Push to Amazon Glacier using amazonka

Here is a small Haskell package for pushing files to Amazon Glacier: https://github.com/carlohamalainen/glacier-push. It uses Brendan Hay’s amazonka API, in particular amazonka-glacier.

One thing that I couldn’t find in amazonka was a way to calculate the tree hash of a file. The Glacier API needs this for each part that is uploaded as well as the whole file. Amazon explains how to calculate the tree hash in their Glacier docs and provides sample code in Java and C++. Since the algorithm is recursive, it is quite short in Haskell:

oneMb :: Int64
oneMb = 1024*1024

treeHash :: BS.ByteString -> Hash
treeHash s = treeHash' $ map sha256 $ oneMbChunks s
  where
    treeHash' []  = error "Internal error in treeHash'."
    treeHash' [x] = B16.encode x
    treeHash' xs  = treeHash' $ next xs

    next :: [BS.ByteString] -> [BS.ByteString]
    next []       = []
    next [a]      = [a]
    next (a:b:xs) = sha256 (BS.append a b) : next xs

    oneMbChunks :: BS.ByteString -> [BS.ByteString]
    oneMbChunks x
      | BS.length x <= oneMb = [x]
      | otherwise            = BS.take oneMb x : oneMbChunks (BS.drop oneMb x)

    sha256 :: BS.ByteString -> BS.ByteString
    sha256 = cs . SHA256.hashlazy

To push a large file to Glacier we do three things: initiate the multipart upload, upload each part (say, 100Mb chunks), and then finalize the upload.

Initiate the multipart upload

We do this to get an uploadId which is then used for each of the multipart uploads. We use initiateMultipartUpload, and need to set the part size.

initiateMulti env vault _partSize = send' env mpu
  where
    mpu = initiateMultipartUpload accountId vault
            & imuPartSize .~ (Just $ cs $ show _partSize)

Upload the parts

With the response from initiating the multipart upload (the mu parameter in uploadOnePart) we can push a single part using uploadMultipartPart. Here we have to also set the checksum and content range:

uploadOnePart env vault mu p = do
    let Part{..} = p

    body <- toHashed <$> getPart _path (_partStart, _partEnd)

    uploadId <- case mu ^. imursUploadId of
                    Nothing     -> throw MissingUploadID
                    Just uid    -> return uid

    let ump = uploadMultipartPart accountId vault uploadId body
                & umpChecksum .~ (Just $ cs $ p ^. partHash)
                & umpRange    .~ Just cr

    send' env ump

  where

    contentRange :: Int64 -> Int64 -> Text
    contentRange x y = "bytes " `append` cs (show x) `append` accountId `append` cs (show y) `append` "/*"

Complete the multipart upload

Completing the multipart upload lets Glacier know that it should start its job of assembling all the parts into a full archive. We have to set the archive size and the tree hash of the entire file.

completeMulti env vault mp mu = do
    uploadId <- case mu ^. imursUploadId of
                    Nothing     -> throw MissingUploadID
                    Just uid    -> return uid

    let cmu = completeMultipartUpload accountId vault uploadId
                & cmuArchiveSize .~ (Just $ cs $ show $ mp ^. multipartArchiveSize)
                & cmuChecksum    .~ (Just $ cs $ mp ^. multipartFullHash)

    send' env cmu

Notes

In each of these functions I used a helper for sending the request:

send' env x = liftIO $ runResourceT . runAWST env $ send x

I run the main block of work in a KatipContextT since I am using katip for structured logging. Adding new key-value info to the log context is accomplished using katipAddContext.

go vault' path = do
    $(logTM) InfoS "Startup."

    let vault = cs vault'

    let myPartSize = 128*oneMb

    mp  <- liftIO $ mkMultiPart path myPartSize

    env <- liftIO $ newEnv'
    mu  <- liftIO $ initiateMulti env vault myPartSize

    let uploadId = fromMaybe (error "No UploadId in response, can't continue multipart upload.")
                 $ mu ^. imursUploadId

    partResponses <- forM (mp ^. multiParts) $ \p ->
        katipAddContext (sl "uploadId" uploadId) $
        katipAddContext (sl "location" $ fromMaybe "(nothing)" $ mu ^. imursLocation) $ do
            doWithRetries 10 (uploadOnePart env vault mu p)

    case lefts partResponses of
        []   -> do $(logTM) InfoS "All parts uploaded successfully, now completing the multipart upload."
                   catch (do completeResponse <- completeMulti env vault mp mu
                             katipAddContext (sl "uploadId" uploadId)                             $
                              katipAddContext (sl "archiveId" $ completeResponse ^. acoArchiveId) $
                               katipAddContext (sl "checksum" $ completeResponse ^. acoChecksum ) $
                                katipAddContext (sl "location" $ completeResponse ^. acoLocation) $ do
                                  $(logTM) InfoS "Done"
                                  liftIO exitSuccess)
                         (\e -> do logServiceError "Failed to complete multipart upload." e
                                   $(logTM) ErrorS "Failed."
                                   liftIO exitFailure)

        errs -> do forM_ errs (logServiceError "Failed part upload.")
                   $(logTM) ErrorS "Too many part errors."
                   liftIO exitFailure

logServiceError msg (ServiceError e)
    = let smsg :: Text
          smsg = toText $ fromMaybe "" $ e ^. serviceMessage

          scode :: Text
          scode = toText $ e ^. serviceCode

        in katipAddContext (sl "serviceMessage" smsg) $
            katipAddContext (sl "serviceCode" scode)  $
             (headersAsContext $ e ^. serviceHeaders) $
               $(logTM) ErrorS msg

logServiceError msg (TransportError e)
    = let txt :: Text
          txt = toText $ show e
        in katipAddContext (sl "TransportError" txt) $
            $(logTM) ErrorS msg

logServiceError msg (SerializeError e)
    = let txt :: Text
          txt = toText $ show e
        in katipAddContext (sl "SerializeError" txt) $
            $(logTM) ErrorS msg

I found it handy to write this little helper function to turn each header from an amazonka ServiceError into a Katip context key/value pair:

headersAsContext :: KatipContext m => [Header] -> m a -> m a
headersAsContext hs = foldl (.) id $ map headerToContext hs
  where
    headerToContext :: KatipContext m => Header -> m a -> m a
    headerToContext (h, x) = let h' = cs $ CI.original h :: Text
                                 x' = cs x               :: Text
                               in katipAddContext (sl h' x')

Katip can write to ElasticSearch using katip-elasticsearch. Then you’d be able to search for errors on specific header fields, etc.

Sample run

glacier-push-exe basement myfile
[2017-08-30 12:44:47][glacier-push.main][Info][x4][1724][ThreadId 7][main:Main app/Main.hs:300:7] Startup.
[2017-08-30 12:44:50][glacier-push.main][Info][x4][1724][ThreadId 7][location:/998720554704/vaults/basement/multipart-uploads/vZMCGNsLGhfTJ_hJ-CJ_OF_juCAY1IaZDl_A3YqOZXnuQRH_AtPiMaUE-K1JDew-ZwiIuDZgR3QbjsJIEfWtGeMNeKDs][partEnd:134217727][partStart:0][uploadId:vZMCGNsLGhfTJ_hJ-CJ_OF_juCAY1IaZDl_A3YqOZXnuQRH_AtPiMaUE-K1JDew-ZwiIuDZgR3QbjsJIEfWtGeMNeKDs][main:Main app/Main.hs:213:15] Uploading part.
[2017-08-30 12:45:45][glacier-push.main][Info][x4][1724][ThreadId 7][location:/998720554704/vaults/basement/multipart-uploads/vZMCGNsLGhfTJ_hJ-CJ_OF_juCAY1IaZDl_A3YqOZXnuQRH_AtPiMaUE-K1JDew-ZwiIuDZgR3QbjsJIEfWtGeMNeKDs][partEnd:268435455][partStart:134217728][uploadId:vZMCGNsLGhfTJ_hJ-CJ_OF_juCAY1IaZDl_A3YqOZXnuQRH_AtPiMaUE-K1JDew-ZwiIuDZgR3QbjsJIEfWtGeMNeKDs][main:Main app/Main.hs:213:15] Uploading part.
[2017-08-30 12:46:37][glacier-push.main][Info][x4][1724][ThreadId 7][location:/998720554704/vaults/basement/multipart-uploads/vZMCGNsLGhfTJ_hJ-CJ_OF_juCAY1IaZDl_A3YqOZXnuQRH_AtPiMaUE-K1JDew-ZwiIuDZgR3QbjsJIEfWtGeMNeKDs][partEnd:293601279][partStart:268435456][uploadId:vZMCGNsLGhfTJ_hJ-CJ_OF_juCAY1IaZDl_A3YqOZXnuQRH_AtPiMaUE-K1JDew-ZwiIuDZgR3QbjsJIEfWtGeMNeKDs][main:Main app/Main.hs:213:15] Uploading part.
[2017-08-30 12:46:55][glacier-push.main][Info][x4][1724][ThreadId 7][main:Main app/Main.hs:320:22] All parts uploaded successfully, now completing the multipart upload.
[2017-08-30 12:46:57][glacier-push.main][Info][x4][1724][ThreadId 7][archiveId:bImG6jM0eQGNC7kIJTsC_wtcAXdPDUtJ-NyfstrkxeyTtXC_iEgkvenH-h_eQH-LYbhVKWJM7WuZlb7OHKtgKJNEpOtVaqxEhlNRTHphUtLCurcHAQDHKkiTnIXTpFxgPgvP9Q0axA][checksum:4f08645d8f3705dc222eef7547591c400362806abb7a6298b9267ebf2be7d901][location:/998720554704/vaults/basement/archives/bImG6jM0eQGNC7kIJTsC_wtcAXdPDUtJ-NyfstrkxeyTtXC_iEgkvenH-h_eQH-LYbhVKWJM7WuZlb7OHKtgKJNEpOtVaqxEhlNRTHphUtLCurcHAQDHKkiTnIXTpFxgPgvP9Q0axA][uploadId:vZMCGNsLGhfTJ_hJ-CJ_OF_juCAY1IaZDl_A3YqOZXnuQRH_AtPiMaUE-K1JDew-ZwiIuDZgR3QbjsJIEfWtGeMNeKDs][main:Main app/Main.hs:326:37] Done

The lines are pretty long (as they are intended for consumption into ElasticSearch, not human parsing) so here is one with line breaks:

[2017-08-30 12:45:45]
 [glacier-push.main]
 [Info]
 [x4]
 [1724]
 [ThreadId 7]
 [location:/998720554704/vaults/basement/multipart-uploads/vZMCGNsLGhfTJ_hJ-CJ_OF_juCAY1IaZDl_A3YqOZXnuQRH_AtPiMaUE-K1JDew-ZwiIuDZgR3QbjsJIEfWtGeMNeKDs]
 [partEnd:268435455]
 [partStart:134217728]
 [uploadId:vZMCGNsLGhfTJ_hJ-CJ_OF_juCAY1IaZDl_A3YqOZXnuQRH_AtPiMaUE-K1JDew-ZwiIuDZgR3QbjsJIEfWtGeMNeKDs]
 [main:Main app/Main.hs:213:15] Uploading part.

Checking out and compiling

Use Stack. Then:

$ git clone https://github.com/carlohamalainen/glacier-push.git
$ cd glacier-push
$ stack build

To browse the source on github, have a look at:

FSA B3164 bottom bracket replacement

My Boardman Team TXC 650b hardtail mountain bike has an FSA crankset and bottom bracket. After a year and a half the bottom bracket got quite rough so I decided to swap it out with a Hope Hollowtech II bottom bracket.

I couldn’t find much online about this bottom bracket. Markings include: “FSA B3164 MegaExo 24mm MS185” and “BC1.37″ x 24T”.

I replaced it with this Hope bottom bracket, the 68/73mm model.

hope

It comes with three spacers but not all are needed.

The drive side cup of the new bottom bracket screwed in easily but I stripped the thread of the bottom bracket on the non-drive side, probably due to gunk in the threads.

I managed to get the cup out and then used a dremel (on low speed) to clean up the threads.

To chase the threads, I made a few cuts into the new bottom bracket cup (following this video), which let me push into the bottom bracket enough to bite into the thread. If you do this you’ll probably destroy the thread completely, rendering the frame junk.

 

ghc-imported-from => ghc-mod (August 2017)

I have a pull request to merge ghc-imported-from into ghc-mod. The main benefit of being part of ghc-mod is that I don’t have to duplicate ghc-mod’s infrastructure for handling sandboxes, GHC options, interfaces to other build tools like Stack, and compatibility with more versions of GHC.

The pull request is still under review, so until then you can try it out by cloning the development branches:

git clone -b imported-from https://github.com/DanielG/ghc-mod.git ghc-mod-imported-from
cd ghc-mod-imported-from
cabal update && cabal sandbox init && cabal install
export PATH=`pwd`/.cabal-sandbox/bin:$PATH

Assuming that you use Plugged for managing Vim/Neovim plugins, use my branch of ghcmod-vim by adding this to your vimrc:

call plug#begin('~/.vim/plugged')

Plug 'carlohamalainen/ghcmod-vim', { 'branch': 'ghcmod-imported-from-cmd', 'for' : 'haskell' }

Install the plugin with :PlugInstall in vim.

Recently, xdg-open stopped working for me (others have had the same issue) so I recommend setting ghcmod_browser in your vimrc:

let g:ghcmod_browser = '/usr/bin/firefox'

Here are some handy key mappings:

au FileType  haskell nnoremap  :GhcModType
au FileType  haskell nnoremap  :GhcModInfo
au FileType  haskell nnoremap  :GhcModTypeClear

au FileType lhaskell nnoremap  :GhcModType
au FileType lhaskell nnoremap  :GhcModInfo
au FileType lhaskell nnoremap  :GhcModTypeClear

au FileType haskell  nnoremap  :GhcModOpenDoc
au FileType lhaskell nnoremap  :GhcModOpenDoc

au FileType haskell  nnoremap  :GhcModDocUrl
au FileType lhaskell nnoremap  :GhcModDocUrl

au FileType haskell  vnoremap  :GhcModOpenHaddockVismode
au FileType lhaskell vnoremap  :GhcModOpenHaddockVismode

au FileType haskell  vnoremap  :GhcModEchoUrlVismode
au FileType lhaskell vnoremap  :GhcModEchoUrlVismode

On the command line, use the imported-from command. It tells you the defining module, the exporting module, and the Haddock URL:

$ ghc-mod imported-from Foo.hs 9 34
base-4.8.2.0:System.IO.print Prelude /opt/ghc/7.10.3/share/doc/ghc/html/libraries/base-4.8.2.0/Prelude.html

From Vim/Neovim, navigate to a symbol and hit F4 which will open the Haddock URL in your browser, or F5 to echo the command-line output.

Ubuntu 17 – device not managed

I plugged in a D-Link DUB-1312 to my laptop running Ubuntu Zesty but Network Manager said that the interface was “not managed”.

The fix, found here, is to remove the contents of one file. Better to save the original file and touch an empty one:

$ sudo mv    /usr/lib/NetworkManager/conf.d/10-globally-managed-devices.conf{,_ORIGINAL}
$ sudo touch /usr/lib/NetworkManager/conf.d/10-globally-managed-devices.conf

For reference, here’s the info about the DUB-1312 USB ethernet adapter:

$ sudo apt update
$ sudo apt install hwinfo
$ sudo hwinfo --netcard

(other output snipped)

40: USB 00.0: 0200 Ethernet controller
  [Created at usb.122]
  Unique ID: VQs5.d0KcpDt5qE6
  Parent ID: 75L1.MLPSY0FvjsF
  SysFS ID: /devices/pci0000:00/0000:00:14.0/usb2/2-6/2-6.4/2-6.4.3/2-6.4.3:1.0
  SysFS BusID: 2-6.4.3:1.0
  Hardware Class: network
  Model: "D-Link DUB-1312"
  Hotplug: USB
  Vendor: usb 0x2001 "D-Link"
  Device: usb 0x4a00 "D-Link DUB-1312"
  Revision: "1.00"
  Serial ID: "000000000005FA"
  Driver: "ax88179_178a"
  Driver Modules: "ax88179_178a"
  Device File: enxe46f13f4be18
  HW Address: e4:6f:13:f4:be:18
  Permanent HW Address: e4:6f:13:f4:be:18
  Link detected: yes
  Module Alias: "usb:v2001p4A00d0100dcFFdscFFdp00icFFiscFFip00in00"
  Driver Info #0:
    Driver Status: ax88179_178a is active
    Driver Activation Cmd: "modprobe ax88179_178a"
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #33 (Hub)

Structured logging to AWS ElasticSearch

A while ago I wrote about how to set up a structured logging service using PostgreSQL. AWS now makes it possible to have the same functionality (plus more) in the “serverless” style. For background on the idea of serverless architecture, watch this talk: GOTO 2017 • Serverless: the Future of Software Architecture • Peter Sbarski. Parts of this post are based on this guide on serverless AWS lambda elasticsearch and kibana.

First, create an Amazon Elasticsearch Service Domain. I used the smallest instance size since it’s just for personal use. Full docs are here.

For programmatic access control, create an AWS IAM user and make a note of its “arn” identifier, e.g. arn:aws:iam::000000000000:user/myiamuser. Then add an access policy as follows. We also add access to our IP address for the kibana interface. I made an ElasticSearch domain called “logs”; see the Resource field below:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::000000000000:user/myiamuser"
      },
      "Action": "es:*",
      "Resource": "arn:aws:es:ap-southeast-1:000000000000:domain/logs/*"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": "es:*",
      "Resource": "arn:aws:es:ap-southeast-1:000000000000:domain/logs/*",
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": "xxx.xxx.xxx.xxx"
        }
      }
    }
  ]
}

To post to the ElasticSearch instance we use requests-aws4auth:

sudo pip install requests-aws4auth

Then we can post a document, a json blob, using the following script. Set the host, region, AWS key, and AWS secret key. This script saves the system temperature under an index system-stats with the ISO date attached.

import datetime 

from elasticsearch import Elasticsearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth

HOST        = 'search-logs-xxxxxxxxxxxxxxxxxxxxxxxxxx.ap-southeast-1.es.amazonaws.com' # see 'Endpoint' in ES status page
REGION      = 'ap-southeast-1' # choose the correct region
AWS_KEY     = 'XXXXXXXXXXXXXXXXXXXX'
AWS_SECRET  = 'YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY'
 
def get_temp():
    return 42.0 # actually read from 'sensors' or similar

if __name__ == '__main__':
    now  = datetime.datetime.now()
    date = now.date().isoformat()

    doc = {'host': 'blah', 'temperature': get_temp(), 'datetime': now.isoformat()}

    awsauth = AWS4Auth(AWS_KEY, AWS_SECRET, REGION, 'es')

    es = Elasticsearch(
            hosts=[{'host': HOST, 'port': 443}],
            http_auth=awsauth, use_ssl=True, verify_certs=True,
            connection_class=RequestsHttpConnection)

    _index = 'system-stats-' + date
    _type  = 'temperature'
    print doc
    print es.index(index=_index, doc_type=_type, body=doc)

To query data we use elasticsearch-dsl.

sudo pip install elasticsearch-dsl
from elasticsearch import Elasticsearch
from elasticsearch import RequestsHttpConnection
from requests_aws4auth import AWS4Auth

from elasticsearch_dsl import Search

from datetime import datetime

HOST        = 'search-logs-xxxxxxxxxxxxxxxxxxxxxxxxxx.ap-southeast-1.es.amazonaws.com' # see 'Endpoint' in ES status page
REGION      = 'ap-southeast-1' # choose the correct region
AWS_KEY     = 'XXXXXXXXXXXXXXXXXXXX'
AWS_SECRET  = 'YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY'

awsauth = AWS4Auth(AWS_KEY, AWS_SECRET, REGION, 'es')

client = Elasticsearch(
            hosts=[{'host': HOST, 'port': 443}],
            http_auth=awsauth, use_ssl=True, verify_certs=True,
            connection_class=RequestsHttpConnection)

plot_date      = '2017-08-06'
monitored_host = 'blah'

s = Search(using=client, index='system-stats-' + plot_date) \
       .query('match', host=monitored_host)

response = s.execute()

xy = [(datetime.strptime(hit.datetime, '%Y-%m-%dT%H:%M:%S.%f'), hit.temperature) for hit in response]
xy = sorted(xy, key=lambda z: z[0])

for (x, y) in xy:
    print(x,y)

Sample output:

$ python3 dump.py 
2017-08-06 04:00:02.337370 32.0
2017-08-06 05:00:01.779796 37.0
2017-08-06 07:00:01.789370 37.0
2017-08-06 11:00:01.711586 40.0
2017-08-06 12:00:02.054906 42.0
2017-08-06 16:00:02.075869 44.0
2017-08-06 18:00:01.619764 43.0
2017-08-06 19:00:02.319470 38.0
2017-08-06 20:00:03.098032 43.0
2017-08-06 22:00:03.629017 43.0

For exploring the data you can also use kibana, which is included with the ElasticSearch service from AWS.

Another nifty thing about the AWS infrastructure is that you can use Lambda to create ElasticSearch entries when objects drop in an S3 bucket. More details in this post.

Stripping html tags using TagSoup

I had a situation, when converting old blog posts to WordPress, where I wanted to strip all the extra info on the pre tags. For example this:

<pre><code><span style="">&gt;</span> <span style="color: blue; font-weight: bold;">import</span> <span style="">Data</span><span style="">.</span><span style="">Char</span>

would turn into:

>import Data.Char

It turns out that this is really easy using TagSoup.

module Detag where

import Control.Monad
import Text.HTML.TagSoup

The function to strip tags works on a list of tags of strings:

strip :: [Tag String] -> [Tag String]

strip [] = []

If we hit a pre tag, ignore its info (the underscore) and continue on recursively:

strip (TagOpen "pre" _ : rest) = TagOpen "pre" [] : strip rest

Similarly, strip the info off an opening code tag:

strip (TagOpen  "code" _ : rest) = strip rest
strip (TagClose "code"   : rest) = strip rest

If we hit a span, followed by some text, and a closing span, then keep the text tag and continue:

strip (TagOpen "span" _ : TagText t : TagClose "span" : rest)
  = TagText t : strip rest

Don’t change other tags:

strip (t:ts) = t : strip ts

Parsing input from stdin is straightforward. We use optEscape and optRawTag to avoid mangling other html in the input.

main :: IO ()
main = do
    s <- getContents
    let tags = parseTags s
        ropts = renderOptions{optEscape = id, optRawTag = const True}
    putStrLn $ renderTagsOptions ropts $ strip tags

Example output:

$ runhaskell Detag.hs 
<pre class="sourceCode haskell"><code class="sourceCode haskell"><span style="">&gt;</span> <span style="color: green;">{-# LANGUAGE RankNTypes          #-}</span>
<pre>> {-# LANGUAGE RankNTypes          #-}

Data.Proxy

Short note on Data.Proxy based on this Stackoverflow answer.

First, a few imports:

> {-# LANGUAGE RankNTypes          #-}
> {-# LANGUAGE ScopedTypeVariables #-}
>
> module Proxy where
>
> import Data.Proxy
> import Text.Read

Suppose we want to check if some fuzzy real world data can be read as certain concrete types. We could write a few helper functions using readMaybe:

> readableAsInt :: String -> Bool
> readableAsInt s
>   = case readMaybe s of
>       Just (_ :: Int) -> True
>       _               -> False
>
> readableAsDouble :: String -> Bool
> readableAsDouble s
>   = case readMaybe s of
>       Just (_ :: Double) -> True
>       _                  -> False
>
> readableAsBool :: String -> Bool
> readableAsBool s
>   = case readMaybe s of
>       Just (_ :: Bool) -> True
>       _                -> False

These are all basically the same. How to generalise? Let’s try a typeclass.

> class ReadableAs t where
>    readableAs :: String -> Bool

This doesn’t work since readableAs doesn’t depend on the type t:

    The class method ‘readableAs’
    mentions none of the type or kind variables of the class ‘ReadableAs t’
    When checking the class method: readableAs :: String -> Bool
    In the class declaration for ‘ReadableAs’
Failed, modules loaded: none.

So put the type in:

> class ReadableAs' t where
>    readableAs' :: t -> String -> Bool

This compiles, so let’s write some instances:

> instance ReadableAs' Int where
>   readableAs' _ s
>      = case readMaybe s of
>          Just (_ :: Int) -> True
>          _               -> False
>
> instance ReadableAs' Double where
>   readableAs' _ s
>      = case readMaybe s of
>          Just (_ :: Double) -> True
>          _                  -> False
>
> instance ReadableAs' Bool where
>   readableAs' _ s
>      = case readMaybe s of
>          Just (_ :: Bool) -> True
>          _                -> False

Using it is clunky since we have to come up with a concrete value for the first argument:

 > readableAs' (0::Int) "0"
 True
 > readableAs' (0::Double) "0"
 True

For some types we could use Data.Default for this placeholder value. But for other types nothing will make sense. How do we choose a default value for Foo?

> data Foo = Cat | Dog

Haskell has non-strict evaluation so we can use undefined, but, ugh. Bad idea.

 > readableAs' (undefined::Int) "0"
 True

So let’s try out Proxy. It has a single constructor and a free type variable that we can set:

 > :t Proxy
 Proxy :: Proxy t
 > Proxy :: Proxy Bool
 Proxy
 > Proxy :: Proxy Int
 Proxy
 > Proxy :: Proxy Double
 Proxy

Let’s use Proxy t instead of t:

> class ReadableAsP t where
>    readableAsP :: Proxy t -> String -> Bool
>
> instance ReadableAsP Int where
>   readableAsP _ s
>      = case readMaybe s of
>          Just (_ :: Int) -> True
>          _               -> False
>
> instance ReadableAsP Double where
>   readableAsP _ s
>      = case readMaybe s of
>          Just (_ :: Double) -> True
>          _                  -> False
>
> instance ReadableAsP Bool where
>   readableAsP _ s
>      = case readMaybe s of
>          Just (_ :: Bool) -> True
>          _                -> False

This works, and we don’t have to come up with the unused concrete value:

 > readableAsP (Proxy :: Proxy Bool) "0"
 False
 > readableAsP (Proxy :: Proxy Bool) "True"
 True
 > readableAsP (Proxy :: Proxy Int) "0"
 True
 > readableAsP (Proxy :: Proxy Double) "0"
 True
 > readableAsP (Proxy :: Proxy Double) "0.0"
 True

Still, there’s a lot of duplication in the class and instances. We can do away with the class entirely. With the ScopedTypeVariables language extension and the forall, the t in the type signature can be referred to in the body:

> readableAs :: forall t. Read t => Proxy t -> String -> Bool
> readableAs _ s
>      = case readMaybe s of
>          Just (_ :: t) -> True
>          _             -> False
 > readableAs (Proxy :: Proxy Int) "0"
 True
 > readableAs (Proxy :: Proxy Int) "foo"
 False

Archived comments


Franklin Chen

2017-03-25 02:23:07.94742 UTC

This can also now be done without Proxy, thanks to explicit type application:

{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE AllowAmbiguousTypes #-}

import Data.Maybe (isJust)

readableAs :: forall t. Read t => String -> Bool
readableAs = isJust @t . readMaybe

example1 = readableAs @Int "0"
example2 = readableAs @Int "foo"
example3 = readableAs @Double "0.1"

ghc-imported-from => ghc-mod (March 2017)

I have a pull request to merge ghc-imported-from into ghc-mod. The main benefit of being part of ghc-mod is that I don’t have to duplicate ghc-mod’s infrastructure for handling sandboxes, GHC options, interfaces to other build tools like Stack, and compatibility with more versions of GHC.

The pull request is still under review, so until then you can try it out by cloning the development branches:

git clone -b imported-from https://github.com/DanielG/ghc-mod.git ghc-mod-imported-from
cd ghc-mod-imported-from
cabal update && cabal sandbox init && cabal install
export PATH=`pwd`/.cabal-sandbox/bin:$PATH

Assuming that you use Plugged for managing Vim/Neovim plugins, use my branch of ghcmod-vim by adding this to your vimrc:

call plug#begin('~/.vim/plugged')

Plug 'carlohamalainen/ghcmod-vim', { 'branch': 'ghcmod-imported-from-cmd', 'for' : 'haskell' }

Install the plugin with :PlugInstall in vim.

Here are some handy key mappings:

au FileType  haskell nnoremap            :GhcModType
au FileType  haskell nnoremap            :GhcModInfo
au FileType  haskell nnoremap    :GhcModTypeClear

au FileType lhaskell nnoremap            :GhcModType
au FileType lhaskell nnoremap            :GhcModInfo
au FileType lhaskell nnoremap    :GhcModTypeClear

au FileType haskell  nnoremap   :GhcModOpenDoc
au FileType lhaskell nnoremap   :GhcModOpenDoc

au FileType haskell  nnoremap   :GhcModDocUrl
au FileType lhaskell nnoremap   :GhcModDocUrl

au FileType haskell  vnoremap  : GhcModOpenHaddockVismode
au FileType lhaskell vnoremap  : GhcModOpenHaddockVismode

au FileType haskell  vnoremap  : GhcModEchoUrlVismode
au FileType lhaskell vnoremap  : GhcModEchoUrlVismode

On the command line, use the imported-from command. It tells you the defining module, the exporting module, and the Haddock URL:

$ ghc-mod imported-from Foo.hs 9 34 show
base-4.8.2.0:GHC.Show.show Prelude https://hackage.haskell.org/package/base-4.8.2.0/docs/Prelude.html

From Vim/Neovim, navigate to a symbol and hit F4 which will open the Haddock URL in your browser, or F5 to echo the command-line output.

Raspbian with full disk encryption

This blog post shows how to convert a standard Raspbian installation to full disk encryption. The encryption passphrase can be entered at the physical console or via a dropbear ssh session.

I mainly follow the Offensive Security guide.

What you need:

  • Raspberry Pi.
  • Laptop with a microSD card slot. I used my X1 Carbon running Ubuntu xenial (amd64).

First, install Raspbian. With a 32Gb microSD card the partitions are:

/dev/mmcblk0p2                29G  4.8G   23G  18% /media/carlo/7f593562-9f68-4bb9-a7c9-2b70ad620873
/dev/mmcblk0p1                63M   21M   42M  34% /media/carlo/boot

It’s a good idea to make a backup of the working installation:

dd if=/dev/mmcblk0 of=pi-debian-unencrypted-backup.img

Also make a note of the start/end of the main partition. This will be needed later.

Install the qemu static (on the laptop, not the Pi):

sudo apt update
sudo apt install qemu-user-static

Create directories for the chroot. Easiest to do all of this as root. Pop the sd card into the laptop and drop into a chroot:

mkdir -p pi/chroot/boot

mount /dev/mmcblk0p2 pi/chroot/
mount /dev/mmcblk0p1 pi/chroot/boot/

mount -t proc  none     pi/chroot/proc
mount -t sysfs none     pi/chroot/sys
mount -o bind /dev      pi/chroot/dev
mount -o bind /dev/pts  pi/chroot/dev/pts

cp /usr/bin/qemu-arm-static pi/chroot/usr/bin/
LANG=C chroot pi/chroot/

Next we need to install a few things in the chroot. If these fail with

root@x4:/# apt update
qemu: uncaught target signal 4 (Illegal instruction) - core dumped
Illegal instruction (core dumped)

then comment out the libarmmem line in ld.so.preload:

root@x4:~# cat pi/chroot/etc/ld.so.preload
#/usr/lib/arm-linux-gnueabihf/libarmmem.so

Install the things:

apt update
apt install busybox cryptsetup dropbear

To create an initramfs image we we need the kernel version. In my case
it is 4.4.50-v7+.

root@x4:/# ls -l /lib/modules/
total 8
drwxr-xr-x 3 root root 4096 Mar 11 20:31 4.4.50+
drwxr-xr-x 3 root root 4096 Mar 11 20:31 4.4.50-v7+

Create the image, enable ssh, and set the root password:

root@x4:/# mkinitramfs -o /boot/initramfs.gz 4.4.50-v7+
root@x4:/# update-rc.d ssh enable
root@x4:/# passwd

Set the boot command line. Previously I had:

root@x4:/# cat /boot/cmdline.txt
dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait

The new one refers to the encrypted partition:

dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=/dev/mapper/crypt_sdcard cryptdevice=/dev/mmcblk0p2:crypt_sdcard rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait

Add this to the boot config:

echo "initramfs initramfs.gz 0x00f00000" >> /boot/config.txt

Cat the private key and copy to the laptop; save as pikey:

root@x4:/# cat /etc/initramfs-tools/root/.ssh/id_rsa

The Offensive Security guide talks about editing /etc/initramfs-tools/root/.ssh/authorized_keys so that the ssh login can only run /scripts/local-top/cryptroot. I had no luck getting this to work, running into weird issues with Plymouth. For some reason the password prompt appeared on the physical console of the Pi, not the ssh session. So I skipped this and manually use the dropbear session to enter the encryption passphrase.

Set /etc/fstab to point to the new root partition. Original file:

root@x4:/# cat /etc/fstab
proc            /proc           proc    defaults          0       0
/dev/mmcblk0p1  /boot           vfat    defaults          0       2
/dev/mmcblk0p2  /               ext4    defaults,noatime  0       1
# a swapfile is not a swap partition, no line here
#   use  dphys-swapfile swap[on|off]  for that

New file (only one line changed, referring to /dev/mapper):

root@x4:/# cat /etc/fstab
proc            /proc           proc    defaults          0       0
/dev/mmcblk0p1  /boot           vfat    defaults          0       2
/dev/mapper/crypt_sdcard /               ext4    defaults,noatime  0       1
# a swapfile is not a swap partition, no line here
#   use  dphys-swapfile swap[on|off]  for that

Edit /etc/crypttab to look like this:

root@x4:/# cat /etc/crypttab
# 				
crypt_sdcard /dev/mmcblk0p2 none luks

The Offensive Security guide mentions that there can be issues with ports taking a while to wake up, so they recommend adding a 5 second sleep before the configure_networking line in /usr/share/initramfs-tools/scripts/init-premount/dropbear:

echo "Waiting 5 seconds for USB to wake"
sleep 5
configure_networking &

Regenerate the image:

root@x4:/# mkinitramfs -o /boot/initramfs.gz 4.4.50-v7+
device-mapper: table ioctl on crypt_sdcard failed: No such device or address
Command failed
cryptsetup: WARNING: failed to determine cipher modules to load for crypt_sdcard
Unsupported ioctl: cmd=0x5331

Now pop out of the chroot (Ctrl-D), unmount some things, and make a backup:

umount pi/chroot/boot
umount pi/chroot/sys
umount pi/chroot/proc
mkdir -p pi/backup
rsync -avh pi/chroot/* pi/backup/

umount pi/chroot/dev/pts
umount pi/chroot/dev
umount pi/chroot

Encrypt the partition, unlock it, and rsync the data back:

cryptsetup -v -y --cipher aes-cbc-essiv:sha256 --key-size 256 luksFormat /dev/mmcblk0p2
cryptsetup -v luksOpen /dev/mmcblk0p2 crypt_sdcard
mkfs.ext4 /dev/mapper/crypt_sdcard

mkdir -p pi/encrypted
mount /dev/mapper/crypt_sdcard pi/encrypted/
rsync -avh pi/backup/* pi/encrypted/

umount pi/encrypted/
cryptsetup luksClose /dev/mapper/crypt_sdcard
sync

Now put the sd card into the Pi and boot it up. If you see this on the console:

/scripts/local-top/cryptroot: line 1: /sbin/cryptsetup: not found

it means that the initramfs image didn’t include the cryptsetup binary. It is a known bug and the workaround that worked for me was:

echo "export CRYPTSETUP=y" >> /usr/share/initramfs-tools/conf-hooks.d/forcecryptsetup

(I had to do this in the chroot environment and rebuild the initramfs image. Ugh.)

For some reason I had Plymouth asking for the password on the physical console instead of the ssh dropbear connection. This is another known issue. This workaround looked promising but it broke the physical console as well as the ssh connection. No idea why.

What does work for me is to use the dropbear session to manually kill Plymouth and then enter the encryption password:

ssh -i pikey root@192.168.0.xxx

Then in the busybox session:

kill $(pidof plymouthd)
# Wait a few seconds...
echo -ne password > /lib/cryptsetup/passfifo
/scripts/local-top/cryptroot

This lets you enter the encryption passphrase. After a few seconds the normal boot process continues. So you can enter the encryption passphrase with a real keyboard if you are physically with the Pi, or you can ssh in if you are remote.

Subkeys in GPG for a YubiKey

I recently got two YubiKeys to try out another form of 2FA and to see how they work with my PGP setup (Enigmail and Thunderbird). I followed ankitrasto’s guide (part 1 and part 2) to move a key to the YubiKey.

I then exported my public key with gpg2 -a --export carlo@carlo-hamalainen.net and sent it to a friend. He replied with the reasonable question: why didn’t the fingerprint E3E4A5B8 change? The exported data changed (was longer) yet the fingerprint, which looks like a hash, was the same.

What’s going on here is that originally I had a main key E3E4A5B8 for signing (the “S” next to usage) and certification (the “C”). Meanwhile, encryption was done using a subkey 81E07A3C (the “E”).

$ gpg2 --edit-key carlo@carlo-hamalainen.net
gpg (GnuPG) 2.1.11; Copyright (C) 2016 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

sec  rsa4096/E3E4A5B8
     created: 2013-07-10  expires: never       usage: SC
     trust: ultimate      validity: ultimate
ssb  rsa4096/81E07A3C
     created: 2013-07-10  expires: never       usage: E
[ultimate] (1). Carlo Hamalainen 

From my friend’s perspective, I only had one “public key”, the one with fingerprint E3E4A5B8.

When I added two subkeys (one for each YubiKey), I got BE8897FA and 766D56F8. These entries have a card-no which refers to the serial number of the YubiKey where the subkey lives.

$ gpg2 --edit-key carlo@carlo-hamalainen.net
gpg (GnuPG) 2.1.11; Copyright (C) 2016 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

sec  rsa4096/E3E4A5B8
     created: 2013-07-10  expires: never       usage: SC
     trust: ultimate      validity: ultimate
ssb  rsa4096/81E07A3C
     created: 2013-07-10  expires: never       usage: E
ssb  rsa2048/BE8897FA
     created: 2016-11-20  expires: never       usage: E
     card-no: 0006 XXXXXXXX
ssb  rsa2048/766D56F8
     created: 2016-12-13  expires: never       usage: E
     card-no: 0006 YYYYYYYY
[ultimate] (1). Carlo Hamalainen 

To see more detail about the keys, we have to inspect the packets in the export.

Packet dump

We can use gpg2 --list-packets --verbose to see what is in the output of gpg2 -a --export.

Here is the original packet dump when I had just the main key and the encryption subkey:

$ gpg2 -a --export | gpg2 --list-packets --verbose

# off=0 ctb=99 tag=6 hlen=3 plen=525
:public key packet:
    version 4, algo 1, created 1373439395, expires 0
    pkey[0]: C35E579D722847E70515382572B28200 (very long line, snipped)
    pkey[1]: 010001
    keyid: 269E0DC4E3E4A5B8
# off=528 ctb=b4 tag=13 hlen=2 plen=45
:user ID packet: "Carlo Hamalainen "
# off=575 ctb=89 tag=2 hlen=3 plen=568
:signature packet: algo 1, keyid 269E0DC4E3E4A5B8
    version 4, created 1373439395, md5len 0, sigclass 0x13
    digest algo 2, begin of digest 05 44
    hashed subpkt 2 len 4 (sig created 2013-07-10)
    hashed subpkt 27 len 1 (key flags: 03)
    hashed subpkt 11 len 5 (pref-sym-algos: 9 8 7 3 2)
    hashed subpkt 21 len 5 (pref-hash-algos: 8 2 9 10 11)
    hashed subpkt 22 len 3 (pref-zip-algos: 2 3 1)
    hashed subpkt 30 len 1 (features: 01)
    hashed subpkt 23 len 1 (key server preferences: 80)
    subpkt 16 len 8 (issuer key ID 269E0DC4E3E4A5B8)
    data: REDACTED001
# off=1146 ctb=b9 tag=14 hlen=3 plen=525
:public sub key packet:
    version 4, algo 1, created 1373439395, expires 0
    pkey[0]: REDACTED002
    pkey[1]: 010001
    keyid: EF86D47281E07A3C
# off=1674 ctb=89 tag=2 hlen=3 plen=543
:signature packet: algo 1, keyid 269E0DC4E3E4A5B8
    version 4, created 1373439395, md5len 0, sigclass 0x18
    digest algo 2, begin of digest b9 63
    hashed subpkt 2 len 4 (sig created 2013-07-10)
    hashed subpkt 27 len 1 (key flags: 0C)
    subpkt 16 len 8 (issuer key ID 269E0DC4E3E4A5B8)
    data: REDACTED003

With the two subkeys added for the YubiKeys, the dump now looks like this:


# off=0 ctb=99 tag=6 hlen=3 plen=525
:public key packet:
    version 4, algo 1, created 1373439395, expires 0
    pkey[0]: C35E579D722847E70515382572B28200 (very long line, snipped)
    pkey[1]: 010001
    keyid: 269E0DC4E3E4A5B8
# off=528 ctb=b4 tag=13 hlen=2 plen=45
:user ID packet: "Carlo Hamalainen "
# off=575 ctb=89 tag=2 hlen=3 plen=568
:signature packet: algo 1, keyid 269E0DC4E3E4A5B8
    version 4, created 1373439395, md5len 0, sigclass 0x13
    digest algo 2, begin of digest 05 44
    hashed subpkt 2 len 4 (sig created 2013-07-10)
    hashed subpkt 27 len 1 (key flags: 03)
    hashed subpkt 11 len 5 (pref-sym-algos: 9 8 7 3 2)
    hashed subpkt 21 len 5 (pref-hash-algos: 8 2 9 10 11)
    hashed subpkt 22 len 3 (pref-zip-algos: 2 3 1)
    hashed subpkt 30 len 1 (features: 01)
    hashed subpkt 23 len 1 (key server preferences: 80)
    subpkt 16 len 8 (issuer key ID 269E0DC4E3E4A5B8)
    data: REDACTED001
# off=1146 ctb=b9 tag=14 hlen=3 plen=525
:public sub key packet:
    version 4, algo 1, created 1373439395, expires 0
    pkey[0]: REDACTED002
    pkey[1]: 010001
    keyid: EF86D47281E07A3C
# off=1674 ctb=89 tag=2 hlen=3 plen=543
:signature packet: algo 1, keyid 269E0DC4E3E4A5B8
    version 4, created 1373439395, md5len 0, sigclass 0x18
    digest algo 2, begin of digest b9 63
    hashed subpkt 2 len 4 (sig created 2013-07-10)
    hashed subpkt 27 len 1 (key flags: 0C)
    subpkt 16 len 8 (issuer key ID 269E0DC4E3E4A5B8)
    data: REDACTED003
# off=2220 ctb=b9 tag=14 hlen=3 plen=269
:public sub key packet:
    version 4, algo 1, created 1479618635, expires 0
    pkey[0]: REDACTED004
    pkey[1]: 010001
    keyid: 6D682AD2BE8897FA
# off=2492 ctb=89 tag=2 hlen=3 plen=543
:signature packet: algo 1, keyid 269E0DC4E3E4A5B8
    version 4, created 1479618635, md5len 0, sigclass 0x18
    digest algo 8, begin of digest bc 2c
    hashed subpkt 2 len 4 (sig created 2016-11-20)
    hashed subpkt 27 len 1 (key flags: 0C)
    subpkt 16 len 8 (issuer key ID 269E0DC4E3E4A5B8)
    data: REDACTED005
# off=3038 ctb=b9 tag=14 hlen=3 plen=269
:public sub key packet:
    version 4, algo 1, created 1481611279, expires 0
    pkey[0]: REDACTED006
    pkey[1]: 010001
    keyid: 17118623766D56F8
# off=3310 ctb=89 tag=2 hlen=3 plen=543
:signature packet: algo 1, keyid 269E0DC4E3E4A5B8
    version 4, created 1481611279, md5len 0, sigclass 0x18
    digest algo 8, begin of digest a2 63
    hashed subpkt 2 len 4 (sig created 2016-12-13)
    hashed subpkt 27 len 1 (key flags: 0C)
    subpkt 16 len 8 (issuer key ID 269E0DC4E3E4A5B8)
    data: REDACTED007

1. In both dumps, my main key is the same (the green block). The fingerprint is 269E0DC4E3E4A5B8 in both dumps.

2. The next block is the same in both dumps – it is the encryption subkey. The signature packet is our way of proving that the subkey is attached to our main key. Note the line issuer key ID 269E0DC4E3E4A5B8.

3. This part is for my first YubiKey subkey, BE8897FA. The signature packet claims that the subkey is issued by 269E0DC4E3E4A5B8.

4. This part is for my second YubiKey subkey, 766D56F8. Its signature packet also claims that the subkey is issued by 269E0DC4E3E4A5B8.

pgpdump

An alternative to gpg2 --list-packets --verbose is pgpdump. This formats the packet dump in a nicer way, for example rendering the timestamp created 1373439395 as Public key creation time - Wed Jul 10 14:56:35 SGT 2013.

$ gpg2 --armor --export carlo@carlo-hamalainen.net | pgpdump
Old: Public Key Packet(tag 6)(525 bytes)
    Ver 4 - new
    Public key creation time - Wed Jul 10 14:56:35 SGT 2013
    Pub alg - RSA Encrypt or Sign(pub 1)
    RSA n(4096 bits) - ...
    RSA e(17 bits) - ...
Old: User ID Packet(tag 13)(45 bytes)
    User ID - Carlo Hamalainen 
Old: Signature Packet(tag 2)(568 bytes)
    Ver 4 - new
    Sig type - Positive certification of a User ID and Public Key packet(0x13).
    Pub alg - RSA Encrypt or Sign(pub 1)
    Hash alg - SHA1(hash 2)
    Hashed Sub: signature creation time(sub 2)(4 bytes)
        Time - Wed Jul 10 14:56:35 SGT 2013
    Hashed Sub: key flags(sub 27)(1 bytes)
        Flag - This key may be used to certify other keys
        Flag - This key may be used to sign data
    Hashed Sub: preferred symmetric algorithms(sub 11)(5 bytes)
        Sym alg - AES with 256-bit key(sym 9)
        Sym alg - AES with 192-bit key(sym 8)
        Sym alg - AES with 128-bit key(sym 7)
        Sym alg - CAST5(sym 3)
        Sym alg - Triple-DES(sym 2)
    Hashed Sub: preferred hash algorithms(sub 21)(5 bytes)
        Hash alg - SHA256(hash 8)
        Hash alg - SHA1(hash 2)
        Hash alg - SHA384(hash 9)
        Hash alg - SHA512(hash 10)
        Hash alg - SHA224(hash 11)
    Hashed Sub: preferred compression algorithms(sub 22)(3 bytes)
        Comp alg - ZLIB (comp 2)
        Comp alg - BZip2(comp 3)
        Comp alg - ZIP (comp 1)
    Hashed Sub: features(sub 30)(1 bytes)
        Flag - Modification detection (packets 18 and 19)
    Hashed Sub: key server preferences(sub 23)(1 bytes)
        Flag - No-modify
    Sub: issuer key ID(sub 16)(8 bytes)
        Key ID - 0x269E0DC4E3E4A5B8
    Hash left 2 bytes - 05 44
    RSA m^d mod n(4095 bits) - ...
        -> PKCS-1
Old: Public Subkey Packet(tag 14)(525 bytes)
    Ver 4 - new
    Public key creation time - Wed Jul 10 14:56:35 SGT 2013
    Pub alg - RSA Encrypt or Sign(pub 1)
    RSA n(4096 bits) - ...
    RSA e(17 bits) - ...
Old: Signature Packet(tag 2)(543 bytes)
    Ver 4 - new
    Sig type - Subkey Binding Signature(0x18).
    Pub alg - RSA Encrypt or Sign(pub 1)
    Hash alg - SHA1(hash 2)
    Hashed Sub: signature creation time(sub 2)(4 bytes)
        Time - Wed Jul 10 14:56:35 SGT 2013
    Hashed Sub: key flags(sub 27)(1 bytes)
        Flag - This key may be used to encrypt communications
        Flag - This key may be used to encrypt storage
    Sub: issuer key ID(sub 16)(8 bytes)
        Key ID - 0x269E0DC4E3E4A5B8
    Hash left 2 bytes - b9 63
    RSA m^d mod n(4095 bits) - ...
        -> PKCS-1
Old: Public Subkey Packet(tag 14)(269 bytes)
    Ver 4 - new
    Public key creation time - Sun Nov 20 13:10:35 SGT 2016
    Pub alg - RSA Encrypt or Sign(pub 1)
    RSA n(2048 bits) - ...
    RSA e(17 bits) - ...
Old: Signature Packet(tag 2)(543 bytes)
    Ver 4 - new
    Sig type - Subkey Binding Signature(0x18).
    Pub alg - RSA Encrypt or Sign(pub 1)
    Hash alg - SHA256(hash 8)
    Hashed Sub: signature creation time(sub 2)(4 bytes)
        Time - Sun Nov 20 13:10:35 SGT 2016
    Hashed Sub: key flags(sub 27)(1 bytes)
        Flag - This key may be used to encrypt communications
        Flag - This key may be used to encrypt storage
    Sub: issuer key ID(sub 16)(8 bytes)
        Key ID - 0x269E0DC4E3E4A5B8
    Hash left 2 bytes - bc 2c
    RSA m^d mod n(4096 bits) - ...
        -> PKCS-1
Old: Public Subkey Packet(tag 14)(269 bytes)
    Ver 4 - new
    Public key creation time - Tue Dec 13 14:41:19 SGT 2016
    Pub alg - RSA Encrypt or Sign(pub 1)
    RSA n(2048 bits) - ...
    RSA e(17 bits) - ...
Old: Signature Packet(tag 2)(543 bytes)
    Ver 4 - new
    Sig type - Subkey Binding Signature(0x18).
    Pub alg - RSA Encrypt or Sign(pub 1)
    Hash alg - SHA256(hash 8)
    Hashed Sub: signature creation time(sub 2)(4 bytes)
        Time - Tue Dec 13 14:41:19 SGT 2016
    Hashed Sub: key flags(sub 27)(1 bytes)
        Flag - This key may be used to encrypt communications
        Flag - This key may be used to encrypt storage
    Sub: issuer key ID(sub 16)(8 bytes)
        Key ID - 0x269E0DC4E3E4A5B8
    Hash left 2 bytes - a2 63
    RSA m^d mod n(4093 bits) - ...
        -> PKCS-1

Manually checking the packets

I noticed that the gpg2 dump for the packet at offset 3038 refers to the key id 17118623766D56F8 but the equivalent block from pgpdump has no key id:

# off=3038 ctb=b9 tag=14 hlen=3 plen=269
:public sub key packet:
    version 4, algo 1, created 1481611279, expires 0
    pkey[0]: REDACTED006
    pkey[1]: 010001
    keyid: 17118623766D56F8
Old: Public Subkey Packet(tag 14)(269 bytes)
    Ver 4 - new
    Public key creation time - Tue Dec 13 14:41:19 SGT 2016
    Pub alg - RSA Encrypt or Sign(pub 1)
    RSA n(2048 bits) - ...
    RSA e(17 bits) - ...

The RFC tells us how to calculate the fingerprint of a V4 packet:

A V4 fingerprint is the 160-bit SHA-1 hash of the octet 0x99, followed by the two-octet packet length, followed by the entire Public-Key packet starting with the version field. The Key ID is the low-order 64 bits of the fingerprint.

My Python snippet to do this is here. A snippet is below:

xs = read_binary_public_key()

SUBKEY_OFFSET = 3038
SUBKEY_PLEN   = 269

subkey_packet = xs[SUBKEY_OFFSET:SUBKEY_OFFSET+SUBKEY_PLEN+2] # +2 because the PLEN is short by one?

assert subkey_packet[2] == 'x04'

for_fingerprint = ['x99'] + subkey_packet

h = hashlib.sha1(to_str(for_fingerprint))

assert h.digest_size == 20
key_id = to_hexdump(h.digest()[12:], s='')

assert key_id == '17118623766D56F8'

Subkey binding signature

The subkey packet at off=3038 defines the subkey 17118623766D56F8. The next packet at off=3310 provides proof that the key 17118623766D56F8 is attached to our main public key 269E0DC4E3E4A5B8.

The signature packet doesn’t refer to the offset 3038 or the key id 17118623766D56F8 of the subkey packet, so let’s check the contents of the signature packet to see if it really does match the subkey data.

# off=3038 ctb=b9 tag=14 hlen=3 plen=269
:public sub key packet:
    version 4, algo 1, created 1481611279, expires 0
    pkey[0]: REDACTED006
    pkey[1]: 010001
    keyid: 17118623766D56F8
# off=3310 ctb=89 tag=2 hlen=3 plen=543
:signature packet: algo 1, keyid 269E0DC4E3E4A5B8
    version 4, created 1481611279, md5len 0, sigclass 0x18
    digest algo 8, begin of digest a2 63
    hashed subpkt 2 len 4 (sig created 2016-12-13)
    hashed subpkt 27 len 1 (key flags: 0C)
    subpkt 16 len 8 (issuer key ID 269E0DC4E3E4A5B8)
    data: REDACTED007

The first thing that we can check is that the left 16 bits of the hash matches A2 63 (red line above). Checking this hash wasn’t completely straightforward, just reading from RFC. (Other people ran into similar issues.) The full block of code is here. Sample is below:

signature_block = xs[3310:3310+543+2]

# Starts off with two bytes for the length.
assert 543 == (ord(signature_block[0]) << 8) + ord(signature_block[1])

hash_subpacket_length = (ord(signature_block[6]) << 8) + ord(signature_block[7])
assert hash_subpacket_length == 9

start_of_hashed_part   = 8
start_of_unhashed_part = 7 + hash_subpacket_length + 1

unhash_subpacket_length = ( (ord(signature_block[start_of_unhashed_part]) << 8)
                          +  ord(signature_block[start_of_unhashed_part+1])
                          )

start_of_left_16 = start_of_unhashed_part + unhash_subpacket_length + 2

left_16 = signature_block[start_of_left_16:start_of_left_16+2]

assert left_16 == ['xA2', 'x63']

hashed_part_of_sig = signature_block[start_of_hashed_part:start_of_hashed_part+hash_subpacket_length]
assert len(hashed_part_of_sig) == hash_subpacket_length

public_key_block = xs[0:0+525+2] # +2 for what?
assert public_key_block[2] == 'x04'

header1     = ['x99'] + public_key_block[:8]
pubkey_body = public_key_block[8:]

header2     = ['x99'] + subkey_packet[:8]
subsig_body = subkey_packet[8:]

# Version, class, pub key algo, digest algo.
version_class_algos = [ 'x04', 'x18',
                        signature_block[4],
                        signature_block[5]
                      ]

m = hash_subpacket_length
assert m == 9
hash_chunk_length = [chr(m >> 8), chr(m)]

"""
According to https://tools.ietf.org/html/rfc4880#section-5.2.4
the final bit of data is a trailer of six octets:

   V4 signatures also hash in a final trailer of six octets: the
   version of the Signature packet, i.e., 0x04; 0xFF; and a four-octet,
   big-endian number that is the length of the hashed data from the
   Signature packet (note that this number does not include these final
   six octets).

But in gnupg-2.1.11, we see the following in g10/sig-check.c:

410     else {
411     byte buf[6];
412     size_t n;
413     gcry_md_putc( digest, sig->pubkey_algo );
414     gcry_md_putc( digest, sig->digest_algo );
415     if( sig->hashed ) {
416         n = sig->hashed->len;
417             gcry_md_putc (digest, (n >> 8) );
418             gcry_md_putc (digest,  n       );
419         gcry_md_write (digest, sig->hashed->data, n);
420         n += 6;
421     }
422     else {
423       /* Two octets for the (empty) length of the hashed
424              section. */
425           gcry_md_putc (digest, 0);
426       gcry_md_putc (digest, 0);
427       n = 6;
428     }
429     /* add some magic per Section 5.2.4 of RFC 4880.  */
430     buf[0] = sig->version;
431     buf[1] = 0xff;
432     buf[2] = n >> 24;
433     buf[3] = n >> 16;
434     buf[4] = n >>  8;
435     buf[5] = n;
436     gcry_md_write( digest, buf, 6 );
437     }
438     gcry_md_final( digest );

Line 420 adds 6, so we'll do the same even though it
seems to disagree with the RFC.

"""

n = m + 6
assert n == 15

magic = ['x04', 'xff',
         chr(n >> 24),
         chr(n >> 16),
         chr(n >>  8),
         chr(n)]

for_digest = []

for_digest += header1
for_digest += pubkey_body

for_digest += header2
for_digest += subsig_body

for_digest += version_class_algos

for_digest += hash_chunk_length
for_digest += hashed_part_of_sig

for_digest += magic

# According to https://tools.ietf.org/html/rfc4880#section-9.4
# the hash algo 8 is SHA256.
assert 'x08' == signature_block[5]
digest = hashlib.sha256(to_str(for_digest)).digest()

assert 'A2 63' == to_hexdump(digest[:2])

We find that the first 16 bits of the digest is A2 63, matching the data in the signature packet, so the binding signature passes the first test.

The second test is to convert the digest to an MPI and verify it using the specified public key algorithm (an exercise for the reader since pygcrypt fails to install on my laptop 🙂

The RFC alone wasn’t enough for me to reconstruct the subkey hash check, for example the +6 quirk. I had to poke around in gpg 2.1.11 to see what was being used in the hash. For efficiency reasons, libgcrypt lets you push single characters or blocks of bytes to the hash buffer (gcry_md_putc and gcry_md_write; see the libgcrypt docs) so you can’t dump a contiguous block of memory to compare against for_digest). My hacky debugging (print statements) is in this patch. For some reason gpg2 --check-sigs 766D56F8! wasn’t exercising the signature checking code (cached somewhere!?) so on line 108 of my patch I had to force opt.no_sig_cache = 1;.

Enigmail and key fingerprints

So why doesn’t Enigmail let you choose which subkey is being used for encryption? As far as I can tell this is by design:

https://sourceforge.net/p/enigmail/forum/support/thread/37b7a5c8

Enigmail does not try to expose all possible features to end users. The goal of Enigmail is to be usable for everyone, including beginners. I think that the concept of subkeys is way too complex even for many average users. Most of them are already confused by public and secret keys.

You can use gpg.conf to configure the way you want to use subkeys. I will not implement specific features for this in Enigmail.

And then:

Unfortunately, this doesn’t work in this case. gpg is invoked by enigmail with the -u / –local-user argument, completely overriding my settings in gpg.conf. If you / enigmail invoked it with the –default-key argument, it would be a different story. But it does not.

If you would change the next enigmail update to use the –default-key argument instead of the -u argument, it would really help.

EDIT:

Ok, patched enigmail myself. It works as expected with –default-key instead of -u.

And then:

I have decided that I will not replace “-u” (or the equivalent “–local-user”) by “–default-key” in Enigmail. Here is why:

If a user specified local-user in gpg.conf, then use of “-u” in Enigmail will lead to the key being signed by both keys. This is what some users (especially companies) want, expect from Enigmail, and know that it’s been supported for the last 10 years. Using –default-key will break this; in other words, gpg.conf will “win” over Enigmail.

The requirement to a specific subkey for signing is by far less common, and average users don’t need to do this.

We can see what’s going on by using an old Unix trick: make a gpg2 “binary” that is a shell script and put it before the real gpg2 binary in the $PATH:

$ cat bin/gpg2
#!/bin/bash

echo $@ >> /tmp/gpg2.txt

/usr/bin/gpg2 `echo $@ | sed 's/-u 0x7679121C22964C12888893D1269E0DC4E3E4A5B8/-u 766D56F8!/g'`

The -u 766D56F8! parameter forcibly chooses that subkey (the exclamation mark is needed).

This trick is stupid, and potentially dangerous, since someone could convince you to sign a document with the encryption key instead of the signing key. So don’t do it! By default gpg2 uses the last encryption subkey for encryption.

Links

Source code for the snippets in this blog post: https://github.com/carlohamalainen/playground/tree/master/pgp/key_id

Anatomy of a gpg key

RFC 4880

Part 1/2: Email Encryption with the Yubikey-NEO, GPG and Linux

Part 2/2: Email Encryption with the Yubikey-NEO, GPG and Linux

Libgcrypt Reference Manual

GNU Privacy Guard

gnupg-2.1.11.tar.bz2 and signature