Better Nix setup for Spacemacs

In an earlier post I documented my setup for getting Spacemacs/Emacs to work with Nix. I’ve since found a much more elegant solution based on

No more Emacs packages for Nix and no need to defining functions that wrap executables in an invocation of nix-shell.

There’s a nice bonus too, with this setup I don’t need to run nix-shell, which always drops me at a bash prompt, instead I get a working setup in my shell of choice.

Setting up direnv

The steps for setting up direnv depends a bit on your setup, but luckily I found the official instructions for installing direnv to be very clear and easy to follow. There’s not much I can add to that.

Setting up Spacemacs

Since emacs-direnv isn’t included by default in Spacemacs I needed to do a bit of setup. I opted to create a layer for it, rather than just drop it in the list dotspacemacs-additional-packages. Yes, a little more complicated, but not difficult and I nurture an intention of submitting the layer for inclusion in Spacemacs itself at some point. I’ll see where that goes.

For now, I put the following in the file ~/.emacs.d/private/layers/direnv/packages.el:

(defconst direnv-packages
      '(direnv))

(defun direnv/init-direnv ()
  (use-package direnv
    :init
    (direnv-mode)))

Setting up the project folders

In each project folder I then add the file .envrc containing a single line:

use_nix

Then I either run direnv allow from the command line, or run the function direnv-allow after opening the folder in Emacs.

Using it

It’s as simple as moving into the folder in a shell – all required envvars are set up on entry and unset on exit.

In Emacs it’s just as simple, just open a file in a project and the envvars are set. When switching to a buffer outside the project the envvars are unset.

There is only one little caveat, nix-build doesn’t work inside a Nix shell. I found out that running

IN_NIX_SHELL= nix-build

does work though.

X-Ray and WAI

For a while we’ve been planning on introducing AWS X-Ray into our system at work. There’s official support for a few languages, but not too surprisingly Haskell isn’t on that list. I found freckle/aws-xray-client on GitHub, which is so unofficial that it isn’t even published on Hackage. While it looks very good, I suspect it does more than I need and since it lacks licensing information I decided to instead implement a version tailored to our needs.

As a first step I implemented a WAI middleware that wraps an HTTP request and reports the time it took to produce a response. Between the X-Ray Developer Guide and the code in Freckle’s git repo it turned out to be fairly simple.

First off, this is the first step towards X-Ray nirvana, so all I’m aiming for is minimal support. That means all I want is to send minimal X-Ray segments, with the small addition that I want to support parent_id from the start.

The first step then is to parse the HTTP header containing the X-Ray information – X-Amzn-Trace-Id. For now I’m only interested in two parts, Root and Parent, so for simplicity’s sake I use a tuple to keep them in. The idea is to take the header’s value, split on ; to get the parts, then split each part in two, a key and a value, and put them into an association list ([(Text, Text)]) for easy lookup using, well lookup.

parseXRayTraceIdHdr :: Text -> Maybe (Text, Maybe Text)
parseXRayTraceIdHdr hdr = do
  bits <- traverse parseHeaderComponent $ T.split (== ';') hdr
  traceId <- lookup "Root" bits
  let parent = lookup "Parent" bits
  pure (traceId, parent)

parseHeaderComponent :: Text -> Maybe (Text, Text)
parseHeaderComponent cmp = case T.split (== '=') cmp of
                            [name, value] -> Just (name, value)
                            _ -> Nothing

The start and end times for processing a request are also required. The docs say that using at least millisecond resolution is a good idea, so I decided to do exactly that. NominalDiffTime, which is what getPOSIXTime produces, supports a resolution of picoseconds (though I doubt my system’s clock does) which requires a bit of (type-based) converting.

mkTimeInMilli :: IO Milli
mkTimeInMilli = ndfToMilli <$> getPOSIXTime
  where
    ndfToMilli = fromRational . toRational

The last support function needed is one that creates the segment. Just building the JSON object, using aeson’s object, is enough at this point.

mkSegment :: Text -> Text -> Milli -> Milli -> (Text, Maybe Text) -> Value
mkSegment name id startTime endTime (root, parent) =
  object $ [ "name" .= name
           , "id" .= id
           , "trace_id" .= root
           , "start_time" .= startTime
           , "end_time" .= endTime
           ] <> p
  where
    p = maybe [] (\ v -> ["parent_id" .= v]) parent

Armed with all this, I can now put together a WAI middleware that

  1. records the start time of the call
  2. processes the request
  3. sends off the response and keeps the result of it
  4. records the end time
  5. parses the tracing header
  6. builds the segment prepended with the X-Ray daemon header
  7. sends the segment to the X-Ray daemon
traceId :: Text -> Middleware
traceId xrayName app req sendResponse = do
  startTime <- mkTimeInMilli
  app req $ \ res -> do
    rr <- sendResponse res
    endTime <- mkTimeInMilli
    theId <- T.pack . (\ v -> showHex v "") <$> randomIO @Word64
    let traceParts = (decodeUtf8 <$> requestHeaderTraceId req) >>= parseXRayTraceIdHdr
        segment = mkSegment xrayName theId startTime endTime <$> traceParts
    case segment of
      Nothing -> pure ()
      Just segment' -> sendXRayPayload $ toStrict $ prepareXRayPayload segment'
    pure rr

  where
    prepareXRayPayload segment =
      let header = object ["format" .= ("json" :: String), "version" .= (1 :: Int)]
      in encode header <> "\n" <> encode segment

    sendXRayPayload payload = do
      addrInfos <- S.getAddrInfo Nothing (Just "127.0.0.1") (Just "2000")
      case addrInfos of
        [] -> pure () -- silently skip
        (xrayAddr:_) -> do
          sock <- S.socket (S.addrFamily xrayAddr) S.Datagram S.defaultProtocol
          S.connect sock (S.addrAddress xrayAddr)
          sendAll sock payload
          S.close sock

The next step will be to instrument the actual processing. The service I’m instrumenting is asynchronous, so all the work happens after the response has been sent. My plan for this is to use subsegments to record it. That means I’ll have to

  • keep the Root and ID (theId in traceId above) for use in subsegments
  • keep the original tracing header, for use in outgoing calls
  • make sure all outgoing HTTP calls include a tracing header with a proper Parent
  • wrap all outgoing HTTP calls with time keeping and sending a subsegment to the X-Ray daemon

I’m saving that work for a rainy day though, or rather, for a day when I’m so upset at Clojure that I don’t want to see another parenthesis.

Edit (2020-04-10): Corrected the segment field name for the parent ID, it should be parent_id.

My ghcide build for Nix

I was slightly disappointed to find out that not all packages on Hackage that are marked as present in Nix(pkgs) actually are available. Quite a few of them are marked broken and hence not installable. One of these packages is ghcide.

There are of course expressions available for getting a working ghcide executable installed, like ghcide-nix. However, since I have rather simple needs for my Haskell projects I thought I’d play with my own approach to it.

What I care about is:

  1. availability of the development tools I use, at the moment it’s mainly ghcide but I’m planning on making use of ormolu in the near future
  2. pre-built packages
  3. ease of use

So, I put together ghcide-for-nix. It’s basically just a constumized Nixpkgs where the packages needed to un-break ghcide are present.

Usage is a simple import away:

import (builtins.fetchGit {
  name = "ghcide-for-nix";
  url = https://github.com/magthe/ghcide-for-nix;
  rev = "927a8caa62cece60d9d66dbdfc62b7738d61d75f";
}) 

and it’ll give you a superset of Nixpkgs. Pre-built packages are available on Cachix.

It’s not sophisticated, but it’s rather easy to use and suffices for my purposes.

Nix setup for Spacemacs

Edit 2020-06-22: I’ve since found a better setup for this.

When using ghcide and LSP, as I wrote about in my post on Haskell, ghcide, and Spacemacs, I found myself ending up recompiling a little too often. This pushed me to finally start looking at Nix. After a bit of a fight I managed to get ghcide from Nix, which brought me the issue of setting up Spacemacs. Inspired by a gist from Samuel Evans-Powell and a guide to setting up an environment for Reflex by Thales Macedo Garitezi I ended up with the following setup:

(defun dotspacemacs/layers ()
  (setq-default
   ...
   dotspacemacs-additional-packages
   '(
     nix-sandbox
     nix-haskell-mode
     ...
     )
   ...
   ))
(defun dotspacemacs/user-config ()
  ...
  (add-hook 'haskell-mode-hook #'lsp)
  (add-hook 'haskell-mode-hook 'nix-haskell-mode)
  (add-hook 'haskell-mode-hook
            (lambda ()
              (setq-local flycheck-executable-find
                          (lambda (cmd)
                            (nix-executable-find (nix-current-sandbox) cmd)))
              (setq-local flycheck-command-wrapper-function
                          (lambda (argv)
                            (apply 'nix-shell-command (nix-current-sandbox) argv)))
              (setq-local haskell-process-wrapper-function
                          (lambda (argv)
                            (apply 'nix-shell-command (nix-current-sandbox) argv)))
              (setq-local lsp-haskell-process-wrapper-function
                          (lambda (argv)
                            `("nix-shell" "-I" "." "--command" "ghcide --lsp" ,(nix-current-sandbox))))))
  (add-hook 'haskell-mode-hook
            (lambda ()
              (flycheck-add-next-checker 'lsp-ui '(warning . haskell-stack-ghc))))
  ...
  )

It seems to work, but please let me know if you have suggestions for improvements.

Populating Projectile's cache

As I track the develop branch of Spacemacs I occasionally clean out my cache of projects known to Projectile. Every time it takes a while before I’m back at a stage where I very rarely have to visit something that isn’t already in the cache.

However, today I found the function projectile-add-known-project, which prompted me to write the following function that’ll help me quickly re-building the cache the next time I need to reset Spacemacs to a known state again.

(defun projectile-extra-add-projects-in-subfolders (projects-root)
  (interactive (list (read-directory-name "Add to known projects: ")))
  (message "Searching for projects in %s..." projects-root)
  (let ((dirs (seq-map 'file-name-directory (directory-files-recursively projects-root "^.git$" t))))
    (seq-do 'projectile-add-known-project dirs)
    (message "Added %d projects" (length dirs))))

Ditaa in Org mode

Just found out that Emacs ships with Babel support for ditaa (yes, I’m late to the party).

Sweet! That is yet another argument for converting all our README.mds into README.orgs at work.

Dr. Evil 

The changes I made to my Spacemacs config are

(defun dotspacemacs/user-config ()
  ...
  (with-eval-after-load 'org
    ...
    (add-to-list 'org-babel-load-languages '(ditaa . t))
    (setq org-ditaa-jar-path "/usr/share/java/ditaa/ditaa-0.11.jar"))
  ...)

Haskell, ghcide, and Spacemacs

The other day I read Chris Penner’s post on Haskell IDE Support and thought I’d make an attempt to use it with Spacemacs.

After running stack build hie-bios ghcide haskell-lsp --copy-compiler-tool I had a look at the instructions on using haskell-ide-engine with Spacemacs. After a bit of trial and error I came up with these changes to my ~/.spacemacs:

(defun dotspacemacs/layers ()
  (setq-default
   dotspacemacs-configuration-layers
   '(
    ...
    lsp
    (haskell :variables
             haskell-completion-backend 'lsp
             )
    ...)
  )
)
(defun dotspacemacs/user-config ()
  (setq lsp-haskell-process-args-hie '("exec" "ghcide" "--" "--lsp")
        lsp-haskell-process-path-hie "stack"
        lsp-haskell-process-wrapper-function (lambda (argv) (cons (car argv) (cddr argv)))
        )
  (add-hook 'haskell-mode-hook
            #'lsp)
)

The slightly weird looking lsp-haskell-process-wrapper-function is removing the pesky --lsp inserted by this line.

That seems to work. Though I have to say I’m not ready to switch from intero just yet. Two things in particular didn’t work with ghcide/LSP:

  1. Switching from one the Main.hs in one executable to the Main.hs of another executable in the same project didn’t work as expected – I had hints and types in the first, but nothing in the second.
  2. Jump to the definition of a function defined in the package didn’t work – I’m not willing to use GNU GLOBAL or some other source tagging system.

Nested tmux

I’ve finally gotten around to sorting out running nested tmux instances. I found the base for the configuration in the article Tmux in practice: local and nested remote tmux sessions, which links a few other related resources.

What I ended up with was this:

# Toggle tmux keybindings on/off, for use with inner tmux
# https://is.gd/slxE45
bind -T root F12  \
  set prefix None \;\
  set key-table off \;\
  set status-left "#[fg=black,bg=blue,bold] OFF " \;\
  refresh-client -S

bind -T off F12 \
  set -u prefix \;\
  set -u key-table \;\
  set -u status-left \;\
  refresh-client -S

It’s slightly simpler than what’s in the article above, but it works and it fits rather nicely with the nord theme.

Hedgehog on a REST API, part 3

In my previous post on using Hedgehog on a REST API, Hedgehog on a REST API, part 2 I ran the test a few times and adjusted the model to deal with the incorrect assumptions I had initially made. In particular, I had to adjust how I modelled the User ID. Because of the simplicity of the API that wasn’t too difficult. However, that kind of completely predictable ID isn’t found in all APIs. In fact, it’s not uncommon to have completely random IDs in API (often they are UUIDs).

So, I set out to try to deal with that. I’m still using the simple API from the previous posts, but this time I’m pretending that I can’t build the ID into the model myself, or, put another way, I’m capturing the ID from the responses.

The model state

When capturing the ID it’s no longer possible to use a simple Map Int Text for the state, because I don’t actually have the ID until I have an HTTP response. However, the ID is playing an important role in the constructing of a sequence of actions. The trick is to use Var Int v instead of an ordinary Int. As I understand it, and I believe that’s a good enough understanding to make use of Hedgehog possible, is that this way the ID is an opaque blob in the construction phase, and it’s turned into a concrete value during execution. When in the opaque state it implements enough type classes to be useful for my purposes.

newtype State (v :: * -> *)= State (M.Map (Var Int v) Text)
  deriving (Eq, Show)

The API calls: add user

When taking a closer look at the Callback type not all the callbacks will get the state in the same form, opaque or concrete, and one of them, Update actually receives the state in both states depending on the phase of execution. This has the most impact on the add user action. To deal with it there’s a need to rearrange the code a bit, to be specific, commandExecute can no longer return a tuple of both the ID and the status of the HTTP response because the update function can’t reach into the tuple, which it needs to update the state.

That means the commandExecute function will have to do tests too. It is nice to keep all tests in the callbacks, but by sticking a MonadTest m constraint on the commandExecute it turns into a nice solution anyway.

addUser :: (MonadGen n, MonadIO m, MonadTest m) => Command n m State
addUser = Command gen exec [ Update u
                           ]
  where
    gen _ = Just $ AddUser <$> Gen.text (Range.linear 0 42) Gen.alpha

    exec (AddUser n) = do
      (s, ui) <- liftIO $ do
        mgr <- newManager defaultManagerSettings
        addReq <- parseRequest "POST http://localhost:3000/users"
        let addReq' = addReq { requestBody = RequestBodyLBS (encode $ User 0 n)}
        addResp <- httpLbs addReq' mgr
        let user = decode (responseBody addResp) :: Maybe User
        return (responseStatus addResp, user)
      status201 === s
      assert $ isJust ui
      (userName <$> ui) === Just n
      return $ userId $ fromJust ui

    u (State m) (AddUser n) o = State (M.insert o n m)

I found that once I’d come around to folding the Ensure callback into the commandExecute function the rest fell out from the types.

The API calls: delete user

The other actions, deleting a user and getting a user, required only minor changes and the changes were rather similar in both cases.

Not the type for the action needs to take a Var Int v instead of just a plain Int.

newtype DeleteUser (v :: * -> *) = DeleteUser (Var Int v)
  deriving (Eq, Show)

Which in turn affect the implementation of HTraversable

instance HTraversable DeleteUser where
  htraverse f (DeleteUser vi) = DeleteUser <$> htraverse f vi

Then the changes to the Command mostly comprise use of concrete in places where the real ID is needed.

deleteUser :: (MonadGen n, MonadIO m) => Command n m State
deleteUser = Command gen exec [ Update u
                              , Require r
                              , Ensure e
                              ]
  where
    gen (State m) = case M.keys m of
      [] -> Nothing
      ks -> Just $ DeleteUser <$> Gen.element ks

    exec (DeleteUser vi) = liftIO $ do
      mgr <- newManager defaultManagerSettings
      delReq <- parseRequest $ "DELETE http://localhost:3000/users/" ++ show (concrete vi)
      delResp <- httpNoBody delReq mgr
      return $ responseStatus delResp

    u (State m) (DeleteUser i) _ = State $ M.delete i m

    r (State m) (DeleteUser i) = i `elem` M.keys m

    e _ _ (DeleteUser _) r = r === status200

Conclusion

This post concludes my playing around with state machines in Hedgehog for this time. I certainly hope I find the time to put it to use on some larger API soon. In particular I’d love to put it to use at work; I think it’d be an excellent addition to the integration tests we currently have.

Architecture of a service

Early this summer it was finally time to put this one service I’ve been working on into our sandbox environment. It’s been running without hickups so last week I turned it on for production as well. In this post I thought I’d document the how and why of the service in the hope that someone will find it useful.

The service functions as an interface to external SMS-sending services, offering a single place to change if we find that we are unhappy with the service we’re using.1 This service replaces an older one, written in Ruby and no one really dares touch it. Hopefully the Haskell version will prove to be a joy to work with over time.

Overview of the architecture

The service is split into two parts, one web server using scotty, and streaming data processing using conduit. Persistent storage is provided by a PostgreSQL database. The general idea is that events are picked up from the database, acted upon, which in turn results in other events which written to the database. Those are then picked up and round and round we go. The web service accepts requests, turns them into events and writes the to the database.

Hopefully this crude diagram clarifies it somewhat.

Diagram of the service architecture

There are a few things that might need some explanation

  • In the past we’ve wanted to have the option to use multiple external SMS services at the same time. One is randomly chosen as the request comes in. There’s also a possibility to configure the frequency for each external service.

    Picker implements the random picking and I’ve written about that earlier in Choosing a conduit randomly.

    Success and fail are dummy senders. They don’t actually send anything, and the former succeeds at it while the latter fails. I found them useful for manual testing.

  • Successfully sending off a request to an external SMS service, getting status 200 back, doesn’t actually mean that the SMS has been sent, or even that it ever will be. Due to the nature of SMS messaging there are no guarantees of timeliness at all. Since we are interested in finding out whether an SMS actually is sent a delayed action is scheduled, which will fetch the status of a sent SMS after a certain time (currently 2 minutes). If an SMS hasn’t been sent after that time it might as well never be – it’s too slow for our end-users.

    This is what report-fetcher and fetcher-func do.

  • The queue sink and queue src are actually sourceTQueue and sinkTQueue. Splitting the stream like that makes it trivial to push in events by using writeTQueue.

  • I use sequenceConduits in order to send a single event to multiple Conduits and then combine all their results back into a single stream. The ease with which this can be done in conduit is one of the main reasons why I choose to use it.2

Effects and tests

I started out writing everything based on a type like ReaderT <my cfg type> IO and using liftIO for effects that needed lifting. This worked nicely while I was setting up the basic structure of the service, but as soon as I hooked in the database I really wanted to do some testing also of the effectful code.

After reading Introduction to Tagless Final and The ReaderT Design Patter, playing a bit with both approaches, and writing Tagless final and Scotty and The ReaderT design pattern or tagless final?, I finally chose to go down the route of tagless final. There’s no strong reason for that decision, maybe it was just because I read about it first and found it very easy to move in that direction in small steps.

There’s a split between property tests and unit tests:

  • Data types, their monad instances (like JSON (de-)serialisation), pure functions and a few effects are tested using properties. I’m using QuickCheck for that. I’ve since looked a little closer at hedgehog and if I were to do a major overhaul of the property tests I might be tempted to rewrite them using that library instead.

  • Most of the Conduits are tested using HUnit.

Configuration

The service will be run in a container and we try to follow the 12-factor app rules, where the third one says that configuration should be stored in the environment. All previous Haskell projects I’ve worked on have been command line tools were configuration is done (mostly) using command line argument. For that I usually use optparse-applicative, but it’s not applicable in this setting.

After a bit of searching on hackage I settled on etc. It turned out to be nice an easy to work with. The configuration is written in JSON and only specifies environment variables. It’s then embedded in the executable using file-embed. The only thing I miss is a ToJSON instance for Config – we’ve found it quite useful to log the active configuration when starting a service and that log entry would become a bit nicer if the message was JSON rather than the (somewhat difficult to read) string that Config’s Show instance produces.

Logging

There are two requirements we have when it comes to logging

  1. All log entries tied to a request should have a correlation ID.
  2. Log requests and responses

I’ve written about correlation ID before, Using a configuration in Scotty.

Logging requests and responses is an area where I’m not very happy with scotty. It feels natural to solve it using middleware (i.e. using middleware) but the representation, especially of responses, is a bit complicated so for the time being I’ve skipped logging the body of both. I’d be most interested to hear of libraries that could make that easier.

Data storage and picking up new events

The data stream processing depends heavily on being able to pick up when new events are written to the database. Especially when there are more than one instance running (we usually have at least two instance running in the production environment). To get that working I’ve used postgresql-simple’s support for LISTEN and NOTIFY via the function getNotification.

When I wrote about this earlier, Conduit and PostgreSQL I got some really good feedback that made my solution more robust.

Delayed actions

Some things in Haskell feel almost like cheating. The light-weight threading makes me confident that a forkIO followed by a threadDelay (or in my case, the ones from unliftio) will suffice.


  1. It has happened in the past that we’ve changed SMS service after finding that they weren’t living up to our expectations.↩︎

  2. A while ago I was experimenting with other streaming libraries, but I gave up on getting re-combination to work – Zipping streams↩︎