Systemd for auto deploy from AWS

Over the last week I’ve completely rebuilt the the only on-premise system we have at work. One of the bits I’m rather happy with is the replacement of M/Monit with just plain systemd. We didn’t use any of the advanced features of M/Monit, we only wanted ordinary process monitoring and restart on a single system. It really was a bit of overkill.

So, instead of

  • using M/Monit to monitor processes
  • a monit configuration for the app (both start and stop instructions)
  • a script to start the app and capture its PID in a file
  • a script to resync the app against the S3 bucket where Travis puts the build artifacts, and if a new version has arrived, remove the PID file thereby triggering M/Monit to restart the app
  • a crontab entry to run the sync/restart script every few minutes

we now have

  • a (simplified) script to start the app1
  • a service unit (app.service) for the app
  • a timer unit (app-sync.timer) to trigger the resync of the app against the S3 bucket
  • a oneshot service unit (app-sync.service), triggered by the timer, to perform the sync of the app with the latest build, i.e. call aws s3 sync
  • a path unit (app-restart.path) to monitor one of the build artifacts, i.e. to pick up that a new version has arrived
  • a oneshot service unit (app-restart.service), triggered by the path unit, calling systemctl restart app.service

There’s one more piece to the setup, but

  • the start script is simplified since it no longer needs to push things to the background and capture the PID
  • the sync/restart script is completely gone (arguably the more complicated of the two scripts in the M/Monit setup)
  • responsibility is cleanly separated leading to a solution that’s easier to understand (as long as you know a bit about systemd of course)

so I think it’s a net improvement.

  1. All it does is set a bunch of environment variables and then start the app, so I’m planning on moving the environment variables into a separate file and put the start command the service unit file instead.

Components feel so not FP

At work we’re writing most of the new stuff using Clojure. That’s been going on since before I started working there, and from the beginning there’s been exploration of style and libraries, yes, even of frameworks (ugh!). Now there’s discussion of standardising. The discussion is still in its infancy, but it prompted me to start thinking about what I’m come across in the code base we have now, what I like and what I dislike. The first thing that came to mind was how our different services are set up. I mean set up internally. Like, what’s actually in the main function (or in -main, since we’re talking Clojure). In many services we use Stuart Sierra’s component, in a growing number of services we use integrant, and in one we use mount.

The current discussion is going in the direction of integrant, and I don’t like it!

AFAICS integrant suffers from all the same things as component (the author of mount has put it into words better than I ever could in his text on mount‘s differences from component’) plus one more thing to boot: systems are “configured into being”. It’s touted as “data-driven architecture”, I tend to see it as architecture defined in a language separate from my functions. The integrant README says that one of its strengths over component is that

In Integrant, systems are created from a configuration data structure, typically loaded from an edn resource. The architecture of the application is defined through data, rather than code.

Somehow that statement makes me think of OO and DI frameworks. I suspect the above paragraph is just as true with the two replacements s/Integrant/Spring/ and s/edn/XML/. I’m not convinced this is a strength at all! My experience with DI frameworks in OO is limited (I’ve never used external configuration), but the enduring impression is that it’s unwieldy. In particular it was very far between cause and effect of errors. So far this is true for integrant as well.

Also, it makes me think of “functional in the small, OO in the large”1, which is a comment coming out of the f-sharp world. Maybe there’s a connection here. Maybe “OO in the large” is something that resonates with OO-turned-FP developers. Maybe that means it’s only a quesiton of time (and exposure to FP) before they embrace “functional all the way”? Or, maybe I’m simply missing something crucial.

In any case I’m going to have to take a closer look at mount in the near future. I’ll also have to take a look at its brother yurt, and at its distant cousin mount-lite.


Is this a good way to do JSON validation?

At work, where we use Clojure, we’ve been improving our error messages in the public API to

  1. return as many errors as possible in a response, and
  2. be in humanly readable English.

If one adopts spec as we have one gets the former for free, but the output of spec can hardly be called humanly readable. For the latter part we chose to use phrase.

Given that I’d like to see Haskell used more at work (currently there are 2 minor services written in Haskell and around a score in Clojure) I thought I’d take a look at JSON validation in Haskell. I ended up beig less than impressed. We have at least one great library for parsing JSON, aeson, but there are probably a few more that I haven’t noticed. It’s of course possible to mix in validation with the parsing, but since parsers, and this is true for aeson’s parser too, tend to be monads and that means that item 1 above, finding as many errors as possible, isn’t on the table.

A quick look at Hackage gave that

  • there is a package called aeson-better-errors that looked promising but didn’t fi my needs (I explain at the end why it isn’t passing muster)
  • the support for JSON Schema is very lacking in Haskell, hjsonschema is deprecated and aeson-schema only supports version 3 of the draft (the current version is 7) and the authors claim that that hjsonschema is more moderna and more actively maintained

So, a bit disappointed I started playing with the problem myself and found that, just as is stated in the description of the validation library, I want something that’s isomorphic to Either but accumulates on the error side. That is, something like

I decided it was all right to limit validation to proper JSON expressions, i.e. a validator could have the type Value -> JSONValidationResult. I want to combine validators so I decided to wrap it in a newtype and write a SemiGroup instance for it as well:

The function to actually run the validation is rather straight forward

After writing a few validators I realised a few patterns emerged and the following functions simplified things a bit:

With this in place I started writing validators for the basic JSON types:

The number type in JSON is a float (well, in aeson it’s a Scientific), so to check for an integer a bit more than the above is needed

as well as functions that check for the presence of a specific key

With this in place I can now create a validator for a person with a name and an age:

and run it on a Value:

and all failures are picked up


  1. I quickly realised I wanted slightly more complex validation of course, so all the validators for basic JSON types above have a version taking a custom validator of type a -> JSONValidationResult (where a is the Haskell type contained in the particulare Value).
  2. I started out thinking that I want an Applicative for my validations, but slowly I relaxed that to SemiGroup. I’m still not sure about this decision, because I can see a real use of or which I don’t really have now. Maybe that means I should switch back towards Applicative, just so I can implement an Alternative instance for validators.
  3. Well, I simply don’t know if this is even a good way to implement validators. I’d love to hear suggestions both for improvements and for completely different ways of tackling the problems.
  4. I would love to find out that there already is a library that does all this in a much better way. Please point me in its direction!

Appendix: A look at aeson-better-errors

The issue with aeson-better-errors is easiest to illustrate using the same example as in its announcement:

and with this loaded in GHCi (and make sure to either pass -XOverloadedStrings on the command line, or :set -XOverloadedStrings in GHCi itself)

*> parse asPerson "{\"name\": \"Alice\", \"age\": 32}"
Right (Person "Alice" 32)
*> parse asPerson "{\"name\": \"Alice\"}"
Left (BadSchema [] (KeyMissing "age"))
*> parse asPerson "{\"nam\": \"Alice\"}"
Left (BadSchema [] (KeyMissing "name"))

Clearly aeson-better-errors isn’t fulfilling the bit about reporting as many errors as possible. Something that I would have realised right away if I had bothered reading its API reference on Hackage a bit more carefully, the parser type ParseT is an instance of Monad!

Zipping streams

Writing the following is easy after glancing through the documentation for conduit:

Neither pipes nor streaming make it as easy to figure out. I must be missing something! What functions should I be looking at?

Picking up new rows in a DB

I’ve been playing around a little with postgresql-simple in combination with streaming for picking up inserted rows in a DB table. What I came up with was

It seems almost too simple so I’m suspecting I missed something.

  • Am I missing something?
  • Should I be using streaming, or is pipes or conduit better (in case they are, in what way)?

Using a configuration in Scotty

At work we’re only now getting around to put correlation IDs into use. We write most our code in Clojure but since I’d really like to use more Haskell at work I thought I’d dive into Scotty and see how to deal with logging and then especially how to get correlation IDs into the logs.

The types

For configuration it decided to use the reader monad inside ActionT from Scotty. Enter Chell:

In order to run it I wrote a function corresponding to scotty:

Correlation ID

To deal with the correlation ID each incoming request should be checked for the HTTP header X-Correlation-Id and if present it should be used during logging. If no such header is present then a new correlation ID should be created. Since it’s per request it feels natural to create a WAI middleware for this.

The easiest way I could come up with was to push the correlation ID into the request’s headers before it’s passed on:

It also turns out to be useful to have both a default correlation ID and a function for pulling it out of the headers:

Getting the correlation ID into the configuration

Since the correlation ID should be picked out of the request on handling of every request it’s useful to have it the configuration when running the ChellActionM actions. However, since the correlation ID isn’t available when running the reader (the call to runReaderT in chell) something else is called for. When looking around I found local (and later I was pointed to the more general withReaderT) but it doesn’t have a suitable type. After some help on Twitter I arrived at withConfig which allows me to run an action in a modified configuration:

Making it handy to use

Armed with this I can put together some functions to replace Scotty’s get, post, etc. With a configuration type like this:

The modified get looks like this (Scotty’s original is S.get)

With this in place I can use the simpler ReaderT Config IO for inner functions that need to log.

QuickCheck on a REST API

Since I’m working with web stuff nowadays I thought I’d play a little with translating my old post on using QuickCheck to test C APIs to the web.

The goal and how to reach it

I want to use QuickCheck to test a REST API, just like in the case of the C API the idea is to

  1. generate a sequence of API calls (a program), then
  2. run the sequence against a model, as well as
  3. run the sequence against the web service, and finally
  4. compare the resulting model against reality.


I’ll use a small web service I’m working on, and then concentrate on only a small part of the API to begin with.

The parts of the API I’ll use for the programs at this stage are

Method Route Example in Example out
POST /users {"userId": 0, "userName": "Yogi Berra"} {"userId": 42, "userName": "Yogi Berra"}
DELETE /users/:id

The following API calls will also be used, but not in the programs

Method Route Example in Example out
GET /users [0,3,7]
GET /users/:id {"userId": 42, "userName": "Yogi Berra"}
POST /reset

Representing API calls

Given the information about the API above it seems the following is enough to represent the two calls of interest together with a constructor representing the end of a program

and a program is just a sequence of calls, so list of ApiCall will do. However, since I want to generate sequences of calls, i.e. implement Arbitrary, I’ll wrap it in a newtype

Running against a model (simulation)

First of all I need to decide what model to use. Based on the part of the API I’m using I’ll use an ordinary dictionary of Int and Text

Simulating execution of a program is simulating each call against a model that’s updated with each step. I expect the final model to correspond to the state of the real service after the program is run for real. The simulation begins with an empty dictionary.

The simulation of the API calls must then be a function taking a model and a call, returning an updated model

Here I have to make a few assumptions. First, I assume the indeces for the users start on 1. Second, that the next index used always is the successor of highest currently used index. We’ll see how well this holds up to reality later on.

Running against the web service

Running the program against the actual web service follows the same pattern, but here I’m dealing with the real world, so it’s a little more messy, i.e. IO is involved. First the running of a single call

The running of a program is slightly more involved. Of course I have to set up the Manager needed for the HTTP calls, but I also need to

  1. ensure that the web service is in a well-known state before starting, and
  2. extract the state of the web service after running the program, so I can compare it to the model

The call to POST /reset resets the web service. I would have liked to simply restart the service completely, but I failed in automating it. I think I’ll have to take a closer look at the implementation of scotty to find a way.

Extracting the web service state and packaging it in a Model is a matter of calling GET /users and then repeatedly calling GET /users/:id with each id gotten from the first call

Generating programs

My approach to generating a program is based on the idea that given a certain state there is only a limited number of possible calls that make sense. Given a model m it makes sense to make one of the following calls:

  • add a new user
  • delete an existing user
  • end the program

Based on this writing genProgram is rather straight forward

Armed with that the Arbitrary instance for Program can be implemented as1

The property of an API

The steps in the first section can be used as a recipe for writing the property

What next?

There are some improvements that I’d like to make:

  • Make the generation of Program better in the sense that the programs become longer. I think this is important as I start tackling larger APIs.
  • Write an implementation of shrink for Program. With longer programs it’s of course more important to actually implement shrink.

I’d love to hear if others are using QuickCheck to test REST APIs in some way, if anyone has suggestions for improvements, and of course ideas for how to implement shrink in a nice way.

  1. Yes, I completely skip the issue of shrinking programs at this point. This is OK at this point though, because the generated Programss do end up to be very short indeed.

A simple zaw widget for jumping to git projects

A colleague at work showed me a script he put together to quickly jump to the numerous git projects we work with. It’s based on dmenu and looks rather nice. However, I’d rather have something based on zsh, but when looking around I didn’t find anything that really fit. So, instead I ended up writing a simple widget for zaw.

I then attached bound it like this

On mocks and stubs in python (free monad or interpreter pattern)

A few weeks ago I watched a video where Ken Scambler talks about mocks and stubs. In particular he talks about how to get rid of them.

One part is about coding IO operatioins as data and using the GoF interpreter pattern to

What he’s talking about is of course free monads, but I feel he’s glossing over a lot of details. Based on some of the questions asked during the talk I think I share that feeling with some people in the audience. Specifically I feel he skipped over the following:

  • How does one actually write such code in a mainstream OO/imperative language?
  • What’s required of the language in order to allow using the techniques he’s talking about?
  • Errors tend to break abstractions, so how does one deal with error (i.e. exceptions)?

Every time I’ve used mocks and stubs for unit testing I’ve had a feeling that “this can’t be how it’s supposed to be done!” So to me, Ken’s talk offered some hope, and I really want to know how applicable the ideas are in mainstream OO/imperative languages.

The example

To play around with this I picked the following function (in Python):

It’s small and simple, but I think it suffices to highlight a few important points. So the goal is to rewrite this function such that calls to IO operations (actions) (e.g. are replaced by data (an instance of some data type) conveying the intent of the operation. This data can later be passed to an interpreter of actions.

Thoughts on the execution of actions and the interpreter pattern

When reading the examples in the description of the interpreter pattern what stands out to me is that they are either

  1. a list of expressions, or
  2. a tree of expressions

that is passed to an interpreter. Will this do for us when trying to rewrite count_chars_of_file?

No, it won’t! Here’s why:

  • A tree of actions doesn’t really make sense. Our actions are small and simple, they encode the intent of a single IO operation.
  • A list of actions can’t deal with interspersed non-actions, in this case it’s the line n = len(text) that causes a problem.

The interpreter pattern misses something that is crucial in this case: the running of the interpreter must be intermingled with running non-interpreted code. The way I think of it is that not only the action needs to be present and dealt with, but also the rest of the program, that latter thing is commonly called a continuation.

So, can we introduce actions and rewrite count_chars_of_file such that we pause the program when interpretation of an action is required, interpret it, and then resume where we left off?

Sure, but it’s not really idiomatic Python code!

Actions and continuations

The IO operations (actions) are represented as a named tuple:

and the functions returning actions can then be written as

The interpreter is then an if statement checking the value of op.op with each branch executing the IO operation and passing the result to the rest of the program. I decided to wrap it directly in the program runner:

So far so good, but what will count_char_of_file all of this do to count_chars_of_file?

Well, it’s not quite as easy to read any more (basically it’s rewritten in CPS):

Generators to the rescue

Python does have a notion of continuations in the form of generators.1 By making count_char_of_file into a generator it’s possible to remove the explicit continuations and the program actually resembles the original one again.

The type for the actions loses one member, and the functions creating them lose an argument:

The interpreter and program runner must be modified to step the generator until its end:

Finally, the generator-version of count_chars_of_file goes back to being a bit more readable:

Generators all the way

Limitations of Python generators mean that we have either have to push the interpreter (runProgram) down to where count_char_of_file is used, or make all intermediate layers into generators and rewrite the interpreter to deal with this. It could look something like this then:

Final thoughts

I think I’ve shown one way to achieve, at least parts of, what Ken talks about. The resulting code looks almost like “normal Python”. There are some things to note:

  1. Exception handling is missing. I know of no way to inject an exception into a generator in Python so I’m guessing that exceptions from running the IO operations would have to be passed in via generator.send as a normal value, which means that exception handling code would have to look decidedly non-Pythonic.
  2. Using this approach means the language must have support for generators (or some other way to represent the rest of the program). I think this rules out Java, but probably it can be done in C#.
  3. I’ve only used a single interpreter here, but I see no issues with combining interpreters (to combine domains of operations like file operations and network operations). I also think it’d be possible to use it to realize what De Goes calls Onion Architecture.

Would I ever advocate this approach for a larger Python project, or even any project in an OO/imperative language?

I’m not sure! I think that testing using mocks and stubs has lead to a smelly code base each an every time I’ve used it, but this approach feels a little too far from how OO/imperative code usually is written. I would love to try it out and see what the implications really are.

  1. I know, I know, coroutines! I’m simplifying and brushing over details here, but I don’t think I’m brushing over any details that are important for this example.

That's a large project

From LWN October 6, 2016:

Over the course of the last year, the project accepted about eight changes per hour — every hour — from over 4,000 developers sponsored by over 400 companies.