Posts tagged "scotty":
Architecture of a service
Early this summer it was finally time to put this one service I've been working on into our sandbox environment. It's been running without hickups so last week I turned it on for production as well. In this post I thought I'd document the how and why of the service in the hope that someone will find it useful.
The service functions as an interface to external SMS-sending services, offering a single place to change if we find that we are unhappy with the service we're using.1 This service replaces an older one, written in Ruby and no one really dares touch it. Hopefully the Haskell version will prove to be a joy to work with over time.
Overview of the architecture
The service is split into two parts, one web server using scotty, and streaming data processing using conduit. Persistent storage is provided by a PostgreSQL database. The general idea is that events are picked up from the database, acted upon, which in turn results in other events which written to the database. Those are then picked up and round and round we go. The web service accepts requests, turns them into events and writes the to the database.
Hopefully this crude diagram clarifies it somewhat.
There are a few things that might need some explanation
In the past we've wanted to have the option to use multiple external SMS services at the same time. One is randomly chosen as the request comes in. There's also a possibility to configure the frequency for each external service.
Picker implements the random picking and I've written about that earlier in Choosing a conduit randomly.
Success and fail are dummy senders. They don't actually send anything, and the former succeeds at it while the latter fails. I found them useful for manual testing.
Successfully sending off a request to an external SMS service, getting status 200 back, doesn't actually mean that the SMS has been sent, or even that it ever will be. Due to the nature of SMS messaging there are no guarantees of timeliness at all. Since we are interested in finding out whether an SMS actually is sent a delayed action is scheduled, which will fetch the status of a sent SMS after a certain time (currently 2 minutes). If an SMS hasn't been sent after that time it might as well never be – it's too slow for our end-users.
This is what report-fetcher and fetcher-func do.
- The queue sink and queue src are actually
sourceTQueue
andsinkTQueue
. Splitting the stream like that makes it trivial to push in events by usingwriteTQueue
. - I use
sequenceConduits
in order to send a single event to multiple =Conduit=s and then combine all their results back into a single stream. The ease with which this can be done in conduit is one of the main reasons why I choose to use it.2
Effects and tests
I started out writing everything based on a type like ReaderT <my cfg type> IO
and using liftIO
for effects that needed lifting. This worked nicely while I
was setting up the basic structure of the service, but as soon as I hooked in
the database I really wanted to do some testing also of the effectful code.
After reading Introduction to Tagless Final and The ReaderT Design Patter, playing a bit with both approaches, and writing Tagless final and Scotty and The ReaderT design pattern or tagless final?, I finally chose to go down the route of tagless final. There's no strong reason for that decision, maybe it was just because I read about it first and found it very easy to move in that direction in small steps.
There's a split between property tests and unit tests:
- Data types, their monad instances (like JSON (de-)serialisation), pure functions and a few effects are tested using properties. I'm using QuickCheck for that. I've since looked a little closer at hedgehog and if I were to do a major overhaul of the property tests I might be tempted to rewrite them using that library instead.
- Most of the =Conduit=s are tested using HUnit.
Configuration
The service will be run in a container and we try to follow the 12-factor app rules, where the third one says that configuration should be stored in the environment. All previous Haskell projects I've worked on have been command line tools were configuration is done (mostly) using command line argument. For that I usually use optparse-applicative, but it's not applicable in this setting.
After a bit of searching on hackage I settled on etc. It turned out to be nice
an easy to work with. The configuration is written in JSON and only specifies
environment variables. It's then embedded in the executable using file-embed.
The only thing I miss is a ToJSON
instance for Config
– we've found it
quite useful to log the active configuration when starting a service and that
log entry would become a bit nicer if the message was JSON rather than the
(somewhat difficult to read) string that Config
's Show
instance produces.
Logging
There are two requirements we have when it comes to logging
- All log entries tied to a request should have a correlation ID.
- Log requests and responses
I've written about correlation ID before, Using a configuration in Scotty.
Logging requests and responses is an area where I'm not very happy with scotty.
It feels natural to solve it using middleware (i.e. using middleware
) but the
representation, especially of responses, is a bit complicated so for the time
being I've skipped logging the body of both. I'd be most interested to hear of
libraries that could make that easier.
Data storage and picking up new events
The data stream processing depends heavily on being able to pick up when new
events are written to the database. Especially when there are more than one
instance running (we usually have at least two instance running in the
production environment). To get that working I've used postgresql-simple's
support for LISTEN
and NOTIFY
via the function getNotification
.
When I wrote about this earlier, Conduit and PostgreSQL I got some really good feedback that made my solution more robust.
Delayed actions
Some things in Haskell feel almost like cheating. The light-weight threading
makes me confident that a forkIO
followed by a threadDelay
(or in my case,
the ones from unliftio) will suffice.
Footnotes:
It has happened in the past that we've changed SMS service after finding that they weren't living up to our expectations.
A while ago I was experimenting with other streaming libraries, but I gave up on getting re-combination to work – Zipping streams
Tagless final and Scotty
For a little while I've been playing around with event sourcing in Haskell using
Conduit
and Scotty
. I've come far enough that the basic functionality I'm
after is there together with all those little bits that make it a piece of
software that's fit for deployment in production (configuration, logging, etc.).
There's just one thing that's been nagging me, testability.
The app is built of two main parts, a web server (Scotty
) and a pipeline of
stream processing components (Conduit
). The part using Scotty
is utilising a
simple monad stack, ReaderT Config IO
, and the Conduit
part is using
Conduit In Out IO
. This means that in both parts the outer edge, the part
dealing with the outside world, is running in IO
directly. Something that
isn't really aiding in testing.
I started out thinking that I'd rewrite what I have using a free monad with a
bunch of interpreters. Then I remembered that I have "check out tagless final".
This post is a record of the small experiments I did to see how to use it with
Scotty
to achieve (and actually improve) on the code I have in my
production-ready code.
1 - Use tagless final with Scotty
As a first simple little experiment I wrote a tiny little web server that would
print a string to stdout
when receiving the request to GET /route0
.
The printing to stdout
is the operation I want to make abstract.
class Monad m => MonadPrinter m where mPutStrLn :: Text -> m ()
I then created an application type that is an instance of that class.
newtype AppM a = AppM { unAppM :: IO a } deriving (Functor, Applicative, Monad, MonadIO) instance MonadPrinter AppM where mPutStrLn t = liftIO $ putStrLn (unpack t)
Then I added a bit of Scotty
boilerplate. It's not strictly necessary, but
does make the code a bit nicer to read.
type FooM = ScottyT Text AppM type FooActionM = ActionT Text AppM foo :: MonadIO m => Port -> ScottyT Text AppM () -> m () foo port = scottyT port unAppM
With that in place the web server itself is just a matter of tying it all together.
main :: IO () main = do foo 3000 $ do get "/route0" $ do lift $ mPutStrLn "getting /route0" json $ object ["route0" .= ("ok" :: String)] notFound $ json $ object ["error" .= ("not found" :: String)]
That was simple enough.
2 - Add configuration
In order to try out how to deal with configuration I added a class for doing some simple logging
class Monad m => MonadLogger m where mLog :: Text -> m ()
The straight forward way to deal with configuration is to create a monad stack
with ReaderT
and since it's logging I want to do the configuration consists of
a single LoggerSet
(from fast-logger
).
newtype AppM a = AppM { unAppM :: ReaderT LoggerSet IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadReader LoggerSet)
That means the class instance can be implemented like this
instance MonadLogger AppM where mLog msg = do ls <- ask liftIO $ pushLogStrLn ls $ toLogStr msg
Of course foo
has to be changed too, and it becomes a little easier with a
wrapper for runReaderT
and unAppM
.
foo :: MonadIO m => LoggerSet -> Port -> ScottyT Text AppM () -> m () foo ls port = scottyT port (`runAppM` ls) runAppM :: AppM a -> LoggerSet -> IO a runAppM app ls = runReaderT (unAppM app) ls
With that in place the printing to stdout
can be replaced by a writing to the
log.
main :: IO () main = do ls <- newStdoutLoggerSet defaultBufSize foo ls 3000 $ do get "/route0" $ do lift $ mLog "log: getting /route0" json $ object ["route0" .= ("ok" :: String)] notFound $ json $ object ["error" .= ("not found" :: String)]
Not really a big change, I'd say. Extending the configuration is clearly straight forward too.
3 - Per-request configuration
At work we use correlation IDs1 and I think that the most convenient way to
deal with it is to put the correlation ID into the configuration after
extracting it. That is, I want to modify the configuration on each request.
Luckily it turns out to be possible to do that, despite using ReaderT
for
holding the configuration.
I can't be bothered with a full implementation of correlation ID for this little
experiment, but as long as I can get a new AppM
by running a function on the
configuration it's just a matter of extracting the correct header from the
request. For this experiment it'll do to just modify an integer in the
configuration.
I start with defining a type for the configuration and changing AppM
.
type Config = (LoggerSet, Int) newtype AppM a = AppM { unAppM :: ReaderT Config IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadReader Config)
The logger instance has to be changed accordingly of course.
instance MonadLogger AppM where mLog msg = do (ls, i) <- ask liftIO $ pushLogStrLn ls $ toLogStr msg <> toLogStr (":" :: String) <> toLogStr (show i)
The get
function that comes with scotty
isn't going to cut it, since it has
no way of modifying the configuration, so I'll need a new one.
mGet :: ScottyError e => RoutePattern -> ActionT e AppM () -> ScottyT e AppM () mGet p a = get p $ do withCfg (\ (ls, i) -> (ls, succ i)) a
The tricky bit is in the withCfg
function. It's indeed not very easy to read,
I think
withCfg = mapActionT . withAppM where mapActionT f (ActionT a) = ActionT $ (mapExceptT . mapReaderT . mapStateT) f a withAppM f a = AppM $ withReaderT f (unAppM a)
Basically it reaches into the guts of scotty
's ActionT
type (the details are
exposed in Web.Scotty.Internal.Types
, thanks for not hiding it completely),
and modifies the ReaderT Config
I've supplied.
The new server has two routes, the original one and a new one at GET /route1
.
main :: IO () main = do putStrLn "Starting" ls <- newStdoutLoggerSet defaultBufSize foo (ls, 0) 3000 $ do get "/route0" $ do lift $ mLog "log: getting /route0" json $ object ["route0" .= ("ok" :: String)] mGet "/route1" $ do lift $ mLog "log: getting /route1" json $ object ["route1" .= ("bar" :: String)] notFound $ json $ object ["error" .= ("not found" :: String)]
It's now easy to verify that the original route, GET /route0
, logs a string
containing the integer '0', while the new route, GET /route1
, logs a string
containing the integer '1'.
Footnotes:
If you don't know what it is you'll find multiple sources by searching for "http correlation-id". A consistent approach to track correlation IDs through microservices is as good a place to start as any.
Using a configuration in Scotty
At work we're only now getting around to put correlation IDs into use. We write most our code in Clojure but since I'd really like to use more Haskell at work I thought I'd dive into Scotty and see how to deal with logging and then especially how to get correlation IDs into the logs.
The types
For configuration it decided to use the reader monad inside ActionT
from
Scotty. Enter Chell:
type ChellM c = ScottyT Text (ReaderT c IO) type ChellActionM c = ActionT Text (ReaderT c IO)
In order to run it I wrote a function corresponding to scotty
:
chell :: c -> Port -> ChellM () -> IO () chell cfg port a = scottyOptsT opts (flip runReaderT cfg) a where opts = def { verbose = 0 , settings = (settings def) { settingsPort = port } }
Correlation ID
To deal with the correlation ID each incoming request should be checked for the
HTTP header X-Correlation-Id
and if present it should be used during logging.
If no such header is present then a new correlation ID should be created. Since
it's per request it feels natural to create a WAI middleware for this.
The easiest way I could come up with was to push the correlation ID into the request's headers before it's passed on:
requestHeaderCorrelationId :: Request -> Maybe ByteString requestHeaderCorrelationId = lookup "X-Correlation-Id" . requestHeaders correlationId :: Middleware correlationId app req sendResponse = do u <- (randomIO :: IO UUID) let corrId = maybe (toASCIIBytes u) id (requestHeaderCorrelationId req) newHeaders = ("X-Correlation-Id", corrId) : (requestHeaders req) app (req { requestHeaders = newHeaders }) $ \ res -> sendResponse res
It also turns out to be useful to have both a default correlation ID and a function for pulling it out of the headers:
defaultCorrelationString :: ByteString defaultCorrelationString = "no-correlation-id" getCorrelationId :: Request -> ByteString getCorrelationId r = maybe defaultCorrelationString id (requestHeaderCorrelationId r)
Getting the correlation ID into the configuration
Since the correlation ID should be picked out of the request on handling of
every request it's useful to have it the configuration when running the
ChellActionM
actions. However, since the correlation ID isn't available when
running the reader (the call to runReaderT
in chell
) something else is
called for. When looking around I found local
(and later I was pointed to the
more general withReaderT
) but it doesn't have a suitable type. After some help
on Twitter I arrived at withConfig
which allows me to run an action in a
modified configuration:
withConfig :: (c -> c') -> ChellActionM c' () -> ChellActionM c () withConfig = mapActionT . withReaderT where mapActionT f (ActionT a) = ActionT $ (mapExceptT . mapReaderT . mapStateT) f a
Making it handy to use
Armed with this I can put together some functions to replace Scotty's get
,
post
, etc. With a configuration type like this:
data Config = Cfg LoggerSet ByteString
The modified get
looks like this (Scotty's original is S.get
)
get :: RoutePattern -> ChellActionM Config () -> ChellM Config () get p a = S.get p $ do r <- request let corrId = getCorrelationId r withConfig (\ (Cfg l _) -> Cfg l corrId) a
With this in place I can use the simpler ReaderT Config IO
for inner functions
that need to log.