02 Feb 2019

The ReaderT design pattern or tagless final?

The other week I read V. Kevroletin's Introduction to Tagless Final and realised that a couple of my projects, both at work and at home, would benefit from a refactoring to that approach. All in all I was happy with the changes I made, even though I haven't made use of all the way. In particular there I could further improve the tests in a few places by adding more typeclasses. For now it's good enough and I've clearly gotten some value out of it.

I found mr. Kevroletin's article to be a good introduction so I've been passing it on when people on the Functional programming slack bring up questions about how to organize their code as applications grow. In particular if they mention that they're using monad transformers. I did exactly that just the other day @solomon wrote

so i've created a rats nest of IO where almost all the functions in my program are in ReaderT Env IO () and I'm not sure how to purify everything and move the IO to the edge of the program

I proposed tagless final and passed the URL on, and then I got a pointer to the article The ReaderT Design Patter which I hadn't seen before.

The two approches are similar, at least to me, and I can't really judge if one's better than the other. Just to get a feel for it I thought I'd try to rewrite the example in the ReaderT article in a tagless final style.

A slightly changed example of `ReaderT` design pattern

I decided to make a few changes to the example in the article:

I removed the modify function, instead the code uses the typeclass function modifyBalance directly.
I separated the instances needed for the tests spatially in the code just to make it easier to see what's "production" code and what's test code.
I combined the main functions from the various examples to that both an example (main0) and the test (main1) are run.
I switched from Control.Concurrent.Async.Lifted.Safe (from monad-control) to UnliftIO.Async (from unliftio)

After that the code looks like this

{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}

import           Control.Concurrent.STM
import           Control.Monad.Reader
import qualified Control.Monad.State.Strict as State
import           Say
import           Test.Hspec
import           UnliftIO.Async

data Env = Env
  { envLog :: !(String -> IO ())
  , envBalance :: !(TVar Int)
  }

class HasLog a where
  getLog :: a -> (String -> IO ())

instance HasLog Env where
  getLog = envLog

class HasBalance a where
  getBalance :: a -> TVar Int

instance HasBalance Env where
  getBalance = envBalance

class Monad m => MonadBalance m where
  modifyBalance :: (Int -> Int) -> m ()

instance (HasBalance env, MonadIO m) => MonadBalance (ReaderT env m) where
  modifyBalance f = do
    env <- ask
    liftIO $ atomically $ modifyTVar' (getBalance env) f

logSomething :: (MonadReader env m, HasLog env, MonadIO m) => String -> m ()
logSomething msg = do
  env <- ask
  liftIO $ getLog env msg

main0 :: IO ()
main0 = do
  ref <- newTVarIO 4
  let env = Env { envLog = sayString , envBalance = ref }
  runReaderT
    (concurrently_
      (modifyBalance (+ 1))
      (logSomething "Increasing account balance"))
    env
  balance <- readTVarIO ref
  sayString $ "Final balance: " ++ show balance

instance HasLog (String -> IO ()) where
  getLog = id

instance HasBalance (TVar Int) where
  getBalance = id

instance Monad m => MonadBalance (State.StateT Int m) where
  modifyBalance = State.modify

main1 :: IO ()
main1 = hspec $ do
  describe "modify" $ do
    it "works, IO" $ do
      var <- newTVarIO (1 :: Int)
      runReaderT (modifyBalance (+ 2)) var
      res <- readTVarIO var
      res `shouldBe` 3
    it "works, pure" $ do
      let res = State.execState (modifyBalance (+ 2)) (1 :: Int)
      res `shouldBe` 3
  describe "logSomething" $
    it "works" $ do
      var <- newTVarIO ""
      let logFunc msg = atomically $ modifyTVar var (++ msg)
          msg1 = "Hello "
          msg2 = "World\n"
      runReaderT (logSomething msg1 >> logSomething msg2) logFunc
      res <- readTVarIO var
      res `shouldBe` (msg1 ++ msg2)

main :: IO ()
main = main0 >> main1

I think the distinguising features are

The application environmant, Env will contain configuraiton values (not in this example), state, envBalance, and functions we might want to vary, envLog
There is no explicit type representing the execution context
Typeclasses are used to abstract over application environment, HasLog and HasBalance
Typeclasses are used to abstract over operations, MonadBalance
Typeclasses are implemented for both the application environment, HasLog and HasBalance, and the execution context, MonadBalance

In the end this makes for code with very loose couplings; there's not really any single concrete type that implements all the constraints to work in the "real" main function (main0). I could of course introduce a type synonym for it

type App = ReaderT Env IO

but it brings no value – it wouldn't be used explicitly anywhere.

A tagless final version

In order to compare the ReaderT design pattern to tagless final (as I understand it) I made an attempt to translate the code above. The code below is the result.¹

{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeFamilies #-}

import           Control.Concurrent.STM
import qualified Control.Monad.Identity as Id
import           Control.Monad.Reader
import qualified Control.Monad.State.Strict as State
import           Say
import           Test.Hspec
import           UnliftIO (MonadUnliftIO)
import           UnliftIO.Async

newtype Env = Env {envBalance :: TVar Int}

newtype AppM a = AppM {unAppM :: ReaderT Env IO a}
  deriving (Functor, Applicative, Monad, MonadIO, MonadReader Env, MonadUnliftIO)

runAppM :: Env -> AppM a -> IO a
runAppM env app = runReaderT (unAppM app) env

class Monad m => ModifyM m where
  mModify :: (Int -> Int) -> m ()

class Monad m => LogSomethingM m where
  mLogSomething :: String -> m()

instance ModifyM AppM where
  mModify f = do
    ref <- asks envBalance
    liftIO $ atomically $ modifyTVar' ref f

instance LogSomethingM AppM where
  mLogSomething = liftIO . sayString

main0 :: IO ()
main0 = do
  ref <- newTVarIO 4
  let env = Env ref
  runAppM env
    (concurrently_
      (mModify (+ 1))
      (mLogSomething "Increasing account balance"))
  balance <- readTVarIO ref
  sayString $ "Final balance: " ++ show balance

newtype ModifyAppM a = ModifyAppM {unModifyAppM :: State.StateT Int Id.Identity a}
  deriving (Functor, Applicative, Monad, State.MonadState Int)

runModifyAppM :: Int -> ModifyAppM a -> (a, Int)
runModifyAppM s app = Id.runIdentity $ State.runStateT (unModifyAppM app) s

instance ModifyM ModifyAppM where
  mModify = State.modify'

newtype LogAppM a = LogAppM {unLogAppM :: ReaderT (TVar String) IO a}
  deriving (Functor, Applicative, Monad, MonadIO, MonadReader (TVar String))

runLogAppM :: TVar String -> LogAppM a -> IO a
runLogAppM env app = runReaderT (unLogAppM app) env

instance LogSomethingM LogAppM where
  mLogSomething msg = do
    var <- ask
    liftIO $ atomically $ modifyTVar var (++ msg)

main1 :: IO ()
main1 = hspec $ do
  describe "mModify" $ do
    it "works, IO" $ do
      var <- newTVarIO 1
      runAppM (Env var) (mModify (+ 2))
      res <- readTVarIO var
      res `shouldBe` 3
    it "works, pure" $ do
      let (_, res) = runModifyAppM 1 (mModify (+ 2))
      res `shouldBe` 3
  describe "mLogSomething" $
    it "works" $ do
      var <- newTVarIO ""
      runLogAppM var (mLogSomething "Hello" >> mLogSomething "World!")
      res <- readTVarIO var
      res `shouldBe` "HelloWorld!"

main :: IO ()
main = main0 >> main1

The steps for the "real" part of the program were

Introduce an execution type, AppM, with a convenience function for running it, runAppM
Remove the log function from the environment type, envLog in Env
Remove all the HasX classes
Create a new operations typeclass for logging, LogSomethingM
Rename the operations typeclass for modifying the balance to match the naming found in the tagless article a bit better, ModifyM
Implement instances of both operations typeclasses for AppM

For testing the steps were

Define an execution type for each test, ModifyAppM and LogAppM, with some convenience functions for running them, runModifyAppM and runLogAppM
Write instances for the operations typeclasses, one for each

So I think the distinguising features are

There's both an environment type, Env, and an execution type AppM that wraps it
The environment holds only configuration values (none in this example), and state (envBalance)
Typeclasses are used to abstract over operations, LogSomethingM and ModifyM
Typeclasses are only implemented for the execution type

This version has slightly more coupling, the execution type specifies the environment to use, and the operations are tied directly to the execution type. However, this coupling doesn't really make a big difference – looking at the pure modify test the amount of code don't differ by much.

A short note (mostly to myself)

I did write it using monad-control first, and then I needed an instance for MonadBaseControl IO. Deriving it automatically requires UndecidableInstances and I didn't really dare turn that on, so I ended up writing the instance. After some help on haskell-cafe it ended up looking like this

instance MonadBaseControl IO AppM where
  type StM AppM a = a
  liftBaseWith f = AppM (liftBaseWith $ \ run -> f (run . unAppM))
  restoreM = return

Conclusion

My theoretical knowledge isn't anywhere near good enough to say anything objectively about the difference in expressiveness of the two design patterns. That means that my conclusion comes down to taste, do you like the readerT patter or tagless final better?

I like the slightly looser coupling I get with the ReaderT pattern. Loose coupling is (almost) always a desirable goal. However, I can see that tying the typeclass instances directly to a concrete execution type results in the intent being communicated a little more clearly. Clearly communicating intent in code is also a desirable goal. In particular I suspect it'll result in more actionable error messages when making changes to the code – the error will tell me that my execution type lacks an instance of a specific typeclass, instead of it telling me that a particular transformer stack does. On the other hand, in the ReaderT pattern that stack is very shallow.

One possibility would be that one pattern is better suited for libraries and the other for applications. I don't think that's the case though as in both cases the library code would be written in a style that results in typeclass constraints on the caller and providing instances for those typeclasses is roughly an equal amount of work for both styles.

Footnotes:

Please do point out any mistakes I've made in this, in particular if they stem from me misunderstanding tagless final completely.