The ReaderT design pattern or tagless final?
The other week I read V. Kevroletin's Introduction to Tagless Final and realised that a couple of my projects, both at work and at home, would benefit from a refactoring to that approach. All in all I was happy with the changes I made, even though I haven't made use of all the way. In particular there I could further improve the tests in a few places by adding more typeclasses. For now it's good enough and I've clearly gotten some value out of it.
I found mr. Kevroletin's article to be a good introduction so I've been passing it on when people on the Functional programming slack bring up questions about how to organize their code as applications grow. In particular if they mention that they're using monad transformers. I did exactly that just the other day @solomon wrote
so i've created a rats nest of IO where almost all the functions in my program are in
ReaderT Env IO ()
and I'm not sure how to purify everything and move the IO to the edge of the program
I proposed tagless final and passed the URL on, and then I got a pointer to the article The ReaderT Design Patter which I hadn't seen before.
The two approches are similar, at least to me, and I can't really judge if one's
better than the other. Just to get a feel for it I thought I'd try to rewrite
the example in the ReaderT
article in a tagless final style.
A slightly changed example of ReaderT
design pattern
I decided to make a few changes to the example in the article:
- I removed the
modify
function, instead the code uses the typeclass functionmodifyBalance
directly. - I separated the instances needed for the tests spatially in the code just to make it easier to see what's "production" code and what's test code.
- I combined the
main
functions from the various examples to that both an example (main0
) and the test (main1
) are run. - I switched from
Control.Concurrent.Async.Lifted.Safe
(frommonad-control
) toUnliftIO.Async
(fromunliftio
)
After that the code looks like this
{-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE FlexibleInstances #-} import Control.Concurrent.STM import Control.Monad.Reader import qualified Control.Monad.State.Strict as State import Say import Test.Hspec import UnliftIO.Async data Env = Env { envLog :: !(String -> IO ()) , envBalance :: !(TVar Int) } class HasLog a where getLog :: a -> (String -> IO ()) instance HasLog Env where getLog = envLog class HasBalance a where getBalance :: a -> TVar Int instance HasBalance Env where getBalance = envBalance class Monad m => MonadBalance m where modifyBalance :: (Int -> Int) -> m () instance (HasBalance env, MonadIO m) => MonadBalance (ReaderT env m) where modifyBalance f = do env <- ask liftIO $ atomically $ modifyTVar' (getBalance env) f logSomething :: (MonadReader env m, HasLog env, MonadIO m) => String -> m () logSomething msg = do env <- ask liftIO $ getLog env msg main0 :: IO () main0 = do ref <- newTVarIO 4 let env = Env { envLog = sayString , envBalance = ref } runReaderT (concurrently_ (modifyBalance (+ 1)) (logSomething "Increasing account balance")) env balance <- readTVarIO ref sayString $ "Final balance: " ++ show balance instance HasLog (String -> IO ()) where getLog = id instance HasBalance (TVar Int) where getBalance = id instance Monad m => MonadBalance (State.StateT Int m) where modifyBalance = State.modify main1 :: IO () main1 = hspec $ do describe "modify" $ do it "works, IO" $ do var <- newTVarIO (1 :: Int) runReaderT (modifyBalance (+ 2)) var res <- readTVarIO var res `shouldBe` 3 it "works, pure" $ do let res = State.execState (modifyBalance (+ 2)) (1 :: Int) res `shouldBe` 3 describe "logSomething" $ it "works" $ do var <- newTVarIO "" let logFunc msg = atomically $ modifyTVar var (++ msg) msg1 = "Hello " msg2 = "World\n" runReaderT (logSomething msg1 >> logSomething msg2) logFunc res <- readTVarIO var res `shouldBe` (msg1 ++ msg2) main :: IO () main = main0 >> main1
I think the distinguising features are
- The application environmant,
Env
will contain configuraiton values (not in this example), state,envBalance
, and functions we might want to vary,envLog
- There is no explicit type representing the execution context
- Typeclasses are used to abstract over application environment,
HasLog
andHasBalance
- Typeclasses are used to abstract over operations,
MonadBalance
- Typeclasses are implemented for both the application environment,
HasLog
andHasBalance
, and the execution context,MonadBalance
In the end this makes for code with very loose couplings; there's not really any
single concrete type that implements all the constraints to work in the "real"
main function (main0
). I could of course introduce a type synonym for it
type App = ReaderT Env IO
but it brings no value – it wouldn't be used explicitly anywhere.
A tagless final version
In order to compare the ReaderT
design pattern to tagless final (as I
understand it) I made an attempt to translate the code above. The code below is
the result.1
{-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE TypeFamilies #-} import Control.Concurrent.STM import qualified Control.Monad.Identity as Id import Control.Monad.Reader import qualified Control.Monad.State.Strict as State import Say import Test.Hspec import UnliftIO (MonadUnliftIO) import UnliftIO.Async newtype Env = Env {envBalance :: TVar Int} newtype AppM a = AppM {unAppM :: ReaderT Env IO a} deriving (Functor, Applicative, Monad, MonadIO, MonadReader Env, MonadUnliftIO) runAppM :: Env -> AppM a -> IO a runAppM env app = runReaderT (unAppM app) env class Monad m => ModifyM m where mModify :: (Int -> Int) -> m () class Monad m => LogSomethingM m where mLogSomething :: String -> m() instance ModifyM AppM where mModify f = do ref <- asks envBalance liftIO $ atomically $ modifyTVar' ref f instance LogSomethingM AppM where mLogSomething = liftIO . sayString main0 :: IO () main0 = do ref <- newTVarIO 4 let env = Env ref runAppM env (concurrently_ (mModify (+ 1)) (mLogSomething "Increasing account balance")) balance <- readTVarIO ref sayString $ "Final balance: " ++ show balance newtype ModifyAppM a = ModifyAppM {unModifyAppM :: State.StateT Int Id.Identity a} deriving (Functor, Applicative, Monad, State.MonadState Int) runModifyAppM :: Int -> ModifyAppM a -> (a, Int) runModifyAppM s app = Id.runIdentity $ State.runStateT (unModifyAppM app) s instance ModifyM ModifyAppM where mModify = State.modify' newtype LogAppM a = LogAppM {unLogAppM :: ReaderT (TVar String) IO a} deriving (Functor, Applicative, Monad, MonadIO, MonadReader (TVar String)) runLogAppM :: TVar String -> LogAppM a -> IO a runLogAppM env app = runReaderT (unLogAppM app) env instance LogSomethingM LogAppM where mLogSomething msg = do var <- ask liftIO $ atomically $ modifyTVar var (++ msg) main1 :: IO () main1 = hspec $ do describe "mModify" $ do it "works, IO" $ do var <- newTVarIO 1 runAppM (Env var) (mModify (+ 2)) res <- readTVarIO var res `shouldBe` 3 it "works, pure" $ do let (_, res) = runModifyAppM 1 (mModify (+ 2)) res `shouldBe` 3 describe "mLogSomething" $ it "works" $ do var <- newTVarIO "" runLogAppM var (mLogSomething "Hello" >> mLogSomething "World!") res <- readTVarIO var res `shouldBe` "HelloWorld!" main :: IO () main = main0 >> main1
The steps for the "real" part of the program were
- Introduce an execution type,
AppM
, with a convenience function for running it,runAppM
- Remove the log function from the environment type,
envLog
inEnv
- Remove all the
HasX
classes - Create a new operations typeclass for logging,
LogSomethingM
- Rename the operations typeclass for modifying the balance to match the naming
found in the tagless article a bit better,
ModifyM
- Implement instances of both operations typeclasses for
AppM
For testing the steps were
- Define an execution type for each test,
ModifyAppM
andLogAppM
, with some convenience functions for running them,runModifyAppM
andrunLogAppM
- Write instances for the operations typeclasses, one for each
So I think the distinguising features are
- There's both an environment type,
Env
, and an execution typeAppM
that wraps it - The environment holds only configuration values (none in this example), and
state (
envBalance
) - Typeclasses are used to abstract over operations,
LogSomethingM
andModifyM
- Typeclasses are only implemented for the execution type
This version has slightly more coupling, the execution type specifies the environment to use, and the operations are tied directly to the execution type. However, this coupling doesn't really make a big difference – looking at the pure modify test the amount of code don't differ by much.
A short note (mostly to myself)
I did write it using monad-control
first, and then I needed an instance for
MonadBaseControl IO
. Deriving it automatically requires UndecidableInstances
and I didn't really dare turn that on, so I ended up writing the instance. After
some help on haskell-cafe it ended up looking like this
instance MonadBaseControl IO AppM where type StM AppM a = a liftBaseWith f = AppM (liftBaseWith $ \ run -> f (run . unAppM)) restoreM = return
Conclusion
My theoretical knowledge isn't anywhere near good enough to say anything
objectively about the difference in expressiveness of the two design patterns.
That means that my conclusion comes down to taste, do you like the readerT
patter or tagless final better?
I like the slightly looser coupling I get with the ReaderT
pattern. Loose
coupling is (almost) always a desirable goal. However, I can see that tying the
typeclass instances directly to a concrete execution type results in the intent
being communicated a little more clearly. Clearly communicating intent in code
is also a desirable goal. In particular I suspect it'll result in more
actionable error messages when making changes to the code – the error will tell
me that my execution type lacks an instance of a specific typeclass, instead of
it telling me that a particular transformer stack does. On the other hand, in
the ReaderT
pattern that stack is very shallow.
One possibility would be that one pattern is better suited for libraries and the other for applications. I don't think that's the case though as in both cases the library code would be written in a style that results in typeclass constraints on the caller and providing instances for those typeclasses is roughly an equal amount of work for both styles.
Footnotes:
Please do point out any mistakes I've made in this, in particular if they stem from me misunderstanding tagless final completely.