In a tool I'm writing I want to load a file that may reside on the local disk, but if it isn't there I want to fetch it from the web. Basically it's very similar to having a cache and dealing with a miss, except in my case I don't populate the cache.
Let me first define the functions to play with
loadFromDisk :: String -> IO (Either String Int) loadFromDisk k@"bad key" = do putStrLn $ "local: " <> k pure $ Left $ "no such local key: " <> k loadFromDisk k = do putStrLn $ "local: " <> k pure $ Right $ length k loadFromWeb :: String -> IO (Either String Int) loadFromWeb k@"bad key" = do putStrLn $ "web: " <> k pure $ Left $ "no such remote key: " <> k loadFromWeb k = do putStrLn $ "web: " <> k pure $ Right $ length k
Discarded solution: using the
It's fairly easy to get the desired behaviour but
IO is based
on exceptions which doesn't strike me as a good idea unless one is using
directly. That is fine in a smallish application, but in my case it makes sense
to use tagless style (or
ReaderT pattern) so I'll skip exploring this option
First attempt: lifting into the
There's an instance of
Either e in version 0.5 of
transformers. It's deprecated and it's gone in newer versions of the library as
one really should use
ExceptT instead. Even if I don't think it's
where I want to end up, it's not an altogether bad place to start.
Now let's define a function using
liftA2 (<|>) to make it easy to see what the
fallBack :: Applicative m => m (Either String res) -> m (Either String res) -> m (Either String res) fallBack = liftA2 (<|>)
λ> loadFromDisk "bad key" `fallBack` loadFromWeb "good key" local: bad key web: good key Right 8 λ> loadFromDisk "bad key" `fallBack` loadFromWeb "bad key" local: bad key web: bad key Left "no such remote key: bad key"
The first example shows that it falls back to loading form the web, and the
second one shows that it's only the last failure that survives. The latter part,
that only the last failure survives, isn't ideal but I think I can live with
that. If I were interested in collecting all failures I would reach for
validation-selective (there's one in
should work too).
So far so good, but the next example shows a behaviour I don't want
λ> loadFromDisk "good key" `fallBack` loadFromWeb "good key" local: good key web: good key Right 8
or to make it even more explicit
λ> loadFromDisk "good key" `fallBack` undefined local: good key *** Exception: Prelude.undefined CallStack (from HasCallStack): error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err undefined, called at <interactive>:451:36 in interactive:Ghci4
There's no short-circuiting!1
The behaviour I want is of course that if the first action is successful, then the second action shouldn't take place at all.
It looks like either
<|> is strict in its second argument, or maybe it's
liftA2 that forces it. I've not bothered digging into the details, it's enough
to observe it to realise that this approach isn't good enough.
Second attempt: cutting it short, manually
Fixing the lack of short-circuiting the evaluation after the first success isn't too difficult to do manually. Something like this does it
fallBack :: Monad m => m (Either String a) -> m (Either String a) -> m (Either String a) fallBack first other = do first >>= \case r@(Right _) -> pure r r@(Left _) -> (r <|>) <$> other
It does indeed show the behaviour I want
λ> loadFromDisk "bad key" `fallBack` loadFromWeb "good key" local: bad key web: good key Right 8 λ> loadFromDisk "bad key" `fallBack` loadFromWeb "bad key" local: bad key web: bad key Left "no such remote key: bad key" λ> loadFromDisk "good key" `fallBack` undefined local: good key Right 8
Excellent! And to switch over to use
Validation one just have to switch
collecting the failures by concatenating strings isn't the best idea of course.
Switching to some other
Monoid (that's the constraint on the failure type)
isn't too difficult.
fallBack :: (Monad m, Monoid e) => m (Validation e a) -> m (Validation e a) -> m (Validation e a) fallBack first other = do first >>= \case r@(Success _) -> pure r r@(Failure _) -> (r <|>) <$> other
Third attempt: pulling failures out to
After writing the
fallBack function I still wanted to explore other solutions.
There's almost always something more out there in the Haskell eco system, right?
So I asked in the #haskell-beginners channel on the Functional Programming
Slack. The way I asked the question resulted in answers that iterates over a
list of actions and cutting at the first success.
The first suggestion had me a little confused at first, but once I re-organised the helper function a little it made more sense to me.
mFromRight :: MonadPlus m => m (Either err res) -> m res mFromRight = (either (const mzero) return =<<)
To use it put the actions in a list, map the helper above, and finally run
asum on it all2. I think it makes it a little clearer what happens if
it's rewritten like this.
firstRightM :: MonadPlus m => [m (Either err res)] -> m res firstRightM = asum . fmap go where go m = m >>= either (const mzero) return
λ> firstRightM [loadFromDisk "bad key", loadFromWeb "good key"] local: bad key web: good key 8 λ> firstRightM [loadFromDisk "good key", undefined] local: good key 8
So far so good, but I left out the case where both fail, because that's sort of the fly in the ointment here
λ> firstRightM [loadFromDisk "bad key", loadFromWeb "bad key"] local: bad key web: bad key *** Exception: user error (mzero)
It's not nice to be back to deal with exceptions, but it's possible to recover,
e.g. by appending
<|> pure 0.
λ> firstRightM [loadFromDisk "bad key", loadFromWeb "bad key"] <|> pure 0 local: bad key web: bad key 0
However that removes the ability to deal with the situation where all actions
fail. Not nice! Add to that the difficulty of coming up with a good
MonadPlus instance for an application monad; one basically have to resort to
the same thing as for
IO, i.e. to throw an exception. Also not nice!
Fourth attempt: wrapping in
ExceptT to get its
This was another suggestion from the Slack channel, and it is the one I like the most. Again it was suggested as a way to stop at the first successful action in a list of actions.
firstRightM :: (Foldable t, Functor t, Monad m, Monoid err) => t (m (Either err res)) -> m (Either err res) firstRightM = runExceptT . asum . fmap ExceptT
Which can be used similarly to the previous one. It's also easy to write a
fallBack for it.
fallBack :: (Monad m, Monoid err) => m (Either err res) -> m (Either err res) -> m (Either err res) fallBack first other = runExceptT $ ExceptT first <|> ExceptT other
λ> loadFromDisk "bad key" `fallBack` loadFromWeb "good key" local: bad key web: good key Right 8 λ> loadFromDisk "good key" `fallBack` undefined local: good key Right 8 λ> loadFromDisk "bad key" `fallBack` loadFromWeb "bad key" local: bad key web: bad key Left "no such local key: bad keyno such remote key: bad key"
Yay! This solution has the short-circuiting behaviour I want, as well as collecting all errors on failure.
I'm still a little disappointed that
liftA2 (<|>) isn't short-circuiting as I
still think it's the easiest of the approaches. However, it's a problem that one
has to rely on a deprecated instance of
but switching to use
Validation would be only a minor change.
Manually writing the
fallBack function, as I did in the second attempt,
results in very explicit code which is nice as it often reduces the cognitive
load for the reader. It's a contender, but using the deprecated
instance is problematic and introducing
Validition, an arguably not very
common type, takes away a little of the appeal.
In the end I prefer the fourth attempt. It behaves exactly like I want and even
ExpectT lives in transformers I feel that it (I pull it in via mtl)
is in such wide use that most Haskell programmers will be familiar with it.
One final thing to add is that the documentation of
Validation is an excellent
inspiration when it comes to the behaviour of its instances. I wish that the
documentation of other packages, in particular commonly used ones like base,
transformers, and mtl, would be more like it.
Comments, feedback, and questions
Dustin sent me a comment via email a while ago, it's now March 2022 so it's taken me embarrassingly long to publish it here.
I removed a bit from the beginning of the email as it doesn't relate to this post.
… a thing I've written code for before that I was reasonably pleased with. I have a suite of software for managing my GoPro media which involves doing some metadata extraction from images and video. There will be multiple transcodings of each medium with each that contains the metadata having it completely intact (i.e., low quality encodings do not lose metadata fidelity). I also run this on multiple machines and store a cache of Metadata in S3.
Sometimes, I've already processed the metadata on another machine. Often, I can get it from the lowest quality. Sometimes, there's no metadata at all. The core of my extraction looks like this:
ms <- asum [ Just . BL.toStrict <$> getMetaBlob mid, fv "mp4_low" (fn "low"), fv "high_res_proxy_mp4" (fn "high"), fv "source" (fn "src"), pure Nothing]
The first version grabs the processed blob from S3. The next three fetch (and process) increasingly larger variants of the uploaded media. The last one just gives up and says there's no metadata available (and memoizes that in the local DB and S3).
Some of these objects are in the tens of gigs, and I had a really bad internet connection when I first wrote this software, so I needed it to work.
I'm not sure if it's a good term to use in this case as Wikipedia says it's for Boolean operators. I hope it's not too far a stretch to use it in this context too.
In the version of base I'm using there is no
asum, so I simply copied
the implementation from a later version:
asum :: (Foldable t, Alternative f) => t (f a) -> f a asum = foldr (<|>) empty