Posts tagged "caching":

27 Nov 2021

Fallback of actions

In a tool I'm writing I want to load a file that may reside on the local disk, but if it isn't there I want to fetch it from the web. Basically it's very similar to having a cache and dealing with a miss, except in my case I don't populate the cache.

Let me first define the functions to play with

loadFromDisk :: String -> IO (Either String Int)
loadFromDisk k@"bad key" = do
    putStrLn $ "local: " <> k
    pure $ Left $ "no such local key: " <> k
loadFromDisk k = do
    putStrLn $ "local: " <> k
    pure $ Right $ length k

loadFromWeb :: String -> IO (Either String Int)
loadFromWeb k@"bad key" = do
    putStrLn $ "web: " <> k
    pure $ Left $ "no such remote key: " <> k
loadFromWeb k = do
    putStrLn $ "web: " <> k
    pure $ Right $ length k

Discarded solution: using the Alternative of IO directly

It's fairly easy to get the desired behaviour but Alternative of IO is based on exceptions which doesn't strike me as a good idea unless one is using IO directly. That is fine in a smallish application, but in my case it makes sense to use tagless style (or ReaderT pattern) so I'll skip exploring this option completely.

First attempt: lifting into the Alternative of Either e

There's an instance of Alternative for Either e in version 0.5 of transformers. It's deprecated and it's gone in newer versions of the library as one really should use Except or ExceptT instead. Even if I don't think it's where I want to end up, it's not an altogether bad place to start.

Now let's define a function using liftA2 (<|>) to make it easy to see what the behaviour is

fallBack ::
    Applicative m =>
    m (Either String res) ->
    m (Either String res) ->
    m (Either String res)
fallBack = liftA2 (<|>)
λ> loadFromDisk "bad key" `fallBack` loadFromWeb "good key"
local: bad key
web: good key
Right 8

λ> loadFromDisk "bad key" `fallBack` loadFromWeb "bad key"
local: bad key
web: bad key
Left "no such remote key: bad key"

The first example shows that it falls back to loading form the web, and the second one shows that it's only the last failure that survives. The latter part, that only the last failure survives, isn't ideal but I think I can live with that. If I were interested in collecting all failures I would reach for Validation from validation-selective (there's one in validation that should work too).

So far so good, but the next example shows a behaviour I don't want

λ> loadFromDisk "good key" `fallBack` loadFromWeb "good key"
local: good key
web: good key
Right 8

or to make it even more explicit

λ> loadFromDisk "good key" `fallBack` undefined
local: good key
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err
  undefined, called at <interactive>:451:36 in interactive:Ghci4

There's no short-circuiting!1

The behaviour I want is of course that if the first action is successful, then the second action shouldn't take place at all.

It looks like either <|> is strict in its second argument, or maybe it's liftA2 that forces it. I've not bothered digging into the details, it's enough to observe it to realise that this approach isn't good enough.

Second attempt: cutting it short, manually

Fixing the lack of short-circuiting the evaluation after the first success isn't too difficult to do manually. Something like this does it

fallBack ::
    Monad m =>
    m (Either String a) ->
    m (Either String a) ->
    m (Either String a)
fallBack first other = do
    first >>= \case
        r@(Right _) -> pure r
        r@(Left _) -> (r <|>) <$> other

It does indeed show the behaviour I want

λ> loadFromDisk "bad key" `fallBack` loadFromWeb "good key"
local: bad key
web: good key
Right 8

λ> loadFromDisk "bad key" `fallBack` loadFromWeb "bad key"
local: bad key
web: bad key
Left "no such remote key: bad key"

λ> loadFromDisk "good key" `fallBack` undefined
local: good key
Right 8

Excellent! And to switch over to use Validation one just have to switch constructors, Right becomes Success and Left becomes Failure. Though collecting the failures by concatenating strings isn't the best idea of course. Switching to some other Monoid (that's the constraint on the failure type) isn't too difficult.

fallBack ::
    (Monad m, Monoid e) =>
    m (Validation e a) ->
    m (Validation e a) ->
    m (Validation e a)
fallBack first other = do
    first >>= \case
        r@(Success _) -> pure r
        r@(Failure _) -> (r <|>) <$> other

Third attempt: pulling failures out to MonadPlus

After writing the fallBack function I still wanted to explore other solutions. There's almost always something more out there in the Haskell eco system, right? So I asked in the #haskell-beginners channel on the Functional Programming Slack. The way I asked the question resulted in answers that iterates over a list of actions and cutting at the first success.

The first suggestion had me a little confused at first, but once I re-organised the helper function a little it made more sense to me.

mFromRight :: MonadPlus m => m (Either err res) -> m res
mFromRight = (either (const mzero) return =<<)

To use it put the actions in a list, map the helper above, and finally run asum on it all2. I think it makes it a little clearer what happens if it's rewritten like this.

firstRightM :: MonadPlus m => [m (Either err res)] -> m res
firstRightM = asum . fmap go
  where
    go m = m >>= either (const mzero) return
λ> firstRightM [loadFromDisk "bad key", loadFromWeb "good key"]
local: bad key
web: good key
8

λ> firstRightM [loadFromDisk "good key", undefined]
local: good key
8

So far so good, but I left out the case where both fail, because that's sort of the fly in the ointment here

λ> firstRightM [loadFromDisk "bad key", loadFromWeb "bad key"]
local: bad key
web: bad key
*** Exception: user error (mzero)

It's not nice to be back to deal with exceptions, but it's possible to recover, e.g. by appending <|> pure 0.

λ> firstRightM [loadFromDisk "bad key", loadFromWeb "bad key"] <|> pure 0
local: bad key
web: bad key
0

However that removes the ability to deal with the situation where all actions fail. Not nice! Add to that the difficulty of coming up with a good MonadPlus instance for an application monad; one basically have to resort to the same thing as for IO, i.e. to throw an exception. Also not nice!

Fourth attempt: wrapping in ExceptT to get its Alternative behaviour

This was another suggestion from the Slack channel, and it is the one I like the most. Again it was suggested as a way to stop at the first successful action in a list of actions.

firstRightM ::
    (Foldable t, Functor t, Monad m, Monoid err) =>
    t (m (Either err res)) ->
    m (Either err res)
firstRightM = runExceptT . asum . fmap ExceptT

Which can be used similarly to the previous one. It's also easy to write a variant of fallBack for it.

fallBack ::
    (Monad m, Monoid err) =>
    m (Either err res) ->
    m (Either err res) ->
    m (Either err res)
fallBack first other = runExceptT $ ExceptT first <|> ExceptT other
λ> loadFromDisk "bad key" `fallBack` loadFromWeb "good key"
local: bad key
web: good key
Right 8

λ> loadFromDisk "good key" `fallBack` undefined
local: good key
Right 8

λ> loadFromDisk "bad key" `fallBack` loadFromWeb "bad key"
local: bad key
web: bad key
Left "no such local key: bad keyno such remote key: bad key"

Yay! This solution has the short-circuiting behaviour I want, as well as collecting all errors on failure.

Conclusion

I'm still a little disappointed that liftA2 (<|>) isn't short-circuiting as I still think it's the easiest of the approaches. However, it's a problem that one has to rely on a deprecated instance of Alternative for Either String, but switching to use Validation would be only a minor change.

Manually writing the fallBack function, as I did in the second attempt, results in very explicit code which is nice as it often reduces the cognitive load for the reader. It's a contender, but using the deprecated Alternative instance is problematic and introducing Validition, an arguably not very common type, takes away a little of the appeal.

In the end I prefer the fourth attempt. It behaves exactly like I want and even though ExpectT lives in transformers I feel that it (I pull it in via mtl) is in such wide use that most Haskell programmers will be familiar with it.

One final thing to add is that the documentation of Validation is an excellent inspiration when it comes to the behaviour of its instances. I wish that the documentation of other packages, in particular commonly used ones like base, transformers, and mtl, would be more like it.

Comments, feedback, and questions

[2021-11-28 Sun] Dustin Sallings

Dustin sent me a comment via email a while ago, it's now March 2022 so it's taken me embarrassingly long to publish it here.

I removed a bit from the beginning of the email as it doesn't relate to this post.

… a thing I've written code for before that I was reasonably pleased with. I have a suite of software for managing my GoPro media which involves doing some metadata extraction from images and video. There will be multiple transcodings of each medium with each that contains the metadata having it completely intact (i.e., low quality encodings do not lose metadata fidelity). I also run this on multiple machines and store a cache of Metadata in S3.

Sometimes, I've already processed the metadata on another machine. Often, I can get it from the lowest quality. Sometimes, there's no metadata at all. The core of my extraction looks like this:

ms <- asum [
  Just . BL.toStrict <$> getMetaBlob mid,
  fv "mp4_low" (fn "low"),
  fv "high_res_proxy_mp4" (fn "high"),
  fv "source" (fn "src"),
  pure Nothing]

The first version grabs the processed blob from S3. The next three fetch (and process) increasingly larger variants of the uploaded media. The last one just gives up and says there's no metadata available (and memoizes that in the local DB and S3).

Some of these objects are in the tens of gigs, and I had a really bad internet connection when I first wrote this software, so I needed it to work.

Footnotes:

1

I'm not sure if it's a good term to use in this case as Wikipedia says it's for Boolean operators. I hope it's not too far a stretch to use it in this context too.

2

In the version of base I'm using there is no asum, so I simply copied the implementation from a later version:

asum :: (Foldable t, Alternative f) => t (f a) -> f a
asum = foldr (<|>) empty
Tags: alternative_typeclass caching fallback haskell
Other posts