17 Feb 2019

Choosing a conduit randomly

Lately I've been playing around conduit. One thing I wanted to try out was to set up processing where one processing step was chosen on random from a number of components, based on weights. In short I guess I wanted a function with a type something like this

foo :: [(Int, ConduitT i o m r)] -> ConduitT i o m r

I have to admit I don't even know where to start writing such a function1 but after a little bit of thinking I realised I could get the same effect by controlling how chunks of data is routed. That is, instead of choosing a component randomly, I can choose a route randomly. It would look something like when choosing from three components

                        +---------+   +----------+   +-------------+
                        | Filter  |   | Drop tag |   | Component A |
                    +-->| Value-0 |-->|          |-->|             |--+
                    |   +---------+   +----------+   +-------------+  |
+----------------+  |   +---------+   +----------+   +-------------+  |
| Choose random  |  |   | Filter  |   | Drop tag |   | Component B |  |
| value based on +----->| Value-1 |-->|          |-->|             |----->
| weights        |  |   +---------+   +----------+   +-------------+  |
+----------------+  |   +---------+   +----------+   +-------------+  |
                    |   | Filter  |   | Drop tag |   | Component C |  |
                    +-->| Value-2 |-->|          |-->|             |--+
                        +---------+   +----------+   +-------------+

That is

  1. For each chunk that comes in, choose a value randomly based on weights and tag the chunk with the choosen value, then
  2. split the processing into one route for each component,
  3. in each route filter out chunks tagged with a single value, and
  4. remove the tag, then
  5. pass the chunk to the component, and finally
  6. bring the routes back together again.

Out of these steps all but the very first one are already available in conduit:

What's left is the beginning. I started with a function to pick a value on random based on weights2

pickByWeight :: [(Int, b)] -> IO b
pickByWeight xs = randomRIO (1, tot) >>= \ n -> return (pick n xs)
  where
    tot = sum $ map fst xs

    pick n ((k, x):xs)
      | n <= k = x
      | otherwise = pick (n - k) xs
    pick _ _ = error "pick error"

Using that I then made a component that tags chunks

picker ws = do
  evt <- await
  case evt of
    Nothing -> return ()
    Just e -> do
      p <- liftIO $ pickByWeight ws
      yield (p, e)
      picker ws

I was rather happy with this…

@snoyberg just have to let you know, conduit is a joy to use. Thanks for sharing it.

– Magnus Therning (@magthe) February 6, 2019

Footnotes:

1

Except maybe by using Template Haskell to generate the code I did come up with.

2

I used Quickcheck's frequency as inspiration for writing it.

Tags: haskell conduit