Optparse-applicative and custom argument parsers

The latest update of optparse-applicative triggered me to look over the functions in cblrepo for parsing a few custom command line options. I used to do the parsing in a rather ad-hoc way with lots of use of list functions to split on specific characters. For instance, some option values are pairs of package name and version separated by a comma: PKG-NAME,VERSION. This worked fine and was easy to plug into version 0.10 of optparse-applicative. It was also easily extended to triples, PKG-NAME,VERSION,RELEASE, but it started feeling a bit brittle when some tuples got extended with an optional list of flag assignments, PKG-NAME,VERSION[:FLAG,FLAG,FLAG,...]. The recent release of version 0.11 of optparse-applicative changed the API for custom option value parsers radically; instead of passing a string to the parser, the parser has to use readerAsk to get the string. In short, ReaderM turned into a state monad.

In adjusting to the new API I noticed that the code was organised in such a way that some low-level parsing functions were used directly from command line option definitions, while also being used as building blocks for the more complex parsers. This of course meant that the structuring of the functions needed to be changed completely to deal with the API change.

It turns out there already was a parser that was written in a different style (here already adjusted to the 0.11 API):

readerGhcVersion :: ReadM Version
readerGhcVersion =
    arg <- readerAsk
    case lastMay $ readP_to_S parseVersion arg of
        Just (v, "") -> return v
        _ -> fail $ "cannot parse value `" ++ arg ++ "`"

So I rewrote the rest of the parsers in a similar style. The arguably most complicated is this one:

readPkgNVersion :: ReadP (String, Version)
readPkgNVersion = do
    n <- many (satisfy (/= ','))
    char ','
    v <- parseVersion
    return (n, v)

readFlag :: ReadP (FlagName, Bool)
readFlag = readNegFlag <++ readPosFlag
        readNegFlag = do
            char '-'
            n <- many (satisfy (/= ','))
            return (FlagName n, False)

        readPosFlag = do
            n0 <- get
            n <- many (satisfy (/= ','))
            return (FlagName (n0 : n), True)

strCblPkgArgReader :: ReadM (String, Version, FlagAssignment)
strCblPkgArgReader = let
        readWithFlags = do
            (n, v) <- readPkgNVersion
            char ':'
            fas <- sepBy readFlag (char ',')
            return (n, v, fas)

        readWithoutFlags = do
            (n, v) <- readPkgNVersion
            return (n, v, [])

    in do
        s <- readerAsk
        case lastMay (readP_to_S (readWithFlags <++ readWithoutFlags) s) of
            Just (r, "") -> return r
            _ -> fail $ "Cannot parse: " ++ s

It is slightly longer, but it’s rather a lot easier to read what’s happening after this rewrite. ReadP feels like a lighter option than pulling in parsec as a dependency, but I’d love to hear any comments or suggestions, as well as pointers to how other people deal with parsing of non-trivial types of arguments in combination with optparse-applicative.

Leave a comment