Is this a good way to do JSON validation?
- Magnus Therning
At work, where we use Clojure, we’ve been improving our error messages in the public API to
- return as many errors as possible in a response, and
- be in humanly readable English.
If one adopts spec as we have one gets the former for free, but the output of spec
can hardly be called humanly readable. For the latter part we chose to use phrase.
Given that I’d like to see Haskell used more at work (currently there are 2 minor services written in Haskell and around a score in Clojure) I thought I’d take a look at JSON validation in Haskell. I ended up beig less than impressed. We have at least one great library for parsing JSON, aeson, but there are probably a few more that I haven’t noticed. It’s of course possible to mix in validation with the parsing, but since parsers, and this is true for aeson
’s parser too, tend to be monads and that means that item 1 above, finding as many errors as possible, isn’t on the table.
A quick look at Hackage gave that
- there is a package called aeson-better-errors that looked promising but didn’t fi my needs (I explain at the end why it isn’t passing muster)
- the support for JSON Schema is very lacking in Haskell, hjsonschema is deprecated and aeson-schema only supports version 3 of the draft (the current version is 7) and the authors claim that that
hjsonschema
is more moderna and more actively maintained
So, a bit disappointed I started playing with the problem myself and found that, just as is stated in the description of the validation library, I want something that’s isomorphic to Either
but accumulates on the error side. That is, something like
data JSONValidationResult = JVRInvalid [JSONValidationFailure]
| JVRValid
deriving (Eq, Show)
instance Semigroup JSONValidationResult where
JVRInvalid es0) <> (JVRInvalid es1) = JVRInvalid $ es0 <> es1
(JVRValid <> r = r
<> JVRValid = r r
I decided it was all right to limit validation to proper JSON expressions, i.e. a validator could have the type Value -> JSONValidationResult
. I want to combine validators so I decided to wrap it in a newtype
and write a SemiGroup
instance for it as well:
newtype JSONValidator = JV (A.Value -> JSONValidationResult)
instance Semigroup JSONValidator where
JV v0) <> (JV v1) = JV $ \ val -> v0 val <> v1 val (
The function to actually run the validation is rather straight forward
JV validator) val = validator val runJSONValidator (
After writing a few validators I realised a few patterns emerged and the following functions simplified things a bit:
JVRValid = JVRValid
mapInvalid _ JVRInvalid es) = JVRInvalid $ map f es
mapInvalid f (
= JVRValid
valid = JVRInvalid [JVFDesc s] invalid s
With this in place I started writing validators for the basic JSON types:
= JV go
isNumber where
A.Number _) = valid
go (= invalid "not a number"
go _
= JV go
isString where
A.String _) = valid
go (= invalid "not a string"
go _
= JV go
isBool where
A.Bool _) = valid
go (= invalid "not a bool"
go _
= JV go
isNull where
A.Null = valid
go = invalid "not 'null'" go _
The number type in JSON is a float (well, in aeson
it’s a Scientific
), so to check for an integer a bit more than the above is needed
= JV go
isInt where
A.Number i) = if i == fromInteger (round i)
go (then valid
else invalid "not an integer"
= invalid "not an integer" go _
as well as functions that check for the presence of a specific key
= JV go
reqKey n v where
A.Object obj) = case HM.lookup n obj of
go (Nothing -> invalid $ "required key '" <> n <> "' is missing"
Just val -> mapInvalid (JVFPath n) $ runJSONValidator v val
= invalid "not an object"
go _
= JV go
optKey n v where
A.Object obj) = case HM.lookup n obj of
go (Nothing -> valid
Just val -> mapInvalid (JVFPath n) $ runJSONValidator v val
= invalid "not an object" go _
With this in place I can now create a validator for a person with a name and an age:
= reqKey "name" isString <>
vPerson "age" isInt reqKey
and run it on a Value
:
*> runJSONValidator vPerson <$> (decode "{\"name\": \"Alice\", \"age\": 32}" :: Maybe Value)
Just JVRValid
and all failures are picked up
*> runJSONValidator vPerson <$> (decode "{\"name\": \"Alice\", \"age\": \"foo\"}" :: Maybe Value)
Just (JVRInvalid [JVFPath "age" (JVFDesc "not an integer")])
*>runJSONValidator vPerson <$> (decode "{\"name\": \"Alice\"}" :: Maybe Value)
Just (JVRInvalid [JVFDesc "required key 'age' is missing"])
<$> (decode "{\"nam\": \"Alice\"}" :: Maybe Value)
runJSONValidator vPerson Just (JVRInvalid [JVFDesc "required key 'name' is missing",JVFDesc "required key 'age' is missing"])
Reflections
- I quickly realised I wanted slightly more complex validation of course, so all the validators for basic JSON types above have a version taking a custom validator of type
a -> JSONValidationResult
(wherea
is the Haskell type contained in the particulareValue
). - I started out thinking that I want an
Applicative
for my validations, but slowly I relaxed that toSemiGroup
. I’m still not sure about this decision, because I can see a real use of or which I don’t really have now. Maybe that means I should switch back towardsApplicative
, just so I can implement anAlternative
instance for validators. - Well, I simply don’t know if this is even a good way to implement validators. I’d love to hear suggestions both for improvements and for completely different ways of tackling the problems.
- I would love to find out that there already is a library that does all this in a much better way. Please point me in its direction!
Appendix: A look at aeson-better-errors
The issue with aeson-better-errors
is easiest to illustrate using the same example as in its announcement:
{-# LANGUAGE OverloadedStrings #-}
module Play where
import Data.Aeson
import Data.Aeson.BetterErrors
data Person = Person String Int
deriving (Show)
asPerson :: Parse e Person
= Person <$> key "name" asString <*> key "age" asIntegral asPerson
and with this loaded in GHCi (and make sure to either pass -XOverloadedStrings
on the command line, or :set -XOverloadedStrings
in GHCi itself)
*> parse asPerson "{\"name\": \"Alice\", \"age\": 32}"
Right (Person "Alice" 32)
*> parse asPerson "{\"name\": \"Alice\"}"
Left (BadSchema [] (KeyMissing "age"))
*> parse asPerson "{\"nam\": \"Alice\"}"
Left (BadSchema [] (KeyMissing "name"))
Clearly aeson-better-errors
isn’t fulfilling the bit about reporting as many errors as possible. Something that I would have realised right away if I had bothered reading its API reference on Hackage a bit more carefully, the parser type ParseT
is an instance of Monad
!