14 Jul 2018

QuickCheck on a REST API

Since I'm working with web stuff nowadays I thought I'd play a little with translating my old post on using QuickCheck to test C APIs to the web.

The goal and how to reach it

I want to use QuickCheck to test a REST API, just like in the case of the C API the idea is to

generate a sequence of API calls (a program), then
run the sequence against a model, as well as
run the sequence against the web service, and finally
compare the resulting model against reality.

The REST API

I'll use a small web service I'm working on, and then concentrate on only a small part of the API to begin with.

The parts of the API I'll use for the programs at this stage are

Method	Route	Example in	Example out
`POST`	`/users`	`{"userId": 0, "userName": "Yogi Berra"}`	`{"userId": 42, "userName": "Yogi Berra"}`
`DELETE`	`/users/:id`

The following API calls will also be used, but not in the programs

Method	Route	Example out
`GET`	`/users`	`[0,3,7]`
`GET`	`/users/:id`	`{"userId": 42, "userName": "Yogi Berra"}`
`POST`	`/reset`

Representing API calls

Given the information about the API above it seems the following is enough to represent the two calls of interest together with a constructor representing the end of a program

data ApiCall = AddUser Text
             | DeleteUser Int
             | EndProgram
             deriving (Show)

and a program is just a sequence of calls, so list of ApiCall will do. However, since I want to generate sequences of calls, i.e. implement Arbitrary, I'll wrap it in a newtype

newtype Program = Prog [ApiCall]

Running against a model (simulation)

First of all I need to decide what model to use. Based on the part of the API I'm using I'll use an ordinary dictionary of Int and Text

type Model = M.Map Int Text

Simulating execution of a program is simulating each call against a model that's updated with each step. I expect the final model to correspond to the state of the real service after the program is run for real. The simulation begins with an empty dictionary.

simulateProgram :: Program -> Model
simulateProgram (Prog cs) = foldl simulateCall M.empty cs

The simulation of the API calls must then be a function taking a model and a call, returning an updated model

simulateCall :: Model -> ApiCall -> Model
simulateCall m (AddUser t) = M.insert k t m
  where
    k = succ $ foldl max 0 (M.keys m)
simulateCall m (DeleteUser k) = M.delete k m
simulateCall m EndProgram = m

Here I have to make a few assumptions. First, I assume the indeces for the users start on 1. Second, that the next index used always is the successor of highest currently used index. We'll see how well this holds up to reality later on.

Running against the web service

Running the program against the actual web service follows the same pattern, but here I'm dealing with the real world, so it's a little more messy, i.e. IO is involved. First the running of a single call

runCall :: Manager -> ApiCall -> IO ()
runCall mgr (AddUser t) = do
  ireq <- parseRequest "POST http://localhost:3000/users"
  let req = ireq { requestBody = RequestBodyLBS (encode $ User 0 t)}
  resp <- httpLbs req mgr
  guard (status201 == responseStatus resp)

runCall mgr (DeleteUser k) = do
  req <- parseRequest $ "DELETE http://localhost:3000/users/" ++ show k
  resp <- httpNoBody req mgr
  guard (status200 == responseStatus resp)

runCall _ EndProgram = return ()

The running of a program is slightly more involved. Of course I have to set up the Manager needed for the HTTP calls, but I also need to

ensure that the web service is in a well-known state before starting, and
extract the state of the web service after running the program, so I can compare it to the model

runProgram :: Program -> IO Model
runProgram (Prog cs) = do
  mgr <- newManager defaultManagerSettings
  resetReq <- parseRequest "POST http://localhost:3000/reset"
  httpNoBody resetReq mgr
  mapM_ (runCall mgr) cs
  model <- extractModel mgr
  return model

The call to POST /reset resets the web service. I would have liked to simply restart the service completely, but I failed in automating it. I think I'll have to take a closer look at the implementation of scotty to find a way.

Extracting the web service state and packaging it in a Model is a matter of calling GET /users and then repeatedly calling GET /users/:id with each id gotten from the first call

extractModel :: Manager -> IO Model
extractModel mgr = do
  req <- parseRequest "http://localhost:3000/users"
  resp <- httpLbs req mgr
  let (Just ids) = decode (responseBody resp) :: Maybe [Int]
  users <- forM ids $ \ id -> do
    req <- parseRequest $ "http://localhost:3000/users/" ++ show id
    resp <- httpLbs req mgr
    let (Just (user:_)) = decode (responseBody resp) :: Maybe [User]
    return user
  return $ foldl (\ map (User id name) -> M.insert id name map) M.empty users

Generating programs

My approach to generating a program is based on the idea that given a certain state there is only a limited number of possible calls that make sense. Given a model m it makes sense to make one of the following calls:

add a new user
delete an existing user
end the program

Based on this writing genProgram is rather straight forward

genProgram :: Gen Program
genProgram = Prog <$> go M.empty
  where
    possibleAddUser _ = [AddUser <$> arbitrary]
    possibleDeleteUser m = map (return . DeleteUser) (M.keys m)
    possibleEndProgram _ = [return EndProgram]

    go m = do
      let possibles = possibleDeleteUser m ++ possibleAddUser m ++ possibleEndProgram m
      s <- oneof possibles
      let m' = simulateCall m s
      case s of
        EndProgram -> return []
        _ -> (s:) <$> go m'

Armed with that the Arbitrary instance for Program can be implemented as¹

instance Arbitrary Program where
  arbitrary = genProgram
  shrink p = []

The property of an API

The steps in the first section can be used as a recipe for writing the property

prop_progCorrectness :: Program -> Property
prop_progCorrectness program = monadicIO $ do
  let simulatedModel = simulateProgram program
  runModel <- run $ runProgram program
  assert $ simulatedModel == runModel

What next?

There are some improvements that I'd like to make:

Make the generation of Program better in the sense that the programs become longer. I think this is important as I start tackling larger APIs.
Write an implementation of shrink for Program. With longer programs it's of course more important to actually implement shrink.

I'd love to hear if others are using QuickCheck to test REST APIs in some way, if anyone has suggestions for improvements, and of course ideas for how to implement shrink in a nice way.

Footnotes:

Yes, I completely skip the issue of shrinking programs at this point. This is OK at this point though, because the generated =Programs=s do end up to be very short indeed.