Listing files in Haskell

As I promised earlier here’s a post on my playing with files and directories in Haskell. This was a few days ago so I’ve forgotten a few of the twists and turns that took me to the goal. Forgive me for that.

First, my goal was to list all files below a directory, recursively. I was sort of hoping to find something similar to Python’s os.walk(). No such luck!

I found out a few things.

  1. getDirectoryContents returns everything in a directory, including . and ... I needed a filter to remove them:

     isDODD f = not $ (endswith "/." f) || (endswith "/.." f)

    (At first I called it isDotOrDotDot but I like isDODD better.)

  2. I also needed to separate out the directories and files from the result of getDirectoryContents:

     listDirs = filterM doesDirectoryExist
     listFiles = filterM doesFileExist
  3. getDirectoryContents returns a list of the contents in the directory you point it to. All file names/directory names are relative to that path. That means the next thing I needed was to join paths. I first couldn’t believe that there wasn’t a function to do that. I mean, I can list contents of a directory, I can find out if something’s a file or a directory, but the most basic manipulation of paths isn’t there. At first I simply concatenated strings, but I didn’t worry about making it cross platform or anything. Then I found that Cabal comes with libraries that handles cross-platform issues properly, but that library was “closed”. After moaning asking on haskell-cafe I found FilePath. It’s even packaged for Debian here.

    FilePath.joinPath takes a list of strings to join, while I was only interested in joining two strings at a time:

     joinFN p1 p2 = joinPath [p1, p2]

Putting it all together I ended up with the following:

listFilesR :: FilePath -> IO [FilePath]
listFilesR path = let
    isDODD :: String -> Bool
    isDODD f = not $ (endswith "/." f) || (endswith "/.." f)

    listDirs :: [FilePath] -> IO [FilePath]
    listDirs = filterM doesDirectoryExist

    listFiles :: [FilePath] -> IO [FilePath]
    listFiles = filterM doesFileExist

    joinFN :: String -> String -> FilePath
    joinFN p1 p2 = joinPath [p1, p2]

    in do
        allfiles <- getDirectoryContents path
        no_dots <- filterM (return . isDODD) (map (joinFN path) allfiles)
        dirs <- listDirs no_dots
        subdirfiles <- (mapM listFilesR dirs >>= return . concat)
        files <- listFiles no_dots
        return $ files ++ subdirfiles

Magnus

Greg, thanks for the pointer. I really should start looking at MissingH after I make my own implementations. If I do it before I miss a learning opportunity :-)

Magnus

Taking Greg’s excellent advise I’ve now replaced my hand-crafted function from my original post with the following:

listFilesR :: FilePath -> IO [FilePath]
listFilesR path = do
    cur_path <- getCurrentDirectory
    files <- recurseDir SystemFS $ normalise $ combine cur_path path
    filterM doesFileExist files

A lot shorter and nicer :-)

Paul

Very helpful; but how/where is endswith defined?

Magnus

@Paul, endsWith is in MissingH here.

Leave a comment