Joe Ray

Principal Engineer

Analysing Oyster data dumps with Clojure

Having moved to London last December, I’m using my Oyster card a lot more than I used to. In March I discovered that if you’ve registered your Oyster card with TfL you can get them to send you CSV dumps of your usage. And what’s a programmer to do with CSV dumps if not analyse them somehow?

I’ve been interested in Clojure for a while and started learning it just before Christmas so it seemed a natural fit to write the analyser in Clojure.

The result is a command line application which takes a set of the CSV files and outputs some figures like the average journey cost, total journey time etc. Here’s my output for the first three months of 2015:

 From 01 Jan 2015 to 31 Mar 2015

    Total Duration  21 hrs 2 mins
     Avg. Duration        23 mins
  Shortest Journey        10 mins
   Longest Journey   1 hrs 6 mins
        Total Cost        £213.99
         Avg. Cost          £1.88
          Journeys            114
 Most popular mode      bus (51%)

As you can see, depressingly I’ve spent almost a whole day on public transport since January and that only includes journeys that were timed – 51% of my journeys were by bus which don’t have end times!

You can find the code on GitHub – improvements and comments welcome.

Thoughts on Clojure

I’m really enjoying programming in Clojure. I’ve not done any functional programming before and the approach is making a lot of sense to me, especially in the context of the Alan Perlis quote:

“It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.”

Clojure for the Brave and True has been an unusual introduction to the language, the Exercism.io and 4Clojure exercises are useful for practicing and I’ve used jafingerhut’s Grimoire cheatsheet a lot for reference.

Contact Me

I love a good conversation! E-mail me (encrypted if you want) or find me on Twitter, LinkedIn and GitHub.