Dan Poltawski's Blog

Should iRun - the Tech

This is the second of two posts on creating my first app Should iRun. You can find the first post here.

Obtaining timetable information

To recap from my first post – the idea of Should iRun is to provide quick access to departure times of trains passing through stations close by. To make this idea into reality, I first needed to gain access to timetable information in an electronic format. Fortunately, I discovered that Transperth provide access to their timetables in Google Transit [GTFS] format and with a license which allows it to be used by my app. Kudos to transperth for releasing information to the public and not just to Google.

Armed with the GTFS timetable data, I had two choices for the way I could construct the application:

  • Provide a central web service which stores serves the timetables to my application over the network
  • Package the timetable information and ship it as part of the application.

I decided it would be best to ship the data with my application. As a goal for my app is to be as fast as possible, it turns out to be an obvious choice:– At rush hour times the cellular network in Perth can become congested and slow to retrieve data – so looking up the timetable information would be slow and frustrating using the data connection. With the timetables pre-populated on the device they can be looked up instantly.

The GTFS data is provided as a series of flat files (CSV), looking up timetable information using flat files, is not really a practical approach, so the data needed to be imported into sqlite (a relational database which comes with iOS as standard). Shortly after constructing a schema and writing a small perl script™, I realised the error of my ways. I stopping the DIY approach, searched github and found A Small Python Tool for Importing GTFS Data into SQL, which maps the csv files to database tables and does the import job (plus some basic indexes).

Once I verified the imported data, it was time to get to work constructing the queries needed to create the basic shell of the app. Since I spend my days on Moodle working with SQL all the time, this was probably the most straight forward part of the whole operation. :)

I ended with a set of queries like this (this showing the next departures from a specific stop):

SELECT t.trip_id, s.departure_time, t.trip_headsign
FROM gtfs_stop_times s
JOIN gtfs_trips t ON s.trip_id = t.trip_id
JOIN gtfs_calendar c ON t.service_id = c.service_id
WHERE sunday = 1 AND start_date <= '20130210' AND end_date >= '20130210' AND s.stop_id = 99531
AND s.departure_time > '18:00:00'
ORDER BY s.departure_time

While the queries were straight forward – there were a few things I ended up adding to the ‘import process’ when creating the sqlite database:

  • The imported data ended up being huge (50MB+ when uncompressed) and this is being shipped with the main application so I needed to reduce the size. I ended up stripping out the ‘shapes’ table. This table describes the physical path which every journey takes; I wasn’t using this info at all and figured that if I was to add it then you’d i’d need to use a network connection to display the map, so I could potentially provide this using a web service.
  • The GTFS data is formatted in a way to be as generic for all situations as possible and is probably pretty close to first normal form. As it turned out, computing some of the queries for 13,000 stops was a little bit too slow and would be even slower on the iPhone’s hardware. So I did some denormalisation and index optimisation to get the information quickly enough (whilst not increasing the data size from the first point with huge indexes)

With the basics of the data model sorted, I could get down to business and construct the app!

Objective-C and iOS frameworks

Objective-C is an ugly language to look at (it really takes some time to get used to those horrible square brackets) and going back to the dark ages of manual memory management takes some getting used to. But after using it for some time, i’ve really come to like it. The work that Apple have put into the LLDB debugger and Xcode integration with the static analyser make it feel a lot less low level than it really is. Apple’s frameworks are (for the most part) really well designed, use standard software patterns and writing applications in the way they recommend is an exercise in learning software design best practices. Its interesting to think how long they will be able to keep this up, but it feels modern enough at the moment. For interesting commentary on this topic see John Siracusa’s articles Avoiding Copland 2010, Copland 2010 revisited and the podcast A Dark Age of Objective-C.

Should iRun is fairly stock application using the stock iOS frameworks and so I don’t have much to say on the matter which isn’t covered elsewhere.

  • I use storyboards to construct the user interface. Mostly storyboarding is really helpful, though I have found the rough edges and gotchyas where it seems like a stale storyboard is being used and I only discover after changing all the code to determine whats going on. The interface almost entirely made up using UITableView and standard controls so using storyboards was straight forward.
  • I use ARC. Some old time Objective-C developers seem to be sceptical about it. But for someone like me who works in a garbage collected language most of the day it really helps to avoid having to spend way too much time thinking about memory management. The first time I tried out iOS development I did manual reference counting and got on mostly fine, but it was definitely easier for me with ARC.
  • I briefly considered using Core Data for connecting to the sqlite database. But I don’t think it is really fit for my purposes where the data model was already setup and the I was familiar with how to construct the appropriate sql queries. Instead I use sqlite through the convenient fmdb Objective-C wrapper.
  • With the exception of fmdb, I avoided third-party wrappers around core frameworks. Partly because I wanted to learn the frameworks, partly to avoid too many dependencies and partly to ensure that I can take advantage of any improvements Apple put into the frameworks in future versions.
  • Once I got my head around and started using KVO (and to a lesser extent NSNotificationCenter), I greatly improved the software design/compartmentalisation of my app.
  • I translated the app into English, Catalan, Chinese, French, Korean and Spanish. I didn’t attempt this till a couple of versions in. It wasn’t much work to switch (because the app is small) but in future I wont forget to use NSLocalizedString from the start. (I used friends at Moodle HQ and tethras.com for the translations, both were efficient :–) ).

Location, location, location

Using location information is a crucial element of the app so that the effort to see the train times the user wants takes no effort. All a user of the app has to do is open it – i’ll present the right stations and times immediately. Sounds simple, but there were a couple of challenges.

To get a very accurate location takes time and uses a lot of battery life (the GPS radio needs to be powered up and get a fix on location). In fact, to establish location, iOS will first use cell tower information (accuracy to ~1 or 2km), then use nearby wifi networks (accuracy to ~100m) before finally using the GPS. There lies a dilemma, the app could do with a near-instant location to show the coming trains but I didn’t want to be constantly refreshing the screen with minor changes in location, that would be distracting for the user. Its surprisingly hard to test this too, while the iOS simulator allows you to simulate the location of somewhere, but it doesn’t really simulate the real-world behaviour of requesting a location – so debugging it was limited to real world use. As a result, I didn’t really get this aspect of the application working as well as I liked until quite recently. When I was using the app in real world use, I often got frustrated by the app displaying a completely wrong location due to stale information and other factors. In the end, the solution was straight forward reacting to the various conditions that can occur. I use inaccurate locations so that at least we might be in the ballpark area and I continue to wait for a more inaccurate location, then I only update the user interface when the location gets significantly better. I stop asking for location updates if I get an accuracy of 50m or better. It sounds simple when I write it here, but its surprising the amount of time it took me to get this code working as well as I liked.

The other challenge with using location was ordering stations by location so that I could display the list of stations nearby to the user. The ideal solution to this would be to use the database to retrieve stations ordered by location, but sqlite doesn’t come with functionality to order the data this way. I found some C functions which could be added to sqlite to do the maths on latitude and longitudes and use this for sorting. However, I wasn’t so keen on this as this sort of maths is not my strong point and I didn’t really like adding something I didn’t understand properly. As an alternative, I retrieved the stations in memory and used the Core Location distanceFrom methods to sort the list. This was inelegant, but it worked quickly and efficiently enough, I understood it, so i’ve stuck with it.

Introducing the server side component

After a few months of monitoring timetable changes, I realised that transperth were updating their GTFS timetable data reasonably frequently. It was changing often enough that I thought it was best to provide a way to notify and update its timetables from within the application. I could’ve just released a new version of the app with new timetables each time, but there are disadvantages to releasing a new app in the store:

  • Reviews and ratings in the app store get reset each version
  • Users tend to expect new features with app store updates
  • You need to wait for apple approval (typically a week at a time) before it gets released.

So, I came up with a simple mechanism to serve timetable updates:

  • Periodically, the app will do a https request to the updates API
  • Along with the request for the timetables, the app sends with it some useful info such as the type of device, operating system version, locale, timetable version etc.
  • The server side logs this information and returns a JSON response containing:
    • The latest timetable version number
    • The location where the timetable update can be retrieved (over http)
    • The sha512 hash of the update file
    • A description of the update
  • After receiving this JSON response, the app determines whether its running the latest version of the timetables or if it needs an update. If an update is available it’ll make that known to the user.
  • At the users convenience, the timetable update will be downloaded, hash compared with the one supplied and the update will be installed if verified as good.

The API itself is a short php script which receives the parameters from the app, logs them to a database and then returns the JSON, one advantage of passing all versions in the request means I can change the API behaviour based on version (work around bugs, or make exceptions if necessary) and I also versioned the url for this reason. For testing, I have two symlinks, production.json and beta.json. I always serve beta.json to my personal devices. So I test by updating the beta symlink, then after verifying I can quickly switch the production symlinks

The API is running on a virtual server I have in the UK, but the updates are served from Amazon S3 in Sydney. Having the largeish (20Mb) download served from Australia makes a big difference to the speed it downloads here in the most isolated city in the world!

When the API went live last week, I was able to start seeing real information about users of my app. It was really inspiring to see that people actually are using the app and updating the timetables, as I said in my last post, it was difficult to tell before. The stats are currently public, if you are interested (though I may change that in the future).

The final server side component I currently have is for push notifications. If a user chooses to receive push notifications then the app sends a push token to another API endpoint. I maintain a database table of push notification tokens and can also tie these to the log entries (so I can target users who have not updated to the latest timetable update, for example). The most complex part of notifications is getting the SSL certificate machinery setup, but once that was sorted, I have a perl script which uses Net::APNS::Persistent to send notifications and Net::APNS::Feedback to disable notifications based on notification failures.

In Conclusion

Thanks for reaching this far.. This post was much longer and less interesting than I had hoped originally.

Developing Should iRun has been my favourite sort of project, mixing a diverse range of technologies together to create a single solution. My buzzword chart will be off the scale: objective-c, iOS, SQL, php, python, mysql, sqlite, bash, perl, S3 – its like a web 2.0 recruitment handbook!