Greater Greater Washington

WMATA might offer open data for all regional transit

WMATA planners helped STLTransit create an animation of transit across the entire Washington region. That's possible because WMATA has a single data file with all regional agencies' schedules. They hope to make that file public; that would fuel even more tools that aid the entire region.


Click full screen and HD to see the most detail.

One of the obstacles for people who want to build trip planners, analyze what areas are accessible by transit, design visualizations, or create mobile apps is that our region has a great many transit agencies, each with their own separate data files.

Want to build a tool that integrates Metrobus, Fairfax Connector, and Ride On? You have to chase down a number of separate files from different agencies in a number of different places, and not all agencies offer open data at all.

The effect is that many tool builders, especially those outside the region, don't bother to include all of our regional systems. For example, the fun tool Mapnificent, which shows you everywhere you can reach in a set time from one point by transit, only includes WMATA, DC Circulator, and ART services. That means it just won't know about some places you can reach in Fairfax, Alexandria, Montgomery, or Prince George's.

Sites like this can show data for many cities all across the world without the site's author having to do a bunch of custom work in every city, because many transit agencies release their schedules in an open file format called the General Transit Feed Specification (GTFS). Software developer Matt Caywood has been maintaining a list of which local agencies offer GTFS files as well as open real-time data.

We've made some progress. Fairfax Connector, for example, recently started offering its own GTFS feed. But while DASH has one, you have to email them for it, and there's none for Prince George's The Bus.

The best way to foster more neat tools and apps would be to have a single GTFS file that includes all systems. As it turns out, there is such a beast. WMATA already has all of the schedules for all regional systems for its own trip planner. It even creates a single GTFS file now.

Michael Eichler wrote on PlanItMetro that they give this file to the regional Transportation Planning Board for its modeling, and offered it to STLTransit, who have been making animations showing all transit in a region across a single day.

This is one of many useful ways people could use the file. How about letting others get it? Eichler writes, "We are working to make this file publicly available."

Based on the STLTransit video, WMATA's file apparently includes 5 agencies that Caywood's list says have no public GTFS files: PG's TheBus, PRTC OmniLink and OmniRide, Fairfax CUE, Frederick TransIT, and Loudoun County Transit. It also covers Laurel Connect-a-Ride, Reston LINK, Howard Transit, the UM Shuttle, and Annapolis Transit, which aren't even on that list and which most software developers might not even think to look for even if they did have available files.

Last I heard, the obstacles to the file being public included WMATA getting permission from the regional transit agencies, and some trepidation by folks inside the agency about whether they should take on the extra work to do this or would get criticized if the file has any errors.

Let's hope they can make this file public as soon as possible. Since it already exists, it should be a no-brainer. If any regional agencies or folks at WMATA don't understand why this is good for transit, a look at this video should bring it into clear focus.

Support us: Monthly   Yearly   One time
Greatest supporter—$250/year
Greater supporter—$100/year
Great supporter—$50/year
Or pick your own amount: $/year
Greatest supporter—$250
Greater supporter—$100
Great supporter—$50
Supporter—$20
Or pick your own amount: $
Want to contribute by mail or another way? Instructions are here.
Contributions to Greater Greater Washington are not tax deductible.

David Alpert is the founder and editor-in-chief of Greater Greater Washington. He worked as a Product Manager for Google for six years and has lived in the Boston, San Francisco, and New York metro areas in addition to Washington, DC. He now lives with his wife and daughter in Dupont Circle. 

Comments

Add a comment »

Great article. The strange thing here is that agencies have no problem with releasing this data for essentially PR purposes -- cool as the visual is, it doesn't help anyone get around. But when it comes to releasing data to create apps which create new passengers who bring in money, suddenly agencies become reluctant.

The real barrier is the widespread idea that "bad data is worse than no data," which enables organizations' most dilatory tendencies. One agency has spent months iterating 8 times with Google to get their GTFS absolutely perfect including exact transfer fares and other quibbles. I appreciate their dedication to quality, but they need to produce the open data that serves their citizens and passengers!

A better motto for open data is "release early, release often." Data isn't static, it gets updated, and problems can be fixed in the update process! All that's needed for data quality to improve over time is:

1) regular data updates -- all agencies produce schedules on a regular basis
2) identification of data problems -- developers and even app users are good at this
3) fixing data problems -- all agencies have at least a complaint line, and if the data is valuable enough to passengers, their complaints will escalate enough to get noticed

WMATA actually created this regional data file by hand inputting the paper (PDF) schedule data released by the regional agencies themselves. There are some minor concerns about timeliness and who is going to fix data entry errors, but WMATA seems willing to handle it and I very much hope they will be able to release the data.

by Matt Caywood on Feb 11, 2013 2:52 pm • linkreport

Bad idea to have all regional transit info in one GTFS file for many reasons. They all update their schedules at different times. That means a file that is updated at odd intervals. Any program that is made to parse GTFS can handle multiple files so having them all in one or split out is no big difference. Also part of how google handles GTFS is they require the transit authority to host on a public website the GTFS files and google points their servers to the website. That means they're all available publicly. There shouldn't be a need to email anyone for it. WMATA does not create the GTFS itself, the trip planner does, which is the same for all the regional transit providers. They are able to consolidate all the info into a single GTFS file.
It might be fine for research, but in reality data quality is going to differ between authorities and that alone should be the reason to separate them out.

by Derpy on Feb 11, 2013 2:57 pm • linkreport

@Derpy -- we can argue about the pros and cons of a monolithic regional data file, but seeing as it's the only way we're likely to get GTFS schedule data for five agencies in the near future, I would like the file.

One reason a single file would be helpful: small developers typically don't have an automated process for updating schedules in their apps. The only agency they bother with is WMATA, because they think the others aren't worth the trouble. For them, a frequently updated regional GTFS file (from WMATA or a third party) would be nice.

Also part of how google handles GTFS is they require the transit authority to host on a public website the GTFS files and google points their servers to the website.

That's not correct. Just because Google gets sent a link doesn't mean everyone else has it. Many agencies set up a public website, some host on gtfs-data-exchange.com, but others don't share with anyone except Google (preferential treatment).

by Matt Caywood on Feb 11, 2013 4:20 pm • linkreport

So your saying it would be easier for app developers to be lazy? I'd agree, however that's not the real problem app developers have (and i'm making some assumptions here so bare with me). I'd say that a lack of standardization for the real time updates that indicates the vehicle's position and estimated time of arrival is the main problem with having a single app that can operate worldwide. GTFS is not the issue in that arena. Having a single GTFS for all the agencies in the region is not going to do much. Who uses an app to get just scheduled times?

by Derpy on Feb 11, 2013 4:38 pm • linkreport

Derpy,

Lots of people use (or would use) an app to get scheduled times. Sometimes people need to make plans, like figuring out how to get home from somewhere later in the day, or just to see what the options are for getting around an area. People seem to be fixated on real-time information these days; real-time is great but it's not everything.

by jimble on Feb 11, 2013 6:52 pm • linkreport

Matt,
If someone at WMATA made that file by hand inputting pdf schedules, then someone there is doing something wrong. The other agencies in the area send specifically formatted excel versions of their schedules ahead of any service change.

by Ginger on Feb 12, 2013 1:31 pm • linkreport

jimble,
Why not use google transit then as a trip planner? I mean, that's what it's for. Why would a app developer reinvent that? If you need an app and don't want to use a browser, google maps for android (and probably apple) has the capability of doing navigation, as well as being able to select public transit. They already have all the GTFS feeds available for you.
What google doesn't have (yet) is real-time predictions. Only a select handful have been giving them gtfs-realtime data. That's where the app developers are focusing on, but they can't make a universal one due to no standards.

by Derpy on Feb 13, 2013 4:45 pm • linkreport

@Ginger -- thanks, I may have misinterpreted. Excel is better than a PDF schedule!

by Matt Caywood on Feb 17, 2013 11:12 pm • linkreport

@Derpy -- App developers are doing lots of other things that aren't supported by Google APIs. For example, TransitNearMe.com.

In addition, developers would be very naive to rely on Google APIs, as everyone depending on Google Maps found out when they decided to jack their commercial prices to ridiculous levels. Google is a commercial enterprise, not a public service, and they're quite capable of acting like it.

Finally, Google Transit is pretty stagnant (where's simple multimodal car to transit? Bikeshare? Zipcar? car2go?) and having rival efforts, particularly open source projects like OpenTripPlanner, keeps pushing them forward. But those efforts can't happen without open data.

by Matt Caywood on Feb 17, 2013 11:27 pm • linkreport

Add a Comment

Name: (will be displayed on the comments page)

Email: (must be your real address, but will be kept private)

URL: (optional, will be displayed)

Your comment:

By submitting a comment, you agree to abide by our comment policy.
Notify me of followup comments via email. (You can also subscribe without commenting.)
Save my name and email address on this computer so I don't have to enter it next time, and so I don't have to answer the anti-spam map challenge question in the future.

or