How New York's MTA Uses Drupal Caching to Get Riders There on Time [March 6, 2013]
How New York's MTA Uses Drupal Caching to Get Riders There on Time [March 6, 2013]
Want to learn more about Acquia’s products, services, and happenings in the Drupal Community? Visit our site: http://bit.ly/yLaHO5.
The New York's Metropolitan Transportation Authority recently announced the availability of Real Time Data Feeds for developers. Let Blink Reaction explain how this custom API portal was built on Drupal 7 to provide critical up-to-the-minute travel information for millions of New York's daily commuters and visitors.
This Drupal-based site enables developers to create an account and obtain an
API key for specific feeds created by the MTA. Developers are able to use
the real-time data on their website or mobile application to inform riders
about scheduling and service notices.
In this webinar, you will learn the techniques and best practices used to bring
about Enterprise Drupal success including:
• Caching with Varnish 3
• Edge Side Includes
• Proxies and Mobile Load Testing
• Project Management Tools and Tips
Moderator: Today’s webinar is how New York’s MTA uses Drupal caching to get riders there on time. Our speakers today are Ray Saltini, the Drupal evangelist from Blink Reaction and Evgeniy Kashchenko, who’s a project manager of Blink Reaction. We’re really excited to have them on the call today and we hope you enjoy their presentation.
Ray Saltini: Great. I’d like to thank you all for coming today. We’re going to get started. I want you to know that we’re going to tell you briefly who we are at Blink and what we do. I’m going to talk a little bit more than we specifically do at these sort of presentations about the significance and contents of this project. Then let’s jump right into it and do an overview. From the important PM considerations, we’ll talk about the technology needs and how that fit into architecture and implementation of the project, and of course, we’ll give you a couple of different resources. First of all, Blink Reaction is privileged to be providing services for some Fortune 500 organizations and large government non-for-profits. We’re particularly proud of our efforts to train in the Drupal stage. We see that really is our biggest contribution to the Drupal community that started to help level the playing field and also advanced innovation and integration with Drupal. We are active in the community; enjoy our support of it and hope to see you all, one way or another, at an upcoming event. So first and foremost, let’s talk a little bit about - where did we go about the Ocean Railway, as we talked about it. We had an opportunity to work on a transportation project in New York City. Obviously, the technology component, as opposed to the bricks and mortar infrastructure of it, but the transportation is a critical issue in the cities, in the US in particular. One of the things that we were just very aware of is how important this was to the MTA. If you’re a student of New York, you can understand that the health of the city - the transit system is often used as a barometer of the health of the city. So we took this project very seriously. Back in the day where that analogy was made for packet service, there are a whole lot of ships and boats on Long Island. The first efforts at transportation solutions are on the ground or above the ground - as the case may be - solutions were just being dreamt of. The subway - planning it out and developing it is not unlike the process that many of us go through when we plan and build interactive web applications. Once the prototypes got a little bit more reasonable, we both understood that they would actually start having to build underground. Of course, remnants of the above ground systems still exist all over the place in the boroughs. You can see just from these old shots, just what a tremendous major initiative it was in New York City - for nearly 100 years at building and maintaining the system. Eventually, they got three different companies to service their link. So, we took this challenge very seriously because it was part of an effort to really create a solution around the inevitable problem of delays and service interruptions and the like of which you can see that this old cartoon demonstrates, that the MTA as well as other urban locations throughout the globe have dealt with for a long time. It really was an attempt to bring us back to the future. So, this was a problem in the service which first opened a challenge to inform riders. We have a lot of folks looking over the platform. That was the highest technological solution at the time. Things weren’t much different in the ‘70’s, then they got a little bit weirder and in the ‘80’s, they got downright scary. Now, in the ‘90’s and the new millennium, we can see that as much as things change, things often remain the same. So it’s this historical sense of gravitas that everyone, from a large project staff, went into working on this project, dealing with legacy systems, analog systems, but really trying to service these incredibly large numbers of commuters. We joked amongst ourselves when we were doing this presentation that the MTA, in fact, does keep track of passengers entering the subway system, but they don’t necessarily keep track of passengers exiting the system. Beyond that joke, this was a huge undertaking for the MTA to plan for a number of years, and despite the one step further and one step backward implemented scheduling of service, they have made tremendous progress. What they wanted to do with this project is bring that out of the tunnels, off the platforms and into the hands of people that could distribute this information. So, as the MTA upgraded their infrastructure, they’ve been able to pass this information along but also a faith in the tremendous challenge of it. It’s at this point that I want to introduce Evgeniy Kashchenko, a project manager - enterprise project manager for Blink Reaction who helped managed a team of talented developers and committed stakeholders.
Evgeniy Kashchenko: So, MTA was already transitioning to Drupal and, therefore, continuing to build and Drupal made sense, and data mining the website was built using Drupal 7 and it’s installed on Acquia’s cloud. So, we worked in tight coordination with many groups here. Of course, MTA provided the requirements. They also provided the overall look and feel and branding standards for the site, as well as the most important key thing, which is the train’s data. From our side at Blink, we managed the project, did a lot of custom development, triple configuration, configured cache, and created a separate mobile app to demonstrate how the feed can be used. So we partnered with Acquia on this and the Acquia team worked on project management, as well on clients communication, configured the technology stack including Varnish and provided a lot of types of services through their partner. So, let me show you the screen of the website. Here it is. It’s taken from the data mine that was provided. So, the goal was, again, to make of it after a second information about MTA services to millions of New Yorkers and visitors. This Drupal website allows developers to create an account and obtain an API key to access those feeds created by MTA so that they can build apps to inform riders and others about scheduling and service notices. So, this site is integrated into MTA’s portal and builders of its integral part. They started to announce the launch of this platform and given the access to their data started in full and it put a lot of pressure and stress specifics to meet the deadline.
Let’s jump to the next slide. This is how the dashboard looks for a developer who already registered. Basically, all you need is your MTA key, which is shown on the right. This is the key that allows you to get the data from API’s. So it’s straightforward. To register, you can do it yourself just by going to the MTA’s info website and providing information about yourself. Right now, it’s a pilot with a couple of trains’ data, but they are looking to add more in the future as they become available because a lot of infrastructure changes to switch from analog tracking to digital needs to happen so that this data can be gathered and then presented to everybody.
So, let’s talk about projects management a bit. As a key to successful project delivery is communication and track of coordination with everybody who’s involved. One of the tools that helps to achieve it is the RASCI chart that is presented in the screen right now that basically allows you to outline the main responsibilities and activities on the project and assign those to specific individuals. Besides, you would probably want to create a communication plan in defining cadence participants in the format of the different meetings, calls and reports, which in our case was the weekly status reports that were combined with calls and screen sharing for demo purposes. Of course, you would be willing to be on top of your dependences and risks to make sure that they are mitigated in time. So, this project was in a very aggressive schedule for three months, including discovery, and we added more resources to make an update and also caught in some review time, but we know that basically, at the end of last year, Hurricane Sandy interrupted and, basically, drove the plans. So that’s why MTA had a lot of work to do with their infrastructure and the actual launch was delayed. So, iterations that we went through were discovery, which included specs, information architecture and designs. Then there were three sprints of development with demos and reviews and two separate launches because we were launching two apps - the portal itself and the demo app. Now, let’s talk about the solution itself.
So as I mentioned before, it’s Drupal 7 with heavy use of Varnish and other types of caching, widely used contributed modules, five custom modules for theme, for ESI, for flush of the cache and for messaging, and as everybody can register to the website, it’s made opened so that you can access the data field at once and only if you’re abusing the service, then your account will be deactivated. Another part of this which is a real-time feed that is provided in the format of GTFS binary file, and then some static feeds are also provided as a separate service. So if you don’t know much about ESI includes its technology that strives to resolve the problem of web infrastructure scaling and bring more awe to the edge. You can look up more on these types of caching. So, in our case, Varnish was used to basically cache the data and, again, this is the page that shows you it’s documentation that it can be accessed online at Varnish cache docs. So let me show you the architecture of the app. So how it works is the developer comes to the website, registers using the web form, then the data is stored in Drupal and access key is generated. Then the developer can access API using this key. This API is proxied by Varnish for scaling purposes, because Varnish helps us to get the reply to user tester if we know who the user is and if data is still valid. So, as this is real time application, the lifetime of a cache for the data itself is just 30 seconds which is displayed in here. The other part, which is also cache and Varnish is your API key validness. That is a longer period. So how it works: first, request is sent to a standalone file in the dark route, which is this wrapper. It checks the API key in Drupal if it’s valid or not. Then if it’s valid, it’s stored in Memcache. Then the data is provided to via Edge Side Includes. Basically, what it is is just like a tag in the reply that is then polished by Varnish and the appropriate file is entered into the database instead of your tag. One of the key performance indicated for the project was load and sustaining load. So in this case, as I mentioned before, Acquia partnered with SOASTA, which is distributed to our testing provider that can hammer your service from around the globe. What we were able to prove with this load testing is that our solution is handling five thousand users per second concurrently with the architecture that was built in place. That’s about it about the portal. Now, I will transfer the presenter to Ray and he can talk about the demo application that shows you how you can actually use this data and get some real results for that’s worth and can be used to actually know if the train is on time.
Ray Saltini: Then getting actual - if you don’t mind, let’s just keep it on your screen and I’ll speak to that. It might just be more expedient since I’m going to be…
Evgeniy Kashchenko: Yes, absolutely!
Ray Saltini: Thank you. Good. Great! We have these two ends of this project. Obviously, the caching of the rights and credentials of the data, but what we were also asked to do is put together a very quick demonstration of how developers could access feeds from the MTA and, in particular, these real-time feeds. Typically, what the development community that the MTA has been supporting for quite some time now if there’s some increase transparency and other feeds that would develop a native phone application if you will. In this case, the business requirement from the MTA was to actually develop something in Drupal. That actually posed some challenges in particular, because some of the transit feed specifications did not have a native PHP parser. Effectively, at this stage of the project involved setting up a completely standalone Drupal 7 site and is a terrific demonstration of how Drupal can be used as a platform to serve a mobile application. We developed a response as a theme specifically for the iPhone, a mobile web theme and used fairly typical modules in Drupal’s contributed modules section to accomplish this. What was not easy about the project, of course, was this getting around the fact that there didn’t exist a partner for the transit files used by the transit associates of the MTA. So we were able to actually tap and customize a library that contributed for this that’s available on get hub. We actually had to rewrite parts of it because the stack that we were on - we built up - in one version of PHP and the library file, the general transit spec library parser for it, we built around a different version of PHP. So, what you have on your right is an actual copy of the general transit feed spec that’s used to convey this information. The real-time data is actually a binary file, but the static feed is a CSP file which we actually didn’t handled directly, but the two feeds were matched up in the application to give users the ability to actually go ahead and select stations which obviously is a static lists of stations and then the actual arrival time of their train. So, one of the things that we can do is go to a quick demo of the MTA site. So, if the folks would - I guess that would be helpful if you want to switch over to me - if Evginey have that, I should have that up here on my screen if I can manage to find it.
Evgeniy Kashchenko: Yes, maybe I can turn it up. Just give me one minute. It’s just a bit slow with the screen sharing, Ray.
Ray Saltini: Yes. [Pause] I have it up if you want to transfer the point. [Pause]
Moderator: Ray, I passed it to you so you could share your screen.
Ray Saltini: Terrific. So the folks can see here that we’ve actually got a - it’s really optimized for the iPhone and it’s limited. The choices of course are limited by the infrastructure that the MTA has in place to actually deliver the feeds for a particular line, but if we can take the local line and then we get a little map presented here. Then we get to choose to see when our next train would be at a particular station and click 168th Street in Washington Heights. This is something that you can actually try yourself if you go to datamine.mta.info. You’ll see this application on your lower right-hand side, and then click and you’ll get your arrival times for the next several trains. So, it’s a very, very simple implementation of this feeds technology, but it’s the implementation that is being expanded upon everyday by the developer community. It is leveraged by more and more feeds that become available through the MTA. That is the heart of the matter on this application screen. Never maximize your screen when you are using two monitors. Okay. So whatever’s down here…okay. So, one of the things that the MTA has done is really engaged and invested in the open source and the developer community and terrific resources of MTA, developer resources group on Google groups, as well. That and, of course, some of the technology sites that we showed you around, Edge side, includes and the transit feeds and Varnish will help you leverage this kind of implementation for your own projects. I think that this was a very out-of-the-box implementation of Drupal and, although it was a very robust enterprise level website platform, more and more we see it every day being used to create very robust, sophisticated web applications. We also saw an atypical implementation of Edge Side Includes with this. The limiting Drupal bootstrapped to really accelerate access to that level of caching. It was an important part of the engineering and architecting work that the team was able to accomplish with this project. We were actually able to ratchet the system up to 5,000 hits per second, specifically, so that the system could accommodate increased load as in the median term. So, although they only have one feed set up and running at the moment, the system was really built to handle increased capacity on an incremental phase and basis. It really is an opportunity to show off the robustness of Drupal and some of these other technologies when they’re planned out and built carefully and optimized. So thank you for everyone’s time and indulgence. If we have some questions, and if there are some things that we could provide some clarity around, we certainly will.
Moderator: Great! Thank you so much Ray and Evginey. If you have any questions, please ask them in the Q&A tab right now and we’ll get to them. [Pause] It doesn’t look like we have any questions. Ray and Evginey, do you want to end with anything or contact information if they think of anything?
Ray Saltini: Yes, absolutely! We are available at blinkreaction.com. We offer a growing selection of introductory and advanced Drupal training classes. We, of course, do professional services work on the front and back end as a function of our relationship with Acquia. We’re just looking to have the opportunity to share. Evginey, I don’t know if you want to add anything…
Moderator: We actually have some questions that came in if you wouldn’t mind answering them.
Ray Saltini: No, not at all.
Moderator: Okay. The first one was, what was the biggest challenge with the parser?
Evgeniy Kashchenko: Basically, it didn’t exist and it existed for another version of PHP. We needed to adapt it. It’s short end syntax, the use of name spaces and stuff like that. So we needed to scale it back to Drupal 5.2 because that’s what’s available on the hosting. Basically, that’s how we achieved it.
Moderator: Okay. Great! The next question is how do you handle caching for authenticated sessions?
Evgeniy Kashchenko: Yes, sure. So, basically, there are many different ways to do it. So, specifically, for the back-end, we use Edge Side Includes and that’s how we are able to sustain this huge load with verifying the keys and still providing the almost real-time data which is cache just for 30 seconds with the use of Edge Side Includes. For front-end, it’s basic or standard Drupal caching that has provided blocks caching to use caching.
Ray Saltini: So, through the use of this caching that Evginey is speaking of, we were able to limit Drupal’s bootstrap. So, we didn’t need to move a full instance of Drupal in order to update the cache at any one given point in time to check developer credentials. This was handled in cache. If at any one given point in time a development credentials have been revoked, the access rights for that particular developer or their access to even a particular feed will be selectively removed from the cache. That was an important - this concept of limiting Drupal’s bootstrapping is very essential. So, we used it for what we needed.
Moderator: Okay. Great!
Evgeniy Kashchenko: Yes, and actually there are a couple of other questions about how it works internally. So, I have two more slides that I can go over for specifically for what’s stored in Drupal versus what’s stored in cache and how it all works and how the results are matched with GTFS data. So, I just saw a couple of questions about it. So, if you want to make me the presenter…
Moderator: Okay. Let me pass it over to you. [Pause]
Evgeniy Kashchenko: So are you passing the presenter to me then?
Moderator: Yes, you have it.
Evgeniy Kashchenko: Maybe it’s on the way [laughter].
Ray Saltini: Maybe it’s still seeing the Acquia presentation.
Evgeniy Kashchenko: Yes, I still see questions slide.
Ray Saltini: Is it me who has it?
Moderator: You’ll have to re-share your screen. [Pause]
Evgeniy Kashchenko: Okay. Let’s see. Does it work?
Evgeniy Kashchenko: Okay. Cool! So, I hope this will answer some of the more technical questions that were asked already. So, that’s a bit more detailed architectural diagram of how it works or the data flow. So here is how they request to go through and basically what happens is when somebody tries to request a feed through the API, the request is sent to Varnish. That request is actually for MTAESI.PHP. That’s the wrapper that basically checks your key validness and then includes - we have ESI, the real-time data. So, Varnish first tries to look up the cache for this page in the internal memory and then if it’s not found, it goes to the back-end to actually get it. What happens here is we first check in Memcache if we know this key and, if it’s valid, then we can basically return the data on this step. If not, then we will need to load Drupal at the lower bootstrap level and that’s basically the level to actually get access to the database. The bootstrap steps are on the right of the screen in here. So, we take first the three but the rest is kept. This allows us to save a lot of time of Drupal configuration. So again, we’ll level bootstrap Drupal. We are getting the information about the key from Drupal database and make sure that it’s valid and after that, we can store it in Memcache for further reference and we can return the ESI tag to the data file which we do in here using this ESI tag that is on the right. Basically, the path in here is the location of this file on your file system. It gets there by MTA pushing it to that location so that it is up to date all the time. So, we serve this tag with PHP then Varnish understands that it’s not something that he wants to return as is, but it needs to be switched to the file that is in here, which is done, and then it’s returned back to the consumer. So, basically, Varnish does two different types of caching. First, it caches the MTAESI.PHP file with your access key and that duration is 30 minutes, whereas the included file itself is cached only for 30 seconds because that’s the frequency of the updates and we want the data to be up-to-date. So, basically, what it allows us to achieve is that this file where ESI is included is shared among all of the requests. So, basically, first it requests to Varnish, populates it and then it’s just searched from cache, whereas a request to ESI-PHP is key specific. If you have access to the service within the last 30 minutes, then Varnish will remember you and will give you the file it wants. Otherwise, if it’s like an hour or a couple of hours, then we’ll still have the data about the key in Memcache which has a longer period of life for your key. If the key is for some reason deactivated by an admin personnel, then it is also removed from the Memcache. So hopefully this explains why they needed that they also know how the cache is done for the API portal. Now, to talk about the demo app and how the data is matched there. Basically, we have two types of GTFS data files here. One is static and that one has train schedules - basically, trains names and stations that were there; whereas, real time has actual delays or any changes in the schedule that are there. So the static data doesn’t change that often, like maybe a couple of times a year. It’s cached in Drupal tables. So, basically an admin personnel can refresh it at will when they need to, and it will be updated in Drupal tables, whereas real-time GTFS data is parsed by this protobuf PHP module, which can be found on gate hub by default URL. So, what it does is parses this binary data file and populates it to Drupal cache so that when users come to the website, we can get the data about lines and different stops on them, and then map match it with real-time data to provide the results that you saw on Ray’s demo. So, I hope that helps and we’ll look through the other questions that you might have.
Moderator: Okay. Great! The next question is, is Drupal storing data from the parser or simply rendering it? If it’s just rendering the data, do you have to write a custom module to interpret that data and output it into the view?
Evgeniy Kashchenko: Yes, basically, that’s what I just went over. So the real time data is parsed by this protobuf PHP by Ivan Montes. Then it’s stored in Drupal cache.
Moderator: Okay. The next question is, how many instances of the app do you have for the load? How scalable is this?
Evgeniy Kashchenko: There are a couple of servers that are shared among a couple of different applications that MTA has. So it’s not only this demo app and API portal, but it also has some other Drupal applications that MTA already built before. What makes it informant is actually the caching architecture and infrastructure that is put in place.
Moderator: Okay. Great. Are there any real-time statistics? Let’s say - how many users are using it at the same time?
Evgeniy Kashchenko: There are some based on operational data. There is a plan to add more of real time monitoring there and generate administrative reports, as well. That’s something that is planned for the next phase. In terms of real numbers, I’m sorry. I just don’t have those [laughter].
Moderator: Of course. The next question that came in is, so does it mean that the entire cache is being flushed and rebuilt every 30 seconds?
Evgeniy Kashchenko: Let me go back to this slide. I’m not sure which application you’re asking about but if it’s API portal, then the part with your key is cached for 30 minutes within Varnish and a longer period within Memcache, and the file itself with the real-time data is refreshed each 30 seconds. If you’re talking about demo app, then again, there are two pieces in it. For the static data, it is refreshed as needed and real-time data - I think right now, it’s set to a minute or so. So again, it’s an app for demo purposes. The intention is for other developers to understand what data is inside and how it can be leveraged.
Moderator: Okay. Great! I think that’s it for questions. On my end, the recording will be posted in the next 48 hours to the website and we’ll email it out to you. I want to say a big thanks to Ray and Evginey for presenting and thank you everyone for attending. Evginey, do you want to end with anything?
Evgeniy Kashchenko: No. Thanks a lot everybody for joining us today and let me just share this slide. So basically, you can reach me or Ray at the emails that are on the screen. It’s firstname.lastname@example.org or evginey.ka - oh wait a minute that’s misspelled. It’s email@example.com.
Moderator: Alright. Thank you so much.
Evgeniy Kashchenko: Again, thanks for participating today.
Ray Saltini: Thank you.