Three Powerful Tools for Improving the Performance of your Drupal Site [June 21, 2012]
Want to learn more about Acquia’s products, services, and happenings in the Drupal Community? Visit our site: http://bit.ly/yLaHO5.
Since its open-source inception in 2001, Drupal adoption has spread like wildfire. Developers have fallen in love with the speed at which they can innovate, and organizations large and small are exiting five+ year investments in proprietary CMSs in favor of the agility Drupal delivers to the business.
Although there are over 16,000 modules, only a small percentage are geared toward performance and scalability of Drupal applications. Acquia and its partners have solutions that make it just as easy to make a high-performing Drupal site as it is to build an innovative site.
Join this webinar to learn how you can improve the performance of your Drupal site with the help of tools available in the Acquia Network including Acquia Insight for code and configuration testing, Blitz.io for load testing, and New Relic for application performance analytics.
Participants of this webinar will learn how to:
* Uncover Drupal code and configuration mistakes
* Test performance and scalability
* Detect and resolve common Drupal performance issues
Speaker 1: Hi everyone. Thank you for joining in on the webinar today. I’m going to go over a few slides and then pass it over to Michael Cooper. Today’s webinar is three powerful tools for improving the performance of your Drupal site. With Michael Cooper from Acquia, Michael Smith from Blitz.io, and Bjorn Freeman-Benson from New Relic. For more information you can always visit Acquia.com or email us at email@example.com or call us at any time. Now I’m going to pass it over to Michael Cooper.
Michael C. I want to talk to you a little bit about the Acquia network and one of our main purposes Insight that can be used to help you get better performance with your Drupal website. I’m going to give an overview of how our tools can make sure all of your bases are covered, everything’s functioning properly and then from there Blitz.io will talk about how you then low test your site. Obviously it doesn’t make sense to low test your site until you know everything is set up properly. Then following that, New Relic is a great tool for showing you how to dig into the reasons why your site might have either a) failed low testing or b) insure that new things don’t get introduced into your site that affect performance afterwards and make sure that your up-time and your scalability stay in place.
Our main goal at the Acquia network is simply just to make it easier to fully develop and maintain Drupal websites. All aspects of a Drupal website from beginning to end are the things that we want to make sure that we help you empower you to do better with. Sort of our slogan as a of software and services to help maintain killer web experiences built on Drupal. I think the last part, even though it’s not bold is the part I feel most strongly about as a geek because I want stuff to be awesome and to rock and run amazingly. I believe quite firmly that Drupal is a platform for doing that.
We have three main offerings as part of the Acquia network. We had a library which has a lot of answers. This is a knowledge-based written by Drupalists and maintained and carried by Drupalists. It gives you a lot of answers for edge cases, for weird scenarios, or even to best practices like how you do reverse proxy cashing and which one to use or to use them together; a lot of things like that that can be very useful to someone wanting to get really into the nuts and bolts of how to improve people, or just if you need to understand some of the basics.
And of course we have our Acquia support team which has your back and the cloud services which has all the tools that you can use, be that our internal tools or third party services like New Relic and Blitz that will help you make your site better. The answers I need as far as the libraries; we have 800 articles (probably a lot more than that; probably over 900 articles now), thousands of facts FAQs actually at this point, and lots of podcasts, webinars, and videos and a lot more things coming in that are going to be improving even further.
For example, Drupalize.Me you sign up for an Acquia Network subscription, you actually get a free account on Drupalize.Me and I believe they also have hundreds of additional videos on how to perform tasks with Drupal, how to improve your experience with Drupal and to just get a better overall experience. Then of course our support team which is not really a great way to think of it because these guys are actual Drupalists. These guys are all developers and site builders. Their system admins and what not who have had years of experience in the field and now operate as our customer service team that will help you with any problem you have in Drupal, be it integration with a third party service or even your own custom code. They’re of course round the clock support as well. We have people all around the globe doing this; we’re always there.
Then of course something to be focusing on today is the tools that you can use for extending and managing your site and improving your entire experience for yourself and for your customers. The three main tools that we have are Acquia Insight, Acquia Search, and SEO Grader; things that we’ll be focusing on the most today because it provides real time analyses and proactive alerts about your Drupal site and it’s configurations and a little bit about your stack in so far as how it pertains to Drupal. It’s great stuff. It lets you know how your sites configured, if you have poor configuration issues, if there’s a lot of obvious things that we look at and say, “Wow you should never have these two modules working together because they’re going to conflict and you’re going to have performance issues,” or we say, “In production you should never be running this modular here because it’s going to hammer your database the whole time.” We make sure that we give you a lot of these basic recommendations. We make sure that all of the settings for your site are in place and proper as your baseline. Then from there you can then insure that using other tools as sort of a custom code and other scenarios and things like create new views or what not are not also not causing problems. This is an essential baseline.
It’s kind of shocking. You’ll see a little bit more when I get into it. Some of the things we check for here will strike you as very obvious but the thing is when you have a big website and you have a lot of people working on it or even a lot of websites; it’s easy sometimes to miss some of the basic things; that’s where insight really comes in to make sure that all that stuff is covered.
The SEO Grader, I’m not going to get into it today but it’s an additional tool that we have that helps you configure your Drupal website to be optimal for SEO which is a pretty useful thing.
Acquia Search is all about essentially offloading all of the search overhead from your site. Search is really important but it’s also one of those things that you don’t want to think about too much. What we like to do is use our Acquia Search service to offload all of the search resource load and indexing and everything to our servers. All your Drupal website probably does is just sends off information to our servers because we use Apachi solar, we’re using 3.5 so we have all of the latest and greatest features in there and it’s just something you don’t have to worry about anymore which is a great thing and also a little boost of performance too.
Something that’s now available too is what we call site portfolio management and that’s if you have a lot of Drupal websites running you can actually at a glance see if any of those Drupal websites have problems, if you have modules that are out of date on a particular website, if you flagged a bunch of performance issues, or security issues on the website, and gives you sort of a one-stop shop for a report card to tell if you need to do anything with your sites and what you need to do. Then from there you can create all of your to-dos off of that list.
The insight overview page is the main page you get to when you come into insight. What we do is we give you a score based on about 150 different tests that we run on your site. Not all sites have all 150 tests applied to them it really depends on what modules you’re running, what versions of Drupal you’re running and what not. There’s a big slot that can get applied. We score your website based on how well we think you have it configured. This particular website here is a test website and it has an 83%. We have a better score but a number of important tests have failed. On this particular scenario, they’re multi-security tests. Maybe some of them are minor ones so we have them flagged and the things that are important.
The number of tests that pass or fail does not always directly correlate to the actual percentage of scores because some things of course are much minor. Then down at the bottom you can see that we give historical graphs too. You can look at this and say, “Wow we did see a decrease in leads or web traffic or maybe we saw an increase in web traffic,” and you can correlate that to the score and use that to determine why historical events might have occurred. Also you noticed at the top here; we flagged an important alert that one or more modules on this website has a security update that’s necessary. There’s a number of different alerts that are considered to be super critical that will actually service all throughout our UI to be going to the UI and this alert will there to bother you all the time saying, “You have page cashing turned off or you have modules that are out of date and security updates or you have things like the unauthenticated users can use the PHP filter,” stuff like that is what will show up and constantly bother you at the top of the site.
When you have an alert there’s a healthy fix it link and this is an example of what you could get. This particular one shows a database logging module enabled. They explain why you don’t want to have a database logging module enabled and we provide options as far as what other systems we think can give you the same thing and in this case, Systlog is the one you should run. There might be some scenarios we understand where maybe it’s a development environment or maybe you’ve said, “You know what this is a really small website, I’m not that concerned about database logging,” and you can choose to ignore certain alerts. That way you can say, “I understand why Acquia is offering this recommendation but I have my own work around I put in place and I feel alright,” and you can ignore that alert and it will no longer than count towards your score.
If you remember back to the very first page, the screen I showed you with insight, one of the things was that there was a security update for a module. What I have here is a page inside that shows all of the modules for the site that need updates and you can get to that page simply by clicking this code tab. It shows me a bunch of modules in here that need to be updates. All of these ones either have bug fixes that are the reason for the update or they’re security updates. I know that this is basically a to-do list for myself or for one of the developers that needs to take care of this; so we’ll go and perform these updates. If I click on one say like “panels,” I’ll actually see that I am running version 3.9 and that was November 1st last year and there’s a new version out, 3.10 and it has security updates and bug fixes; we get the download length as well as the project page so if you wanted to get a little bit more info you can get into that.
One of things that…let me go back two slides and just point this out. You’ll see looking down here, the header distance but there’s a couple of things view diff over on the right. What that means is that these modules have been modified which is always the bane of anyone trying to maintain something like Drupal. You don’t know…either maybe you took over a site from somebody else or you outsourced the site and you’re not positive if any modules have been modified. What this allows you to do is to actually, at a glance see if any of the modules been modified and know from there what you have to do. We know we have to update these two modules and we know they’re modified so we know we actually have to go in and do something about that; we can’t just willy nilly update these modules because we might actually break functionality on the site.
If I look at this one module’s typekit I can see in here that the developer’s have made a change where they’ve altered from loading the typekit JS files from off of the typekit website to loading a local version of it. Obviously that probably would have been done for performance which is a good thing. We’ve actually flagged what was changed the module, we provided diff and we also highlighted the fact that this typekit.js file is not in the module you get off of drupal.org so that if you were to go and update this module, you’re going to need to either a) see if the new version has this functionality or b) modify the new version of modules as well to have this functionality.
Another thing and this is actually kind of interesting. This is something that I spotted just using the tool while I was preparing these slides a little while back that the token module needs to be updated. I looked at it and thought that’s kind of odd because it’s one of those modules you would just almost never have a reason to change. When I looked at the actual module I saw here that inside of this folder there’s a second copy of the token module and it says neither of these files are in the distro, yes because I look at this and I see token, token then token.module and what not. What that means is someone has accidentally instead of just updating the module at some point in the past, they put a second copy of the token module inside the token module. If you’ve ever looked at how Drupal sort of understands what module to load; you almost have no idea which of those two code bases are actually being used as token modules. Updating this module might not actually do anything because Drupal might read this other erroneous version of the module. This is sort of an eagle thing; it’s something you may not spot on a regular basis but because insight is highlighting modules that don’t look the way they’re supposed to, you can quite easily pick out things like that.
Once you’ve got all of your modules up to date and you’ve got all of your security stuff in place, you know that you’re in a good place and now you’ll start looking at performance. We can see that this particular website is pretty good from a performance standpoint; 16 out of 19 issues are passed. We have a lot of different check, if you didn’t get in the slide, they’re gone off the page. We check for things like the page cashing turned on, do you have the ability to reverse proxy cashing, are you sending out the right cashable headers to browsers, are you using ADCs m-cash, all that sort of things are things that get checked. A couple of things have been flagged here; is that the views UI module isn't enabled, that’s not a huge performance hit so it doesn’t actually have a very big impact on the score but it is there. The database log in module can be a huge impact on performance because if you have your site set to log notices you could be writing hundreds of things into the watchdog database table on every single page load so that could be a huge performance impact. And the devel module can do a lot of extra logging and what not that could also impact performance so we recommend of course turning them all off. As you click the healthy it explains what the actual impact is or some information about that and give the link to turn it off and all that sort of thing as well.
Beyond that, everything now is good to test so what you could then do is, you could say, “You know what? I have only two issues that are currently aggravating this website, they’re under security, I don’t think they’re a big deal, I think I’m ready now to go and actually start doing some low testing on my site. At this point we’ll hand it off to Mike Smith Blitz.io and make sure your site’s performing as well as you want it to.
Mike S. Hi, thanks everyone. Give me one second and I’m going to screen share and then I’ll walk through a demo of using Blitz and using it to tune the performance of your site. I’ll just give it one minute to load. While it’s loading…I’ll give a little background to Blitz. Blitz is a cloud based load testing service. It’s a pay as you go service with a premium payment model. What we do is we offer a free spine up to any users. What you do when you sign up, you’re able to get our basic plan which allows you to test with up to 250 concurrent users running against your site. From there you can scale up. We started scaling all the way up to 50,000 users if you’re really ambitious and want to cut your site; it handles that amount of traffic. The pricing models are very flexible. You pay for the time you want to use and for how much testing you actually need.
What I’m going to walk through is an example that I’ve created which is using a very plain default Drupal install which I set up on a server. It has a single page. It actually is a single static page. What I’m going to demonstrate is actually the thing, one of the things that you just heard about which is the impact of turning on and off page cashing on your site, and show you how you can use Blitz io2, actually demonstrate the real world impact that is a real impact that it has on end users.
Let me go back to Blitz. Here we have our initial finance screen. I’m going to go ahead, log in. For this purpose I’m just logging in directly to the Blitz site. We also have partnerships with Acquia so that if you’re using the Acquia dashboard, or via the Acquia tools you can actually add Blitz as an add-in and sign in directly to Blitz from there. The benefit of that is that 1) you don’t have to sign in, create a second account and Blitz will do a single sign in for you. It’s just a little bit more convenient and also as an additional thank you we are offering a free one week trial for… we’ll bump your user capacity up to 1,000 up from the normal 250. It’s a good way of trying out the service that’s a little bit more demanding, low test.
What we have here is the first page you see when you log into Blitz. We have what we call the Blitz bar. We’re going to start out by doing what we call a sprint which is really just a single request against your website. I’m going to take my URL from my example site here and I’m just going to paste that in to the Blitz bar and I’m just going to start my sprint. What’s happening, or what’s happened already, is there’s actually a Blitz engine that’s actually running in a data sprinter that is in Virginia and it performs a request against my site and told me what it got back. It told me that it responded in 125 milliseconds which is actually a pretty decent response time for a plain vanilla app. It’s also giving me some additional information about the request. It tells me what the response headers were, what the request headers were. When we did the request we identified ourselves as coming from Blitz so that if you’re a site admin and you see traffic coming to your site it’s a little easier to know where it’s coming from. You can use it if you’re interested in seeing some little technical details about how your site might be configured or how Blitz is seeing your site.
One thing you might notice here is the cash control header is actually configured to do no cash which no means there’s no cashing happening whatsoever. Every page load that your user will trigger is actually dynamically generating the page which is definitely a big red flag in terms of performance. Let’s take the next step which is actually running a load test and seeing how the performance changes when you go from just one user to a dozen. Let’s say we’re going to ramp up from starting with one user, we’re going to ramp up to 50 users over a period of 30 seconds. This will kind of give us a gradual ramp and we can see how the performance changes over time.
The load test starts running immediately we’re seeing the results as its running. We’re basically showing you both a response time and those statistics on the number of hits that your site is seeing. If we go down here we see that it initially it starts off with a response time with around 200 milliseconds. Pretty quickly we get to the point where we have about 20 users and the response time has jumped up to 1 second. Every user, it’s taking them a whole page, a whole second to load the page. By default we have a time out of one second configured. At that point we actually start counting these requests as time outs. Potentially a user came to your site, it was loading for one second, they got bored and they decided to go pick out something else. That’s essentially what you’re seeing here. Really, when you initially look at the site with the sprint, with a single request the performance was decent but you can immediately see that once you start getting a number of concurrent users, the performance really goes down hill quickly.
If you were running this site one of the very first things you would probably want to do and one of the things which is recommended to you to do is actually turn on…configure your Druple site to enable full page cashing. I am by no means a Druple expert but I did manage to figure out that you are able to do that very easily through the configuration. What I’m going to do is turn on page cashing for anonymous users. Pretty much just a one click operation here; configuration options saved, website still up. What I’m going to do is go back to Blitz and I’m just going to run this exact same load test now. If you remember before we got to about 20 users before the performance really dropped off. Let’s see how things are doing now when you have page cashing enabled. You can immediately see the response time is actually already much lower; it’s less than 100 milliseconds so in this case, the page is actually being served from being pre-generated and served from the cash. It’s not being dynamically generated on every hit. You can see that the performance is great. Most of the users coming to your site are immediately getting a response and your server is happily handling all the way up to 50 users. That’s great news. Of course you can scale this up more. I’ll go up to 250 users and we’ll probably start hitting some more issues with that. From this point, you can see that Blitz is a useful tool to use when you’re configuring your site, when you’re trying to figure out what both your peak performance but you also might use Blitz to determine over time has the performance of your site changed. Maybe you made a configuration change, you deployed it on some different servers; something like that you want to know what the impact of that was.
Going back to our results here, what we saw was that the performance with 250 users definitely slowed down quite a bit so after a point we started getting back to more like the one second response time which is probably not ideal. We also got a number of time outs because we went over one second. From this point you might start investigating other things like doing some more database tuning, you might try pushing into more capable hardware. You might try something like deploying a load balancer running your sites. A lot of these things are something that actually do well to help you identify some of the internal bottleneck for your application.
One thing that I want to demonstrate quickly is that we actually try to integrate with the tools that our users are comfortable using. We want Blitz to be something that you can use without having to learn a bunch of new things. We want to work with what you already have. An example of that is…we have these analytics plug ins. One of the ones that we support, one of the first ones we added actually was the new LX Analytic. If you have a New Relic account, you can actually configure your Blitz account to connect to New Relic and get some internal metric on your machine such as how much memory is being used or how much BPU time is being taken and you actually have those results overlaid into your Blitz chart. It will be a good way to kind of get started in performance. Of course, from there you could actually go into the full New Relic user interface and from there you get a lot more details which log in but this was just a demonstration of how the two tools can integrate together.
One other thing that I really wanted to find out is that we actually provide an ATI so that you can run Blitz without…in an automated fashion if you some sort of internal tools that you want to integrate with; we provide ATI for ATI clients for a variety of different languages. We put into ATIT is one of our newer ones. There’s really a lot of flexibility. There are a lot of capabilities…of course we don’t have time to cover today but I definitely would encourage everyone to check it out. Preferably if you are a member the Acquia Network and you can take advantage of the one week offer than that would be great. I think that’s about my time. I think at this point I can go ahead and hand it off to New Relic. And they'll gave you a little bit more of the more introductions as to how you can use New Relic to further tune the performance of your site.
Speaker 2: Well let’s see. Hopefully the slide will switch to my slides and people can hear me. Maybe someone will speak up if they can’t. That was great from Mike looking at Blitz. That’s pretty interesting stuff. I’m going to tell you about New Relic monitors, the web applications, and how you can use it maybe with the Acquia Insight or with Blitz or just stand alone. Just a brief background; New Relic is an application performance monitoring tool that runs on your site in production. We deliver software to service just like Acquia does; monthly subscription. We track both the application performance and actually the real user performance which is a typical job description so we can see how long each browser it takes and we have server monitoring as well.
Mike showed how Blitz takes some of that data and shows it with their stuff as well; that’s pretty cool how they integrate there. The idea behind New Relic is that it’s a production monitoring system. In addition to be able to using it in testing environments you want to do it live all the time because we’ve found that users tend to find much more interesting problems in your site than anyone can ever imagine than in a testing environment. I’m not saying you don’t want to do low testing but you definitely also want to monitor production.
I thought I’d show a few things that you find sort of common Drupal performance problems. One of the most common Drupal performance problems is slow database queries. You can find those with New Relic. Let’s look at a couple examples. Here’s a screen shot from when I was browsing Dries site, Dries being a Drupal site, you can look at some of this information and you see here that threes site has a certain amount of time every minute that goes to PHP and a certain amount of time goes database time and so on. We can drill down into that a bit further and say well why are we spending so much time in database time there? We drill down and we can drill down in a couple of different ways. We can either drill down in database time where we look at it specifically filtered a different way to look at which of the sequel statements that are being executed are the slowest ones on the site. Let’s drill down a little bit further and here’s a general view (again a screen shot of a general view of database queries) that happened on Dries' site and you’ll see that the number one database query that’s consuming the most amount of time is a select statement on the access log. I don’t know exactly what module Dries’ got running on his site there or how the code’s written but I’m getting that we’re looking there at whether the user is currently logged in or whether they’ve logged in recently and we’re accessing that table a lot. There’s a few other things that are taking up some time. We could drill in a little further and look at say URL alias table and find out where that table happens to be referenced in the code. If we look down at time consumption by collar we see that, the no page default will take up most of the time and then there’s a techton and turn page that looks at the URL alias and so on down the line. We can drill down further and we can go even further than that and drill down and see what are the actual sequel statements that are being executed that take a lot of time.
This is a screen shot not from Dries' site but from our site that couldn’t find a good example from previous sites. Here we have the actual sequel statements. Here’s where we have one that took 350 milliseconds or like 4 minutes which is ridiculous. You can drill down even further and….one of the things about New Relic is that you see the general view. If you’re using Blitz to drive your load and you see that as Mike was demonstrating at a slow point, you drill down into that and you just keep drilling down until you find the solution or find the problem.
Here’s a screen shot of where we’ve drilled down all the way to find the slow sequel statement and we’ve done a query analysis, i.e. an explain plan; New Relic sort of does that on the fly for you and it also shows you the exact saturates of the code where it’s being executed. You can see all of this and then you go, “Oh look, here’s where I need to improve that particular performance.”
We can look at this from another way as well. That was looking at what are the slow sequel statements but we might also look at from the point of view of what are the slow web transactions or web requests that take more time and that are caused by sequel statements. We can drill down in through transaction traces and here we see a particular piece of code called Find and Aggregate; it says a lot of sequel calls. We can then do things sort of drilling down and finding out oh when people come in with this web request, it’s causing a particular set of sequel calls which are slow. This is a way you can find out what’s going on there.
Now one of the things that you might be interested in is what sort of overhead we’re you using in external HTTP calls. A lot of people are building sites today (we’re building sites today) that make a lot of external calls and Dries’s site is a perfect example of this. If we look again at this screen shot of Dries’s site we see that the web external time, that dark green color has the certain amount of the overhead is allocated to that so what’s going on there? Let’s drill down as to why the site is making these external calls that are then slow when we’re packaging together services like that. If we drill down to what’s going on in the inside we find out that it’s calling this particular IP address and if you look further into that IP address, you discover that what it is, is a comment spam checking service. When people post comments on Dries’s blog, it called to this spam checking service which has this variable response rate.
Here’s a similar shot again from our site. We have various systems that we call like Horoku, API and campfire and so on. You can see multiple services if you call multiple services. Here’s another view of the same thing for Dries’s site. Here we’re showing that Dries's site calls out to database a lot of the time and external comment service some of the time and the database is very quick, the external service tends to be a bit slower. If you have a lot more services you see a much wider range of these things. You can see what the range of calls are, how fast those things are and so on. You can look at how those services are responding historically. Again, much like Mike shows you with Blitz, if you want to look at what’s going on, not just today, but what has been going on in the past. Where the characteristic changes are in your performance and in this case we’re looking at external performance not just…external services you’re using not just the services that I’ve written. You can compare them historically or you can drill down into particular time frames. If we look at this particular time frame, we see that briefly that the external used for comments filtering took a bit more time than it had before during a particular time period so we could look into spikes and find out why they’re slow.
A third reason your site might have performance problems is you’re just kept getting a lot of traffic. If you were using, again, Blitz and you generate a lot of traffic you might know why it had a lot of traffic but what if you had a lot of traffic just in general in your production site? Well you can go look at the grass and you can see what your output is. Dries's doesn’t get a lot of traffic or it didn’t when I was taking these snapshots; 20 to 30-page hits per minute. But if you had more than that you could drive and look at what was going on production wise.
This is one of my favorite charts in New Relic partly because it’s just pretty with lots of colors but also because it’s very useful for looking at what the scaling issues are as your site undergoes more and more load. We have been running this chart with that example that Mike was showing with Blitz; we would see a different curvage. Here we’re seeing a very horizontal line which is showing that the site scales very well with load. In that second example that Mike showed where when he started having 250 users his site started to suffer under the load. You would see much more of a hockey stick in this particular shape. Again, here’s Drie's site; here’s one with more data points. This is the one from our particular site. Notice that we’re running somewhere between 4,000 and 6,000 requests a minute through our site and New Relic is collecting all of that information and showing that we’re getting pretty much a linear response time there.
You can also see by the coloring that we tend to get a lot more of our hits in the day; fewer of the hits during the night; sort of classic North American performance curve there.
Another thing that might cause your Drupal site to be slow and you would want to check within what’s in the load tester. It’s just using too much TPU time. One of the things that New Relic has is that again I said we monitor not only application but also the underlying server and user browser as well. Here we’re clicking into the server’s tab, looking at all the servers that are running and we see a page that shows various things that servers; unbelievably boring which of course as an ops person everything you want to see is unbelievably boring; nice level graphs of low averages and so on. There could be cases where it’s not unbelievably boring when things are taking a long time.
Here’s a case where we had some of our code which was taking a long time and if you look at those response times for particular individual functions they were taking a long time for certain pages for certain customers and so we could drill down into those particular transactions and find out what’s taking all the CPU time. In this particular example we see that this function called find and aggregate is taking most of the time followed by this multi app data which taking another chunk of the time. We can continue drilling down into that to find out why exactly are those things taking all of the time and we’ll see that in this case I can just tell you because I know what the code did, find and aggregate was doing in our scripting language what we should have been doing in sequel. So instead of using the powerful sequel engine to doing and purging the data, we were doing it by grabbing the data and then aggregating in our code base and so it was a lot slower in terms of CPU time then it would have been if we had done it in sequel; which we then changed the code to doing it in sequel. You find the same sort of thing when you write custom codes in your Drupal module that you’re not taking advantage of right place to do that computation, you’ll find slow transactions and you can drill down and see what’s going on there.
As Mike pointed out in his click demo, one of the best things you can do to improve your Drupal site is to cash in. How do you detect cashing from them when you’re looking at New Relic? Again, it’s looking at that overview page. You look at the historical performance or the performance of pre-site there. Well, are there in cashing problems. It turns out Dries site is very well structured so it’s kind of hard to find problems in demoing Dries site but if we were to find such a thing, here’s an example where we had a cashing problem in one of our applications and you’ll see that the ruby code (because our application’s written in ruby not HP but you get the same thing when you look at Drupal) the ruby code was taking up a lot of the time; the database time, the garbage collection time but the ruby code was taking up a lot of time. The problem was that we weren’t cashing page fragments just like it turned on there for his Blitz demo. That deployment line showing up on the chart is when we then deployed a new version of the code that had cashing of page fragments. You’ll notice that very quickly the application time went down, as the cashing kicked in and we started cashing more and more of those fragments and our total computation time went down.
When you see a historical chart which is sort of the reverse of this where it used to be at a particular rate and then it went up dramatically, that’s typically a case where you’ve accidentally turned off cashing in your application and you can go and configure it one way or another to turn cashing back on again. You can also find this by just looking at again the transaction traces where the web requests come in to see which ones are taking a lot time. If there’s certain one’s that are taking a lot of time that on average don’t take a lot of time, you go look at cashing. In this particular case we’re going to drill in and we’re going to find that once again it wasn’t cashing, not once again but in this case it wasn’t cashing but it’s this external service that Dries created to do comment spam filtering that was taking up the time. It’s not a cashing problem in this case but you might have assumed it was by looking at the performance there but again by drilling down you can go oh in this case it wasn’t a cashing problem and you could look at it in more detail.
That’s a quick overview of New Relic and some of the things you can do with it for looking at performance problems inside Drupal. I think it goes really well with Acquia and with Blitz because we’re looking at the three different areas that you can within your over all website build in. The Acquia insight is a fabulous tool for deciding whether you built your Acquia, your Drupal system well and you use Blitz to run some low tests against it and then once you’re in production, you can use New Relic to make sure that your customers are still using your site the way you predicted they were going to use them. Acquia customers think that New Relic automatically provisioned by clicking a few buttons and with that I’m going to hand it back over to I think it’s Michael.
Michael: Thanks for wrapping it up. That’s exactly how I feel about these services too and how well they complement one another. I think that Hannah’s going to read off some Q & A questions.
Hannah: Yes, please ask any questions you may have in the Q & A section of the Webex UI. Our first question is how much performance overhead does New Relic typically have?
Speaker 3: That’s an excellent question and of course you always want to know that before you put anything into your production site. We’ve measured the performance. We have an internal goal of having less than 5% overhead for doing continuous monitoring and we’ve measured our overhead to be around 2-3% on your typical application. It’s always possible as an engineer I have to tell you, it’s always possible to write a diabolical application where the overhead will be more than our measured 3-4% but we’ve discovered across Drupal sites of all of our customers, we’re looking at about 3-4%.
Hannah: Okay, that’s it for questions. Could people please ask any questions they may have in the web XTY section? Alright that’s it. Thank you everyone for joining and thank you for the great presentations. Again, the slides will be posted at slide share and the recorded webinar will be posted to Acquia.com in the next 48 hours. Thank you.