Following the Sun: Ops Team Keeps the Lights on - Part 2
(Part 2 of "The Adventures of the Acquia Cloud Ops Team" blog series)
When last encountered our intrepid Operations heroes, they were providing a super support effort for a client’s big day on a major morning news show. This week we want to talk about exactly what they did and how Acquia’s cross-functional support effort made morning show’s best day, in fact their best day.
To begin with, the Operations team addressed infrastructure: web servers were strategically added to optimize site performance, handle increased traffic and to speed up slow SQL queries as they were identified. All aspects of the stack (PHP, Memory, CPU, etc.) were checked for usage and whether they were sufficient to handle the anticipated increase in traffic. What we saw in analyzing utilization was that there were irregular "bumps" in CPU load that while related to traffic increases they were a symptom rather than a cause. By looking at what was happening at the time, using the Linux tool strace, we were able to identify the code triggering the CPU spike and work with the customer to tweak it so it was less of a burden when site visitors spiked.
What made Acquia’s efforts unique, however, was the fact that senior-level Drupal experts from across the nation were pulled in to ensure the project’s success.
Acquia then engaged its Professional Services group to look at best-tuning practices, performance audits and architectural recommendations. Recommendations from the PS team let to modifications of module usage within the Drupal application. Then, by reviewing the mySQL slow queries log, we identified a table that was being accessed frequently but without a necessary index. Some highly utilized pages that were bypassing the Varnish caching, due to session cookies, Acquia caught as part of the Acquia Cloud platform.
A senior performance engineer conducted an analysis, looking at caching strategy and recommending configuration or code changes to stay ahead of the curve. The analysis included Varnish, Drupal and Memcache utilization. As one example, there was data being populated into Memcache that was resulting in frequent “ejections” resulting in sub-par caching performance. Working with the customer, this was optimized and the benefits of Memcache were realized.
Because Acquia has experts located across four time zones, the highly trained Support and Operations experts were always available as the need arose. Multiple improvements and recommendations were made on the fly and as traffic ramped up. The application in fact became more efficient. As a result, the customer had an extremely successful campaign. They drove an increase in users and won bragging rights as the largest web server configuration that Acquia had hosted at the time.
End to end, Acquia had 10 experts involved in supporting this one event. This is not a unique story for Acquia. Our customers get the full support of our teams of experts, our proactive planning, and the global locations of our staff.
In coming weeks, we’ll look at other examples of the Acquia Operations team’s most challenging projects, and detail the key roles and proactive support that often go on behind the scenes at Acquia so that our customers can concentrate on what they do best.