Avoiding Performance Issues: When and How to Debug Production

User satisfaction depends on site performance, and engineers have increasingly built websites with an eye towards availability, correctness, and speed. Why, then, do sites with robust unit and functional test suites go dark? And with easy cloud deployments and trivial git reverts, why do performance regressions take so long to fix?

Organizations are quick to blame "bad code", but a module that worked without a hitch on a development box, or even during a load test, can fail catastrophically when confronted with idiosyncratic user behavior or a different hosting environment. Keeping sites running under these conditions means not just expecting challenges, but exceeding the demands of unexpected ones.

You'll learn techniques that help Drupal teams prepare for the worst:

  • Continuous monitoring across production
  • Agility in following problems through complex systems
  • Making sure the right team members are aware of problems
  • Structuring deployments to keep them responsive and reversible
  • Writing code that won't cause debugging headaches

James Meickle is one of the Drupal developers who kept Mitt Romney's campaign website running smoothly during the 2012 presidential elections. Now he works at AppNeta where he's helped dozens of customers deploy performance monitoring solutions across dozens of frameworks.