Pantheon is a website operations and hosting platform designed to deliver improved speed, scalability, and reliability for customers leveraging Drupal and WordPress for website delivery. Powering more than 200,000 sites and supporting billions of monthly pageviews, Pantheon is used by organizations across the globe, including Stitch Fix, Datastax, and the ACLU.
Success at Pantheon comes from a consistent focus on engineering excellence and adoption of modern development methodologies and technologies. From the beginning, the platform was developed to leverage public cloud services in order to maximize performance and keep the company’s developers focused on its core technology. Written using a combination of Go, PHP, Python, and Node.js, the Pantheon platform runs on both open source technologies (Cassandra, Linux, and Kubernetes) and licensed technologies (the Fastly content delivery network, or CDN) to ensure performance, agility, and scalability for customers. The combination of a DevOps culture with a cloud-native approach to application development and delivery enables rapid iteration, resulting in a steady flow of innovative features for customers.
Optimizing cloud delivery
Pantheon was originally built using a first-generation cloud services provider. While the platform was satisfactory, before long the pace of innovation at Pantheon began to outpace that of its existing cloud services provider.
After evaluating several solutions, Pantheon chose to replace its existing provider with Google Cloud Platform (GCP), including Google Compute Engine, Google Big Query, Google Kubernetes Engine (GKE), and Cloud SQL. With GCP, Pantheon has access to capabilities supporting its current and future road map, including machine learning, integration with G Suite, and support for Kubernetes, which is critical to the future of Pantheon’s core container orchestration layer.
Migrating with confidence
After choosing GCP, Pantheon went to work mapping out its migration strategy. With over 200,000 websites on the platform, all with custom application code and many with multiple active development environments with any number of technical variations, migration was anything but a straightforward path. However, in the manner consistent with its DevOps culture, Pantheon planned and tested over the course of three months to ensure its migration was a success. “Before we moved a single workload over to GCP, our engineering team mapped out a detailed plan for how we would successfully migrate, including establishing baseline metrics and acceptance criteria,” says Josh Koenig, co-founder and head of product at Pantheon.
Critical to this process was New Relic APM. By using APM in both its legacy and GCP environments, Pantheon was able to gain insight into baseline metrics such as response time, error rates, and throughput. (For examples of cloud migration best practices, read the New Relic Tutorial: Measure Twice, Cut Once). “Using New Relic, if an error occurred with a workload migrated from our old cloud platform to GCP, we could see down to the line of code what was causing the issue and resolve it quickly,” says Koenig. “Because we had clear insight into our data, we never had to question if an error or performance degradation was caused by the app or the cloud platform.”
After testing, migration began. “Using New Relic, we migrated 50,000 customers and over 200,000 total websites in just two weeks with no major incidents,” says Koenig. “We didn’t receive a single customer support call due to the migration and didn’t open a single ticket with GCP during the process. We were amazed.”