Challenge

27Global introduced Site Reliability Engineering (SRE) as a key business pillar, and leverages this capability for its internal DevOps teams and for service offering to customers. The company traditionally used a mix of tools, including Grafana, Graylog, and Zabbix, and the SRE team needed more consistent, consolidated observability across multiple development pipelines on-premises and in the cloud.

27Global has an offshore team of engineers in Vietnam in addition to its U.S. team, and found it difficult to communicate complex operational issues, such as performance problems, between the two teams. The SRE team lacked cohesive metrics as evidence for existence of performance issues. Assembling operational data—such as events, logs and traces, to build end-of-day dashboards—for the teams took 45 minutes each night.

With technology changing at a rapid pace, 27Global’s SRE team needed a solution to provide accurate, consolidated, and easy-to-share measurement data to improve DevOps efficiency and deliver value to SRE customers.

Solution

Using New Relic One, 27Global now has observability across its multi-cloud/hybrid cloud DevOps environments. The company can easily instrument containers and microservices in AWS and Microsoft Azure, as well as on-premises infrastructure, and provide engineers and SRE customers with a single view for observability.

For custom software development work for clients, 27Global uses New Relic One as a communication tool, providing developers with context to understand their impact on operations, and enabling the operations team to better understand the developers’ work. 27Global also uses New Relic One as the platform for its SRE service offering, providing clients operational insights to guide strategic decisions on where and how they invest in their infrastructure.

Outcomes

Reduced Mean Time to Recovery: With APM and Logs capabilities in New Relic One, 27Global identified a client database problem and delivered a performance improvement to the application within 24 hours.

Accelerated Time to Market: 27Global cut in half the time to stand up a new project, and continues to shrink time to market by using full-stack observability and automation.

Reduced Toil: Automating with New Relic Terraform Provider streamlines deployments, reducing toil for 27Global’s DevOps teams..

Improved Communications: Automatically generated end-of-day dashboards ensure continuity, eliminating language barriers across time zones when handing off projects between offshore and onshore teams. Bimonthly reports to clients now take minutes instead of four hours.

Increased Visibility: 27Global has granular visibility into client environments, and can present data-driven evidence to support uptime reports, giving clients more confidence in the SRE service.

Saved Time: Having a single source for logs cut the time developers spent correlating measurements to a service problem in half.

Lowered Costs: Consolidating from multiple tools to New Relic One saved 27Global $1,500 per month, which can now be redirected to new hardware for the SRE team.

“Having a single view with New Relic One is a lifesaver for me. When we tie logs to an application and try to figure out the contributing factors, I can go to one spot instead of three different tools like before.”

Thomas Martin, Director, Site Reliability Engineering, 27Global

Thomas Martin from 27Global shares how New Relic One with log management delivers value for his global development and operations teams.