New Relic has nearly 17,000 customers worldwide that benefit daily from the company’s observability platform. Yet to view the success and impact of the New Relic platform, it’s possible to look much closer to home, with the company’s own distributed IT team using the platform to keep the company up and running without a hitch.

Like many of its customers, New Relic has been expanding its IT environment in recent years to tap into the flexibility and power of the cloud. The distributed system supports multiple products with numerous users at any point in time. The hybrid environment now includes hundreds of machines across multiple cloud providers. New Relic combines its own products with niche products and services from multiple third-party providers, each of which typically has its own system for management and visibility. Everything needs to be monitored closely to help ensure high performance for customers.

“We don’t have the resources to watch ten production environments at once,” says Tony Mancill, Lead Software Engineer, Telemetry Data Platform for New Relic. “We need a way to look at data across our environment, slice it up, and see how things are working. That all helped lay the foundation for New Relic One.”

A powerful, unified observability platform

In 2019, New Relic introduced New Relic One, a new way of deploying and managing New Relic solutions. New Relic One supports streamlined rollout and offers a unified platform for teams to visualize and understand everything happening across a software environment, including on-premises infrastructure, multiple clouds, virtualized containers, and third-party services.

“At New Relic, we’re one of the most demanding customers of our own platform,” says Tim Krajcar, Senior Director of Engineering, Full-Stack Observability at New Relic. “Our users push the limits of what can be done. We know that if New Relic One is performing well for us, it will perform well for our customers.”

After spending some time working with New Relic One, teams started to look for ways to make the platform even more intuitive, surface more information for users, and improve the overall New Relic One experience.

“I originally led a team building APM solutions for New Relic One, but we realized that we needed to stop thinking of each New Relic solution as a different experience,” says Krajcar. “We poured so much effort into developing a unified platform in New Relic One that we needed a unified interface to expose all of the available capabilities in one place.”

The reimagined New Relic One experience unites all systems into a single dashboard and alert stream to greatly improve ease of use. Users no longer need to switch between apps to view different systems. Queries, filters, searches, and other operations extend across systems, allowing users to compare performance and draw connections between data across the environment. The new interface also provides several ways to highlight different products, solutions, and features, so people get more from New Relic performance. 

“The work we did to reimagine the New Relic One experience was one of the most exciting projects I’ve ever worked on,” says Krajcar. “It was inspiring to see so many people coming together to create a new approach to observability in a modern environment. We had dozens of internal developers focusing on just the interface.”

New Relic wanted feedback from its most involved users, so it rolled out the updated experience to all internal teams early in the development process. Feedback has been overwhelmingly positive, with teams praising it for being extremely intuitive and providing new ways to understand what’s happening in the environment.

New Relic is now bringing this bold new experience to all New Relic customers. While the experience retains familiar views and dashboards, New Relic feels confident that customers will quickly recognize the benefits of the reimagined platform.

Accelerating performance with greater observability

Mancill could be considered one of New Relic’s many power users, working with the New Relic platform every day to manage performance of petabytes of data. He praises the updated New Relic One experience for giving him incredible visibility into a complex environment, which helps him spot and troubleshoot issues faster than ever.

In one case, Mancill's team noticed that a cluster was not performing up to the same standard as other clusters. With hundreds of instances in play, pinning down the source of the performance error would normally be time-consuming. Using New Relic One, they conducted a massive A/B test across two sets of 400 instances. Comparing performance across all instances and making specific queries and filtering through the system to pinpoint outliers, the team was able to isolate the issue to a specific revision of hardware and trace it all the back to a firmware bug. Having visibility across a range of software and system metrics allowed the team to restore the cluster performance quickly.

“In a less-observed environment, it might have taken a full-time engineer months to comb through all of the data and find this type of discrepancy,” says Mancill. “New Relic One allows us to ask questions that we didn’t even know were possible. The answers that we’re getting help us boost performance for our environment so that we can deliver stronger solutions for our customers.”

While Mancill praises observability improvements in a complex environment, he also gets excited when talking about other New Relic One features. Improvements to logging with New Relic Logs has been a “game-changer” because users can search through logs in a central location. While Mancill was not a heavy user of AI previously, he’s very interested in incorporating AIOps to proactively detect and solve minor deviations that can turn into major issues and even outages if not addressed quickly.

“I wouldn’t want to go back to the way IT was done before New Relic One,” says Mancill. “Strong observability allows you to validate systems and code, giving you the confidence that you need to move forward quickly and deliver more perfect software.”

Enabling new open source support

An exciting development for New Relic has been its increased developer support, including a growing body of open-source apps and APIs.

“Many engineers working at New Relic are active in the open source community, including myself and Tony,” says Krajcar. “We understand the open source mindset. Engineers want to take things apart, see how things work, and create their own spin on things.”

New Relic is committed to open standards, open source instrumentation, and the open communities that support them. The company has shared most of its agents through GitHub, with a goal of sharing all agents by 2021. Future observability offerings will be standardized on OpenTelemetry, an emerging standard for open instrumentation. New Relic has also built an entire site devoted to cataloging and highlighting open source projects. The company hopes that this will inspire the growing New Relic user community to get involved and share solutions to make the platform even better for everyone.

“At New Relic, we know that ’best-in-class’ means something different to everyone,” says Krajcar. “While we work every day to improve the New Relic platform, every developer has their own priorities and opinions of what the platform needs to make it truly special. With an open platform, developers don’t need to wait for New Relic to create the features for them. They can make their own tweaks to the system and develop their own best-in-class apps.”

As New Relic One evolves, Krajcar, Mancill, and the rest of the team look forward to seeing how observability will continue to help New Relic and other forward-looking organizations maintain smooth operations and provide unmatched customer experiences.