Log management refers to all of the processes involved in handling log data, including generating, aggregating, storing, analyzing, archiving, and disposing of logs. A log management system should record everything that happens in an application, network, or server—whether that’s an error, an HTTP request, or something else. This log data can then be used for troubleshooting and analysis. Modern applications often have millions or even billions of events coming from different services each day, which can make it challenging to manage and, more importantly, obtain actionable insights for incidents. For this reason, log management is an essential part of DevOps, observability, and IT practices.
What is a log?
A log is a timestamped, computer-generated record of a discrete, specific action or event. Almost everything in modern software systems can produce log data, and it’s a best practice to ensure that you are doing so. Examples of data that you’d typically log include function calls, errors, HTTP requests, database transactions, and much more. Logs provide a play-by-play of actions as they happen.
Why are logs important?
Logs are a record of every single thing that happens in your application, network or server. They provide a foundation for application monitoring, error tracking, and error reporting, making them an essential part of observability. None of these things would be possible without logging data.
There are four kinds of telemetry data used in observability and monitoring. You can use the acronym MELT to remember them.
- Metrics are based on aggregated log data and provide information on how your application is performing. Tools like New Relic automatically generate some metrics for you while you can customize other metrics depending on your needs.
- Events describe things that happen in an application. They consist of multiple lines of log data. Events take up more storage space than logs, so they tend not to be kept in storage for as long as logs are.
- Logs are much more granular than events and describe every single step that happens in an application.
- Traces use spans to connect events, making it possible to track the root cause of an issue and fix it.
All four kinds of telemetry data have logs in common. Without logs, there is no MELT, and you aren't able to observe what is happening in your application. Instead of being able to proactively monitor your data for issues, you learn about problems from your end users, forcing you to be reactive, not proactive, when it comes to addressing issues. Even worse, once you learn there’s a problem, it's very difficult to fix without logs because you have no record of the errors in your application. The end result: unhappy users, stressed engineers, frustrated customers, and most likely an unsuccessful product.
Logs also provide another key benefit: they tend to be very small. For that reason, they are much easier to transmit and store than many other types of data, such as events.
Why is log management important?
It’s not enough to simply configure an application to emit log data. You’ve probably heard of the following philosophical thought experiment: “If a tree falls and no one is around to hear it, does it make a sound?” In the case of log data, if the data is produced but it’s not properly collected and stored, that log information is lost. Log data needs to be sent somewhere—ideally, a centralized location where it can be properly analyzed and retrieved as needed, in context with data from other services.
However, centralized data collection is just one step in the log management process. Log management involves taking care of every part of the logging lifecycle, from the moment log data is emitted to its eventual archival or deletion.
Many modern applications include microservices, distributed systems, and cloud-based services, with each part of the system emitting its own log data. For example, let’s say you need to know which HTTP requests have the longest average response time in your application. If your application is distributed, multiple services could be making HTTP requests, and you’ll only be able to compare them if the log data is available in one place. That’s where successful log management comes in.
Successful log management allows you to:
- Reduce context switching: Storing your log data in one place removes the need to switch between tools and contexts. If your log data is stored in different places, you potentially need to check multiple locations and tools in order to debug an issue.
- Find and fix problems faster: A log management solution allows you to quickly retrieve, analyze, and visualize log data in context, helping you quickly identify and eliminate problems before they impact your users.
- Instantly search your logs for the data you need: A good log solution provides all the search functionality you need to drill down into your logs and get the data you need quickly.
- Visualize all of your data in a single place: With centralized log data, you can use MELT to build custom visualizations and dashboards that give you a high-level overview of how your application is performing.
So what features do you need to have to ensure you get these benefits? Let’s take a look.
What features should your log management solution include?
You’ll get the maximum benefit from your log management tools if they include the following features.
- Flexible, comprehensive instrumentation: In order to collect all of your log data in one place, your application needs to be instrumented. Instrumentation is the process of installing agents that track the data flowing through your application. Consider an application that has multiple cloud services as well as Java, Rails, and .NET APIs, and primarily uses React and JavaScript on the front end. Each of those services needs to be instrumented, and your log management tool should have agents available for as many different programming languages and services as possible so you can get complete coverage for all of your services. While you can always choose not to instrument part of your application (such as sensitive data), any services that aren’t instrumented will leave you with gaps in your log data.
- Compatible with log forwarding: If it’s not possible to instrument a service, you will need to forward your logs. Your log management tool should be able to handle log forwarding.
- Powerful querying: If an error comes up, you need to be able to access your logs immediately. That’s why it’s important for your log management solution to have the ability to quickly and efficiently query data. For instance, New Relic uses NRQL (New Relic Query Language), which offers a wide range of flexible queries that allow users to get the data they need.
- Secure data storage: Security is essential for handling sensitive log data, especially for applications that need high levels of compliance such as HIPAA (a standard for keeping health and medical records secure).
What are the steps in the log management process?
Here are the major steps in the log management process:
- Creation: First, your services must produce log data. Third-party services often include functionality to emit log data. You can also instrument your services to record logs.
- Collection: Collecting log data is the second step in the process. This usually involves a combination of instrumentation and log forwarding so that all logs are collected in a centralized location.
- Aggregation: Aggregation is the process of taking the collected log data and organizing it into a useful, standardized format. This includes parsing log data from different sources and transforming the data format as needed so that the data is consistent. For instance, your aggregated data might be standardized in JSON format. It’s also typical to enrich logs with metadata that provides additional context, such as the service or IP address where the log was emitted.
- Storage: Once log data is collected, it needs to be stored in a database. For instance, all log data sent to New Relic is stored in the NRDB (New Relic Database). Ideally, the data should be indexed for efficient querying and analysis.
- Archival: After a certain period of time, you no longer need continuous access to some of your log data, but you may need to retain that data for other purposes, including legal reasons or company policy. By archiving older data, you can optimize your storage and make it more efficient to query your recent log data.
- Deletion: You may also opt to delete log data after a certain period of time in order to save space.
Ultimately, the goal of the log management process is to make logs available for troubleshooting, analysis, visualization, reporting, and alerts. None of these are possible without an effective log management solution. Some log management tools even include this functionality, but it’s also typical to use application performance monitoring and observability solutions in combination with log management.
Get started with log management. Try New Relic.
The best way to learn more about log management is to get hands-on experience. Sign up for a free New Relic account to get started. Your free account includes 100 GB/month of free data ingest, one free full-access user, and unlimited free basic users. Then take a deeper dive into our log management documentation.
The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.