Managing log costs in Datadog is a frequent challenge for many teams. Security teams, in particular, often find themselves navigating between Datadog Standard Logs, Datadog Flex Logs, and Datadog Cloud SIEM, each with its own complexities and cost considerations. In this article, we'll explore the specific challenges that arise when using these three features and show you how Scanner can help augment Datadog to address these issues effectively. By leveraging Scanner, you'll be able to enhance your log management while keeping costs under control.
The Datadog Standard Logs feature is the main workhorse of log management in Datadog. One challenge that teams have with Datadog Standard Logs is deciding how many log events to keep indexed. Datadog charges per million log events indexed, and as your retention window grows, the cost can quickly spiral out of control. As log volumes scale up, teams are often forced to reduce their retention windows from 30 days to 15, 7, or even just 3 days.
To manage costs, many teams choose to archive older logs to cloud storage, such as AWS S3. While Datadog offers a log rehydration feature to investigate archived logs, there are limitations. Rehydration is capped at a maximum of 1 billion log events (or roughly 1 TB of data), and the process can be painfully slow, taking hours to rehydrate terabytes of data.
These limitations become a serious problem for security use cases. Security teams often need to look further back than a few weeks—sometimes months or even years—to find patterns, investigate potential threats, and uncover indicators of compromise. Reducing your retention window can mean missing out on crucial information, such as malicious IP addresses or domains that accessed your systems months ago.
Consider a typical cloud audit log source, such as AWS CloudTrail, with a volume of 1 TB of logs per day. Retaining 30 days of these logs in Datadog can cost $75k per month or $900k annually. For many teams, that’s prohibitively expensive—especially when considering the limited ability to investigate logs beyond the retention window.
Rehydrating logs from S3 is a cumbersome solution. You can only rehydrate 1 billion log events at a time, which means that investigating anything beyond a day’s worth of log data involves piecing together multiple rehydration efforts, each of which can take hours. And in a complex threat investigation, when you need to pivot across different data sources like DNS logs or email logs, those hours add up quickly.
Datadog has introduced a new log type called Flex Logs, which makes it easier to retain logs for longer time periods, up to a maximum of 15 months. However, this comes with trade-offs, particularly slower query times when searching these logs.
Let's consider the pricing for Flex Logs. Suppose you have 1 TB per day of log volume and want to retain those logs for 12 months, which means storing 365 TB of data. The storage price for Datadog Flex Logs is $0.05 per million log events per month. If we assume the typical log event is 1 KB in size, the storage cost would be $18.6k per month or $223k annually. In addition, there is an ingestion cost of $0.10 per GB, which totals $37.3k per year. There is also a compute cost for Flex Logs, though Datadog does not make this pricing public. Based on some sources, a "Medium" Flex compute configuration for this log volume could cost around $75k per year. In total, this would amount to approximately $335k annually.
This cost is significantly lower than the $900k per year required for Datadog Standard pricing with 30 days of retention. However, the downside of Flex Logs is the slower query performance. While performance data is not publicly available, Datadog states that Flex Logs are suitable for log sources that only need to be queried a few times per day. Our estimate is that querying Flex Logs could be up to 10 times slower than querying Standard Logs, though this is not certain.
Datadog Cloud SIEM allows you to run a few hundred out-of-the-box detections on logs that flow into Datadog, and you can create your own custom detections. This can be helpful for catching threats in many different log sources. However, the price of Datadog Cloud SIEM is extremely high. According to public pricing information on Datadog's website, the price is $5 per million events analyzed.
To illustrate the cost, let's consider an example scenario with 1 TB per day of logs. Assuming the typical log event is 1 KB in size, then 1 TB per day of log volume translates to roughly 1 billion log events per day. With Datadog Cloud SIEM, this level of log volume would cost $5,000 per day to analyze, or $1.8M per year.
This is an extremely high price tag for detections. Due to the cost, teams often find themselves dropping high volume log sources from Datadog Cloud SIEM, losing detection coverage for those log sources.
Scanner can be a powerful companion to Datadog, helping you achieve more efficient continuous detections and faster historical search capabilities, all while significantly reducing costs. Let's explore how Scanner can help augment Datadog's capabilities, using the example of 1 TB of log data per day.
To start, you can configure Datadog Log Archives to forward all your log data—1 TB per day in this scenario—to an S3 bucket. Then, point Scanner to that S3 bucket. Scanner will organize the logs for rapid search by saving an optimized, indexed version of these logs to another S3 bucket in your AWS account. With this setup, you can take advantage of Scanner's efficiency while retaining Datadog's core monitoring and tracing capabilities.
Instead of using Datadog's Cloud SIEM, which costs $5 per million log events analyzed, you can use Scanner's detection rules for just $0.10 per million log events—50 times less expensive. In our example of analyzing 1 TB of logs per day, this means reducing costs from $1.8 million per year with Datadog Cloud SIEM to less than $40,000 per year with Scanner. That's a significant saving while maintaining effective threat detection across your log data.
For example, you can take a small number of high-volume log sources, like AWS CloudTrail or Cisco Umbrella logs, and stop pushing them into Datadog Cloud SIEM. Instead, you can configure Datadog to archive these logs directly to an S3 bucket and use Scanner Detections to get threat coverage on these logs. If these high-volume log sources are reaching 1 TB per day or more in total volume, this approach could save over $1.7M per year in costs.
You can also use Scanner to reduce the retention period of Datadog Standard Logs. Instead of keeping logs for 30 days, consider keeping only 3 days of logs in Datadog. For 1 TB of logs per day, this would bring the cost of Datadog Standard Logs down from $900k per year to $38k per year. With Scanner in place, you can still perform high-speed searches on any logs older than 3 days, allowing you to reduce retention costs while retaining your search capabilities.
When you need to perform "needle-in-a-haystack" searches, such as finding indicators of compromise (like IP addresses or domains), Scanner can quickly search an entire year of logs—365 TB in our scenario—in just 30 to 60 seconds. The cost of indexing and querying 1 TB of logs per day with Scanner is less than $100,000 per year, whereas the cost for Datadog Flex Logs is approximately $335,000 per year. Scanner gives you powerful, efficient search capabilities at roughly one third of the cost of Datadog Flex Logs.
Scanner achieves its high search speed by using a coarse-grained inverted index stored in S3, coupled with serverless Lambda functions that rapidly traverse the log files. This approach uses special CPU instructions, specifically SIMD text matching implemented in Rust, to maximize efficiency. Scanner's search performance is both highly effective and cost-efficient, allowing you to query vast log datasets in a short amount of time—30 to 60 seconds to search 365 TB of logs in our example.
Scanner also works nicely with other Datadog workflows. For example, if you're looking at a log event in Scanner that contains a trace ID, you can click a button in Scanner to open that trace in Datadog in a new browser tab. Likewise, if you're in Datadog and want to look up the Scanner logs related to a particular trace, Scanner makes it simple to execute a query for that trace ID and view those logs. This interoperability makes Scanner an excellent companion tool that blends smoothly with your existing Datadog workflow.
By using Scanner in combination with Datadog, the cost savings are substantial. In our scenario of analyzing 1 TB of logs per day, using Scanner Detections instead of Datadog Cloud SIEM can save you over $1.7 million per year. Additionally, you can reduce the Datadog Standard Log retention from 30 days to 3 days and use Scanner to search your logs that are older than 3 days, saving $860k per year in Datadog Standard Log costs. Altogether, Scanner allows you to maximize detection efficiency and search speed while dramatically reducing costs.
If your team is not covering certain log sources in Datadog Cloud SIEM or Datadog Logs due to high costs, Scanner can help. With Scanner's cost efficiency, it's affordable to run detections and searches on high log volumes, even at 1 TB per day or more. This means you can gain significantly greater visibility and threat detection coverage for high-volume log sources, such as cloud audit logs, network flow logs, and others.
By integrating Scanner, you can overcome the cost and efficiency challenges that come with using Datadog's logging and SIEM features. Scanner empowers your security team to retain visibility across high-volume log sources without compromising on detection capabilities or facing prohibitive expenses. Whether it's reducing the costs of continuous threat detection, enabling rapid historical searches, or managing retention more effectively, Scanner is a versatile and cost-effective tool to augment your Datadog environment. Take control of your log data, enhance your threat detection coverage, and unlock significant savings—all with Scanner.
Ready to see Scanner in action? Sign up for the Scanner playground at scanner.dev/demo and experience how Scanner can enhance your Datadog setup.