June 24, 2025

The Future of SIEM: Why Data Lakes Are Gaining Traction

Legacy SIEM platforms were designed to bring order to chaos. But today, many teams are actively searching for SIEM alternatives as data volumes grow and pricing models break under pressure. Security teams are increasingly finding themselves forced to drop critical log sources and limit retention windows, not because it’s smart, but because the pricing model leaves them no choice.

The Cost and Risk of Traditional SIEMs

SIEM is now one of the highest-budget line items in many security programs. According to real-world accounts, some organizations have seen contracts with platforms like Google Chronicle jump from the mid–five figures to over $1 million annually. Other platforms, like Sumo Logic, have raised prices by as much as 4–6x.

In response, teams cut corners. High-volume log sources get dropped. Retention shrinks to 30–90 days. But threat actors aren’t working on a 30-day cycle. IBM’s 2023 Cost of a Data Breach Report found the average breach lifecycle lasts 277 days. In 2024, it improved only slightly to 258 days. Without long-term logs, security teams are flying blind, unable to conduct thorough investigations, meet compliance requirements, or improve detection models.

SIEM Is Being Unbundled

Modern security teams aren’t waiting for legacy vendors to catch up. Instead, they’re embracing modular architectures, unbundling the SIEM into best-of-breed tools:

  • SOAR: Tines, Torq
  • Log Collection: Grove, Scanner.dev
  • Log Enrichment & Routing: Cribl, Substation
  • Search & Investigation: Athena, Snowflake, Scanner.dev
  • Detection Logic: Panther, Scanner.dev
  • Case Management: Torq, Incident.io

This new architecture is cheaper, more flexible, and scales with infrastructure, not against it. Forward-thinking teams are building detection pipelines that ingest logs into low-cost storage, enrich and route them intelligently, detect threats using open rule formats, and surface alerts into their tools of choice.

The Rise of Object Storage

One of the driving forces behind this shift is object storage. Platforms like AWS S3, Google Cloud Storage, and Azure Blob power the rest of the data ecosystem from Snowflake to AI pipelines. Now, security is catching up.

Object storage is:

  • Durable and scalable
  • Vendor-neutral
  • Extremely cost-effective (typically $0.02–$0.03/GB)

This changes the economics of retention. Suddenly, keeping a year or more of logs becomes viable without sacrificing search performance.

But What About Performance?

Historically, object storage wasn’t fast enough for interactive investigations. That’s changing fast. Tools like AWS S3 Table + Apache Iceberg are accelerating structured queries. Engines like Scanner bring SIEM-like performance to cloud-native storage, with needle-in-haystack search being 100x faster compared to Athena for JSON-based logs.

This means teams can:

  • Retain all logs
  • Keep years of searchable history
  • Investigate freely without cost spikes

Retention Isn’t Just a Compliance Checkbox

Short log retention is more than a budgeting compromise, it’s a security risk. According to PCI DSS requirements, for example, logs must be retained for at least one year, with three months readily accessible. And many cyber insurance providers now consider long-term visibility a prerequisite for coverage.

Longer retention also means better detections. Analysts can baseline normal behavior, spot low-and-slow threats, and answer questions like: “Have we seen this before?” without switching tools or restoring cold archives.

The Strategic Advantage of Modular Architectures

Modular SIEM architectures offer multiple advantages:

  • Cost Savings: Storing logs in object storage instead of a traditional SIEM can reduce costs by 50–90%
  • Flexibility: Teams avoid vendor lock-in and can swap out components as needs evolve.
  • Scalability: Add new log sources without fear of cost spikes or performance degradation.
  • Speed: Search performance is improving rapidly thanks to open formats, columnar storage, and indexing engines.

A Practical Path Forward

CISOs don’t need to rip and replace overnight. Many teams start by offloading high-volume, low-signal logs, such as cloud audit trails or DNS records, into object storage like Amazon S3. Tools like Grove make it easy to collect and route these logs, while enrichment layers like Substation can add valuable context before storage. From there, logs can be queried directly using Athena (though this can be slow and expensive per query), or piped into systems like Snowflake for more structured analysis at relatively low storage cost. For teams that want immediate value, indexing logs in place with a tool like Scanner.dev enables fast search and real-time detections, without the heavy overhead of traditional SIEMs. This approach offers a cost-efficient onramp that reduces spend today while laying the foundation for a modern, scalable logging strategy.

Modern SIEM strategy is no longer about squeezing logs into a box with fixed capacity. It’s about creating an open, scalable, and efficient architecture that supports long-term visibility and rapid investigation.

The future of SIEM isn’t all-in-one. It’s modular, storage-first, and built for the realities of today’s security landscape.

We believe that traditional log architectures are broken for modern log volumes. Scanner enables fast search and detections for log data lakes – directly in your S3 buckets. Reduce the total cost of ownership of logs by 80-90%.
Photo of Cliff Crosland
Cliff Crosland
CEO, Co-founder
Scanner, Inc.
Cliff is the CEO and co-founder of Scanner.dev, which provides fast search and threat detections for log data in S3. Prior to founding Scanner, he was a Principal Engineer at Cisco where he led the backend infrastructure team for the Webex People Graph. He was also the engineering lead for the data platform team at Accompany before its acquisition by Cisco. He has a love-hate relationship with Rust, but it's mostly love these days.