July 23, 2025

Taking on Black Hat with Something New

We're heading to Black Hat. Not for the slots (we're terrible at gambling), but to connect with the people who deal with detection chaos daily – CISOs, SOC leaders, cloud teams, security engineers. The folks in the trenches, making touch calls and solving hard problems.

For the past year, we've been listening closely. And what we're hearing isn't new, but it's getting louder:

  • SIEM costs are spiraling. We're hearing from teams that tools like Google Cloud SecOps (Chronicle), Sumo Logic, and others are raising prices—sometimes 5-10x.
  • There is a move to data lakes. In response, many are turning to security data lakes in S3, GCS, or Azure Blob Storage. It's a far more scalable, cost-effective approach.
  • But the shift isn’t easy. Getting data into the lake is still too complex, and without full-text search, investigations become slow and frustrating.

That’s what we’re showing up to talk about at Black Hat: the pain points security teams know all too well, what we’ve already built to make detection more usable, and a new addition we’re excited to unveil – a zero-maintenance log ingestion tool that makes getting data into your security lake effortless.

What Security Teams Keep Telling Us

We attended Black Hat in 2024, and we are still hearing the same things. After talking with dozens of teams across many different types of companies, we’re hearing a few common statements like: 

“We’re paying an even bigger fortune for SIEM this year, but only sending in what we can afford to.”

“I’ve collected a lot of logs in my data lake, and I can do basic lookups, like searching by IP or domain, but when I need to dig into messy fields, like command-line arguments buried in PowerShell logs, it falls apart. Full-text search just isn’t practical in most data lakes.”

“We’ve got all this data sitting in our lake, but running detections on it is way harder than it was with our old SIEM. It feels like we have more data, but less signal.”

“Our ingestion pipelines into our data lake are a mess, scripts duct-taped together that no one wants to touch.”

If any of this sounds familiar, we’d love to hear how you’ve approached it – and if you’re still searching for a solution, we’d love to help.

New Feature Alert: Scanner Collect

As SIEM costs continue to rise, sometimes by 5–10x, teams are shifting more logs to cloud storage like S3, GCS, or Azure Blob. It’s the right move for cost and scalability.

But let’s be honest: it’s painful.

Pain point #1: Ingesting logs into your data lake is a never-ending project.

Every team we talk to is going through the same grind:

  • Pull Okta logs into S3.
  • Then Google Workspace.
  • Then Slack, AWS CloudTrail, GCP Audit, and 50 others.

Each source takes some custom work, then requires constant maintenance. There’s always something breaking: rate limits, auth changes, expired tokens. It’s tedious and distracting.

That’s why we built Scanner Collect.

Our newest feature connects directly to dozens of log sources and pulls data into your S3 bucket – automatically. 

We support audit logs from major cloud providers and popular SaaS tools—including Okta, Google Workspace, Slack, GitHub, AWS CloudTrail, GCP, Azure, and many more. You can also send logs and webhook events directly via HTTP, and we’ll automatically load them into your S3 bucket.

No more brittle Python scripts. No more weekly connector maintenance. Just click through our UI, connect your sources, and your data lake builds itself.

Set up dozens of log sources in an afternoon.

Pain point #2: Searching your data lake is harder than your old SIEM.

SQL-based tools are fine for point lookups – like filtering by IP or domain – but they struggle with the messy, text-heavy logs that matter most. Try searching PowerShell command-line arguments or fuzzy regex patterns, and things fall apart. Common data lake query tools like Trino will do full table scans for these, which can take hours to finish, stopping investigations from making progress.

That’s where Scanner’s full-text search comes in.

We’ve built a data lake search indexing engine designed for raw, messy logs – whether you’re working with tens of terabytes or petabytes of data.

Look up anything: IPs, file hashes, command-line flags, or the weird string that just feels suspicious. And do it in seconds.

Here’s what you get out of the box:

  • Ingest once, use instantly: Scanner Collect pulls logs from dozens of sources into S3 with zero maintenance.
  • Search at scale: Query years of logs—structured or unstructured—in seconds.
  • Alert with confidence: Build explainable, code-driven rules with full transparency.
  • Detect in real time: Start running powerful detections just minutes after ingestion.
  • Keep everything: Retain raw or enriched logs without worrying about cost.

What We Want to Show You at Black Hat

We’d love to show you what we’ve been building – and how Scanner Collect is already helping teams streamline their security operations.

Scanner powers fast, scalable detection and investigation for teams that demand clarity and speed. At Black Hat, we’re excited to show you how we’re solving the toughest challenges in modern detection – starting with ingestion.

Scanner Collect, our newest feature, completes the loop: log collection, search, and detection—all in one workflow.

It’s built for teams who are tired of maintaining brittle ingestion pipelines and want to spend more time focusing on what really matters: catching threats.

What We’re Hoping to Learn (and Who We’re Looking to Meet)

We’re heading to Black Hat to show what we’ve been building—and to get real feedback from the people doing the hard work every day. We’ll be pitching some ideas, but just as importantly, we’re here to listen.

We’re especially excited to meet with folks thinking through questions like:

  • What's still painful about building and maintaining a security data lake today?
  • What makes searching your data lake frustrating or slower than you’d like?
  • What capabilities do traditional SIEMs still provide that data lakes haven’t matched yet?
  • Where does the maintenance burden feel too high – connectors, schema management, detections?
  • What kind of investigations are still too hard to do in your current data lake setup?
  • What’s stopping your team from going “all in” on a data lake-based detection model?

If any of that resonates, we'd love to chat.

Want to Connect?

We’ll be at Black Hat all week, and we’re booking meetings throughout the event. If you want to see how Scanner is solving detection challenges or just want to trade ideas about pipelines, visibility, or what’s broken in security tooling, let’s connect.

And if you’re curious about the new tool we just rolled out, we’d be happy to show you what it looks like in action.

We believe that traditional log architectures are broken for modern log volumes. Scanner enables fast search and detections for log data lakes – directly in your S3 buckets. Reduce the total cost of ownership of logs by 80-90%.
Photo of Cliff Crosland
Cliff Crosland
CEO, Co-founder
Scanner, Inc.
Cliff is the CEO and co-founder of Scanner.dev, which provides fast search and threat detections for log data in S3. Prior to founding Scanner, he was a Principal Engineer at Cisco where he led the backend infrastructure team for the Webex People Graph. He was also the engineering lead for the data platform team at Accompany before its acquisition by Cisco. He has a love-hate relationship with Rust, but it's mostly love these days.