Product / Search & Investigate

Search Petabytes in Seconds

Full-text search across years of security logs in seconds, not hours. Inverted indexes and serverless execution make iterative investigation actually possible.

<10s

Search 100TB of logs

100x

Faster than Athena

1–10s

Typical query time

$0.01—$0.10

Cost per query

Traditional Data Lakes Are Too Slow for Security

When queries take 30+ minutes, investigation becomes impossible. You can't iterate, can't pivot, can't pursue multiple hypotheses.

Problem: Full Scans

Traditional tools (Athena, Presto) scan entire tables even for simple queries. Searching for a specific IP or API key means reading and parsing every log file.

close

30+ minute queries on months of data

close

Scan entire dataset to find matching events

close

$75-100 per query in compute costs

close

Can't search nested JSON efficiently

close

Partitioning helps but doesn't solve the core problem

Solution: Inverted Indexes

Scanner builds indexes at ingestion time. Queries look up which files contain matching data, then scan only those files. Skip everything else.

completed

1-10 second queries on years of data

completed

Scan only files with matching events

completed

$0.01-0.10 per query

completed

Native nested field access

completed

Works on any data, no partitioning required

How Scanner Search Works

Step 1:

Indexes built when logs arrive in S3

When logs arrive in S3, Scanner parses them once and builds an inverted index: a lookup table mapping every field value to the files containing it. Index files are stored alongside your logs in S3.

Docs: How Scanner Works
link_out
Step 2:

Queries find relevant data instantly

When you search, Scanner reads the index files (not the original log files). It looks up each search term, gets the index segment lists, and finds the intersection—segments that match all your conditions. Only those segments get scanned.

Docs: How Scanner Achieves Fast Queries
link_out
Step 3:

Parallel serverless execution

Lambda workers spawn automatically - analyzing index files in parallel. They identify matching log segments in parallel, scan only relevant data, and merge results. Functions terminate immediately after. You only pay for seconds of actual compute.

Speed changes what's possible

Investigation is iterative. Every answer leads to more questions. Fast queries mean you can actually follow every lead. Traditional data lake tools like Athena and Presto are too slow for this workflow.

Traditional Tools

3 queries in 2 hours

Scenario:

Suspicious API key accessing S3 buckets from unknown IP address.

endpoint_active

Query 1:

When did this key first appear?

45 minutes

endpoint_active

Query 2:

What other buckets has it accessed?

38 minutes

endpoint_active

Query 3:

Any related suspicious activity?

52 minutes

Total: 2 hours, 15 minutes

Investigation has barely started. Window for containment is closing.

Scanner

20 queries in 4 minutes

Same scenario:

But you can pivot immediately on every finding.

endpoint_active

Query 1:

When did this key first appear?

8 minutes

endpoint_active

Query 2:

What other buckets has it accessed?

5 seconds

endpoint_active

Query 3:

Any related suspicious activity?

12 seconds

endpoint_active

Query 4-20

Who created the key? When? From where? What else did they do? Which resources are affected? Any lateral movement?

3 minutes combined

Total: 4 minutes

Root cause identified: compromised CI/CD pipeline. All affected resources mapped. Systems isolated.

Built for security investigations

Fast queries are just the start. Scanner is designed for how security teams actually work.

Full-text search

Search for any text in any field. No schema required. Find IPs, usernames, file paths, or error messages across all your logs with one query.

Nested field access

Query deeply nested JSON directly. No JSON extraction functions. Indexes work on nested fields automatically.

Temporal context

"Show me everything from this user in a 10-minute window." Jump from one event to all related activity across log sources. Context is critical for investigations.

Saved queries

Save complex queries and share with your team. Rerun investigations instantly. Build a library of investigative playbooks that work.

AI explain

Click any log event to get a plain-English explanation. Understand what happened, why it might matter, and what to look for next - without being a log format expert.

API access

Query programmatically from notebooks, scripts, or automation. Same speed as the UI. Build custom workflows, enrich alerts, or integrate with your tools.

FAQ

Search your data like it’s 2025

See how Scanner can turn your S3 data lake into a high-performance search engine. Query years of logs in seconds, not hours.

Book a Demo