LobsterHoney Docs
Concepts

Architecture Overview

How LobsterHoney's detection pipeline works from trap hit to alert.

LobsterHoney's detection pipeline turns a single trap visit into a fully classified, scored, and alerted security event.

The Detection Pipeline

Agent Visits a Trap

An AI agent crawls your site and hits one of your trap endpoints (e.g. /s/your-org/robots.txt). The trap looks like legitimate content -- a robots.txt file, an .env config, or an API endpoint.

Tokens Are Embedded

LobsterHoney embeds three types of tokens into the response:

  • Callback tokens -- hidden URLs that phone home when followed
  • Extraction tokens -- payloads designed to capture agent instructions and identity
  • Canary credentials -- realistic fake API keys and database URLs that are monitored for use

These are invisible to human visitors but are processed by AI agents that parse and act on content.

Signals Fire

As the agent interacts with the injected tokens, signals fire:

  • CALLBACK_HIT when a hidden URL is followed
  • SYSTEM_PROMPT_LEAKED when an agent's instructions are captured
  • CREDENTIAL_USED when a canary credential is used

Each signal is categorized as either a tripwire (high-confidence) or behavioral (supporting evidence).

Scoring and Classification

The scoring engine aggregates all signals for the session and produces:

  • A threat score (0-100+)
  • A classification (HUMAN, BOT, AI_AGENT, or AI_AGENT_MALICIOUS)
  • A confidence level (0-100%)
  • A severity rating (Low, Medium, High, Critical)

Classification can escalate as new signals accumulate -- a session that starts as BOT can be reclassified to AI_AGENT_MALICIOUS.

Alert and Response

When a session meets the notification threshold:

  • A webhook fires to your configured Slack channel (or custom endpoint)
  • The incident appears in the dashboard with full session details
  • All extracted intelligence (system prompts, model identity, credentials used) is available for review

The entire pipeline runs in real-time. From trap hit to Slack notification is typically under 5 seconds.

Key Design Principles

  • Zero false positives on tripwires -- Callback URLs, extraction tokens, and canary credentials are never visible to humans. Only software that processes hidden content will trigger them.
  • Layered detection -- Each token type catches different agent behaviors. Used together, they build a comprehensive picture.
  • Passive deployment -- Traps sit quietly until visited. No impact on your site's performance or user experience.
  • Attribution by design -- Every signal captures metadata (IP, user agent, timing, extracted data) that helps identify the operator behind the agent.

On this page