Skip to content

Auto-Tracing with OpenTelemetryΒΆ

Flock includes automatic OpenTelemetry instrumentation for all agent methods, providing detailed observability for debugging and monitoring.

Quick StartΒΆ

Enable auto-tracing by setting the environment variable:

export FLOCK_AUTO_TRACE=true
python your_agent.py

This automatically: - βœ… Wraps all public methods with OTEL spans - βœ… Configures logging to DEBUG level - βœ… Captures trace IDs, correlation IDs, and agent metadata - βœ… Creates parent-child span relationships for call hierarchies

ConfigurationΒΆ

Basic Usage (Console Only)ΒΆ

# Enable auto-tracing with console logs only
export FLOCK_AUTO_TRACE=true
python your_agent.py

Export to DuckDBΒΆ

# Export traces to .flock/traces.duckdb
export FLOCK_AUTO_TRACE=true
export FLOCK_TRACE_FILE=true
python your_agent.py

Traces are stored in a DuckDB database, which provides: - βœ… 10-100x faster queries than JSON/SQLite - βœ… Built-in trace viewer UI in the Flock dashboard - βœ… SQL analytics for debugging and monitoring - βœ… Efficient columnar storage

Export to Grafana/Jaeger (OTLP)ΒΆ

# Send traces to OTLP endpoint (Grafana, Jaeger, etc.)
export FLOCK_AUTO_TRACE=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
python your_agent.py

Disable Auto-TracingΒΆ

export FLOCK_AUTO_TRACE=false
python your_agent.py

Filtering: Whitelist and BlacklistΒΆ

Control which operations get traced to reduce overhead and noise. This is especially important to avoid tracing streaming token operations which can cause performance issues.

How Filtering Works: Two-Stage ProcessΒΆ

Stage 1: Wrapping Methods with @traced_and_loggedΒΆ

Methods must first be wrapped with the tracing decorator. This happens automatically via metaclasses:

  • AutoTracedMeta: Wraps all public methods in Agent and Flock classes
  • TracedModelMeta: Wraps all public methods in AgentComponent subclasses

Classes using these metaclasses: - Agent (from agent.py) - Flock (from orchestrator.py) - AgentComponent and all subclasses: - EngineComponent (DSPyEngine, ClaudeEngine, etc.) - OutputUtilityComponent - DashboardEventCollector - MetricsUtility, LoggingUtility - All custom components

Stage 2: Filter Check (Whitelist/Blacklist)ΒΆ

Only wrapped methods reach the filter. Before creating an OTEL span, the decorator checks:

service_name = span_name.split('.')[0]  # Extract class name
# Example: "Agent.execute" β†’ service_name = "Agent"

if not should_trace(service_name, span_name):
    return func(*args, **kwargs)  # Skip span creation entirely

Key Point: Methods that are not wrapped (e.g., plain functions, non-component classes) are never traced, regardless of whitelist/blacklist settings.

Whitelist: FLOCK_TRACE_SERVICESΒΆ

Filters by CLASS NAME (case-insensitive). Only trace methods from specified classes.

# Only trace Agent and Flock classes
export FLOCK_TRACE_SERVICES='["agent", "flock"]'

Behavior: - If set: Only traces methods from listed classes - If empty/unset: Traces ALL wrapped classes - Case-insensitive: "agent", "Agent", "AGENT" all match Agent class

Example:

FLOCK_TRACE_SERVICES='["agent", "flock", "dspyengine"]'

Result: - βœ… Agent.execute - traced (Agent in whitelist) - βœ… Flock.publish - traced (Flock in whitelist) - βœ… DSPyEngine.evaluate - traced (DSPyEngine in whitelist) - ❌ OutputUtilityComponent.on_post_evaluate - NOT traced (not in whitelist) - ❌ DashboardEventCollector.collect_event - NOT traced (not in whitelist)

Blacklist: FLOCK_TRACE_IGNOREΒΆ

Filters by FULL OPERATION NAME (exact match). Never trace specific methods.

# Never trace these specific methods
export FLOCK_TRACE_IGNORE='["DashboardEventCollector.set_websocket_manager", "Agent.get_identity"]'

Behavior: - Exact match on ClassName.method_name - Takes priority over whitelist - Use this to exclude noisy or low-value operations

Example:

FLOCK_TRACE_SERVICES='["agent", "dashboardeventcollector"]'
FLOCK_TRACE_IGNORE='["DashboardEventCollector.set_websocket_manager"]'

Result: - βœ… Agent.execute - traced (in whitelist, not in blacklist) - βœ… DashboardEventCollector.collect_event - traced (in whitelist, not in blacklist) - ❌ DashboardEventCollector.set_websocket_manager - NOT traced (in blacklist)

Filter PriorityΒΆ

  1. Blacklist (highest priority) - If in FLOCK_TRACE_IGNORE, never trace
  2. Whitelist - If FLOCK_TRACE_SERVICES is set, only trace listed services
  3. Default - If no filters set, trace everything that's wrapped

Add to your .env file:

# Trace core agent execution, avoid streaming token overhead
FLOCK_TRACE_SERVICES=["flock", "agent", "dspyengine", "outpututilitycomponent"]

# Exclude noisy operations
FLOCK_TRACE_IGNORE=["DashboardEventCollector.set_websocket_manager"]

# Auto-delete traces older than 30 days
FLOCK_TRACE_TTL_DAYS=30

Why these defaults? - flock - Core orchestration (publish, scheduling) - agent - Agent lifecycle and execution - dspyengine - LLM calls and responses - outpututilitycomponent - Output formatting - Excludes dashboard/streaming operations to avoid performance issues - TTL keeps database size manageable by removing old debugging data

Trace Time-To-Live (TTL)ΒΆ

Control database size by automatically deleting old traces:

# Delete traces older than 30 days
export FLOCK_TRACE_TTL_DAYS=30

How it works: - Cleanup runs on application startup (when DuckDB exporter initializes) - Uses the created_at timestamp field from the spans table - Deletes all spans older than the specified number of days - Prints a summary: [DuckDB TTL] Deleted 1234 spans older than 30 days

When to use TTL: - βœ… Development environments - Keep database small, remove old debugging sessions - βœ… Production - Retain recent traces for debugging, delete historical data - βœ… CI/CD - Clean up test traces automatically - ❌ Long-term analytics - If you need historical trace data, disable TTL or export to separate storage

Performance impact: - Cleanup uses indexed created_at field for fast deletion - Runs only once per application startup - Near-zero runtime overhead

Example scenarios:

# Development: Keep last 7 days only
FLOCK_TRACE_TTL_DAYS=7

# Production: Keep last 30 days
FLOCK_TRACE_TTL_DAYS=30

# Long-term retention: Keep last 90 days
FLOCK_TRACE_TTL_DAYS=90

# No cleanup: Keep all traces forever
# FLOCK_TRACE_TTL_DAYS=  (leave empty or comment out)

What Gets CapturedΒΆ

Span AttributesΒΆ

Every traced method automatically captures:

Attribute Description Example
class Class name of the method Agent, Flock, DSPyEngine
function Method name execute, publish, evaluate
module Python module path flock.orchestrator
agent.name Agent identifier (if applicable) movie, tagline
agent.description Agent description Generate movie ideas
correlation_id Request correlation ID 12d0fcda-e7f7-4c96-ae8e-14ae4eca1518
task_id Task identifier task_abc123
result.type Return type list, EvalResult, Artifact
result.length Collection size (if applicable) 3

Span Hierarchy ExampleΒΆ

Flock.publish (trace_id: ae40f0061e3f1bcfebe169191d138078)
└── Agent.execute
    β”œβ”€β”€ Agent.on_initialize
    β”‚   β”œβ”€β”€ OutputUtilityComponent.on_initialize
    β”‚   └── DSPyEngine.on_initialize
    β”œβ”€β”€ Agent.on_pre_consume
    β”œβ”€β”€ Agent.on_pre_evaluate
    β”œβ”€β”€ Agent.evaluate
    β”‚   └── DSPyEngine.evaluate
    β”œβ”€β”€ Agent.on_post_evaluate
    β”œβ”€β”€ Agent.on_post_publish
    └── Agent.on_terminate

All spans within the same execution share the same trace_id, making it easy to trace a complete request flow.

Console OutputΒΆ

With auto-tracing enabled, you'll see:

2025-10-07 15:32:40 | DEBUG | [trace_id: ae40f0061e3f1bcfebe169191d138078] | [tools] | Flock.publish executed successfully
2025-10-07 15:32:40 | DEBUG | [trace_id: ae40f0061e3f1bcfebe169191d138078] | [tools] | Agent.execute executed successfully
2025-10-07 15:32:40 | DEBUG | [trace_id: ae40f0061e3f1bcfebe169191d138078] | [tools] | DSPyEngine.evaluate executed successfully

Notice how all logs share the same trace_id, making it easy to filter and follow execution flow.

Using with GrafanaΒΆ

1. Start Grafana + Tempo (OTLP Collector)ΒΆ

# docker-compose.yml
version: '3'
services:
  tempo:
    image: grafana/tempo:latest
    command: [ "-config.file=/etc/tempo.yaml" ]
    volumes:
      - ./tempo.yaml:/etc/tempo.yaml
    ports:
      - "4317:4317"  # OTLP gRPC
      - "3200:3200"  # Tempo

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
# tempo.yaml
server:
  http_listen_port: 3200

distributor:
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: 0.0.0.0:4317

storage:
  trace:
    backend: local
    local:
      path: /tmp/tempo/traces

2. Run Your Agent with OTLP ExportΒΆ

export FLOCK_AUTO_TRACE=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
python your_agent.py

3. Query in GrafanaΒΆ

  • Open Grafana at http://localhost:3000
  • Add Tempo as a data source
  • Query by:
  • trace_id - View complete request trace
  • correlation_id - Group related agent executions
  • agent.name - Filter by specific agent
  • service.name=flock-auto-trace - All Flock traces

4. Create DashboardsΒΆ

Useful queries for Grafana panels:

# Agent execution duration by agent name
histogram_quantile(0.95,
  rate(traces{service.name="flock-auto-trace", agent.name!=""}[5m])
)

# Error rate by agent
sum(rate(traces{service.name="flock-auto-trace", status.code="ERROR"}[5m]))
  by (agent.name)

# Traces by correlation ID
traces{correlation_id="12d0fcda-e7f7-4c96-ae8e-14ae4eca1518"}

Using with JaegerΒΆ

1. Start JaegerΒΆ

docker run -d --name jaeger \
  -e COLLECTOR_OTLP_ENABLED=true \
  -p 4317:4317 \
  -p 16686:16686 \
  jaegertracing/all-in-one:latest

2. Run Your AgentΒΆ

export FLOCK_AUTO_TRACE=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
python your_agent.py

3. View TracesΒΆ

  • Open Jaeger UI at http://localhost:16686
  • Select service: flock-auto-trace
  • Search by:
  • Agent name
  • Correlation ID
  • Time range

Skipping Methods from TracingΒΆ

Use environment variables to control tracing at runtime without code changes:

# Exclude specific operations
export FLOCK_TRACE_IGNORE='["MyComponent.noisy_helper"]'

This is preferred because: - βœ… No code changes required - βœ… Can be adjusted per environment (dev/staging/prod) - βœ… Easy to add/remove without modifying source

Option 2: Use @skip_trace DecoratorΒΆ

For methods that should never be traced in any environment:

from flock.logging.auto_trace import skip_trace

class MyComponent(AgentComponent):
    def important_method(self):
        # This will be traced (if class is whitelisted)
        pass

    @skip_trace
    def noisy_helper(self):
        # This will NEVER be traced, even if whitelisted
        pass

When to use @skip_trace: - Methods called thousands of times per second - Methods with sensitive data you never want logged - Internal helpers that provide no debugging value

Performance ConsiderationsΒΆ

  • Overhead: Each span adds ~0.1-0.5ms overhead
  • Console logging: DEBUG logs can slow down execution significantly
  • DuckDB export: Minimal overhead (~0.01ms per span)
  • OTLP export: Batched, minimal overhead (~0.02ms per span)
  • Filtering: Filter check happens before span creation, so filtered operations have near-zero overhead

For production: - Use filtering to trace only core services (recommended) - Use DuckDB export instead of OTLP for lower overhead - Disable DEBUG console logging - Use blacklist to exclude high-frequency operations

Example production config:

export FLOCK_AUTO_TRACE=true
export FLOCK_TRACE_FILE=true  # DuckDB only, no console spam
export FLOCK_TRACE_SERVICES='["agent", "flock"]'  # Core services only
export FLOCK_TRACE_IGNORE='["DashboardEventCollector.set_websocket_manager"]'

TroubleshootingΒΆ

Trace IDs showing as "no-trace"ΒΆ

Cause: Telemetry not initialized

Fix: Ensure FLOCK_AUTO_TRACE=true is set before importing flock

OTLP connection timeoutΒΆ

Cause: OTLP endpoint not reachable

Fix: Don't set OTEL_EXPORTER_OTLP_ENDPOINT unless you have a collector running

Too verbose logsΒΆ

Cause: DEBUG level captures everything

Fix: Reduce logging or disable auto-trace for production

ArchitectureΒΆ

Auto-tracing uses a two-stage approach:

Stage 1: Method Wrapping (Compile Time)ΒΆ

  • Metaclasses: AutoTracedMeta and TracedModelMeta wrap all public methods at class creation time
  • Decorator: @traced_and_logged added to each public method
  • Applied to:
  • Agent (agent.py) - via AutoTracedMeta
  • Flock (orchestrator.py) - via AutoTracedMeta
  • AgentComponent (components.py) - via TracedModelMeta
  • All their subclasses inherit the metaclass

Stage 2: Filter Check (Runtime)ΒΆ

  • Filter Configuration: TraceFilterConfig loaded from environment variables at startup
  • Filter Check: Before creating spans, decorator checks should_trace(service, operation)
  • Early Exit: Filtered operations skip span creation entirely (near-zero overhead)

OTEL Span CreationΒΆ

  • Context Propagation: Uses OTEL's start_as_current_span for parent-child relationships
  • Attribute Extraction: Automatically extracts agent name, correlation ID, etc. from method arguments
  • Error Handling: Records exceptions and sets span status
  • Output Capture: Serializes return values to JSON for debugging

Data FlowΒΆ

Method Call
  β†’ Filter Check (whitelist/blacklist)
    β†’ If filtered: Execute method directly (no tracing)
    β†’ If not filtered: Create OTEL span
      β†’ Extract attributes (class, agent.name, correlation_id, etc.)
      β†’ Execute method
      β†’ Capture output
      β†’ Set span status
      β†’ Export to DuckDB/OTLP

Why This Matters for AI DevelopmentΒΆ

When AI agents (like Claude) debug your code, they rely on printf debugging since they can't use interactive debuggers. Auto-tracing provides:

  1. Complete execution trace - See exactly what methods were called and in what order
  2. Correlation tracking - Group related operations across multiple agents
  3. Automatic context - No manual logging needed
  4. Visual debugging - View traces in Grafana/Jaeger for complex flows

This dramatically improves AI-assisted development by making execution flows transparent.

Example OutputΒΆ

$ export FLOCK_AUTO_TRACE=true && python examples/showcase/02_hello_flock.py

2025-10-07 15:32:40 | DEBUG | [trace_id: d1339d844b78a63d9a2e2b2f4f726e25] | Flock.register_agent executed successfully
2025-10-07 15:32:40 | DEBUG | [trace_id: d1339d844b78a63d9a2e2b2f4f726e25] | Flock.agent executed successfully
2025-10-07 15:32:40 | DEBUG | [trace_id: ae40f0061e3f1bcfebe169191d138078] | Flock.publish executed successfully
2025-10-07 15:32:40 | DEBUG | [trace_id: ae40f0061e3f1bcfebe169191d138078] | OutputUtilityComponent.on_initialize executed successfully
2025-10-07 15:32:40 | DEBUG | [trace_id: ae40f0061e3f1bcfebe169191d138078] | DSPyEngine.on_initialize executed successfully
...
βœ… Movie and tagline generated!

Notice: - Each trace has a unique trace_id - Related operations share the same trace_id - Method names show full context: Class.method

Further ReadingΒΆ