Skip to content

API Reference

This section provides complete documentation for the reverse-attention Python API.

Overview

The package has a simple, focused API:

Class/Function Purpose
ReverseAttentionTracer Main class for tracing attention
TraceResult Result container from tracing
BeamPath Single attention path
SankeyData Visualization data

Quick Reference

Basic Usage

from reverse_attention import ReverseAttentionTracer

# Create tracer
tracer = ReverseAttentionTracer(model, tokenizer)

# Trace from text
result = tracer.trace_text("Hello world")

# Or from token IDs
result = tracer.trace(input_ids)

# Generate visualization
tracer.render_html(result, "output/")

Key Parameters

result = tracer.trace_text(
    text,
    target_pos=-1,        # Position to trace from
    layer=-1,             # Layer to analyze
    top_beam=5,           # Number of beams
    top_k=5,              # Predecessors per step
    min_attn=0.0,         # Attention threshold
    agg_heads="mean",     # Head aggregation
    length_norm="avg_logprob",  # Score normalization
)

Accessing Results

# Metadata
result.seq_len       # Sequence length
result.target_pos    # Resolved target position
result.layer         # Resolved layer index
result.tokens        # All tokens in sequence

# Beams
result.beams         # List[BeamPath]
result.paths_text    # Human-readable paths

# Visualization
result.sankey        # SankeyData for rendering

Module Structure

reverse_attention/
├── __init__.py          # Exports: ReverseAttentionTracer
├── tracer.py            # Main tracer class
├── beam.py              # Beam search + data classes
├── attn_extract.py      # Attention extraction
├── sankey.py            # Sankey conversion
├── tokenize.py          # Token utilities
└── utils.py             # Score normalization

Sections