Skip to content

Installation

Requirements

  • Python 3.9 or higher
  • PyTorch 2.0 or higher
  • A HuggingFace transformer model

Install from PyPI

The simplest way to install:

pip install reverse-attention

Install from Source

For development or the latest features:

git clone https://github.com/ovshake/rat
cd rat
pip install -e .

Optional Dependencies

Development

For running tests and contributing:

pip install -e ".[dev]"

This installs pytest and pytest-cov.

Documentation

For building the documentation locally:

pip install -e ".[docs]"

This installs MkDocs, the Material theme, and mkdocstrings.

Verify Installation

from reverse_attention import ReverseAttentionTracer
print("Installation successful!")

GPU Support

The package automatically uses CUDA if available. To verify GPU support:

import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"MPS available: {torch.backends.mps.is_available()}")

The tracer will automatically select the best available device:

  1. CUDA (NVIDIA GPU)
  2. MPS (Apple Silicon)
  3. CPU (fallback)

You can also specify a device explicitly:

tracer = ReverseAttentionTracer(model, tokenizer, device="cuda")

Troubleshooting

SDPA Attention Error

If you see 'sdpa' attention does not support 'output_attentions=True':

# Fix: Use eager attention implementation
model = AutoModelForCausalLM.from_pretrained(
    "model-name",
    attn_implementation="eager",
)

Modern transformers (4.36+) default to SDPA which doesn't support attention output. Always use attn_implementation="eager".

Out of Memory

If you encounter OOM errors:

  1. Use a smaller model (e.g., Qwen/Qwen2-0.5B instead of larger variants)
  2. Reduce top_beam and top_k parameters
  3. Use shorter input sequences
  4. Enable half-precision: load model with torch_dtype=torch.float16

Model Not Outputting Attention

Ensure your model supports output_attentions=True. Most HuggingFace models do, but some custom models may not. See Model Compatibility for details.