LLM Inference Pipeline

Prompt: "The cat sat" — watch how it flows through the system

📥 Input

🔤 Tokenize

⚡ Prefill

🔄 Decode

📤 Output

loop 0 / 3

CPU

idle

RAM

idle

GPU Compute

idle

VRAM / KV Cache

idle

Tokens

Waiting for input...

KV Cache entries: empty

Click Auto to run the full pipeline, or use Forward / Back to walk through one stage at a time.