LLM Inference Pipeline

Prompt: "The cat sat" — watch how it flows through the system

📥 Input
🔤 Tokenize
⚡ Prefill
🔄 Decode
📤 Output
loop 0 / 3
CPU
idle
RAM
idle
GPU Compute
idle
VRAM / KV Cache
idle
Tokens
Waiting for input...
KV Cache entries: empty
Click Auto to run the full pipeline, or use Forward / Back to walk through one stage at a time.