Oracle interview question

Describe 3 different optimisations applied to LLM inference.

Interview Answer

Anonymous

7 Jul 2025

KV caching, speculative decoding, operator fusion