Best Results per LLM
Filters all results by LLM, Workload, and Quant, then shows the fastest tested setup for each GPU, CPU, or combined configuration.
For more context, see Systems and All Results.
Each bar shows the total time it took to process a prompt of the selected workload length and to generate 500 tokens. Shorter bars are better/faster. Click on a bar to open its detail page.
No results found for the selected filters.