Public GPU and NPU uploads
NVIDIA RTX 3060 8 GB
3584 CUDA cores
Type
GPU
VRAM
8 GB
Memory bandwidth
240 GB/s
TDP
170 W
Benchmark results
|
Llama 2
7B
|
Q4_0 | 512 | 1,815.70 | 282.00 ms | 76 | llama.cpp | Vulkan | — | uploaded 4 weeks ago |
|
|
Standardized test
Llama-Bench
Used prompt
llama-bench -p 512 -n 128
Notes
llama-bench / Vulkan scoreboard; Flash Attention deaktiviert
Evidence
|
||||||||||