"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked a model 2 days ago
Qwen/Qwen3.5-2B liked a dataset 9 days ago
stepfun-ai/Step-3.5-Flash-SFT liked a Space 13 days ago
Presidentlin/llm-pricing-calculator