GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling, https://huggingface.co/papers/2604.18556
-
GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling
Paper • 2604.18556 • Published • 7 -
ISTA-DASLab/Kimi-K2.6-2Bit-GSQ
Image-Text-to-Text • 84B • Updated • 105 -
ISTA-DASLab/Kimi-K2.5-2Bit-GSQ
Image-Text-to-Text • 84B • Updated • 128 -
ISTA-DASLab/Llama-3.1-70B-Instruct-2Bit-GSQ
Text Generation • 7B • Updated • 270