🤗Transformers

Topic	Replies	Views	Activity
Wave Field LLM — O(n log n) attention via wave equation dynamics, within 5% of standard transformer 🤗Transformers	0	10	February 18, 2026
LLaVA Steering: Why does grounding fix hallucinations in captioning but not in Yes/No QA? 🤗Transformers	0	5	February 18, 2026
Issue: Hidden State Steering Improves Generative Grounding (CHAIR) but Fails on Yes/No Probing (POPE) 🤗Transformers	0	3	February 18, 2026
Gemma 3 12B: 4-bit Quantization failing/ignored in Transformers v5.1.0 (Gemma3ForConditionalGeneration) 🤗Transformers	7	28	February 17, 2026
KV Caching problem with gemma 3 🤗Transformers	2	18	February 17, 2026
Num_beam_groups removed in V5? 🤗Transformers	1	10	February 14, 2026
[LLaVA-1.5] Implementing Control Barrier Functions (LCBF) via Attention Hooking – Persistent AttributeError: 'LlamaAttention' object has no attribute 'rotary_emb' 🤗Transformers	4	9	February 13, 2026
Error while importing "Trainer" 🤗Transformers	1	25	February 13, 2026
[LLaVA-1.5] Very low hallucination rate & weak attention correlation in "Attention Gap" experiment – Is my implementation of output_attentions correct? 🤗Transformers	4	19	February 12, 2026
Confusion with freezing Whisper's feature encoder 🤗Transformers	3	13	February 11, 2026
When using Whisper, pipeline notifies that generation_config default values have been modified, even for base models 🤗Transformers	4	34	February 8, 2026
Hyperparameters vs message format prompt tuning 🤗Transformers	2	26	February 6, 2026
SFT Conversation llama3-8b-Instruct fails with assistant_only_loss=True 🤗Transformers	2	54	February 5, 2026
How to train T5 to distinguish task-relevant tokens from contextual noise? 🤗Transformers	1	19	February 5, 2026
Finetuning whisper attention mask not set and canot be inferred 🤗Transformers	5	6180	February 4, 2026
Abnormal generation after multi GPU 🤗Transformers	4	37	February 4, 2026
500 Internal Error - We're working hard to fix this as soon as possible 🤗Transformers	46	3160	February 1, 2026
Caching image prototype embeddings for image-guided object detection using OWL-ViT 🤗Transformers	3	494	January 31, 2026
[Quiestion]How to specify 'model_type' of 'Qwen/Qwen3-VL-8B-Instruct-GGUF'? 🤗Transformers	4	47	January 30, 2026
SAM3Video: CLIPTextModelOutput passed as tensor causes crash with text prompts 🤗Transformers	0	40	January 29, 2026
Different lm_head size and vocab_size 🤗Transformers	1	918	January 28, 2026
Custom KV Cache Steering Implementation Fails with IndexError in LLaVA Generation 🤗Transformers	1	17	January 28, 2026
Transformers v5 timelines 🤗Transformers	1	39	January 28, 2026
Issue: Discrepancy Between Layer-Wise Density Plots vs. Mean Trajectory Plots in LLaVA-1.5 Attention Analysis 🤗Transformers	2	18	January 25, 2026
[Discussion] Validating Attention Map Visualization for Visual Fading in LLaVA-1.5 🤗Transformers	4	45	January 23, 2026
No fix for High Vulnerabilities in transformers latest package 🤗Transformers	2	36	January 22, 2026
How to disable caching in .from_pretrained() 🤗Transformers	4	1272	January 18, 2026
DetLLM – Deterministic Inference Checks 🤗Transformers	0	26	January 17, 2026
Distributed LLaMA Inference Engine Built from Scratch (KV Cache, GQA, RoPE) 🤗Transformers	0	29	January 16, 2026
Run name issue, different run name file in webpage & local 🤗Transformers	1	91	January 16, 2026