Accessing LLM, response without<think>start tag
Accessing LLM, response withoutstart tag,for example:
======================request=====================
hello
======================response====================
Hmm, the user just sent "hello" - that's a pretty minimal opening.
First impression: they're probably testing the waters or starting a casual conversation. Could be a new user unsure how to engage, or someone busy just dropping a quick greeting. No context about their needs yet.
I should keep it warm but open-ended. A simple mirrored greeting feels right - "Hello!" matches their casual tone. Then add an invitation to guide them: "How can I assist you today?" gives them an easy way to steer the conversation without pressure.
...Wait, is there any chance they need urgent help? Unlikely with just "hello," but I'll keep the tone ready for anything. Better not overcomplicate this - they'll share more if they want to.
End with the glasses emoji - it's friendly but professional. Keeps the door wide open for whatever comes next.Hello! How can I assist you today? 😊
https://github.com/vllm-project/vllm/issues/31319
one can use deepseek_r1 reasoning parser as a temporary fix
cool, thank you
Thank! but deepseek_r1 work for thinking mode :
"chat_template_kwargs": {
"enable_thinking": true
}
but with "enable_thinking": false,output is all rasoning, not content.