Accessing LLM, response without<think>start tag

#2
by sudage - opened

Accessing LLM, response withoutstart tag,for example:

======================request=====================
hello
======================response====================
Hmm, the user just sent "hello" - that's a pretty minimal opening.

First impression: they're probably testing the waters or starting a casual conversation. Could be a new user unsure how to engage, or someone busy just dropping a quick greeting. No context about their needs yet.

I should keep it warm but open-ended. A simple mirrored greeting feels right - "Hello!" matches their casual tone. Then add an invitation to guide them: "How can I assist you today?" gives them an easy way to steer the conversation without pressure.

...Wait, is there any chance they need urgent help? Unlikely with just "hello," but I'll keep the tone ready for anything. Better not overcomplicate this - they'll share more if they want to.

End with the glasses emoji - it's friendly but professional. Keeps the door wide open for whatever comes next.Hello! How can I assist you today? 😊

QuantTrio org

https://github.com/vllm-project/vllm/issues/31319

one can use deepseek_r1 reasoning parser as a temporary fix

cool, thank you

Thank! but deepseek_r1 work for thinking mode :
"chat_template_kwargs": {
"enable_thinking": true
}

but with "enable_thinking": false,output is all rasoning, not content.

Sign up or log in to comment