Skip to content
Discussion options

You must be logged in to vote

@handsomelcx

Your diagnosis is right. This is a server-side schema validation issue, not something related to Cherry Studio, OpenWebUI, or Qwen3 reasoning mode. The failure happens before inference even starts.

What’s happening is that turn 1 works because the request only contains a user message. Turn 2 fails because Pydantic tries to validate the assistant message in the chat history against ChatCompletionRequestAssistantMessage, and it rejects it during validation.

In the current fork, llama_cpp/llama_types.py defines the assistant message like this:

class ChatCompletionRequestAssistantMessage(TypedDict):
    role: Literal["assistant"]
    name: Optional[str]
    content: NotRequired[O…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by handsomelcx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants