bl omni --audio always 400: sends audio_url, OpenAI-compatible endpoint requires input_audio
Summary
bl omni --audio <file> fails for every Qwen-Omni model with HTTP 400. The CLI builds the audio content part as type audio_url, but the DashScope OpenAI-compatible /compatible-mode/v1/chat/completions endpoint only accepts input_audio for audio. So audio transcription via the CLI is completely broken.
Environment
- bailian-cli 1.3.0
- Windows 11, Node (npm global install)
- Auth: api-key from
~/.bailian/config.json
Repro
bl omni --model qwen3.5-omni-flash --audio ./clip.wav --text-only \
--system "transcribe this audio verbatim" --message "transcribe this audio." --output json
Actual
{
"error": {
"code": 1,
"message": "Invalid value: audio_url. Supported values are: 'text','image_url','video_url' and 'video'.",
"http_status": 400,
"api_code": "invalid_request_error"
}
}
Reproduced identically on qwen3.5-omni-flash, qwen3-omni-flash, qwen-omni-turbo, qwen2.5-omni-7b — i.e. not model-specific, it's the request shape.
Root cause
The CLI almost certainly emits:
{ "type": "audio_url", "audio_url": { "url": "data:audio/wav;base64,..." } }
The OpenAI-compatible Qwen-Omni API instead expects the input_audio shape:
{ "type": "input_audio", "input_audio": { "data": "data:audio/wav;base64,...", "format": "wav" } }
Fix
Change the audio content part builder in the omni command from audio_url -> input_audio, with { data, format } (format inferred from file extension: wav/mp3/opus/pcm). I verified the input_audio shape works against the same endpoint/key with a hand-rolled streaming request (text-only modality + stream_options.include_usage) across all four models above, so only the CLI's content-part construction needs changing.
Ref: https://help.aliyun.com/zh/model-studio/qwen-omni (OpenAI-compatible audio input uses input_audio).
bl omni --audioalways 400: sendsaudio_url, OpenAI-compatible endpoint requiresinput_audioSummary
bl omni --audio <file>fails for every Qwen-Omni model with HTTP 400. The CLI builds the audio content part as typeaudio_url, but the DashScope OpenAI-compatible/compatible-mode/v1/chat/completionsendpoint only acceptsinput_audiofor audio. So audio transcription via the CLI is completely broken.Environment
~/.bailian/config.jsonRepro
Actual
{ "error": { "code": 1, "message": "Invalid value: audio_url. Supported values are: 'text','image_url','video_url' and 'video'.", "http_status": 400, "api_code": "invalid_request_error" } }Reproduced identically on
qwen3.5-omni-flash,qwen3-omni-flash,qwen-omni-turbo,qwen2.5-omni-7b— i.e. not model-specific, it's the request shape.Root cause
The CLI almost certainly emits:
{ "type": "audio_url", "audio_url": { "url": "data:audio/wav;base64,..." } }The OpenAI-compatible Qwen-Omni API instead expects the
input_audioshape:{ "type": "input_audio", "input_audio": { "data": "data:audio/wav;base64,...", "format": "wav" } }Fix
Change the audio content part builder in the
omnicommand fromaudio_url->input_audio, with{ data, format }(format inferred from file extension: wav/mp3/opus/pcm). I verified theinput_audioshape works against the same endpoint/key with a hand-rolled streaming request (text-only modality +stream_options.include_usage) across all four models above, so only the CLI's content-part construction needs changing.Ref: https://help.aliyun.com/zh/model-studio/qwen-omni (OpenAI-compatible audio input uses
input_audio).