应该用哪一版vllm跑fun-asr-nano?

FunASR使用vllm加速时，应该使用哪个版本的vllm比较好。

## Before asking

1. Search existing issues: https://github.com/modelscope/FunASR/issues
2. Search the docs: https://modelscope.github.io/FunASR/
3. Check the README quick start and deployment section.

## Question

1、我用funasr-server --model fun-asr-nano --port 8000运行了服务器
2、在localhost:8000/docs中，用/v1/audio/transcriptions的try向服务器发送wav格式的语音文件
3、启动funasr-server后，发送第一个wav文件是可以正常识别的，通常（不一定）在发送第二个wav文件时服务器报错。
4、使用paraformer和sensevoice两种模型都没有出过问题。不安装vllm时使用fun-asr-nano模型也没出问题（除了每次发wav文件好像都要重加载模型有点慢）。

上面第3条里服务器报错如下：
```log
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "funasr/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 421, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 63, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/fastapi/applications.py", line 1159, in __call__
    await super().__call__(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/applications.py", line 90, in __call__
    await self.middleware_stack(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "funasr/lib/python3.12/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "funasr/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "funasr/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "funasr/lib/python3.12/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/routing.py", line 660, in __call__
    await self.middleware_stack(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/routing.py", line 680, in app
    await route.handle(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/fastapi/routing.py", line 134, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "funasr/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "funasr/lib/python3.12/site-packages/fastapi/routing.py", line 120, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/fastapi/routing.py", line 674, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/fastapi/routing.py", line 328, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/funasr/bin/_server_app.py", line 204, in transcribe
    result = _process_vllm(audio_data, sr, language=language, use_spk=spk)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/funasr/bin/_server_app.py", line 142, in _process_vllm
    results = app.state.engine.generate(inputs=seg_audios, **gen_kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/funasr/models/fun_asr_nano/inference_vllm.py", line 609, in generate
    outputs = self.vllm_engine.generate(prompts, sampling_params, use_tqdm=len(inputs) > 1)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/vllm/entrypoints/llm.py", line 500, in generate
    return self._run_completion(
           ^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/vllm/entrypoints/llm.py", line 1651, in _run_completion
    return self._run_engine(use_tqdm=use_tqdm, output_type=output_type)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/vllm/entrypoints/llm.py", line 1861, in _run_engine
    step_outputs = self.llm_engine.step()
                   ^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/vllm/v1/engine/llm_engine.py", line 295, in step
    outputs = self.engine_core.get_output()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 793, in get_output
    raise self._format_exception(outputs) from None
vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.
[rank0]:[W610 12:53:34.768469318 ProcessGroupNCCL.cpp:1575] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())

```

## Code or command

```bash

```

## What have you tried?



## Environment

- OS: Ubuntu 26.04
- Python version: 3.12.13
- FunASR version: 1.3.9
- ModelScope version: 1.37.1
- PyTorch / torchaudio version: 2.11.0+cu130
- Install method (`pip`, source, Docker): pip+uv(for vllm)
- Device (`cuda`, `cpu`, `mps`): cuda
- GPU model: RTX 5070ti 16G
- CUDA/cuDNN version: 13.3
- Docker image tag, if used:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

应该用哪一版vllm跑fun-asr-nano? #2968

Before asking

Question

Code or command

What have you tried?

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

应该用哪一版vllm跑fun-asr-nano? #2968

Description

Before asking

Question

Code or command

What have you tried?

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions