Skip to content

应该用哪一版vllm跑fun-asr-nano? #2968

@ZhangQi-HUST

Description

@ZhangQi-HUST

FunASR使用vllm加速时,应该使用哪个版本的vllm比较好。

Before asking

  1. Search existing issues: https://github.com/modelscope/FunASR/issues
  2. Search the docs: https://modelscope.github.io/FunASR/
  3. Check the README quick start and deployment section.

Question

1、我用funasr-server --model fun-asr-nano --port 8000运行了服务器
2、在localhost:8000/docs中,用/v1/audio/transcriptions的try向服务器发送wav格式的语音文件
3、启动funasr-server后,发送第一个wav文件是可以正常识别的,通常(不一定)在发送第二个wav文件时服务器报错。
4、使用paraformer和sensevoice两种模型都没有出过问题。不安装vllm时使用fun-asr-nano模型也没出问题(除了每次发wav文件好像都要重加载模型有点慢)。

上面第3条里服务器报错如下:

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "funasr/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 421, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 63, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/fastapi/applications.py", line 1159, in __call__
    await super().__call__(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/applications.py", line 90, in __call__
    await self.middleware_stack(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "funasr/lib/python3.12/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "funasr/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "funasr/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "funasr/lib/python3.12/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/routing.py", line 660, in __call__
    await self.middleware_stack(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/routing.py", line 680, in app
    await route.handle(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/fastapi/routing.py", line 134, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "funasr/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "funasr/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "funasr/lib/python3.12/site-packages/fastapi/routing.py", line 120, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/fastapi/routing.py", line 674, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/fastapi/routing.py", line 328, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/funasr/bin/_server_app.py", line 204, in transcribe
    result = _process_vllm(audio_data, sr, language=language, use_spk=spk)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/funasr/bin/_server_app.py", line 142, in _process_vllm
    results = app.state.engine.generate(inputs=seg_audios, **gen_kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/funasr/models/fun_asr_nano/inference_vllm.py", line 609, in generate
    outputs = self.vllm_engine.generate(prompts, sampling_params, use_tqdm=len(inputs) > 1)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/vllm/entrypoints/llm.py", line 500, in generate
    return self._run_completion(
           ^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/vllm/entrypoints/llm.py", line 1651, in _run_completion
    return self._run_engine(use_tqdm=use_tqdm, output_type=output_type)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/vllm/entrypoints/llm.py", line 1861, in _run_engine
    step_outputs = self.llm_engine.step()
                   ^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/vllm/v1/engine/llm_engine.py", line 295, in step
    outputs = self.engine_core.get_output()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "funasr/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 793, in get_output
    raise self._format_exception(outputs) from None
vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.
[rank0]:[W610 12:53:34.768469318 ProcessGroupNCCL.cpp:1575] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())

Code or command

What have you tried?

Environment

  • OS: Ubuntu 26.04
  • Python version: 3.12.13
  • FunASR version: 1.3.9
  • ModelScope version: 1.37.1
  • PyTorch / torchaudio version: 2.11.0+cu130
  • Install method (pip, source, Docker): pip+uv(for vllm)
  • Device (cuda, cpu, mps): cuda
  • GPU model: RTX 5070ti 16G
  • CUDA/cuDNN version: 13.3
  • Docker image tag, if used:

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs triageNeeds maintainer triage and routingquestionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions