Question 1

如何修复在 PyTorch 2.11 的 CPU 上运行 VoxCPM 时出现的 'Dimension out of range' 错误？

Accepted Answer

这是 PyTorch 2.11.0+ 中的一个已知错误，会导致 scaled_dot_product_attention 在 CPU 上失败，报错 'Dimension out of range (expected to be in range of [-1, 0], but got -2)'。解决方法：将 PyTorch 降级到 2.11 以下版本，例如 2.5.1。仅 CPU 版本可使用 pip 安装 torch==2.5.1（例如 pip install torch==2.5.1）。GPU（CUDA 12.1）版本使用 torch==2.5.1+cu121。详情参见 PyTorch issue #163597。

Question 2

为什么VoxCPM2在同一个GPU上使用多个子进程工作器时会因CUDA错误（例如“Offset increment outside graph capture”）而崩溃？

Accepted Answer

这是由 torch.compile 的 CUDA 图优化在多进程共享GPU内存池时引发的已知不稳定问题。推荐的解决方法是采用单进程服务架构，例如 nano-vllm-voxcpm (https://github.com/a710128/nanovllm-voxcpm) 或 vllm-omni (https://github.com/OpenBMB/VoxCPM#-production-serving-vllm-omni)，这可以避免多进程CUDA图冲突。nano-vllm-voxcpm 的生产级 FastAPI 封装可在 https://github.com/uttera/uttera-tts-vllm 获取。

Question 3

为什么在Blackwell（RTX 5090）GPU上使用nano-vllm进行LoRA微调时，音频质量会逐渐下降？

Accepted Answer

这是一个已知问题，由CUDA图内存池与LoRA的冲突以及nano-vllm调度器在Blackwell (sm_120)架构上的对象泄漏引起。唯一有效的解决方法是每隔2-3小时重启推理进程，这可以重置泄漏对象并整理GPU内存碎片。请关注issue #326和nano-vllm-voxcpm #61以获取永久性修复。

Question 4

为什么voxcpm2语音克隆会产生扭曲、恶魔般的输出且音频时长不正确？

Accepted Answer

这是voxcpm2和voxcpm1.5中已知的不稳定问题。临时解决方案：切换到voxcpm0.5b，该版本在相同输入下能正常工作。目前尚无永久修复方案，请关注GitHub issue以获取更新。

Question 5

如何修复使用torch.compile时出现的“triton is not installed”警告？

Accepted Answer

安装与你的 PyTorch 版本匹配的 triton。对于 torch==2.5.1，请使用 triton==3.1.0（Linux 系统且配备 NVIDIA GPU）。检查硬件是否支持 triton（计算能力 7.0 或更高）。Windows 支持有限；若功能不受影响可忽略警告。修复方法：pip install triton==3.1.0。如果安装了错误版本（例如 2.1.0 导致错误），请先卸载：pip uninstall triton，然后安装正确版本。

VoxCPM

项目简介

README 预览

常见问题 (5)

同类型项目

puppeteer

PaddleOCR

crawl4ai

supervision