Skip to content
#

multimodal-large-language-models

Here are 137 public repositories matching this topic...

实时语音交互数字人,支持端到端语音方案(GLM-4-Voice - THG)和级联方案(ASR-LLM-TTS-THG)。可自定义形象与音色,无须训练,支持音色克隆,首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and cascaded solutions (ASR-LLM-TTS-THG). Customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.

  • Updated Nov 15, 2024
  • Python

Improve this page

Add a description, image, and links to the multimodal-large-language-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-large-language-models topic, visit your repo's landing page and select "manage topics."

Learn more