首页 > 学习

MiniMax语音模型上新! 40种语言真人级生成, 喜马拉雅、网易已接入声线

作者 | 王涵

智东西8月7日报道，今天，MiniMax推出新一代语音生成模型Speech 2.5。

相比5月发布的Speech 02，Speech 2.5有三大新突破：多语种表现更自然、音色复刻更像、40个语种覆盖更广。

目前，Speech 2.5已全球上线，用户可以登录MiniMax开放平台或MiniMax Audio官网体验：

MiniMax开放平台：minimaxi.com/platform_overview

MiniMax Audio：minimaxi.com/audio

Speech 2.5主页

用户可以在Speech 2.5主页选择想要的音色，在对话框内输入文字描述，也可以上传文件，就可以一键生成所需音频。下文呈现了官方公布的Speech 02生成音频的Demo和智东西实测案例：

一、多语种自然表达，减小机械感

MiniMax Speech 2.5提高了生成音频的相似度和自然韵律度，降低了字错率、减小了AI生成的商务会议、日常对话、英文播客的机械感。

智东西实测，其还可以给音频添加场景氛围音，例如美国女高中生在广播中演讲：

https://oss.zhidx.com/c8ca347142979c4c9b0275ec7c968c4e/68937c00/uploads/2025/08/68947cbca223f_68947cbc9335e_68947cbc9332e_%E7%BE%8E%E5%9B%BD%E5%B0%91%E5%A5%B3%E5%B9%BF%E6%92%AD%E6%BC%94%E8%AE%B2.mp3

音频内容：Two years is nothing, but at the same time a lot can be accomplished in two years. You can try a sport you’ve always wanted to start, and become great at it. You can start a morning routine and affect your mood and stress at a deep level. You can meditate for a few minutes per day, become more self-aware and change the way you react to problems. You can start a business and make it a big success.

生成的音频不但可以清晰准确地念出文字，还有母语者很地道的停顿、语调。

立下复仇誓言的哈姆雷特：

https://oss.zhidx.com/8bf418d721c03169570590d39c883d98/68937c00/uploads/2025/08/689452e5721a9_689452e56e7c6_689452e56e79a_%E7%AB%8B%E4%B8%8B%E5%A4%8D%E4%BB%87%E8%AA%93%E8%A8%80%E7%9A%84%E5%93%88%E5%A7%86%E9%9B%B7%E7%89%B9.mp3

音频内容：Remember? Yea, from the tables of my memory, I’ll wipe away all trivial fond records. All saws of books, all forms, all pressures past, that youth and observation copied there. And then commandment all alone shall live within the book and volume of my brain, unmixed with baser matter. Yes, yes by heaven.

再比如，充满激情的西班牙体育赛事解说员：

https://oss.zhidx.com/5483bff4b111fd564b365c075fc2828d/68937c00/uploads/2025/08/68947e6f5f179_68947e6f5bba0_68947e6f5bb70_%E5%85%85%E6%BB%A1%E6%BF%80%E6%83%85%E7%9A%84%E8%A5%BF%E7%8F%AD%E7%89%99%E4%BD%93%E8%82%B2%E8%B5%9B%E4%BA%8B%E8%A7%A3%E8%AF%B4%E5%91%98.mp3

音频内容：¡Arranca el genio por la derecha, deja atrás a uno, se saca de encima al segundo, entra al área, prepara el remate…¡GOLAZO MONUMENTAL! ¡Una obra de arte que sella la victoria y desata la locura total!

二、跨语种复刻口音，还原声线

Speech 2.5还可以跨语种复刻口音，保留同语种不同地区的口音，还能保留特殊年龄的声线特点，用户可以自由选择自己想要的音色。

智东西实测，用霸道总裁的声线说甄嬛传中皇上的经典台词：

https://oss.zhidx.com/fe787f468b25b4ed933658e514374a84/68937c00/uploads/2025/08/689478c44acd9_689478c445a5a_689478c445a19_%E9%9C%B8%E9%81%93%E6%80%BB%E8%A3%81%E5%A3%B0%E7%BA%BF%E8%AF%B4%E7%94%84%E5%AC%9B%E4%BC%A0%E5%8F%B0%E8%AF%8D.mp3

音频内容：嬛嬛一袅楚宫腰，那更春来香减玉消。紫禁城的风水养人，必不会叫你玉减香消。

用英国女王的经典发音来介绍最新的Speech 2.5会是什么样？

https://oss.zhidx.com/ea597c9308a5660833ba2d22bb7d9b35/68937c00/uploads/2025/08/68945359ae9bb_68945359a911a_68945359a90c6_%E8%8B%B1%E5%9B%BD%E5%A5%B3%E7%8E%8B%E7%9A%84%E7%BB%8F%E5%85%B8%E5%8F%91%E9%9F%B3.mp3

音频内容：Hello everyone. We’re thrilled to introduce the next generation of our voice model: MiniMax Speech 2.5. Building on its predecessor, Speech 2.0, this new version is more powerful than ever. But where it truly shines is in its incredible realism. The model masterfully captures the subtle nuances of the human voice——from trailing intonation and vocal style, to the full spectrum of emotion, all reproduced with stunning authenticity.

从停顿、节奏、到发音处理，模型生成的语音保持了纯正的“女王腔”。

跨语种复刻也可以办到，智东西让Speech 2.5用热血韩漫男主的音色说“美美桑内”歌词，在韩语和英语中切换：

https://oss.zhidx.com/a64f1d5f97c6f960bb11d0eb655ac21f/68937c00/uploads/2025/08/68947938d5422_68947938cfa2f_68947938cf9fa_%E7%83%AD%E8%A1%80%E9%9F%A9%E6%BC%AB%E7%94%B7%E4%B8%BB%E5%A3%B0%E7%BA%BF%E8%AF%B4%E2%80%9C%E7%BE%8E%E7%BE%8E%E6%A1%91%E5%86%85%E2%80%9D.mp3

音频内容：매일매일 설레，이리저리 바빠，never stop burn it，참 예쁜 날이야 oh you know?

同一音色在意大利语、英语间的切换：

https://oss.zhidx.com/6233b4848810ccc8579a8afa1f63daaf/68937c00/uploads/2025/08/6894537784b59_689453777ef17_689453777eee9_%E6%84%8F%E5%A4%A7%E5%88%A9%E8%AF%AD%E8%8B%B1%E8%AF%AD%E5%88%87%E6%8D%A2.mp3

音频内容：Questa è la mia vera voce. I find speaking English a bit difficult. It’s like trying to speak Italian without using hand gestures.

在不同的语言中切换，Speech 2.5生成的内容依旧可以保留口音特色细节。

三、新增多个小语种，语种类型增至40个

Speech 2.5新增了保加利亚语、丹麦语、希伯来语、马来语、波斯语、斯洛伐克语等多个小语种，语种类型扩充到了40个。跨境电商、出海客服、本地化营销，全球化内容可以一键创作。

比如马来语：

https://oss.zhidx.com/4d20b9aeff9f8db4af88a605506328d6/68937c00/uploads/2025/08/689453a561e81_689453a55c370_689453a55c348_%E9%A9%AC%E6%9D%A5%E8%AF%AD.mp3

音频内容：Selamat datang, semoga hari anda indah.

希伯来语：

https://oss.zhidx.com/4faac7c2307610b77b2fa04b7f719d27/68937c00/uploads/2025/08/689453b42355d_689453b41f914_689453b41f8e2_%E5%B8%8C%E4%BC%AF%E6%9D%A5%E8%AF%AD.mp3

音频内容：.הקשיבו למנגינה היפה הזו

四、促进跨境业务，喜马拉雅、网易都用了

MiniMax Speech语音模型可以应用在多种场景下，例如多语种客服、跨国广告配音、跨国教育、跨境电商等。

目前，MiniMax Speech语音模型已在全球被广泛采用。在海外，Vapi、Pipecat等Agent平台选择使用MiniMax Speech提供服务，Hedra、Icon、Syllaby等头部AI应用也已接入MiniMax Speech。

国内，高途教育、喜马拉雅、网易、Rokid眼镜等头部平台及产品都选择了MiniMax Speech。

结语：MiniMax在AI音频赛道继续深耕

MiniMax在AI音频赛道并非初出茅庐，其今年5月发布的Speech 02在Artificial Analysis和Hugging Face TTS Arena两项语音基准测评榜单中超越 OpenAI、ElevenLabs等知名模型，获得双料第一。

Speech 2.5可视为Speech 02的进阶版本，在继承前代优势的基础上，进一步在多语种、音色复刻及语种覆盖上深入优化。

当下，众多企业和研究机构纷纷布局，AI音频赛道竞争愈发激烈，MiniMax Speech 2.5的发布为市场注入了新的活力。