🤖
Voice (Edge TTS)
Convert text to speech using Microsoft Edge TTS with real-time streaming, customizable voice settings, and support for multiple languages including Chinese a...
安全通过
⚙️脚本
技能说明
Voice Skill (Edge TTS)
Text-to-speech skill using Microsoft Edge TTS engine with real-time streaming playback support.
Features 功能特点
- Edge TTS Engine - High quality text-to-speech using Microsoft Edge
- Streaming Playback - Real-time audio streaming (边生成边播放)
- Multiple Voices - Support for Chinese, English, Japanese, Korean voices
- Customizable - Adjust rate, volume, and pitch
- Secure Implementation - No command injection vulnerabilities
Installation 安装
1. Install Python dependencies
pip install edge-tts
2. Install ffmpeg (required for streaming)
Windows:
Download from: https://github.com/GyanD/codexffmpeg/releases
Extract and add bin folder to PATH
macOS:
brew install ffmpeg
Linux:
sudo apt install ffmpeg
Usage 使用
Streaming Playback (Recommended) 流式播放(推荐)
Real-time audio generation and playback:
// Basic usage
await skill.execute({
action: 'stream',
text: '你好,我是小九'
});
// With custom voice
await skill.execute({
action: 'stream',
text: 'Hello, how are you?',
options: {
voice: 'en-US-Standard-A',
rate: '+10%',
volume: '+0%',
pitch: '+0Hz'
}
});
Text-to-Speech with File 生成语音文件
await skill.execute({
action: 'tts',
text: 'Hello, how are you today?',
options: {
voice: 'zh-CN-XiaoxiaoNeural'
}
});
// Returns: { success: true, media: 'MEDIA: /path/to/file.mp3' }
Direct Speak 直接播放
await skill.execute({
action: 'speak',
text: 'Hello!'
});
List Available Voices 查看可用语音
await skill.execute({
action: 'voices'
});
Available Voices 可用语音
| Language | Voice ID |
|---|---|
| Chinese (Female) | zh-CN-XiaoxiaoNeural |
| Chinese (Male) | zh-CN-YunxiNeural |
| Chinese (Male) | zh-CN-YunyangNeural |
| English (US Female) | en-US-Standard-A |
| English (US Male) | en-US-Standard-D |
| English (UK) | en-GB-Standard-A |
| Japanese | ja-JP-NanamiNeural |
| Korean | ko-KR-SunHiNeural |
Options 参数
| Option | Default | Description |
|---|---|---|
| voice | zh-CN-XiaoxiaoNeural | Voice ID |
| rate | +0% | Speech rate (-50% to +100%) |
| volume | +0% | Volume adjustment (-50% to +50%) |
| pitch | +0Hz | Pitch adjustment |
Security 安全
This skill implements enterprise-grade security best practices:
🛡️ Security Features
| Feature | Implementation |
|---|---|
| Input Validation | Voice parameter whitelist validation - only allowed voices can be used |
| No Shell Execution | Uses spawn() with array arguments instead of shell command concatenation |
| Command Injection Prevention | All user inputs are properly validated and escaped |
| Path Safety | Fixed script path prevents path traversal |
Security Details
// ❌ UNSAFE - Don't use exec with string concatenation
exec(`py script.py "${userText}" --voice ${userVoice}`);
// ✅ SAFE - Use spawn with array arguments
spawn('py', [scriptPath, text, '--voice', voice], { shell: false });
Voice Whitelist
Only these voices are allowed:
const allowedVoices = [
'zh-CN-XiaoxiaoNeural', 'zh-CN-YunxiNeural', 'zh-CN-YunyangNeural',
'zh-CN-YunyouNeural', 'zh-CN-XiaomoNeural',
'en-US-Standard-C', 'en-US-Standard-D', 'en-US-Wavenet-F',
'en-GB-Standard-A', 'en-GB-Wavenet-A',
'ja-JP-NanamiNeural', 'ko-KR-SunHiNeural'
];
Any invalid voice parameter will be rejected and replaced with the default voice.
Changelog 更新日志
v1.10 (2026-02-24)
- Enterprise-grade security - Full command injection protection
- Voice whitelist validation
- Replaced exec with spawn for secure process execution
- Input sanitization for all parameters
v1.1.0
- Add streaming playback support (边生成边播放)
- Add ffmpeg dependency
- Fix command injection vulnerability
- Add voice whitelist validation
v1.0.0
- Initial release with basic TTS support
如何使用「Voice (Edge TTS)」?
- 打开小龙虾AI(Web 或 iOS App)
- 点击上方「立即使用」按钮,或在对话框中输入任务描述
- 小龙虾AI 会自动匹配并调用「Voice (Edge TTS)」技能完成任务
- 结果即时呈现,支持继续对话优化