AgentScope Realtime Voice Agents — OpenAI/Gemini/DashScope Realtime API
6 TTS models, RealtimeAgent, voice+tools integration, and multimodal pipelines for realtime voice agents.

AgentScope Realtime Voice Agents — OpenAI/Gemini/DashScope Realtime API
Text-based agents require reading and typing. Voice agents let users talk to their agents — hands-free, natural, and fast.
AgentScope supports three realtime voice backends, each with different strengths. In this post, we'll set up TTS models, build realtime voice agents, and add tools that respond to voice commands.
Series: Part 1: Getting Started | Part 2: Multi-Agent | Part 3: MCP Integration | Part 4: RAG + Memory | Part 5 (this post) | Part 6: Production
1. Voice Agent Overview
Related Posts

AgentScope Production Deployment — Runtime, Monitoring, Scaling
Docker deployment with agentscope-runtime, OpenTelemetry tracing, AgentScope Studio, RL fine-tuning, production checklist.

AgentScope RAG + Memory Architecture — Building Knowledge-Based Agents
Build knowledge-based agents with KnowledgeBase, vector stores (Qdrant/Milvus), and ReMe long-term memory.

AgentScope MCP Server Integration — External Tool Integration in Practice
Connect external tools via MCP (Stdio/HTTP), cross-framework communication with A2A, and building custom MCP servers.