The primary goal is to create a voice agent that can interact with an LLM over the phone to order pizza. The creator aims to build their own interface for this agent, despite existing solutions, to add more custom functionalities like asking about favorite toppings and displaying conversations.
The creator decided against using the 11 Labs MCP server directly because it would have been too simple and quick to implement, taking only about three minutes. This would not have been 'fun' enough for the creator, who wanted to build a more custom and involved solution.
Beyond basic ordering, the creator wants the agent to ask customers about their favorite pizza toppings and potentially order them. They also plan to display the entire conversation history between the agent and the customer, and only then proceed with ordering the pizza.
Transcribe recordings, audio files, and YouTube videos — with AI summaries, speaker detection, and unlimited transcriptions.
Or transcribe another YouTube video here →