How it works
The complete flow of an AI call — from speech recognition to response synthesis.
Vocalis.pro platform combines AI speech recognition and LLM-based response generation to conduct real-time telephone conversations. Here is the simplified flow:
Incoming or outgoing
Real-time Recognition and Transcription
System
Natural response
Transcript Report & Results
Details of each step
1. Initiating the call
Incoming: A customer dials your assigned number, and the AI agent answers immediately. Outgoing: The platform dials a list of numbers from your campaign based on the defined parameters.
2. Voice recognition
The AI continuously listens to what the speaker is saying and transcribes their words into text in real time with optimal accuracy.
3. LLM Model Decision
The large language model analyzes the transcription and your prompt system to formulate the most appropriate response, or decide on an action (transfer, scheduling an appointment, ending the call…).
4. Speech synthesis
The text response is converted into natural speech using high-quality TTS engines (ElevenLabs, Cartesia). The voice can be chosen from an extensive library or cloned from an audio recording.
5. Data & Reporting
At the end of each call, you have the full transcript, the audio recording, the result (answered, message, transfer, success…) and the ability to trigger automated post-call actions.
