diff --git a/README.md b/README.md index 858e3bd..a148423 100644 --- a/README.md +++ b/README.md @@ -8,39 +8,148 @@ [Community](https://discord.gg/NaU4mMgcnC) | [Docs](https://docs.vocode.dev) | [Dashboard](https://app.vocode.dev) -``` -pip install vocode +# vocode + +### **Build voice-based LLM apps in minutes** + +Vocode is an open source library that makes it easy to build voice-based LLM apps. Using Vocode, you can build real-time streaming conversations with LLMs and deploy them to phone calls, Zoom meetings, and more. You can also build personal assistants or apps like voice-based chess. Vocode provides easy abstractions and integrations so that everything you need is in a single library. + +# **⭐️ Features** +- 🗣 Spin up a conversation with your system audio in minutes +- ➡️ 📞 Set up a phone number that responds with a LLM-based agent +- 📞 ➡️ Send out phone calls from your phone number managed by an LLM-based agent +- Out of the box integrations with: + - Transcription services, including: + - [Deepgram](https://deepgram.com/) + - [AssemblyAI](https://www.assemblyai.com/) + - [Google Cloud](https://cloud.google.com/speech-to-text) + - [Whisper](https://openai.com/blog/introducing-chatgpt-and-whisper-apis) + - LLMs, including: + - [ChatGPT](https://openai.com/blog/chatgpt) + - [GPT-4](https://platform.openai.com/docs/models/gpt-4) + - Synthesis services, including: + - [Microsoft Azure](https://azure.microsoft.com/en-us/products/cognitive-services/text-to-speech/) + - [Google Cloud](https://cloud.google.com/text-to-speech) + - [Eleven Labs](https://elevenlabs.io/) + +Check out our React SDK [here](https://github.com/vocodedev/vocode-react-sdk)! + +# **☁️ Quickstart (Hosted)** + +First, get a *free* API key from our [dashboard](https://app.vocode.dev). + +```bash +pip install 'vocode[io]' ``` ```python import asyncio import signal + import vocode - -vocode.api_key = "YOUR_API_KEY" - -from vocode.conversation import Conversation +from vocode.streaming.hosted_streaming_conversation import HostedStreamingConversation +from vocode.streaming.streaming_conversation import StreamingConversation from vocode.helpers import create_microphone_input_and_speaker_output -from vocode.models.transcriber import DeepgramTranscriberConfig -from vocode.models.agent import ChatGPTAgentConfig -from vocode.models.synthesizer import AzureSynthesizerConfig +from vocode.streaming.models.transcriber import ( + DeepgramTranscriberConfig, + PunctuationEndpointingConfig, +) +from vocode.streaming.models.agent import ChatGPTAgentConfig +from vocode.streaming.models.message import BaseMessage +from vocode.streaming.models.synthesizer import AzureSynthesizerConfig + +vocode.api_key = "" + if __name__ == "__main__": microphone_input, speaker_output = create_microphone_input_and_speaker_output( - use_default_devices=True + streaming=True, use_default_devices=False ) - conversation = Conversation( + conversation = HostedStreamingConversation( input_device=microphone_input, output_device=speaker_output, - transcriber_config=DeepgramTranscriberConfig.from_input_device(microphone_input), - agent_config=ChatGPTAgentConfig( - initial_message=BaseMessage(text="Hello!"), - prompt_preamble="The AI is having a pleasant conversation about life." + transcriber_config=DeepgramTranscriberConfig.from_input_device( + microphone_input, + endpointing_config=PunctuationEndpointingConfig(), ), - synthesizer_config=AzureSynthesizerConfig.from_output_device(speaker_output) + agent_config=ChatGPTAgentConfig( + initial_message=BaseMessage(text="Hello!"), + prompt_preamble="Have a pleasant conversation about life", + ), + synthesizer_config=AzureSynthesizerConfig.from_output_device(speaker_output), ) - # This allows you to stop the conversation with a KeyboardInterrupt signal.signal(signal.SIGINT, lambda _0, _1: conversation.deactivate()) asyncio.run(conversation.start()) ``` + +# **🚀 Quickstart (Self-hosted)** + +```bash +pip install 'vocode[io]' +``` + +```python +import asyncio +import signal + +import vocode +from vocode.streaming.streaming_conversation import StreamingConversation +from vocode.helpers import create_microphone_input_and_speaker_output +from vocode.streaming.models.transcriber import ( + DeepgramTranscriberConfig, + PunctuationEndpointingConfig, +) +from vocode.streaming.models.agent import ChatGPTAgentConfig +from vocode.streaming.models.message import BaseMessage +from vocode.streaming.models.synthesizer import AzureSynthesizerConfig + +# these can also be set as environment variables +vocode.setenv( + OPENAI_API_KEY="", + DEEPGRAM_API_KEY="", + AZURE_SPEECH_KEY="", + AZURE_SPEECH_REGION="", +) + + +async def main(): + microphone_input, speaker_output = create_microphone_input_and_speaker_output( + streaming=True, use_default_devices=False + ) + + conversation = StreamingConversation( + output_device=speaker_output, + transcriber_config=DeepgramTranscriberConfig.from_input_device( + microphone_input, endpointing_config=PunctuationEndpointingConfig() + ), + agent_config=ChatGPTAgentConfig( + initial_message=BaseMessage(text="Hello!"), + prompt_preamble="Have a pleasant conversation about life", + ), + synthesizer_config=AzureSynthesizerConfig.from_output_device(speaker_output), + ) + await conversation.start() + print("Conversation started, press Ctrl+C to end") + signal.signal(signal.SIGINT, lambda _0, _1: conversation.terminate()) + while conversation.is_active(): + chunk = microphone_input.get_audio() + if chunk: + conversation.receive_audio(chunk) + await asyncio.sleep(0) + + +if __name__ == "__main__": + asyncio.run(main()) +``` + +# **📞 Phone call quickstarts **# +- [Inbound calls - Hosted](https://docs.vocode.dev/telephony#inbound-calls) +- [Outbound calls - Hosted](https://docs.vocode.dev/telephony#outbound-calls) +- [Telephony Server - self-hosted](https://github.com/vocodedev/vocode-python/blob/main/examples/telephony_app.py) + + + +# **🌱 Documentation** + +[docs.vocode.dev](https://docs.vocode.dev/)