ChatTTS Site: A high-quality, multi-functional text to speech model designed for dialogue scenarios.

What is ChatTTS Site

ChatTTS is a text to speech model specially designed for dialogue scenarios, capable of generating speech quality comparable to human dialogue. It supports Chinese and English speech generation and is trained on approximately 100,000 hours of Chinese and English data. ChatTTS is particularly suitable for dialogue tasks of large language model assistants, as well as applications such as creating dialogue-based audio and video introductions. Based on open-source natural language processing and speech synthesis technologies, it provides developers with a powerful and easy-to-use tool.

How to Use ChatTTS Site

Clone Project from GitHub

Navigate to the open-source repository, choose the appropriate folder, and clone the remote repository to your local machine using git commands. Alternatively, you can choose to manually download it from GitHub.

git clone https://github.com/2noise/ChatTTS.git

Install Requirements

Enter the folder where you downloaded the files in the terminal or command line, then run the following command to download the dependencies.

pip install omegaconf -q
pip install vocos -q
pip install vector_quantize_pytorch -q
pip install nemo_text_processing -q
pip install WeTextProcessing -q

Initialize ChatTTS

Import the package, and declare the Python modules and instances we need.

import torch
from ChatTTS.core import Chat
from IPython.display import Audio

chat = ChatTTS.Chat()
chat.load_models()

Declare Your Text

Determine the text you need to generate speech for, and save it as 'texts'.

texts = ["YOUR_TEXT_TO_GENERATE_AUDIO",]

Generate Audio

Generate the speech.

wavs = chat.infer(texts, use_decoder=True)

Play Audio

Play the Audio

Audio(wavs[0], rate=24_000, autoplay=True)

Features of ChatTTS Site

Realistic Text to Speech

ChatTTS generates audio with human-like intonations and pauses, making it sound like a real person.
Language Support

ChatTTS supports both English and Chinese, breaking the language barrier for users.
Well-Trained

ChatTTS is trained on over 40,000 hours of data, ensuring high efficiency and quality in speech generation.
Open-Source

The source code for ChatTTS is available on GitHub, providing developers with a well-maintained and regularly updated tool.

FAQs from ChatTTS Site

What is ChatTTS?

ChatTTS is a text to speech model designed specifically for dialogue scenarios such as LLM assistants. It supports both English and Chinese languages and is trained with over 100,000 hours composed of Chinese and English.

Is ChatTTS free to use?

Yes, ChatTTS is free to use. You can download the project files from the GitHub repository to your local machine. There are also other developers who have created free versions available on well-known open-source platforms such as Github, HuggingFace and Modelscope.

How do I install ChatTTS?

Installation steps for ChatTTS are outlined in the 'How it works' section. Basically, you could directly download the text to speech project into your laptop from Github repo and use it in python. You can also follow the prompts on the official GitHub page for downloading and using ChatTTS.

Where can I find the source code for ChatTTS?

The source code for ChatTTS can be found on its GitHub repository at https://github.com/2noise/ChatTTS.

In what languages is ChatTTS available?

ChatTTS is currently available in English and Chinese.

How do I use ChatTTS in my project?

To use ChatTTS in your project, you can import it and use the `chat.infer` method with your text. More detailed usage examples might be available in the repository's documentation or example files.

Can I contribute to the ChatTTS project?

Yes, contributions to the ChatTTS project are welcome in various forms, including issue discussions, GitHub issues submissions, and pull requests. You can also join the QQ group: 808364215 for discussions.

What kind of support does ChatTTS offer?

For formal inquiries about the model and its roadmap, you can contact the developers at [email protected]. Joining their QQ group or submitting GitHub issues for support is also encouraged.

How is ChatTTS different from other TTS models?

ChatTTS is optimized for dialogue-based tasks, enabling natural and expressive speech synthesis with support for multiple speakers. It offers fine-grained control over prosodic features like laughter, pauses, and interjections, and surpasses most open-source TTS models in terms of prosody.

ChatTTS Site