Updating READMEs

This commit is contained in:
Michael Hansen 2022-03-31 13:17:25 -04:00
commit 33ea9c086d
5 changed files with 189 additions and 2 deletions

144
README.md Normal file
View file

@ -0,0 +1,144 @@
# Mimic 3
A fast and local neural text to speech system for [Mycroft](https://mycroft.ai/) and the [Mark II](https://mycroft.ai/product/mark-ii/).
[Available voices](https://github.com/MycroftAI/mimic3-voices)
## Dependencies
Mimic 3 requires:
* Python 3.7 or higher
* The [Onnx runtime](https://onnxruntime.ai/)
* [gruut](https://github.com/rhasspy/gruut) or [eSpeak-ng](https://github.com/espeak-ng/espeak-ng) (depending on the voice)
## Installation
### eSpeak
Some voices depend on [eSpeak-ng](https://github.com/espeak-ng/espeak-ng), specifically `libespeak-ng.so`. For those voices, make sure that libespeak-ng is installed with:
``` sh
sudo apt-get install libespeak-ng1
```
### Mycroft TTS Plugin
Install the plugin:
``` sh
mycroft-pip install plugin-tts-mimic3[all]
```
Enable the plugin in your [mycroft.conf](https://mycroft-ai.gitbook.io/docs/using-mycroft-ai/customizations/mycroft-conf) file:
``` sh
mycroft-conf set tts.module mimic3_tts_plug
```
See the [plugin's documentation](https://github.com/MycroftAI/plugin-tts-mimic3) for more options.
### Using pip
Install the command-line tool:
``` sh
pip install mimic3[all]
```
Once installed, the following commands will be available:
* `mimic3`
* `mimic3-download`
Install the HTTP web server:
``` sh
pip install mimic3-http[all]
```
Once installed, the following commands will be available:
* `mimic3-server`
* `mimic3-client`
Language support can be selectively installed by replacing `all` with:
* `de` - German
* `es` - Spanish
* `fr` - French
* `it` - Italian
* `nl` - Dutch
* `ru` - Russian
* `sw` - Kiswahili
Excluding `[..]` entirely will install support for English only.
### From Source
Clone the repository:
``` sh
git clone https://github.com/MycroftAI/mimic3.git
```
Run the install script:
``` sh
cd mimic3/
./install.sh
```
A virtual environment will be created in `mimic3/.venv` and each of the Python modules will be installed in editiable mode (`pip install -e`).
Once installed, the following commands will be available in `.venv/bin`:
* `mimic3`
* `mimic3-server`
* `mimic3-client`
* `mimic3-download`
## Voice Keys
Mimic 3 references voices with the format:
* `<language>/<name>_<quality>` for single speaker voices, and
* `<language>/<name>_<quality>#<speaker>` for multi-speaker voices
* `<speaker>` can be a name or number starting at 0
* Speaker names come from a voice's `speakers.txt` file
For example, the default [Alan Pope](https://popey.me/) voice key is `en_UK/apope_low`. The [CMU Arctic voice](https://github.com/MycroftAI/mimic3-voices/tree/master/voices/en_US/cmu-arctic_low) contains multiple speakers, with a commonly used voice being `en_US/cmu-arctic_low#slt`.
Voices are automatically downloaded from [Github](https://github.com/MycroftAI/mimic3-voices) and stored in `${HOME}/.local/share/mimic3`
## Running
### Command-Line Tools
The `mimic3` command can be used to synthesize audio on the command line:
``` sh
mimic3 --voice 'en_UK/apope_low' 'My hovercraft is full of eels.' > hovercraft_eels.wav
```
See [voice keys](#voice-keys) for how to reference voices and speakers.
See `mimic3 --help` or the [CLI documentation](mimic3-tts/README.md) for more details.
### Web Server and Client
## SSML
A [subset of SSML](mimic3-tts/README.md#SSML) is supported.
## License
See [license file](LICENSE)

View file

@ -31,7 +31,7 @@ pip3 ${PIP_INSTALL} -e "${this_dir}/opentts-abc"
# Include support for languages besides English
pushd "${this_dir}/mimic3-tts" 2>/dev/null
pip3 ${PIP_INSTALL} -e '.[de,fr,it,nl,ru,sw]'
pip3 ${PIP_INSTALL} -e '.[all]'
popd 2>/dev/null
pip3 ${PIP_INSTALL} -e "${this_dir}/mimic3-http"

View file

@ -1,5 +1,9 @@
# Mimic 3 Web Server
A small HTTP web server for the [Mimic 3](https://github.com/MycroftAI/mimic3) text to speech system.
[Available voices](https://github.com/MycroftAI/mimic3-voices)
## Installation

View file

@ -56,7 +56,7 @@ for lang in [
"ru",
"sw",
]:
extras[f"gruut[{lang}]"] = [lang]
extras[f"mimic3-tts[{lang}]"] = [lang]
# Add "all" tag
for tags in extras.values():

View file

@ -2,6 +2,40 @@
A fast and local neural text to speech system for [Mycroft](https://mycroft.ai/) and the [Mark II](https://mycroft.ai/product/mark-ii/).
[Available voices](https://github.com/MycroftAI/mimic3-voices)
## Command-Line Tools
### mimic3
### mimic3-download
## SSML
A subset of [SSML](https://www.w3.org/TR/speech-synthesis11/) is supported:
* `<speak>` - wrap around SSML text
* `lang` - set language for document
* `<s>` - sentence (disables automatic sentence breaking)
* `lang` - set language for sentence
* `<w>` / `<token>` - word (disables automatic tokenization)
* `<voice name="...">` - set voice of inner text
* `voice` - name or language of voice
* Name format is `tts:voice` (e.g., "glow-speak:en-us_mary_ann") or `tts:voice#speaker_id` (e.g., "coqui-tts:en_vctk#p228")
* If one of the supported languages, a preferred voice is used (override with `--preferred-voice <lang> <voice>`)
* `<say-as interpret-as="">` - force interpretation of inner text
* `interpret-as` one of "spell-out", "date", "number", "time", or "currency"
* `format` - way to format text depending on `interpret-as`
* number - one of "cardinal", "ordinal", "digits", "year"
* date - string with "d" (cardinal day), "o" (ordinal day), "m" (month), or "y" (year)
* `<break time="">` - Pause for given amount of time
* time - seconds ("123s") or milliseconds ("123ms")
* `<sub alias="">` - substitute `alias` for inner text
## Architecture
@ -23,3 +57,8 @@ Voices that use [eSpeak-ng](https://github.com/espeak-ng/espeak-ng) for phonemiz
### Character-based Voices
Voices whose "phonemes" are characters from an alphabet, typically with some punctuation.
## License
See [license file](LICENSE)