Updating READMEs
This commit is contained in:
parent
d72bba83c9
commit
33ea9c086d
5 changed files with 189 additions and 2 deletions
144
README.md
Normal file
144
README.md
Normal file
|
|
@ -0,0 +1,144 @@
|
|||
# Mimic 3
|
||||
|
||||
A fast and local neural text to speech system for [Mycroft](https://mycroft.ai/) and the [Mark II](https://mycroft.ai/product/mark-ii/).
|
||||
|
||||
[Available voices](https://github.com/MycroftAI/mimic3-voices)
|
||||
|
||||
|
||||
## Dependencies
|
||||
|
||||
Mimic 3 requires:
|
||||
|
||||
* Python 3.7 or higher
|
||||
* The [Onnx runtime](https://onnxruntime.ai/)
|
||||
* [gruut](https://github.com/rhasspy/gruut) or [eSpeak-ng](https://github.com/espeak-ng/espeak-ng) (depending on the voice)
|
||||
|
||||
|
||||
## Installation
|
||||
|
||||
|
||||
### eSpeak
|
||||
|
||||
Some voices depend on [eSpeak-ng](https://github.com/espeak-ng/espeak-ng), specifically `libespeak-ng.so`. For those voices, make sure that libespeak-ng is installed with:
|
||||
|
||||
``` sh
|
||||
sudo apt-get install libespeak-ng1
|
||||
```
|
||||
|
||||
### Mycroft TTS Plugin
|
||||
|
||||
Install the plugin:
|
||||
|
||||
``` sh
|
||||
mycroft-pip install plugin-tts-mimic3[all]
|
||||
```
|
||||
|
||||
Enable the plugin in your [mycroft.conf](https://mycroft-ai.gitbook.io/docs/using-mycroft-ai/customizations/mycroft-conf) file:
|
||||
|
||||
``` sh
|
||||
mycroft-conf set tts.module mimic3_tts_plug
|
||||
```
|
||||
|
||||
See the [plugin's documentation](https://github.com/MycroftAI/plugin-tts-mimic3) for more options.
|
||||
|
||||
|
||||
### Using pip
|
||||
|
||||
Install the command-line tool:
|
||||
|
||||
``` sh
|
||||
pip install mimic3[all]
|
||||
```
|
||||
|
||||
Once installed, the following commands will be available:
|
||||
* `mimic3`
|
||||
* `mimic3-download`
|
||||
|
||||
Install the HTTP web server:
|
||||
|
||||
``` sh
|
||||
pip install mimic3-http[all]
|
||||
```
|
||||
|
||||
Once installed, the following commands will be available:
|
||||
* `mimic3-server`
|
||||
* `mimic3-client`
|
||||
|
||||
Language support can be selectively installed by replacing `all` with:
|
||||
|
||||
* `de` - German
|
||||
* `es` - Spanish
|
||||
* `fr` - French
|
||||
* `it` - Italian
|
||||
* `nl` - Dutch
|
||||
* `ru` - Russian
|
||||
* `sw` - Kiswahili
|
||||
|
||||
Excluding `[..]` entirely will install support for English only.
|
||||
|
||||
|
||||
### From Source
|
||||
|
||||
Clone the repository:
|
||||
|
||||
``` sh
|
||||
git clone https://github.com/MycroftAI/mimic3.git
|
||||
```
|
||||
|
||||
Run the install script:
|
||||
|
||||
``` sh
|
||||
cd mimic3/
|
||||
./install.sh
|
||||
```
|
||||
|
||||
A virtual environment will be created in `mimic3/.venv` and each of the Python modules will be installed in editiable mode (`pip install -e`).
|
||||
|
||||
Once installed, the following commands will be available in `.venv/bin`:
|
||||
* `mimic3`
|
||||
* `mimic3-server`
|
||||
* `mimic3-client`
|
||||
* `mimic3-download`
|
||||
|
||||
|
||||
## Voice Keys
|
||||
|
||||
Mimic 3 references voices with the format:
|
||||
|
||||
* `<language>/<name>_<quality>` for single speaker voices, and
|
||||
* `<language>/<name>_<quality>#<speaker>` for multi-speaker voices
|
||||
* `<speaker>` can be a name or number starting at 0
|
||||
* Speaker names come from a voice's `speakers.txt` file
|
||||
|
||||
For example, the default [Alan Pope](https://popey.me/) voice key is `en_UK/apope_low`. The [CMU Arctic voice](https://github.com/MycroftAI/mimic3-voices/tree/master/voices/en_US/cmu-arctic_low) contains multiple speakers, with a commonly used voice being `en_US/cmu-arctic_low#slt`.
|
||||
|
||||
Voices are automatically downloaded from [Github](https://github.com/MycroftAI/mimic3-voices) and stored in `${HOME}/.local/share/mimic3`
|
||||
|
||||
|
||||
## Running
|
||||
|
||||
|
||||
### Command-Line Tools
|
||||
|
||||
The `mimic3` command can be used to synthesize audio on the command line:
|
||||
|
||||
``` sh
|
||||
mimic3 --voice 'en_UK/apope_low' 'My hovercraft is full of eels.' > hovercraft_eels.wav
|
||||
```
|
||||
|
||||
See [voice keys](#voice-keys) for how to reference voices and speakers.
|
||||
|
||||
See `mimic3 --help` or the [CLI documentation](mimic3-tts/README.md) for more details.
|
||||
|
||||
|
||||
### Web Server and Client
|
||||
|
||||
|
||||
## SSML
|
||||
|
||||
A [subset of SSML](mimic3-tts/README.md#SSML) is supported.
|
||||
|
||||
|
||||
## License
|
||||
|
||||
See [license file](LICENSE)
|
||||
|
|
@ -31,7 +31,7 @@ pip3 ${PIP_INSTALL} -e "${this_dir}/opentts-abc"
|
|||
|
||||
# Include support for languages besides English
|
||||
pushd "${this_dir}/mimic3-tts" 2>/dev/null
|
||||
pip3 ${PIP_INSTALL} -e '.[de,fr,it,nl,ru,sw]'
|
||||
pip3 ${PIP_INSTALL} -e '.[all]'
|
||||
popd 2>/dev/null
|
||||
|
||||
pip3 ${PIP_INSTALL} -e "${this_dir}/mimic3-http"
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# Mimic 3 Web Server
|
||||
|
||||
A small HTTP web server for the [Mimic 3](https://github.com/MycroftAI/mimic3) text to speech system.
|
||||
|
||||
[Available voices](https://github.com/MycroftAI/mimic3-voices)
|
||||
|
||||
|
||||
## Installation
|
||||
|
||||
|
|
|
|||
|
|
@ -56,7 +56,7 @@ for lang in [
|
|||
"ru",
|
||||
"sw",
|
||||
]:
|
||||
extras[f"gruut[{lang}]"] = [lang]
|
||||
extras[f"mimic3-tts[{lang}]"] = [lang]
|
||||
|
||||
# Add "all" tag
|
||||
for tags in extras.values():
|
||||
|
|
|
|||
|
|
@ -2,6 +2,40 @@
|
|||
|
||||
A fast and local neural text to speech system for [Mycroft](https://mycroft.ai/) and the [Mark II](https://mycroft.ai/product/mark-ii/).
|
||||
|
||||
[Available voices](https://github.com/MycroftAI/mimic3-voices)
|
||||
|
||||
|
||||
## Command-Line Tools
|
||||
|
||||
|
||||
### mimic3
|
||||
|
||||
|
||||
### mimic3-download
|
||||
|
||||
|
||||
## SSML
|
||||
|
||||
A subset of [SSML](https://www.w3.org/TR/speech-synthesis11/) is supported:
|
||||
|
||||
* `<speak>` - wrap around SSML text
|
||||
* `lang` - set language for document
|
||||
* `<s>` - sentence (disables automatic sentence breaking)
|
||||
* `lang` - set language for sentence
|
||||
* `<w>` / `<token>` - word (disables automatic tokenization)
|
||||
* `<voice name="...">` - set voice of inner text
|
||||
* `voice` - name or language of voice
|
||||
* Name format is `tts:voice` (e.g., "glow-speak:en-us_mary_ann") or `tts:voice#speaker_id` (e.g., "coqui-tts:en_vctk#p228")
|
||||
* If one of the supported languages, a preferred voice is used (override with `--preferred-voice <lang> <voice>`)
|
||||
* `<say-as interpret-as="">` - force interpretation of inner text
|
||||
* `interpret-as` one of "spell-out", "date", "number", "time", or "currency"
|
||||
* `format` - way to format text depending on `interpret-as`
|
||||
* number - one of "cardinal", "ordinal", "digits", "year"
|
||||
* date - string with "d" (cardinal day), "o" (ordinal day), "m" (month), or "y" (year)
|
||||
* `<break time="">` - Pause for given amount of time
|
||||
* time - seconds ("123s") or milliseconds ("123ms")
|
||||
* `<sub alias="">` - substitute `alias` for inner text
|
||||
|
||||
|
||||
## Architecture
|
||||
|
||||
|
|
@ -23,3 +57,8 @@ Voices that use [eSpeak-ng](https://github.com/espeak-ng/espeak-ng) for phonemiz
|
|||
### Character-based Voices
|
||||
|
||||
Voices whose "phonemes" are characters from an alphabet, typically with some punctuation.
|
||||
|
||||
|
||||
## License
|
||||
|
||||
See [license file](LICENSE)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue