From 33ea9c086dc77e8af3c1aa59bdfd7d5356fafa14 Mon Sep 17 00:00:00 2001 From: Michael Hansen Date: Thu, 31 Mar 2022 13:17:25 -0400 Subject: [PATCH] Updating READMEs --- README.md | 144 ++++++++++++++++++++++++++++++++++++++++++ install.sh | 2 +- mimic3-http/README.md | 4 ++ mimic3-http/setup.py | 2 +- mimic3-tts/README.md | 39 ++++++++++++ 5 files changed, 189 insertions(+), 2 deletions(-) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..9c98f5c --- /dev/null +++ b/README.md @@ -0,0 +1,144 @@ +# Mimic 3 + +A fast and local neural text to speech system for [Mycroft](https://mycroft.ai/) and the [Mark II](https://mycroft.ai/product/mark-ii/). + +[Available voices](https://github.com/MycroftAI/mimic3-voices) + + +## Dependencies + +Mimic 3 requires: + +* Python 3.7 or higher +* The [Onnx runtime](https://onnxruntime.ai/) +* [gruut](https://github.com/rhasspy/gruut) or [eSpeak-ng](https://github.com/espeak-ng/espeak-ng) (depending on the voice) + + +## Installation + + +### eSpeak + +Some voices depend on [eSpeak-ng](https://github.com/espeak-ng/espeak-ng), specifically `libespeak-ng.so`. For those voices, make sure that libespeak-ng is installed with: + +``` sh +sudo apt-get install libespeak-ng1 +``` + +### Mycroft TTS Plugin + +Install the plugin: + +``` sh +mycroft-pip install plugin-tts-mimic3[all] +``` + +Enable the plugin in your [mycroft.conf](https://mycroft-ai.gitbook.io/docs/using-mycroft-ai/customizations/mycroft-conf) file: + +``` sh +mycroft-conf set tts.module mimic3_tts_plug +``` + +See the [plugin's documentation](https://github.com/MycroftAI/plugin-tts-mimic3) for more options. + + +### Using pip + +Install the command-line tool: + +``` sh +pip install mimic3[all] +``` + +Once installed, the following commands will be available: + * `mimic3` + * `mimic3-download` + +Install the HTTP web server: + +``` sh +pip install mimic3-http[all] +``` + +Once installed, the following commands will be available: + * `mimic3-server` + * `mimic3-client` + +Language support can be selectively installed by replacing `all` with: + +* `de` - German +* `es` - Spanish +* `fr` - French +* `it` - Italian +* `nl` - Dutch +* `ru` - Russian +* `sw` - Kiswahili + +Excluding `[..]` entirely will install support for English only. + + +### From Source + +Clone the repository: + +``` sh +git clone https://github.com/MycroftAI/mimic3.git +``` + +Run the install script: + +``` sh +cd mimic3/ +./install.sh +``` + +A virtual environment will be created in `mimic3/.venv` and each of the Python modules will be installed in editiable mode (`pip install -e`). + +Once installed, the following commands will be available in `.venv/bin`: + * `mimic3` + * `mimic3-server` + * `mimic3-client` + * `mimic3-download` + + +## Voice Keys + +Mimic 3 references voices with the format: + +* `/_` for single speaker voices, and +* `/_#` for multi-speaker voices + * `` can be a name or number starting at 0 + * Speaker names come from a voice's `speakers.txt` file + +For example, the default [Alan Pope](https://popey.me/) voice key is `en_UK/apope_low`. The [CMU Arctic voice](https://github.com/MycroftAI/mimic3-voices/tree/master/voices/en_US/cmu-arctic_low) contains multiple speakers, with a commonly used voice being `en_US/cmu-arctic_low#slt`. + +Voices are automatically downloaded from [Github](https://github.com/MycroftAI/mimic3-voices) and stored in `${HOME}/.local/share/mimic3` + + +## Running + + +### Command-Line Tools + +The `mimic3` command can be used to synthesize audio on the command line: + +``` sh +mimic3 --voice 'en_UK/apope_low' 'My hovercraft is full of eels.' > hovercraft_eels.wav +``` + +See [voice keys](#voice-keys) for how to reference voices and speakers. + +See `mimic3 --help` or the [CLI documentation](mimic3-tts/README.md) for more details. + + +### Web Server and Client + + +## SSML + +A [subset of SSML](mimic3-tts/README.md#SSML) is supported. + + +## License + +See [license file](LICENSE) diff --git a/install.sh b/install.sh index 0ecd490..7922237 100755 --- a/install.sh +++ b/install.sh @@ -31,7 +31,7 @@ pip3 ${PIP_INSTALL} -e "${this_dir}/opentts-abc" # Include support for languages besides English pushd "${this_dir}/mimic3-tts" 2>/dev/null -pip3 ${PIP_INSTALL} -e '.[de,fr,it,nl,ru,sw]' +pip3 ${PIP_INSTALL} -e '.[all]' popd 2>/dev/null pip3 ${PIP_INSTALL} -e "${this_dir}/mimic3-http" diff --git a/mimic3-http/README.md b/mimic3-http/README.md index c7c6d49..60816fc 100644 --- a/mimic3-http/README.md +++ b/mimic3-http/README.md @@ -1,5 +1,9 @@ # Mimic 3 Web Server +A small HTTP web server for the [Mimic 3](https://github.com/MycroftAI/mimic3) text to speech system. + +[Available voices](https://github.com/MycroftAI/mimic3-voices) + ## Installation diff --git a/mimic3-http/setup.py b/mimic3-http/setup.py index 38dc88b..0c1782f 100644 --- a/mimic3-http/setup.py +++ b/mimic3-http/setup.py @@ -56,7 +56,7 @@ for lang in [ "ru", "sw", ]: - extras[f"gruut[{lang}]"] = [lang] + extras[f"mimic3-tts[{lang}]"] = [lang] # Add "all" tag for tags in extras.values(): diff --git a/mimic3-tts/README.md b/mimic3-tts/README.md index e0431aa..b8877bb 100644 --- a/mimic3-tts/README.md +++ b/mimic3-tts/README.md @@ -2,6 +2,40 @@ A fast and local neural text to speech system for [Mycroft](https://mycroft.ai/) and the [Mark II](https://mycroft.ai/product/mark-ii/). +[Available voices](https://github.com/MycroftAI/mimic3-voices) + + +## Command-Line Tools + + +### mimic3 + + +### mimic3-download + + +## SSML + +A subset of [SSML](https://www.w3.org/TR/speech-synthesis11/) is supported: + +* `` - wrap around SSML text + * `lang` - set language for document +* `` - sentence (disables automatic sentence breaking) + * `lang` - set language for sentence +* `` / `` - word (disables automatic tokenization) +* `` - set voice of inner text + * `voice` - name or language of voice + * Name format is `tts:voice` (e.g., "glow-speak:en-us_mary_ann") or `tts:voice#speaker_id` (e.g., "coqui-tts:en_vctk#p228") + * If one of the supported languages, a preferred voice is used (override with `--preferred-voice `) +* `` - force interpretation of inner text + * `interpret-as` one of "spell-out", "date", "number", "time", or "currency" + * `format` - way to format text depending on `interpret-as` + * number - one of "cardinal", "ordinal", "digits", "year" + * date - string with "d" (cardinal day), "o" (ordinal day), "m" (month), or "y" (year) +* `` - Pause for given amount of time + * time - seconds ("123s") or milliseconds ("123ms") +* `` - substitute `alias` for inner text + ## Architecture @@ -23,3 +57,8 @@ Voices that use [eSpeak-ng](https://github.com/espeak-ng/espeak-ng) for phonemiz ### Character-based Voices Voices whose "phonemes" are characters from an alphabet, typically with some punctuation. + + +## License + +See [license file](LICENSE)