Updating READMEs

2022-03-31 13:17:25 -04:00 · 2022-03-31 13:17:25 -04:00 · 33ea9c086d
commit 33ea9c086d
parent d72bba83c9
5 changed files with 189 additions and 2 deletions
--- a/README.md
+++ b/README.md
@ -0,0 +1,144 @@
+# Mimic 3
+
+A fast and local neural text to speech system for [Mycroft](https://mycroft.ai/) and the [Mark II](https://mycroft.ai/product/mark-ii/).
+
+[Available voices](https://github.com/MycroftAI/mimic3-voices)
+
+
+## Dependencies
+
+Mimic 3 requires:
+
+* Python 3.7 or higher
+* The [Onnx runtime](https://onnxruntime.ai/)
+* [gruut](https://github.com/rhasspy/gruut) or [eSpeak-ng](https://github.com/espeak-ng/espeak-ng) (depending on the voice)
+
+
+## Installation
+
+
+### eSpeak
+
+Some voices depend on [eSpeak-ng](https://github.com/espeak-ng/espeak-ng), specifically `libespeak-ng.so`. For those voices, make sure that libespeak-ng is installed with:
+
+``` sh
+sudo apt-get install libespeak-ng1
+```
+
+### Mycroft TTS Plugin
+
+Install the plugin:
+
+``` sh
+mycroft-pip install plugin-tts-mimic3[all]
+```
+
+Enable the plugin in your [mycroft.conf](https://mycroft-ai.gitbook.io/docs/using-mycroft-ai/customizations/mycroft-conf) file:
+
+``` sh
+mycroft-conf set tts.module mimic3_tts_plug
+```
+
+See the [plugin's documentation](https://github.com/MycroftAI/plugin-tts-mimic3) for more options.
+
+
+### Using pip
+
+Install the command-line tool:
+
+``` sh
+pip install mimic3[all]
+```
+
+Once installed, the following commands will be available:
+    * `mimic3`
+    * `mimic3-download`
+
+Install the HTTP web server:
+
+``` sh
+pip install mimic3-http[all]
+```
+
+Once installed, the following commands will be available:
+    * `mimic3-server`
+    * `mimic3-client`
+
+Language support can be selectively installed by replacing `all` with:
+
+* `de` - German
+* `es` - Spanish
+* `fr` - French
+* `it` - Italian
+* `nl` - Dutch
+* `ru` - Russian
+* `sw` - Kiswahili
+
+Excluding `[..]` entirely will install support for English only.
+
+
+### From Source
+
+Clone the repository:
+
+``` sh
+git clone https://github.com/MycroftAI/mimic3.git
+```
+
+Run the install script:
+
+``` sh
+cd mimic3/
+./install.sh
+```
+
+A virtual environment will be created in `mimic3/.venv` and each of the Python modules will be installed in editiable mode (`pip install -e`).
+
+Once installed, the following commands will be available in `.venv/bin`:
+    * `mimic3`
+    * `mimic3-server`
+    * `mimic3-client`
+    * `mimic3-download`
+
+
+## Voice Keys
+
+Mimic 3 references voices with the format:
+
+* `<language>/<name>_<quality>` for single speaker voices, and
+* `<language>/<name>_<quality>#<speaker>` for multi-speaker voices 
+    * `<speaker>` can be a name or number starting at 0
+    * Speaker names come from a voice's `speakers.txt` file
+    
+For example, the default [Alan Pope](https://popey.me/) voice key is `en_UK/apope_low`. The [CMU Arctic voice](https://github.com/MycroftAI/mimic3-voices/tree/master/voices/en_US/cmu-arctic_low) contains multiple speakers, with a commonly used voice being `en_US/cmu-arctic_low#slt`.
+
+Voices are automatically downloaded from [Github](https://github.com/MycroftAI/mimic3-voices) and stored in `${HOME}/.local/share/mimic3`
+
+
+## Running
+
+
+### Command-Line Tools
+
+The `mimic3` command can be used to synthesize audio on the command line:
+
+``` sh
+mimic3 --voice 'en_UK/apope_low' 'My hovercraft is full of eels.' > hovercraft_eels.wav
+```
+
+See [voice keys](#voice-keys) for how to reference voices and speakers.
+
+See `mimic3 --help` or the [CLI documentation](mimic3-tts/README.md) for more details.
+
+
+### Web Server and Client
+
+
+## SSML
+
+A [subset of SSML](mimic3-tts/README.md#SSML) is supported.
+
+
+## License
+
+See [license file](LICENSE)
--- a/install.sh
+++ b/install.sh
@ -31,7 +31,7 @@ pip3 ${PIP_INSTALL} -e "${this_dir}/opentts-abc"

 # Include support for languages besides English
 pushd "${this_dir}/mimic3-tts" 2>/dev/null
-pip3 ${PIP_INSTALL} -e '.[de,fr,it,nl,ru,sw]'
+pip3 ${PIP_INSTALL} -e '.[all]'
 popd 2>/dev/null

 pip3 ${PIP_INSTALL} -e "${this_dir}/mimic3-http"
--- a/mimic3-http/README.md
+++ b/mimic3-http/README.md
@ -1,5 +1,9 @@
 # Mimic 3 Web Server

+A small HTTP web server for the [Mimic 3](https://github.com/MycroftAI/mimic3) text to speech system.
+
+[Available voices](https://github.com/MycroftAI/mimic3-voices)
+

 ## Installation

--- a/mimic3-http/setup.py
+++ b/mimic3-http/setup.py
@ -56,7 +56,7 @@ for lang in [
    "ru",
    "sw",
 ]:
-    extras[f"gruut[{lang}]"] = [lang]
+    extras[f"mimic3-tts[{lang}]"] = [lang]

 # Add "all" tag
 for tags in extras.values():
--- a/mimic3-tts/README.md
+++ b/mimic3-tts/README.md
@ -2,6 +2,40 @@

 A fast and local neural text to speech system for [Mycroft](https://mycroft.ai/) and the [Mark II](https://mycroft.ai/product/mark-ii/).

+[Available voices](https://github.com/MycroftAI/mimic3-voices)
+
+
+## Command-Line Tools
+
+
+### mimic3
+
+
+### mimic3-download
+
+
+## SSML
+
+A subset of [SSML](https://www.w3.org/TR/speech-synthesis11/) is supported:
+
+* `<speak>` - wrap around SSML text
+    * `lang` - set language for document
+* `<s>` - sentence (disables automatic sentence breaking)
+    * `lang` - set language for sentence
+* `<w>` / `<token>` - word (disables automatic tokenization)
+* `<voice name="...">` - set voice of inner text
+    * `voice` - name or language of voice
+        * Name format is `tts:voice` (e.g., "glow-speak:en-us_mary_ann") or `tts:voice#speaker_id` (e.g., "coqui-tts:en_vctk#p228")
+        * If one of the supported languages, a preferred voice is used (override with `--preferred-voice <lang> <voice>`)
+* `<say-as interpret-as="">` - force interpretation of inner text
+    * `interpret-as` one of "spell-out", "date", "number", "time", or "currency"
+    * `format` - way to format text depending on `interpret-as`
+        * number - one of "cardinal", "ordinal", "digits", "year"
+        * date - string with "d" (cardinal day), "o" (ordinal day), "m" (month), or "y" (year)
+* `<break time="">` - Pause for given amount of time
+    * time - seconds ("123s") or milliseconds ("123ms")
+* `<sub alias="">` - substitute `alias` for inner text
+

 ## Architecture

@ -23,3 +57,8 @@ Voices that use [eSpeak-ng](https://github.com/espeak-ng/espeak-ng) for phonemiz
 ### Character-based Voices

 Voices whose "phonemes" are characters from an alphabet, typically with some punctuation.
+
+
+## License
+
+See [license file](LICENSE)