docs: watsonx embeddings component (#7621)

* initial-content

* docs: Update IBM watsonx embeddings component

* add-back-text-embedder

* explanation-of-flow

* Apply suggestions from code review

Co-authored-by: KimberlyFields <46325568+KimberlyFields@users.noreply.github.com>

---------

Co-authored-by: KimberlyFields <46325568+KimberlyFields@users.noreply.github.com>
This commit is contained in:
Mendon Kissling 2025-04-17 13:20:38 -04:00 committed by GitHub
commit b69175086e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 45 additions and 1 deletions

View file

@ -239,6 +239,51 @@ There are two embeddings models in this flow that you can replace with **Hugging
4. In the **Hugging Face** components, set the **Inference Endpoint** field to the URL of your local inference model. **The **API Key** field is not required for local inference.**
5. Run the flow. The local inference models generate embeddings for the input text.
## IBM watsonx embeddings
This component generates text using [IBM watsonx.ai](https://www.ibm.com/watsonx) foundation models.
To use **IBM watsonx.ai** embeddings components, replace an embeddings component with the IBM watsonx.ai component in a flow.
An example document processing flow looks like the following:
![IBM watsonx embeddings model loading a chroma-db with split text](/img/component-watsonx-embeddings-chroma.png)
This flow loads a PDF file from local storage and splits the text into chunks.
The **IBM watsonx** embeddings component converts the text chunks into embeddings, which are then stored in a Chroma DB vector store.
The values for **API endpoint**, **Project ID**, **API key**, and **Model Name** are found in your IBM watsonx.ai deployment.
For more information, see the [Langchain documentation](https://python.langchain.com/docs/integrations/text_embedding/ibm_watsonx/).
### Default models
The component supports several default models with the following vector dimensions:
- `sentence-transformers/all-minilm-l12-v2`: 384-dimensional embeddings
- `ibm/slate-125m-english-rtrvr-v2`: 768-dimensional embeddings
- `ibm/slate-30m-english-rtrvr-v2`: 768-dimensional embeddings
- `intfloat/multilingual-e5-large`: 1024-dimensional embeddings
The component automatically fetches and updates the list of available models from your watsonx.ai instance when you provide your API endpoint and credentials.
### Inputs
| Name | Display Name | Info |
|------|--------------|------|
| url | watsonx API Endpoint | The base URL of the API.|
| project_id | watsonx project id | The project ID for your watsonx.ai instance. |
| api_key | API Key | The API Key to use for the model.|
| model_name | Model Name | The name of the embedding model to use.|
| truncate_input_tokens | Truncate Input Tokens | The maximum number of tokens to process. Default: `200`. |
| input_text | Include the original text in the output | Determines if the original text is included in the output. Default: `True`. |
### Outputs
| Name | Display Name | Info |
|------|--------------|------|
| embeddings | Embeddings | An instance for generating embeddings using watsonx.ai |
## LM Studio Embeddings
This component generates embeddings using [LM Studio](https://lmstudio.ai/docs) models.
@ -258,7 +303,6 @@ This component generates embeddings using [LM Studio](https://lmstudio.ai/docs)
|------|--------------|------|
| embeddings | Embeddings | The generated embeddings |
## MistralAI
This component generates embeddings using [MistralAI](https://docs.mistral.ai/) models.

Binary file not shown.

After

Width:  |  Height:  |  Size: 286 KiB