Phil Nash 7a01cf7e5b

feat: adds model selection to Azure OpenAI Embeddings component (#3882 )

Right now the Azure OpenAI Embeddings component doesn't allow you to pick the embedding model to use. The same models are available that OpenAI make available, so I used the constant that lists them to pull from.

2024-09-26 04:29:04 -07:00

14 KiB

Raw Blame History

title	sidebar_position	slug
Embedding Models	6	/components-embedding-models

Embedding Models

Embeddings models are used to convert text into numerical vectors. These vectors can be used for various tasks such as similarity search, clustering, and classification.

AI/ML

This component generates embeddings using the AI/ML API.

Parameters

Inputs

Name	Type	Description
model_name	String	The name of the AI/ML embedding model to use
aiml_api_key	SecretString	API key for authenticating with the AI/ML service

Outputs

Name	Type	Description
embeddings	Embeddings	An instance of AIMLEmbeddingsImpl for generating embeddings

Amazon Bedrock Embeddings

This component is used to load embedding models from Amazon Bedrock.

Parameters

Inputs

Name	Type	Description
credentials_profile_name	String	Name of the AWS credentials profile in ~/.aws/credentials or ~/.aws/config, which has access keys or role information
model_id	String	ID of the model to call, e.g., `amazon.titan-embed-text-v1`. This is equivalent to the `modelId` property in the `list-foundation-models` API
endpoint_url	String	URL to set a specific service endpoint other than the default AWS endpoint
region_name	String	AWS region to use, e.g., `us-west-2`. Falls back to `AWS_DEFAULT_REGION` environment variable or region specified in ~/.aws/config if not provided

Outputs

Name	Type	Description
embeddings	Embeddings	An instance for generating embeddings using Amazon Bedrock

Astra DB vectorize

Connect this component to the Embeddings port of the Astra DB vector store component to generate embeddings.

This component requires that your Astra DB database has a collection that uses a vectorize embedding provider integration. For more information and instructions, see Embedding Generation.

Parameters

Inputs

Name	Display Name	Info
provider	Embedding Provider	The embedding provider to use
model_name	Model Name	The embedding model to use
authentication	Authentication	The name of the API key in Astra that stores your vectorize embedding provider credentials. (Not required if using an Astra-hosted embedding provider.)
provider_api_key	Provider API Key	As an alternative to `authentication`, directly provide your embedding provider credentials.
model_parameters	Model Parameters	Additional model parameters

Outputs

Name	Type	Description
embeddings	Embeddings	An instance for generating embeddings using Astra vectorize

Azure OpenAI Embeddings

This component generates embeddings using Azure OpenAI models.

Parameters

Inputs

Name	Type	Description
Model	String	Name of the model to use (default: `text-embedding-3-small`)
Azure Endpoint	String	Your Azure endpoint, including the resource. Example: `https://example-resource.azure.openai.com/`
Deployment Name	String	The name of the deployment
API Version	String	The API version to use, options include various dates
API Key	String	The API key to access the Azure OpenAI service

Outputs

Name	Type	Description
embeddings	Embeddings	An instance for generating embeddings using Azure OpenAI

Cohere Embeddings

This component is used to load embedding models from Cohere.

Parameters

Inputs

Name	Type	Description
cohere_api_key	String	API key required to authenticate with the Cohere service
model	String	Language model used for embedding text documents and performing queries (default: `embed-english-v2.0`)
truncate	Boolean	Whether to truncate the input text to fit within the model's constraints (default: `False`)

Outputs

Name	Type	Description
embeddings	Embeddings	An instance for generating embeddings using Cohere

Embedding similarity

This component computes selected forms of similarity between two embedding vectors.

Parameters

Inputs

Name	Display Name	Info
embedding_vectors	Embedding Vectors	A list containing exactly two data objects with embedding vectors to compare.
similarity_metric	Similarity Metric	Select the similarity metric to use. Options: "Cosine Similarity", "Euclidean Distance", "Manhattan Distance".

Outputs

Name	Display Name	Info
similarity_data	Similarity Data	Data object containing the computed similarity score and additional information.

Google generative AI embeddings

This component connects to Google's generative AI embedding service using the GoogleGenerativeAIEmbeddings class from the langchain-google-genai package.

Parameters

Inputs

Name	Display Name	Info
api_key	API Key	Secret API key for accessing Google's generative AI service (required)
model_name	Model Name	Name of the embedding model to use (default: "models/text-embedding-004")

Outputs

Name	Display Name	Info
embeddings	Embeddings	Built GoogleGenerativeAIEmbeddings object

Hugging Face Embeddings

:::note This component is deprecated as of Langflow version 1.0.18. Instead, use the Hugging Face API Embeddings component. :::

This component loads embedding models from HuggingFace.

Use this component to generate embeddings using locally downloaded Hugging Face models. Ensure you have sufficient computational resources to run the models.

Parameters

Inputs

Name	Display Name	Info
Cache Folder	Cache Folder	Folder path to cache HuggingFace models
Encode Kwargs	Encoding Arguments	Additional arguments for the encoding process
Model Kwargs	Model Arguments	Additional arguments for the model
Model Name	Model Name	Name of the HuggingFace model to use
Multi Process	Multi-Process	Whether to use multiple processes

Hugging Face embeddings Inference API

This component generates embeddings using Hugging Face Inference API models.

Use this component to create embeddings with Hugging Face's hosted models. Ensure you have a valid Hugging Face API key.

Parameters

Inputs

Name	Display Name	Info
API Key	API Key	API key for accessing the Hugging Face Inference API
API URL	API URL	URL of the Hugging Face Inference API
Model Name	Model Name	Name of the model to use for embeddings
Cache Folder	Cache Folder	Folder path to cache Hugging Face models
Encode Kwargs	Encoding Arguments	Additional arguments for the encoding process
Model Kwargs	Model Arguments	Additional arguments for the model
Multi Process	Multi-Process	Whether to use multiple processes

MistralAI

This component generates embeddings using MistralAI models.

Parameters

Inputs

Name	Type	Description
model	String	The MistralAI model to use (default: "mistral-embed")
mistral_api_key	SecretString	API key for authenticating with MistralAI
max_concurrent_requests	Integer	Maximum number of concurrent API requests (default: 64)
max_retries	Integer	Maximum number of retry attempts for failed requests (default: 5)
timeout	Integer	Request timeout in seconds (default: 120)
endpoint	String	Custom API endpoint URL (default: "https://api.mistral.ai/v1/")

Outputs

Name	Type	Description
embeddings	Embeddings	MistralAIEmbeddings instance for generating embeddings

NVIDIA

This component generates embeddings using NVIDIA models.

Parameters

Inputs

Name	Type	Description
model	String	The NVIDIA model to use for embeddings (e.g., nvidia/nv-embed-v1)
base_url	String	Base URL for the NVIDIA API (default: https://integrate.api.nvidia.com/v1)
nvidia_api_key	SecretString	API key for authenticating with NVIDIA's service
temperature	Float	Model temperature for embedding generation (default: 0.1)

Outputs

Name	Type	Description
embeddings	Embeddings	NVIDIAEmbeddings instance for generating embeddings

Ollama Embeddings

This component generates embeddings using Ollama models.

Parameters

Inputs

Name	Type	Description
Ollama Model	String	Name of the Ollama model to use (default: `llama2`)
Ollama Base URL	String	Base URL of the Ollama API (default: `http://localhost:11434`)
Model Temperature	Float	Temperature parameter for the model. Adjusts the randomness in the generated embeddings

Outputs

Name	Type	Description
embeddings	Embeddings	An instance for generating embeddings using Ollama

OpenAI Embeddings

This component is used to load embedding models from OpenAI.

Parameters

Inputs

Name	Type	Description
OpenAI API Key	String	The API key to use for accessing the OpenAI API
Default Headers	Dict	Default headers for the HTTP requests
Default Query	NestedDict	Default query parameters for the HTTP requests
Allowed Special	List	Special tokens allowed for processing (default: `[]`)
Disallowed Special	List	Special tokens disallowed for processing (default: `["all"]`)
Chunk Size	Integer	Chunk size for processing (default: `1000`)
Client	Any	HTTP client for making requests
Deployment	String	Deployment name for the model (default: `text-embedding-3-small`)
Embedding Context Length	Integer	Length of embedding context (default: `8191`)
Max Retries	Integer	Maximum number of retries for failed requests (default: `6`)
Model	String	Name of the model to use (default: `text-embedding-3-small`)
Model Kwargs	NestedDict	Additional keyword arguments for the model
OpenAI API Base	String	Base URL of the OpenAI API
OpenAI API Type	String	Type of the OpenAI API
OpenAI API Version	String	Version of the OpenAI API
OpenAI Organization	String	Organization associated with the API key
OpenAI Proxy	String	Proxy server for the requests
Request Timeout	Float	Timeout for the HTTP requests
Show Progress Bar	Boolean	Whether to show a progress bar for processing (default: `False`)
Skip Empty	Boolean	Whether to skip empty inputs (default: `False`)
TikToken Enable	Boolean	Whether to enable TikToken (default: `True`)
TikToken Model Name	String	Name of the TikToken model

Outputs

Name	Type	Description
embeddings	Embeddings	An instance for generating embeddings using OpenAI

Text embedder

This component generates embeddings for a given message using a specified embedding model.

Parameters

Inputs

Name	Display Name	Info
embedding_model	Embedding Model	The embedding model to use for generating embeddings.
message	Message	The message for which to generate embeddings.

Outputs

Name	Display Name	Info
embeddings	Embedding Data	Data object containing the original text and its embedding vector.

VertexAI Embeddings

This component is a wrapper around Google Vertex AI Embeddings API.

Parameters

Inputs

Name	Type	Description
credentials	Credentials	The default custom credentials to use
location	String	The default location to use when making API calls (default: `us-central1`)
max_output_tokens	Integer	Token limit determines the maximum amount of text output from one prompt (default: `128`)
model_name	String	The name of the Vertex AI large language model (default: `text-bison`)
project	String	The default GCP project to use when making Vertex API calls
request_parallelism	Integer	The amount of parallelism allowed for requests issued to VertexAI models (default: `5`)
temperature	Float	Tunes the degree of randomness in text generations. Should be a non-negative value (default: `0`)
top_k	Integer	How the model selects tokens for output, the next token is selected from the top `k` tokens (default: `40`)
top_p	Float	Tokens are selected from the most probable to least until the sum of their probabilities exceeds the top `p` value (default: `0.95`)
tuned_model_name	String	The name of a tuned model. If provided, `model_name` is ignored
verbose	Boolean	This parameter controls the level of detail in the output. When set to `True`, it prints internal states of the chain to help debug (default: `False`)

Outputs

Name	Type	Description
embeddings	Embeddings	An instance for generating embeddings using VertexAI

14 KiB Raw Blame History

Embedding Models

AI/ML

Parameters

Inputs

Outputs

Amazon Bedrock Embeddings

Parameters

Inputs

Outputs

Astra DB vectorize

Parameters

Inputs

Outputs

Azure OpenAI Embeddings

Parameters

Inputs

Outputs

Cohere Embeddings

Parameters

Inputs

Outputs

Embedding similarity

Parameters

Inputs

Outputs

Google generative AI embeddings

Parameters

Inputs

Outputs

Hugging Face Embeddings

Parameters

Inputs

Hugging Face embeddings Inference API

Parameters

Inputs

MistralAI

Parameters

Inputs

Outputs

NVIDIA

Parameters

Inputs

Outputs

Ollama Embeddings

Parameters

Inputs

Outputs

OpenAI Embeddings

Parameters

Inputs

Outputs

Text embedder

Parameters

Inputs

Outputs

VertexAI Embeddings

Parameters

Inputs

Outputs

14 KiB

Raw Blame History