langflow/docs/docs/Starter-Projects/starter-projects-vector-store-rag.md
Mendon Kissling 0d11564dea
docs: v1.1.2 (#5850)
* docs:add-changelog-to-nav

* docs: add OpenRouter component documentation with detailed inputs and outputs

* docs: add Outputs section to components-models documentation for Cohere and Ollama

* docs: update references from configuration-objects to concepts-objects across multiple components and documentation files

* feat: Add DataFrame operations section to components-processing documentation

* title-case-in-nav

* fix-memories-tab-in-chat-memory

* tool-calling-agent-update

* feat: enhance documentation with icon imports and improved instructions for OpenAI component

* material-icon

* fix: update documentation for tool mode input connection in agent component

* add-loop-component

* add-img-for-loop-summary

* feat: add documentation for using logic components in a flow with examples

* fix: enhance documentation for Loop component with detailed data flow explanation

* redirect-for-config-objects-page

* fix: improve error handling in data processing module

* fix: update documentation for Data objects in Loop component and add import statement in memory chatbot tutorial

* quickstart-screenshots

* docs: update starter flow images

* update-agent-screenshots

* move-repl-agent

* docs: enhance global variables documentation and clarify prerequisites for vector store RAG flow

* docs: update Simple Agent to use URL component

* docs: enhance memory chatbot tutorial with example conversation and clarify session ID terminology

* docs: update visibility icon description in concepts-components.md

* Apply suggestions from code review

Co-authored-by: brian-f <brian.fisher@datastax.com>

* correct-playground-sequence-and-typo

---------

Co-authored-by: brian-f <brian.fisher@datastax.com>
2025-01-24 14:24:57 +00:00

4 KiB

title slug
Vector store RAG /starter-projects-vector-store-rag

import Icon from "@site/src/components/icon";

Retrieval Augmented Generation, or RAG, is a pattern for training LLMs on your data and querying it.

RAG is backed by a vector store, a vector database which stores embeddings of the ingested data.

This enables vector search, a more powerful and context-aware search.

We've chosen Astra DB as the vector database for this starter flow, but you can follow along with any of Langflow's vector database options.

Prerequisites

Open Langflow and start a new project

  1. From the Langflow dashboard, click New Flow.
  2. Select Vector Store RAG.
  3. The Vector Store RAG flow is created.

Build the vector RAG flow

The vector store RAG flow is built of two separate flows for ingestion and query.

The Load Data Flow (bottom of the screen) creates a searchable index to be queried for contextual similarity. This flow populates the vector store with data from a local file. It ingests data from a local file, splits it into chunks, indexes it in Astra DB, and computes embeddings for the chunks using the OpenAI embeddings model.

The Retriever Flow (top of the screen) embeds the user's queries into vectors, which are compared to the vector store data from the Load Data Flow for contextual similarity.

  • Chat Input receives user input from the Playground.
  • OpenAI Embeddings converts the user query into vector form.
  • Astra DB performs similarity search using the query vector.
  • Parse Data processes the retrieved chunks.
  • Prompt combines the user query with relevant context.
  • OpenAI generates the response using the prompt.
  • Chat Output returns the response to the Playground.
  1. Configure the OpenAI model component.
    1. To create a global variable for the OpenAI component, in the OpenAI API Key field, click the  Globe button, and then click Add New Variable.
    2. In the Variable Name field, enter openai_api_key.
    3. In the Value field, paste your OpenAI API Key (sk-...).
    4. Click Save Variable.
  2. Configure the Astra DB component.
    1. In the Astra DB Application Token field, add your Astra DB application token. The component connects to your database and populates the menus with existing databases and collections.
    2. Select your Database.
    3. Select your Collection. Collections are created in your Astra DB deployment for storing vector data. If you don't have a collection, see the DataStax Astra DB Serverless documentation.
    4. Select Embedding Model to bring your own embeddings model, which is the connected OpenAI Embeddings component. The Dimensions value must match the dimensions of your collection. You can find this value in the Collection in your Astra DB deployment.

If you used Langflow's Global Variables feature, the RAG application flow components are already configured with the necessary credentials.

Run the Vector Store RAG flow

  1. Click the Playground button. Here you can chat with the AI that uses context from the database you created.
  2. Type a message and press Enter. (Try something like "What topics do you know about?")
  3. The bot will respond with a summary of the data you've embedded.