docs: Update, refresh, and expand Vector Store and Processing component documentation (#9407)

* fix anchors * type convert and structured output components * vector store intro and flow example * reorg some vector search components by provider * still on vector stores * vector store example and outputs * finish vector store page * corrections to astra db vector store * start split text component * save file and smart function * llm router * parser * still on dataframe * finish datafram ops * remove-extra-kv-pair-and-clarify-serialization-from-python --------- Co-authored-by: Mendon Kissling <59585235+mendonk@users.noreply.github.com>
2025-08-18 06:46:47 -07:00 · 2025-08-18 06:46:47 -07:00 · cfb29134bb
commit cfb29134bb
parent e0816e58a2
12 changed files with 1442 additions and 1145 deletions
--- a/docs/docs/Components/components-agents.mdx
+++ b/docs/docs/Components/components-agents.mdx
@ -43,7 +43,7 @@ For examples of flows using the **Agent** and **MCP Tools** components, see the
 The **Agent** component is the primary agent actor in your agent flows.
 This component uses an LLM integration to respond to input, such as a chat message or file upload.

-The agent can use the tools already available in the base LLM model as well as additional tools that you connect to the **Agent** component's **Tools** port.
+The agent can use the tools already available in the base LLM as well as additional tools that you connect to the **Agent** component's **Tools** port.
 You can connect any Langflow component as a tool, including other **Agent** components and MCP servers through the [**MCP Tools** component](#mcp-connection).

 For more information about using this component, see [Use Langflow agents](/agents).
--- a/docs/docs/Components/components-data.mdx
+++ b/docs/docs/Components/components-data.mdx
@ -396,7 +396,7 @@ There are two settings that control the output of the **URL** component at diffe
 When used as a standard component in a flow, the **URL** component must be connected to a component that accepts the selected output data type (`DataFrame` or `Message`).
 You can connect the **URL** component directly to a compatible component, or you can use a [**Type Convert** component](/components-processing#type-convert) to convert the output to another type before passing the data to other components if the data types aren't directly compatible.

-Processing components, like the **Type Convert** component, are useful with the **URL** component because it can extract a large amount of data from the crawled pages.
+**Processing** components like the **Type Convert** component are useful with the **URL** component because it can extract a large amount of data from the crawled pages.
 For example, if you only want to pass specific fields to other components, you can use a [**Parser** component](/components-processing#parser) to extract only that data from the crawled pages before passing the data to other components.

 When used in **Tool Mode** with an **Agent** component, the **URL** component can be connected directly to the **Agent** component's **Tools** port without converting the data.
--- a/docs/docs/Components/components-logic.mdx
+++ b/docs/docs/Components/components-logic.mdx
@ -33,7 +33,10 @@ The following example uses the **If-Else** component to check incoming chat mess

 1. Add an **If-Else** component to your flow, and then configure it as follows:

-    * **Text Input**: Connect the **Text Input** port to a **Chat Input** component.
+    * **Text Input**: Connect the **Text Input** port to a **Chat Input** component or another `Message` input.
+
+        If your input isn't in `Message` format, you can use another component to transform it, such as the [**Type Convert** component](/components-processing#type-convert) or [**Parser** component](/components-processing#parser).
+        If your input isn't appropriate for `Message` format, consider using another component for conditional routing, such as the [**Data Operations** component](/components-processing#data-operations).

    * **Match Text**: Enter `.*(urgent|warning|caution).*` so the component looks for these values in incoming input. The regex match is case sensitive, so if you need to look for all permutations of `warning`, enter `warning|Warning|WARNING`.

@ -96,7 +99,10 @@ You can toggle parameters through the <Icon name="SlidersHorizontal" aria-hidden

 ## Loop

-The **Loop** component iterates over a list of input by passing individual items to other components attached at the **Item** output port until there are no items left to process. Then, the **Loop** component passes the aggregated result of all looping to the component connected to the **Done** port.
+The **Loop** component iterates over a list of input by passing individual items to other components attached at the **Item** output port until there are no items left to process.
+Then, the **Loop** component passes the aggregated result of all looping to the component connected to the **Done** port.
+
+### The looping process

 The **Loop** component is like a miniature flow within your flow.
 Here's a breakdown of the looping process:
@ -115,9 +121,13 @@ Here's a breakdown of the looping process:

    Only one component connects to the **Item** port, but you can pass the data through as many components as you need, as long as the last component in the chain connects back to the **Looping** port.

+    The **If-Else** component isn't compatible with the **Loop** component.
+    For more information, see [Conditional looping](#conditional-looping).
+
 4. After processing all items, the results are aggregated into a single `Data` object that is passed from the **Loop** component's **Done** port to the next component in the flow.

-In terms of simplified code, the **Loop** component works like this:
+The following simplified Python code summarizes how the **Loop** component works.
+This _isn't_ the actual component code; it is only meant to help you understand the general process.

 ```python
 for i in input:             # Receive input data as a list
@ -132,8 +142,7 @@ done = aggregate_results()  # Compile all returned items
 print(done)                 # Send the aggregated results from the Done port to another component
 ```

-<details>
-<summary>Loop example</summary>
+### Loop example

 In the follow example, the **Loop** component iterates over a CSV file until there are no rows left to process.
 In this case, the **Item** port passes each row to a **Type Convert** component to convert the row into a `Message` object, passes the `Message` to a **Structured Output** component to be processed into structured data that is then passed back to the **Loop** component's **Looping** port.
@ -145,7 +154,13 @@ After processing all rows, the **Loop** component loads the aggregated list of s
 For more examples of the **Loop** component, try the **Research Translation Loop** template in Langflow, or see the video tutorial [Mastering the Loop Component & Agentic RAG in Langflow](https://www.youtube.com/watch?v=9Wx7WODSKTo).
 :::

-</details>
+### Conditional looping
+
+The **If-Else** component isn't compatible with the **Loop** component.
+If you need conditional loop events, redesign your flow to process conditions before the loop.
+For example, if you are looping over a `DataFrame`, you could use multiple [**DataFrame Operations** components](/components-processing#dataframe-operations) to conditionally filter data, and then run separate loops on each set of filtered data.
+
+![A flow with conditional looping.](/img/conditional-looping.png)

 ## Notify and Listen

--- a/docs/docs/Components/components-processing.mdx
+++ b/docs/docs/Components/components-processing.mdx
--- a/docs/docs/Components/components-vector-stores.mdx
+++ b/docs/docs/Components/components-vector-stores.mdx
--- a/docs/docs/Concepts/data-types.mdx
+++ b/docs/docs/Concepts/data-types.mdx
@ -39,39 +39,33 @@ The schema is defined in [`data.py`](https://github.com/langflow-ai/langflow/blo

 The following attributes are available:

- `data`: A dictionary that stores key-value pairs.
+- `data`: A `Data` object stores key-value pairs within the `.data` attribute. This is the `Data` object's core dictionary. Each key is a field name, and the values can be any supported data type.
 - `text_key`: The key in `data` that is considered the primary text value.
 - `default_value`: Fallback if `text_key` is missing. The default `text_key` is `"text"`.

-### Data structure
-
-A `Data` object stores key-value pairs within the `.data` attribute, where each key is a field name and its value can be any supported data type. `text_key` tells Langflow which key in the data dictionary is the primary text value for that object.
-
 ```python
 data_obj = Data(
-    text_key="text",            # Field 1
-    data={                      # Field 2 (the actual dict)
+    text_key="text",
+    data={
        "text": "Hello world",
        "name": "Charlie",
        "age": 28
    },
-    default_value=""            # Field 3
+    default_value=""
 )
 ```

 `Data` objects can be serialized to JSON, created from JSON, or created from other dictionary data.
 However, the resulting `Data` object is a structured object with validation and methods, not a plain dictionary.
-
-For example, when serialized into JSON, the previous example becomes the following JSON object:
+For example, when serialized into JSON, the previous Python example becomes the following JSON object:

 ```json
 {
  "text_key": "text",
  "data": {
-    "text": "User Profile",
-    "name": "Charlie Lastname",
-    "age": 28,
-    "email": "charlie.lastname@example.com"
+    "text": "Hello world",
+    "name": "Charlie",
+    "age": 28
  },
  "default_value": ""
 }
@ -263,7 +257,7 @@ Hover over the port to see the accepted or produced data types.
 In Langflow, you can use <Icon name="TextSearch" aria-hidden="True" /> **Inspect output** to view the output of individual components.
 This can help you learn about the different data type and debug problems with invalid or malformed inputs and output.

-The following example shows how to inspect the output of a **Type Convert** component, which can convert `Message`, `Data`, or `DataFrame` input into `Message`, `Data`, or `DataFrame` output:
+The following example shows how to inspect the output of a [**Type Convert** component](/components-processing#type-convert), which can convert data from one type to another:

 1. Create a flow, and then connect a **Chat Input** component to a **Type Convert** component.

@ -344,6 +338,7 @@ The following example shows how to inspect the output of a **Type Convert** comp

 ## See also

+- [**Processing** components](/components-processing)
 - [Custom components](/components-custom-components)
 - [Pydantic Models](https://docs.pydantic.dev/latest/api/base_model/)
 - [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html)
--- a/docs/docs/Configuration/api-keys-and-authentication.mdx
+++ b/docs/docs/Configuration/api-keys-and-authentication.mdx
@ -317,7 +317,7 @@ Additionally, you must sign in as a superuser to manage users and [create a Lang
    uv run langflow run --env-file .env
    ```

-    Starting Langflow with an `.env` file automatically authenticates you as the superuser set in `LANGFLOW_SUPERUSER` and `LANGFLOW_SUPERUSER_PASSWORD`.
+    Starting Langflow with a `.env` file automatically authenticates you as the superuser set in `LANGFLOW_SUPERUSER` and `LANGFLOW_SUPERUSER_PASSWORD`.
    If you don't explicitly set these variables, the default values are `langflow` and `langflow` for system auto-login.

 6. Verify the server is running. The default location is `http://localhost:7860`.
--- a/docs/docs/Configuration/configuration-cli.mdx
+++ b/docs/docs/Configuration/configuration-cli.mdx
@ -230,7 +230,7 @@ Use this mode to previews the changes that would be made to the database schema
  </TabItem>
 </Tabs>

-### langflow run
+### langflow run {#langflow-run}

 Starts the Langflow server.

--- a/docs/docs/Configuration/environment-variables.mdx
+++ b/docs/docs/Configuration/environment-variables.mdx
@ -151,7 +151,7 @@ The following table lists the environment variables supported by Langflow.
 | `LANGFLOW_AUTO_SAVING_INTERVAL` | Integer | `1000` | Set the interval for flow auto-saving in milliseconds. |
 | `LANGFLOW_BACKEND_ONLY` | Boolean | False | Run only the Langflow backend service (no frontend). |
 | `LANGFLOW_BUNDLE_URLS` | List[String] | `[]` | A list of URLs from which to load component bundles and flows. Supports GitHub URLs. If LANGFLOW_AUTO_LOGIN is enabled, flows from these bundles are loaded into the database. |
-| `LANGFLOW_CACHE_TYPE` | String | `async` | Set the cache type for Langflow. Possible values: `async`, `redis`, `memory`, `disk`. If you set the type to `redis`, then you must also set the following environment variables: `LANGFLOW_REDIS_HOST`, `LANGFLOW_REDIS_PORT`, `LANGFLOW_REDIS_DB`, and `LANGFLOW_REDIS_CACHE_EXPIRE`. |
+| `LANGFLOW_CACHE_TYPE` | String | `async` | Set the cache type for Langflow. Possible values: `async`, `redis`, `memory`, `disk`. If you set the type to `redis`, then you must also set the following environment variables: `LANGFLOW_REDIS_HOST`, `LANGFLOW_REDIS_PORT`, `LANGFLOW_REDIS_DB`, and `LANGFLOW_REDIS_CACHE_EXPIRE`. See also [`langflow run`](/configuration-cli#langflow-run). |
 | `LANGFLOW_COMPONENTS_PATH` | String | Not set | Path to the directory containing custom components. |
 | `LANGFLOW_CONFIG_DIR` | String | Varies | Set the Langflow configuration directory where files, logs, and the Langflow database are stored. Default path depends on your installation. See [Flow storage and logs](/concepts-flows#flow-storage-and-logs). |
 | `LANGFLOW_DATABASE_URL` | String | Not set | Set the database URL for Langflow. If not provided, Langflow uses a SQLite database. |
@ -163,14 +163,14 @@ The following table lists the environment variables supported by Langflow.
 | `LANGFLOW_DISABLE_TRACK_APIKEY_USAGE` | Boolean | False | Whether to track API key usage. If true, disables tracking of API key usage (`total_uses` and `last_used_at`) to avoid database contention under high concurrency. |
 | `LANGFLOW_ENABLE_SUPERUSER_CLI` | Boolean | True | Allow creation of superusers with the Langflow CLI command [`langflow superuser`](./configuration-cli.mdx#langflow-superuser). Recommended to be disabled (false) in production for security reasons. |
 | `LANGFLOW_FALLBACK_TO_ENV_VAR` | Boolean | True | If enabled, [global variables](/configuration-global-variables) set in your Langflow **Settings** can use an environment variable with the same name if Langflow can't retrieve the variable value from the global variables. |
-| `LANGFLOW_FRONTEND_PATH` | String | `./frontend` | Path to the frontend directory containing build files. This is for development purposes only. See [`--frontend-path`](./configuration-cli.mdx#run-frontend-path). |
-| `LANGFLOW_HEALTH_CHECK_MAX_RETRIES` | Integer | `5` | Set the maximum number of retries for the health check. See [`--health-check-max-retries`](./configuration-cli.mdx#run-health-check-max-retries). |
-| `LANGFLOW_HOST` | String | `localhost` | The host on which the Langflow server will run. See [`--host`](./configuration-cli.mdx#run-host). |
-| `LANGFLOW_LANGCHAIN_CACHE` | String | `InMemoryCache` | Type of cache to use. Possible values: `InMemoryCache`, `SQLiteCache`. See [`--cache`](./configuration-cli.mdx#run-cache). |
+| `LANGFLOW_FRONTEND_PATH` | String | `./frontend` | Path to the frontend directory containing build files. This is for development purposes only. See [`langflow run`](/configuration-cli#langflow-run). |
+| `LANGFLOW_HEALTH_CHECK_MAX_RETRIES` | Integer | `5` | Set the maximum number of retries for the health check. See [`langflow run`](/configuration-cli#langflow-run). |
+| `LANGFLOW_HOST` | String | `localhost` | The host on which the Langflow server will run. See [`langflow run`](/configuration-cli#langflow-run). |
+| `LANGFLOW_LANGCHAIN_CACHE` | String | `InMemoryCache` | Type of cache storage to use, separate from `LANGFLOW_CACHE_TYPE`. Possible values: `InMemoryCache`, `SQLiteCache`. |
 | `LANGFLOW_LOG_LEVEL` | String | `INFO` | Set the logging level for Langflow. Possible values: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`. |
 | `LANGFLOW_LOG_FILE` | String | Not set | Path to the log file. If this option isn't set, logs are written to stdout. |
 | `LANGFLOW_LOG_RETRIEVER_BUFFER_SIZE` | Integer | `10000` | Set the buffer size for log retrieval. Only used if `LANGFLOW_ENABLE_LOG_RETRIEVAL` is enabled. |
-| `LANGFLOW_MAX_FILE_SIZE_UPLOAD` | Integer | `100` | Set the maximum file size for the upload in megabytes. See [`--max-file-size-upload`](./configuration-cli.mdx#run-max-file-size-upload). |
+| `LANGFLOW_MAX_FILE_SIZE_UPLOAD` | Integer | `100` | Set the maximum file size for the upload in megabytes. See [`langflow run`](/configuration-cli#langflow-run). |
 | `LANGFLOW_MAX_ITEMS_LENGTH` | Integer | `100` | Maximum number of items to store and display in the visual editor. Lists longer than this will be truncated when displayed in the visual editor. Doesn't affect data passed between components nor outputs. |
 | `LANGFLOW_MAX_TEXT_LENGTH` | Integer | `1000` | Maximum number of characters to store and display in the visual editor. Responses longer than this will be truncated when displayed in the visual editor. Doesn't truncate responses between components nor outputs. |
 | `LANGFLOW_MCP_SERVER_ENABLED` | Boolean | True | If this option is set to False, Langflow doesn't enable the MCP server. |
--- a/docs/docs/Support/troubleshooting.mdx
+++ b/docs/docs/Support/troubleshooting.mdx
@ -146,7 +146,7 @@ The following error can occur during Langflow upgrades when the new version can'

 To resolve this error, clear the cache by deleting the contents of your Langflow cache folder.
 The filepath depends on your operating system, installation type, and configuration options.
-For more information and default filepaths, see [Memory management options](/memory#flow-storage-and-logs).
+For more information and default filepaths, see [Memory management options](/memory).

 :::important
 Clearing the cache erases your settings.
--- a/docs/docs/Tutorials/chat-with-files.mdx
+++ b/docs/docs/Tutorials/chat-with-files.mdx
@ -31,8 +31,10 @@ The following steps modify the **Basic Prompting** template to accept file input
 2. In the **Language Model** component, enter your OpenAI API key.

    If you want to use a different provider or model, edit the **Model Provider**, **Model Name**, and **API Key** fields accordingly.
+
 3. To verify that your API key is valid, click <Icon name="Play" aria-hidden="true" /> **Playground**, and then ask the LLM a question.
 The LLM should respond according to the specifications in the **Prompt Template** component's **Template** field.
+
 4. Exit the **Playground**, and then modify the **Prompt Template** component to accept file input in addition to chat input.
 To do this, edit the **Template** field, and then replace the default prompt with the following text:

--- a/docs/static/img/conditional-looping.png
+++ b/docs/static/img/conditional-looping.png