fix: don't send duplicate messages to Agent (#8909)

* fix: update message input handling in LCAgentComponent and improve memory data retrieval

* Refactor MessageTextInput to MessageInput for consistency.
* Enhance input dictionary construction to handle different input types in LCAgentComponent.
* Update get_memory_data method to filter out current input value from retrieved messages.

* fix: update AgentComponent to include documentation link and improve input handling

* Added documentation link for AgentComponent.
* Removed memory inputs from the agent component for cleaner input management.
* Enhanced error handling in message_response method to ensure better validation and logging of exceptions.

* fix: enhance input handling in LCAgentComponent by updating message conversion

* Updated input dictionary construction in LCAgentComponent to use to_lc_message() for Message instances, improving input handling consistency.

* test: add regression test for message duplication in agent component

* Introduced a new test to verify that mathematical expressions do not experience message duplication when processed by the agent component.
* The test checks both input and output JSON to ensure correct handling of expressions like "2+2" without duplication errors.

* test: add workspace tag to regression test for message duplication in agent component

* Updated the regression test for mathematical expressions to include the "@workspace" tag, enhancing test categorization and organization.
* This change ensures better tracking and management of tests related to the agent component.

* fix: add temporary comment in get_memory_data to address message duplication

* Added a TODO comment in the get_memory_data method of AgentComponent to indicate a temporary fix for message duplication issues. This serves as a reminder to develop a more robust solution in the future.

* feat: add message extraction utility for BaseMessage

* Introduced a new helper function, _get_message_from_base_message, to extract and concatenate text content from BaseMessage instances, improving message handling.
* Updated input handling in handle_on_chain_start to utilize the new extraction function, ensuring consistent processing of input messages.

* refactor: standardize code snippets across starter project JSON files

* Updated the "value" field in multiple starter project JSON files to ensure consistent formatting and structure of code snippets.
* This change enhances readability and maintainability of the code examples provided in the starter projects.

* feat: add caching and content dictionary creation for images

* Introduced a new function, create_image_content_dict, to generate a content dictionary for multimodal inputs from image files, enhancing image handling capabilities.
* Implemented LRU caching to optimize performance for repeated image processing.
* Added comprehensive error handling and documentation for better usability and maintainability.

* refactor: update message handling to utilize create_image_content_dict

* Replaced direct image URL creation with create_image_content_dict for improved image content handling in the Data and Message classes.
* Adjusted the order of content in human messages to ensure text appears first, enhancing message structure and clarity.
* Removed deprecated to_lc_message method to streamline the codebase and improve maintainability.

* docs: enhance _get_message_from_base_message docstring for clarity

* Expanded the docstring for the _get_message_from_base_message function to provide detailed information on input types, expected behavior, and examples of usage.
* Improved documentation aims to enhance usability and maintainability of the code by clarifying how to extract text content from BaseMessage instances.

* refactor: enhance image path handling and update message content structure

* Modified the get_file_paths function to support both Image objects and string paths for improved flexibility in file handling.
* Updated test cases to reflect changes in image content structure, ensuring consistency in type and source type attributes.
* Introduced new tests for create_image_content_dict to validate successful creation and error handling for image content dictionaries.

* refactor: streamline message extraction in handle_on_chain_start

* Removed the _get_message_from_base_message function to simplify the codebase.
* Updated handle_on_chain_start to directly use the text method of BaseMessage for extracting message content, enhancing clarity and maintainability.

* feat: enhance input handling for multimodal messages

* Added functionality to process image content within input messages, ensuring images are included in chat history as HumanMessage instances.
* Updated input handling logic to separate image types from text, improving the structure and clarity of message content.
* This enhancement supports better management of multimodal inputs in the agent's chat history.

* feat: add to_lc_message method for converting Data to BaseMessage

* Introduced the to_lc_message method in the Message class to facilitate conversion of Data instances to BaseMessage.
* Implemented logic to handle both HumanMessage and AIMessage based on the presence of required keys and sender type.
* Added logging for missing required keys to improve debugging and maintainability.

* refactor: simplify sender check in Message class

* Updated the sender validation logic in the Message class to remove unnecessary checks for missing sender values.
* This change enhances code clarity and maintains the intended functionality for handling user messages with associated files.

* test: update test_message_from_human_text to reflect content type change

* Modified the test for message conversion to assert that lc_message.content is a string instead of a list.
* Updated assertions to ensure the content matches the expected text, enhancing test accuracy and reliability.

* fix: update sender validation in Message class and adjust test case

* Modified the sender validation logic to handle cases where the sender is not specified, defaulting to HumanMessage.
* Updated the corresponding test case to reflect this change, ensuring accurate type assertion for lc_message when no sender is provided.

* refactor: update import statements for consistency and clarity

* Replaced the import of BaseModel from langchain.pydantic_v1 with the direct import from pydantic to streamline dependencies.
* This change enhances code clarity and aligns with best practices for managing imports in the codebase.

---------

Co-authored-by: Edwin Jose <edwin.jose@datastax.com>
Co-authored-by: Carlos Coelho <80289056+carlosrcoelho@users.noreply.github.com>
This commit is contained in:
Gabriel Luiz Freitas Almeida 2025-07-07 22:36:39 -03:00 committed by GitHub
commit 7aeb687533
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
26 changed files with 232 additions and 46 deletions

View file

@ -4,6 +4,7 @@ from typing import TYPE_CHECKING, cast
from langchain.agents import AgentExecutor, BaseMultiActionAgent, BaseSingleActionAgent
from langchain.agents.agent import RunnableAgent
from langchain_core.messages import HumanMessage
from langchain_core.runnables import Runnable
from langflow.base.agents.callback import AgentAsyncHandler
@ -12,7 +13,7 @@ from langflow.base.agents.utils import data_to_messages
from langflow.custom.custom_component.component import Component, _get_component_toolkit
from langflow.field_typing import Tool
from langflow.inputs.inputs import InputTypes, MultilineInput
from langflow.io import BoolInput, HandleInput, IntInput, MessageTextInput
from langflow.io import BoolInput, HandleInput, IntInput, MessageInput
from langflow.logging import logger
from langflow.memory import delete_message
from langflow.schema.content_block import ContentBlock
@ -34,7 +35,7 @@ DEFAULT_AGENT_NAME = "Agent ({tools_names})"
class LCAgentComponent(Component):
trace_type = "agent"
_base_inputs: list[InputTypes] = [
MessageTextInput(
MessageInput(
name="input_value",
display_name="Input",
info="The input provided by the user for the agent to process.",
@ -135,7 +136,9 @@ class LCAgentComponent(Component):
verbose=verbose,
max_iterations=max_iterations,
)
input_dict: dict[str, str | list[BaseMessage]] = {"input": self.input_value}
input_dict: dict[str, str | list[BaseMessage]] = {
"input": self.input_value.to_lc_message() if isinstance(self.input_value, Message) else self.input_value
}
if hasattr(self, "system_prompt"):
input_dict["system_prompt"] = self.system_prompt
if hasattr(self, "chat_history") and self.chat_history:
@ -143,6 +146,18 @@ class LCAgentComponent(Component):
input_dict["chat_history"] = data_to_messages(self.chat_history)
if all(isinstance(m, Message) for m in self.chat_history):
input_dict["chat_history"] = data_to_messages([m.to_data() for m in self.chat_history])
if hasattr(input_dict["input"], "content") and isinstance(input_dict["input"].content, list):
# ! Because the input has to be a string, we must pass the images in the chat_history
image_dicts = [item for item in input_dict["input"].content if item.get("type") == "image"]
input_dict["input"].content = [item for item in input_dict["input"].content if item.get("type") != "image"]
if "chat_history" not in input_dict:
input_dict["chat_history"] = []
if isinstance(input_dict["chat_history"], list):
input_dict["chat_history"].extend(HumanMessage(content=[image_dict]) for image_dict in image_dicts)
else:
input_dict["chat_history"] = [HumanMessage(content=[image_dict]) for image_dict in image_dicts]
if hasattr(self, "graph"):
session_id = self.graph.session_id

View file

@ -63,8 +63,14 @@ async def handle_on_chain_start(
input_data = event["data"].get("input")
if isinstance(input_data, dict) and "input" in input_data:
# Cast the input_data to InputDict
input_message = input_data.get("input", "")
if isinstance(input_message, BaseMessage):
input_message = input_message.text()
elif not isinstance(input_message, str):
input_message = str(input_message)
input_dict: InputDict = {
"input": str(input_data.get("input", "")),
"input": input_message,
"chat_history": input_data.get("chat_history", []),
}
text_content = TextContent(

View file

@ -127,11 +127,15 @@ class AgentComponent(ToolCallingAgentComponent):
raise
async def get_memory_data(self):
return (
# TODO: This is a temporary fix to avoid message duplication. We should develop a function for this.
messages = (
await MemoryComponent(**self.get_base_args())
.set(session_id=self.graph.session_id, order="Ascending", n_messages=self.n_messages)
.retrieve_messages()
)
return [
message for message in messages if getattr(message, "id", None) != getattr(self.input_value, "id", None)
]
def get_llm(self):
if not isinstance(self.agent_llm, str):

View file

@ -4,8 +4,8 @@ from typing import Any
from astrapy import Collection, DataAPIClient, Database
from astrapy.admin import parse_api_endpoint
from langchain.pydantic_v1 import BaseModel, Field, create_model
from langchain_core.tools import StructuredTool, Tool
from pydantic import BaseModel, Field, create_model
from langflow.base.langchain_utilities.model import LCToolComponent
from langflow.io import BoolInput, DictInput, HandleInput, IntInput, SecretStrInput, StrInput, TableInput

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -13,7 +13,7 @@ from loguru import logger
from pydantic import BaseModel, ConfigDict, model_serializer, model_validator
from langflow.utils.constants import MESSAGE_SENDER_AI, MESSAGE_SENDER_USER
from langflow.utils.image import create_data_url
from langflow.utils.image import create_image_content_dict
if TYPE_CHECKING:
from langflow.schema.dataframe import DataFrame
@ -171,11 +171,10 @@ class Data(BaseModel):
if files:
from langflow.schema.image import get_file_paths
contents = [{"type": "text", "text": text}]
resolved_file_paths = get_file_paths(files)
for file_path in resolved_file_paths:
image_url = create_data_url(file_path)
contents.append({"type": "image_url", "image_url": {"url": image_url}})
contents = [create_image_content_dict(file_path) for file_path in resolved_file_paths]
# add to the beginning of the list
contents.insert(0, {"type": "text", "text": text})
human_message = HumanMessage(content=contents)
else:
human_message = HumanMessage(

View file

@ -22,7 +22,7 @@ def get_file_paths(files: list[str]):
storage_service = get_storage_service()
file_paths = []
for file in files:
file_path = Path(file)
file_path = Path(file.path) if hasattr(file, "path") and file.path else Path(file)
flow_id, file_name = str(file_path.parent), file_path.name
file_paths.append(storage_service.build_full_path(flow_id=flow_id, file_name=file_name))
return file_paths

View file

@ -30,7 +30,7 @@ from langflow.utils.constants import (
MESSAGE_SENDER_NAME_USER,
MESSAGE_SENDER_USER,
)
from langflow.utils.image import create_data_url
from langflow.utils.image import create_image_content_dict
if TYPE_CHECKING:
from langflow.schema.dataframe import DataFrame
@ -206,8 +206,7 @@ class Message(Data):
if isinstance(file, Image):
content_dicts.append(file.to_content_dict())
else:
image_url = create_data_url(file)
content_dicts.append({"type": "image_url", "image_url": {"url": image_url}})
content_dicts.append(create_image_content_dict(file))
return content_dicts
def load_lc_prompt(self):

View file

@ -1,5 +1,6 @@
import base64
import mimetypes
from functools import lru_cache
from pathlib import Path
@ -67,3 +68,35 @@ def create_data_url(image_path: str | Path, mime_type: str | None = None) -> str
msg = f"Failed to create data URL: {e}"
raise type(e)(msg) from e
return f"data:{mime_type};base64,{base64_data}"
@lru_cache(maxsize=50)
def create_image_content_dict(image_path: str | Path, mime_type: str | None = None) -> dict:
"""Create a content dictionary for multimodal inputs from an image file.
Args:
image_path (str | Path): Path to the image file.
mime_type (Optional[str], optional): MIME type of the image.
If None, it will be guessed from the file extension.
Returns:
dict: Content dictionary with type, source_type, data, and mime_type fields.
Raises:
FileNotFoundError: If the image file does not exist.
IOError: If there's an error reading the image file.
ValueError: If the image path is empty or invalid.
"""
if not mime_type:
mime_type = mimetypes.guess_type(str(image_path))[0]
if not mime_type:
msg = f"Could not determine MIME type for: {image_path}"
raise ValueError(msg)
try:
base64_data = convert_image_to_base64(image_path)
except (OSError, FileNotFoundError, ValueError) as e:
msg = f"Failed to create image content dict: {e}"
raise type(e)(msg) from e
return {"type": "image", "source_type": "url", "url": f"data:{mime_type};base64,{base64_data}"}

View file

@ -39,9 +39,10 @@ class TestDataSchema:
assert message.content[0] == {"type": "text", "text": "Check out this image"}
# Check image content
assert message.content[1]["type"] == "image_url"
assert "url" in message.content[1]["image_url"]
assert message.content[1]["image_url"]["url"].startswith("data:image/png;base64,")
assert message.content[1]["type"] == "image"
assert message.content[1]["source_type"] == "url"
assert "url" in message.content[1]
assert message.content[1]["url"].startswith("data:image/png;base64,")
def test_data_to_message_with_multiple_images(self, sample_image, tmp_path):
"""Test conversion of Data to Message with multiple images."""
@ -66,9 +67,15 @@ class TestDataSchema:
assert message.content[0]["type"] == "text"
# Check both images
assert message.content[1]["type"] == "image_url"
assert message.content[2]["type"] == "image_url"
assert all(content["image_url"]["url"].startswith("data:image/png;base64,") for content in message.content[1:])
assert message.content[1]["type"] == "image"
assert message.content[1]["source_type"] == "url"
assert "url" in message.content[1]
assert message.content[1]["url"].startswith("data:image/png;base64,")
assert message.content[2]["type"] == "image"
assert message.content[2]["source_type"] == "url"
assert "url" in message.content[2]
assert message.content[2]["url"].startswith("data:image/png;base64,")
def test_data_to_message_ai_response(self):
"""Test conversion of Data to AI Message."""

View file

@ -65,6 +65,7 @@ def test_message_from_human_text():
lc_message = message.to_lc_message()
assert isinstance(lc_message, HumanMessage)
assert isinstance(lc_message.content, str)
assert lc_message.content == text
@ -94,9 +95,10 @@ def test_message_with_single_image(sample_image):
assert lc_message.content[0] == {"type": "text", "text": text}
# Check image content
assert lc_message.content[1]["type"] == "image_url"
assert "url" in lc_message.content[1]["image_url"]
assert lc_message.content[1]["image_url"]["url"].startswith("data:image/png;base64,")
assert lc_message.content[1]["type"] == "image"
assert lc_message.content[1]["source_type"] == "url"
assert "url" in lc_message.content[1]
assert lc_message.content[1]["url"].startswith("data:image/png;base64,")
def test_message_with_multiple_images(sample_image, langflow_cache_dir):
@ -129,7 +131,10 @@ def test_message_with_multiple_images(sample_image, langflow_cache_dir):
# Check both images
assert all(
content["type"] == "image_url" and content["image_url"]["url"].startswith("data:image/png;base64,")
content["type"] == "image"
and content["source_type"] == "url"
and "url" in content
and content["url"].startswith("data:image/png;base64,")
for content in lc_message.content[1:]
)
@ -169,10 +174,9 @@ def test_message_serialization():
def test_message_to_lc_without_sender():
"""Test converting a message without sender to langchain message."""
message = Message(text="Test message")
# When no sender is specified, it defaults to HumanMessage
# When no sender is specified, it defaults to AIMessage
lc_message = message.to_lc_message()
assert isinstance(lc_message, HumanMessage)
assert lc_message.content == "Test message"
def test_timestamp_serialization():

View file

@ -1,7 +1,7 @@
import base64
import pytest
from langflow.utils.image import convert_image_to_base64, create_data_url
from langflow.utils.image import convert_image_to_base64, create_data_url, create_image_content_dict
@pytest.fixture
@ -70,3 +70,39 @@ def test_create_data_url_unrecognized_extension(tmp_path):
invalid_file.touch()
with pytest.raises(ValueError, match="Could not determine MIME type"):
create_data_url(invalid_file)
def test_create_image_content_dict_success(sample_image):
"""Test successful creation of image content dict."""
content_dict = create_image_content_dict(sample_image)
assert content_dict["type"] == "image"
assert content_dict["source_type"] == "url"
assert "url" in content_dict
assert content_dict["url"].startswith("data:image/png;base64,")
# Verify the base64 part is valid
base64_part = content_dict["url"].split(",")[1]
assert base64.b64decode(base64_part)
def test_create_image_content_dict_with_custom_mime(sample_image):
"""Test creation of image content dict with custom MIME type."""
custom_mime = "image/custom"
content_dict = create_image_content_dict(sample_image, mime_type=custom_mime)
assert content_dict["type"] == "image"
assert content_dict["source_type"] == "url"
assert "url" in content_dict
assert content_dict["url"].startswith(f"data:{custom_mime};base64,")
def test_create_image_content_dict_invalid_file():
"""Test creation of image content dict with invalid file."""
with pytest.raises(FileNotFoundError):
create_image_content_dict("nonexistent.jpg")
def test_create_image_content_dict_unrecognized_extension(tmp_path):
"""Test creation of image content dict with unrecognized file extension."""
invalid_file = tmp_path / "test.unknown"
invalid_file.touch()
with pytest.raises(ValueError, match="Could not determine MIME type"):
create_image_content_dict(invalid_file)

View file

@ -0,0 +1,83 @@
import { expect, test } from "@playwright/test";
import dotenv from "dotenv";
import path from "path";
import { awaitBootstrapTest } from "../../utils/await-bootstrap-test";
test(
"user must not experience message duplication in mathematical expressions with agent component",
{ tag: ["@release", "@components", "@workspace"] },
async ({ page }) => {
test.skip(
!process?.env?.ANTHROPIC_API_KEY,
"ANTHROPIC_API_KEY required to run this test",
);
if (!process.env.CI) {
dotenv.config({ path: path.resolve(__dirname, "../../.env") });
}
await awaitBootstrapTest(page);
await page.getByTestId("side_nav_options_all-templates").click();
await page.getByRole("heading", { name: "Simple Agent" }).first().click();
await page.getByTestId("value-dropdown-dropdown_str_agent_llm").click();
await page.waitForTimeout(200);
await page.getByText("Anthropic").last().click();
await page
.getByTestId("popover-anchor-input-api_key")
.fill(process.env.ANTHROPIC_API_KEY || "");
await page.getByTestId("playground-btn-flow-io").click();
await page.waitForSelector('[data-testid="input-chat-playground"]', {
timeout: 100000,
});
// Test simple math expression
await page.getByTestId("input-chat-playground").fill("2+2");
await page.waitForSelector('[data-testid="button-send"]', {
timeout: 100000,
});
await page.getByTestId("button-send").click();
// Wait for response completion
await page.waitForSelector(
'[data-testid="header-icon"] svg[data-testid="icon-Check"]',
{
timeout: 30000,
},
);
// Click on the execution section to expand and reveal the JSON blocks
await page.locator('[data-testid="header-icon"]').first().click();
// Wait for the JSON code blocks to appear after clicking
await page.waitForSelector('[data-testid="chat-code-tab"]', {
timeout: 10000,
});
// Get all the JSON code content to check both input and output
const codeBlocks = await page
.locator('[data-testid="chat-code-tab"] code.language-json')
.allTextContents();
// First code block should contain the input expression
const inputJson = codeBlocks[0];
expect(inputJson).toContain('"expression": "2+2"');
// Verify the input is NOT duplicated (should not contain "2+22+2")
expect(inputJson).not.toContain('"expression": "2+22+2"');
expect(inputJson).not.toContain('"expression": "22+2"');
// Second code block should contain the output result
const outputJson = codeBlocks[1];
expect(outputJson).toContain('"result": "4"');
// Ensure the result is not 26 (which would be 2+22+2)
expect(outputJson).not.toContain('"result": "26"');
},
);