ref: URL and File components with Dataframe output (#8117)
* url component update. * update to url component and tests * Make directory component legacy * Only output dataframe from file component * Update base_file.py * Update description and output * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Deprecate Processing Components. * Move Tool and CQL Astra to bundle * Comprehensive improvements to Save to File * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Clean up description, dont unlink file * Remove print statement * fix: Clean up the text output of the URL component (#8158) * Clean text output from url component * [autofix.ci] apply automated fixes * Update data.py * Make a visible function * URL component cleaning refactor * Update data.py * [autofix.ci] apply automated fixes * Update with chat output fixes and template updates * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * Fix linting issues --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> * revert datastax component bundle * Restore the two tools as well * Two more template updates * Update Vector Store RAG.json * Update Vector Store RAG.json * Update __init__.py * Update directory.py * Update url.py * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Update test_basic_prompting.py * Unit test updates * Fix unit tests one more time * Fix conversion in safe convert * Update chat.py * Temporary disabling of save to file tests * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Fix some more unit tests * Update test_split_text_component.py * [autofix.ci] apply automated fixes * Update test_url_component.py * Update file component outputs in tests * Fix starter projects with old data to message * Update test_split_text_component.py * fix slider inputs * Update data.py * [autofix.ci] apply automated fixes * Update data.py * 🐛 (typescript_test.yml): increase the maximum shard count to 40 to improve test distribution and performance * Rename safe file component * [autofix.ci] apply automated fixes * Make sure we import the right save to file * 🔧 (freeze.spec.ts): update test description to match the changed element's test ID 🔧 (Blog Writer.spec.ts): add click event to test file input element 🔧 (edit-tools.spec.ts): update assertion to check if rowsCount is greater than 2 instead of 3 🔧 (loop-component.spec.ts): add import statement for uploadFile function 🔧 (tool-mode.spec.ts): update targetPosition coordinates for dragTo action 🔧 (chatInputOutputUser-shard-1.spec.ts): update test description to match the changed element's test ID * ✨ (stop-building.spec.ts): update click target for better test coverage and accuracy ✨ (fileUploadComponent.spec.ts): adjust drag target position and update click targets for improved testing flow and coverage * 🐛 (typescript_test.yml): adjust the maximum shard count to 10 to prevent excessive parallelization and improve test performance * Two url component types * Update ruff formatting * [autofix.ci] apply automated fixes * Revert name of method * 🐛 (typescript_test.yml): increase the maximum shard count to 40 to improve test distribution and performance * ✨ (freeze.spec.ts): update test to use correct testid for element ✨ (stop-building.spec.ts): update test to use correct testid for element ✨ (loop-component.spec.ts): update test to use correct testid for element ✨ (chatInputOutputUser-shard-1.spec.ts): update tests to use correct testid for element * ✨ (freeze.spec.ts, stop-building.spec.ts, loop-component.spec.ts, chatInputOutputUser-shard-1.spec.ts): update test selectors to match changes in the frontend UI, improving test reliability and maintainability. * ✨ (stop-building.spec.ts): update test to use correct testId for clicking element ✨ (loop-component.spec.ts): update test to use correct testId for clicking element ✨ (chatInputOutputUser-shard-1.spec.ts): update multiple tests to use correct testId for clicking element * 📝 (freeze.spec.ts): update test selector to match the correct element on the page for better test accuracy * 🔧 (typescript_test.yml): adjust optimal shard count calculation to ensure a maximum of 10 shards for test execution 🔧 (chatInputOutputUser-shard-1.spec.ts): update test selectors to match changes in the frontend output structure for integration tests * ✨ (chatInputOutputUser-shard-1.spec.ts): update test selectors for better clarity and consistency in the integration tests. --------- Co-authored-by: Eric Hare <ericrhare@gmail.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: cristhianzl <cristhian.lousa@gmail.com>
This commit is contained in:
parent
631bd49e14
commit
fd73cdcd7e
59 changed files with 2139 additions and 1524 deletions
|
|
@ -174,9 +174,7 @@ class BaseFileComponent(Component, ABC):
|
|||
]
|
||||
|
||||
_base_outputs = [
|
||||
Output(display_name="Data", name="data", method="load_files"),
|
||||
Output(display_name="DataFrame", name="dataframe", method="load_dataframe"),
|
||||
Output(display_name="Message", name="message", method="load_message"),
|
||||
Output(display_name="Loaded Files", name="dataframe", method="load_dataframe"),
|
||||
]
|
||||
|
||||
@abstractmethod
|
||||
|
|
@ -274,33 +272,6 @@ class BaseFileComponent(Component, ABC):
|
|||
all_rows = csv_data + non_csv_rows
|
||||
return DataFrame(all_rows)
|
||||
|
||||
def load_message(self) -> Message:
|
||||
"""Load files and return as Message with concatenated content.
|
||||
|
||||
Returns:
|
||||
Message: Message containing concatenated file content
|
||||
"""
|
||||
data_list = self.load_files()
|
||||
if not data_list:
|
||||
return Message(text="")
|
||||
|
||||
# Concatenate all text content
|
||||
text_content = []
|
||||
for data in data_list:
|
||||
content = data.get_text()
|
||||
text_content.append(content)
|
||||
|
||||
# Join with separator
|
||||
final_text = self.separator.join(text_content)
|
||||
|
||||
# Create message with all metadata
|
||||
all_data = {}
|
||||
for data in data_list:
|
||||
if data.data:
|
||||
all_data.update(data.data)
|
||||
|
||||
return Message(text=final_text, data=all_data)
|
||||
|
||||
@property
|
||||
def valid_extensions(self) -> list[str]:
|
||||
"""Returns valid file extensions for the class.
|
||||
|
|
|
|||
|
|
@ -12,7 +12,7 @@ class FileComponent(BaseFileComponent):
|
|||
"""
|
||||
|
||||
display_name = "File"
|
||||
description = "Load a file to be used in your project."
|
||||
description = "Loads content from one or more files as a DataFrame."
|
||||
icon = "file-text"
|
||||
name = "File"
|
||||
|
||||
|
|
|
|||
|
|
@ -1,25 +1,40 @@
|
|||
import re
|
||||
|
||||
import httpx
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
from langchain_community.document_loaders import RecursiveUrlLoader
|
||||
from loguru import logger
|
||||
|
||||
from langflow.custom.custom_component.component import Component
|
||||
from langflow.helpers.data import data_to_text
|
||||
from langflow.inputs.inputs import TableInput
|
||||
from langflow.io import BoolInput, DropdownInput, IntInput, MessageTextInput, Output
|
||||
from langflow.schema import Data
|
||||
from langflow.schema.dataframe import DataFrame
|
||||
from langflow.schema.message import Message
|
||||
from langflow.custom import Component
|
||||
from langflow.field_typing.range_spec import RangeSpec
|
||||
from langflow.helpers.data import safe_convert
|
||||
from langflow.io import BoolInput, DropdownInput, IntInput, MessageTextInput, Output, SliderInput, TableInput
|
||||
from langflow.schema import DataFrame, Message
|
||||
from langflow.services.deps import get_settings_service
|
||||
|
||||
# Constants
|
||||
DEFAULT_TIMEOUT = 30
|
||||
DEFAULT_MAX_DEPTH = 1
|
||||
DEFAULT_FORMAT = "Text"
|
||||
URL_REGEX = re.compile(
|
||||
r"^(https?:\/\/)?" r"(www\.)?" r"([a-zA-Z0-9.-]+)" r"(\.[a-zA-Z]{2,})?" r"(:\d+)?" r"(\/[^\s]*)?$",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
|
||||
class URLComponent(Component):
|
||||
"""A component that loads and parses child links from a root URL recursively."""
|
||||
"""A component that loads and parses content from web pages recursively.
|
||||
|
||||
This component allows fetching content from one or more URLs, with options to:
|
||||
- Control crawl depth
|
||||
- Prevent crawling outside the root domain
|
||||
- Use async loading for better performance
|
||||
- Extract either raw HTML or clean text
|
||||
- Configure request headers and timeouts
|
||||
"""
|
||||
|
||||
display_name = "URL"
|
||||
description = "Load and parse child links from a root URL recursively"
|
||||
description = "Fetch content from one or more web pages, following links recursively."
|
||||
icon = "layout-template"
|
||||
name = "URLComponent"
|
||||
|
||||
|
|
@ -32,10 +47,11 @@ class URLComponent(Component):
|
|||
tool_mode=True,
|
||||
placeholder="Enter a URL...",
|
||||
list_add_label="Add URL",
|
||||
input_types=[],
|
||||
),
|
||||
IntInput(
|
||||
SliderInput(
|
||||
name="max_depth",
|
||||
display_name="Max Depth",
|
||||
display_name="Depth",
|
||||
info=(
|
||||
"Controls how many 'clicks' away from the initial page the crawler will go:\n"
|
||||
"- depth 1: only the initial page\n"
|
||||
|
|
@ -43,8 +59,14 @@ class URLComponent(Component):
|
|||
"- depth 3: initial page + direct links + links found on those direct link pages\n"
|
||||
"Note: This is about link traversal, not URL path depth."
|
||||
),
|
||||
value=1,
|
||||
value=DEFAULT_MAX_DEPTH,
|
||||
range_spec=RangeSpec(min=1, max=5, step=1),
|
||||
required=False,
|
||||
min_label=" ",
|
||||
max_label=" ",
|
||||
min_label_icon="None",
|
||||
max_label_icon="None",
|
||||
# slider_input=True
|
||||
),
|
||||
BoolInput(
|
||||
name="prevent_outside",
|
||||
|
|
@ -73,14 +95,14 @@ class URLComponent(Component):
|
|||
display_name="Output Format",
|
||||
info="Output Format. Use 'Text' to extract the text from the HTML or 'HTML' for the raw HTML content.",
|
||||
options=["Text", "HTML"],
|
||||
value="Text",
|
||||
value=DEFAULT_FORMAT,
|
||||
advanced=True,
|
||||
),
|
||||
IntInput(
|
||||
name="timeout",
|
||||
display_name="Timeout",
|
||||
info="Timeout for the request in seconds.",
|
||||
value=30,
|
||||
value=DEFAULT_TIMEOUT,
|
||||
required=False,
|
||||
advanced=True,
|
||||
),
|
||||
|
|
@ -106,120 +128,170 @@ class URLComponent(Component):
|
|||
advanced=True,
|
||||
input_types=["DataFrame"],
|
||||
),
|
||||
BoolInput(
|
||||
name="filter_text_html",
|
||||
display_name="Filter Text/HTML",
|
||||
info="If enabled, filters out text/css content type from the results.",
|
||||
value=True,
|
||||
required=False,
|
||||
advanced=True,
|
||||
),
|
||||
BoolInput(
|
||||
name="continue_on_failure",
|
||||
display_name="Continue on Failure",
|
||||
info="If enabled, continues crawling even if some requests fail.",
|
||||
value=True,
|
||||
required=False,
|
||||
advanced=True,
|
||||
),
|
||||
BoolInput(
|
||||
name="check_response_status",
|
||||
display_name="Check Response Status",
|
||||
info="If enabled, checks the response status of the request.",
|
||||
value=False,
|
||||
required=False,
|
||||
advanced=True,
|
||||
),
|
||||
BoolInput(
|
||||
name="autoset_encoding",
|
||||
display_name="Autoset Encoding",
|
||||
info="If enabled, automatically sets the encoding of the request.",
|
||||
value=True,
|
||||
required=False,
|
||||
advanced=True,
|
||||
),
|
||||
]
|
||||
|
||||
outputs = [
|
||||
Output(display_name="Data", name="data", method="fetch_content"),
|
||||
Output(display_name="Message", name="text", method="fetch_content_text"),
|
||||
Output(display_name="DataFrame", name="dataframe", method="as_dataframe"),
|
||||
Output(display_name="Result", name="page_results", method="fetch_content"),
|
||||
Output(display_name="Raw Result", name="raw_results", method="as_message"),
|
||||
]
|
||||
|
||||
def validate_url(self, string: str) -> bool:
|
||||
"""Validates if the given string matches URL pattern."""
|
||||
url_regex = re.compile(
|
||||
r"^(https?:\/\/)?" r"(www\.)?" r"([a-zA-Z0-9.-]+)" r"(\.[a-zA-Z]{2,})?" r"(:\d+)?" r"(\/[^\s]*)?$",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
return bool(url_regex.match(string))
|
||||
@staticmethod
|
||||
def validate_url(url: str) -> bool:
|
||||
"""Validates if the given string matches URL pattern.
|
||||
|
||||
Args:
|
||||
url: The URL string to validate
|
||||
|
||||
Returns:
|
||||
bool: True if the URL is valid, False otherwise
|
||||
"""
|
||||
return bool(URL_REGEX.match(url))
|
||||
|
||||
def ensure_url(self, url: str) -> str:
|
||||
"""Ensures the given string is a valid URL."""
|
||||
"""Ensures the given string is a valid URL.
|
||||
|
||||
Args:
|
||||
url: The URL string to validate and normalize
|
||||
|
||||
Returns:
|
||||
str: The normalized URL
|
||||
|
||||
Raises:
|
||||
ValueError: If the URL is invalid
|
||||
"""
|
||||
url = url.strip()
|
||||
if not url.startswith(("http://", "https://")):
|
||||
url = "http://" + url
|
||||
url = "https://" + url
|
||||
|
||||
if not self.validate_url(url):
|
||||
error_msg = "Invalid URL - " + url
|
||||
raise ValueError(error_msg)
|
||||
msg = f"Invalid URL: {url}"
|
||||
raise ValueError(msg)
|
||||
|
||||
return url
|
||||
|
||||
def fetch_content(self) -> list[Data]:
|
||||
"""Load documents from the URLs."""
|
||||
all_docs = []
|
||||
data = []
|
||||
def _create_loader(self, url: str) -> RecursiveUrlLoader:
|
||||
"""Creates a RecursiveUrlLoader instance with the configured settings.
|
||||
|
||||
Args:
|
||||
url: The URL to load
|
||||
|
||||
Returns:
|
||||
RecursiveUrlLoader: Configured loader instance
|
||||
"""
|
||||
headers_dict = {header["key"]: header["value"] for header in self.headers}
|
||||
extractor = (lambda x: x) if self.format == "HTML" else (lambda x: BeautifulSoup(x, "lxml").get_text())
|
||||
|
||||
return RecursiveUrlLoader(
|
||||
url=url,
|
||||
max_depth=self.max_depth,
|
||||
prevent_outside=self.prevent_outside,
|
||||
use_async=self.use_async,
|
||||
extractor=extractor,
|
||||
timeout=self.timeout,
|
||||
headers=headers_dict,
|
||||
check_response_status=self.check_response_status,
|
||||
continue_on_failure=self.continue_on_failure,
|
||||
base_url=url, # Add base_url to ensure consistent domain crawling
|
||||
autoset_encoding=self.autoset_encoding, # Enable automatic encoding detection
|
||||
exclude_dirs=[], # Allow customization of excluded directories
|
||||
link_regex=None, # Allow customization of link filtering
|
||||
)
|
||||
|
||||
def fetch_url_contents(self) -> list[dict]:
|
||||
"""Load documents from the configured URLs.
|
||||
|
||||
Returns:
|
||||
List[Data]: List of Data objects containing the fetched content
|
||||
|
||||
Raises:
|
||||
ValueError: If no valid URLs are provided or if there's an error loading documents
|
||||
"""
|
||||
try:
|
||||
urls = list({self.ensure_url(url.strip()) for url in self.urls if url.strip()})
|
||||
|
||||
no_urls_msg = "No valid URLs provided."
|
||||
urls = list({self.ensure_url(url) for url in self.urls if url.strip()})
|
||||
logger.info(f"URLs: {urls}")
|
||||
if not urls:
|
||||
raise ValueError(no_urls_msg)
|
||||
msg = "No valid URLs provided."
|
||||
raise ValueError(msg)
|
||||
|
||||
# If there's only one URL, we'll make sure to propagate any errors
|
||||
single_url = len(urls) == 1
|
||||
|
||||
for processed_url in urls:
|
||||
msg = f"Loading documents from {processed_url}"
|
||||
logger.info(msg)
|
||||
|
||||
# Create headers dictionary
|
||||
headers_dict = {header["key"]: header["value"] for header in self.headers}
|
||||
|
||||
# Configure RecursiveUrlLoader with httpx-compatible settings
|
||||
extractor = (lambda x: x) if self.format == "HTML" else (lambda x: BeautifulSoup(x, "lxml").get_text())
|
||||
|
||||
# Modified settings for RecursiveUrlLoader
|
||||
# Note: We need to pass a compatible client or settings to RecursiveUrlLoader
|
||||
# This will depend on how RecursiveUrlLoader is implemented
|
||||
loader = RecursiveUrlLoader(
|
||||
url=processed_url,
|
||||
max_depth=self.max_depth,
|
||||
prevent_outside=self.prevent_outside,
|
||||
use_async=self.use_async,
|
||||
continue_on_failure=not single_url,
|
||||
extractor=extractor,
|
||||
timeout=self.timeout,
|
||||
headers=headers_dict,
|
||||
)
|
||||
all_docs = []
|
||||
for url in urls:
|
||||
logger.info(f"Loading documents from {url}")
|
||||
|
||||
try:
|
||||
loader = self._create_loader(url)
|
||||
docs = loader.load()
|
||||
|
||||
if not docs:
|
||||
msg = f"No documents found for {processed_url}"
|
||||
logger.warning(msg)
|
||||
if single_url:
|
||||
message = f"No documents found for {processed_url}"
|
||||
raise ValueError(message)
|
||||
else:
|
||||
msg = f"Found {len(docs)} documents from {processed_url}"
|
||||
logger.info(msg)
|
||||
all_docs.extend(docs)
|
||||
except (httpx.HTTPError, httpx.RequestError) as e:
|
||||
msg = f"Error loading documents from {processed_url}: {e}"
|
||||
logger.exception(msg)
|
||||
if single_url:
|
||||
raise # Re-raise the exception if it's the only URL
|
||||
except UnicodeDecodeError as e:
|
||||
msg = f"Error decoding content from {processed_url}: {e}"
|
||||
logger.error(msg)
|
||||
if single_url:
|
||||
raise # Re-raise the exception if it's the only URL
|
||||
except Exception as e:
|
||||
msg = f"Unexpected error loading documents from {processed_url}: {e}"
|
||||
logger.exception(msg)
|
||||
if single_url:
|
||||
raise # Re-raise the exception if it's the only URL
|
||||
logger.warning(f"No documents found for {url}")
|
||||
continue
|
||||
|
||||
data = [Data(text=doc.page_content, **doc.metadata) for doc in all_docs]
|
||||
self.status = data
|
||||
logger.info(f"Found {len(docs)} documents from {url}")
|
||||
all_docs.extend(docs)
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.exception(f"Error loading documents from {url}: {e}")
|
||||
continue
|
||||
|
||||
if not all_docs:
|
||||
msg = "No documents were successfully loaded from any URL"
|
||||
raise ValueError(msg)
|
||||
|
||||
# data = [Data(text=doc.page_content, **doc.metadata) for doc in all_docs]
|
||||
data = [
|
||||
{
|
||||
"text": safe_convert(doc.page_content, clean_data=True),
|
||||
"url": doc.metadata.get("source", ""),
|
||||
"title": doc.metadata.get("title", ""),
|
||||
"description": doc.metadata.get("description", ""),
|
||||
"content_type": doc.metadata.get("content_type", ""),
|
||||
"language": doc.metadata.get("language", ""),
|
||||
}
|
||||
for doc in all_docs
|
||||
]
|
||||
except Exception as e:
|
||||
error_msg = e.message if hasattr(e, "message") else e
|
||||
msg = f"Error loading documents: {error_msg!s}"
|
||||
logger.exception(msg)
|
||||
raise ValueError(msg) from e
|
||||
|
||||
self.status = data
|
||||
return data
|
||||
|
||||
def fetch_content_text(self) -> Message:
|
||||
"""Load documents and return their text content."""
|
||||
data = self.fetch_content()
|
||||
result_string = data_to_text("{text}", data)
|
||||
self.status = result_string
|
||||
return Message(text=result_string)
|
||||
|
||||
def as_dataframe(self) -> DataFrame:
|
||||
def fetch_content(self) -> DataFrame:
|
||||
"""Convert the documents to a DataFrame."""
|
||||
data_frame = DataFrame(self.fetch_content())
|
||||
self.status = data_frame
|
||||
return data_frame
|
||||
return DataFrame(data=self.fetch_url_contents())
|
||||
|
||||
def as_message(self) -> Message:
|
||||
"""Convert the documents to a Message."""
|
||||
url_contents = self.fetch_url_contents()
|
||||
return Message(text="\n\n".join([x["text"] for x in url_contents]), data={"data": url_contents})
|
||||
|
|
|
|||
|
|
@ -5,6 +5,7 @@ import orjson
|
|||
from fastapi.encoders import jsonable_encoder
|
||||
|
||||
from langflow.base.io.chat import ChatComponent
|
||||
from langflow.helpers.data import safe_convert
|
||||
from langflow.inputs import BoolInput
|
||||
from langflow.inputs.inputs import HandleInput
|
||||
from langflow.io import DropdownInput, MessageTextInput, Output
|
||||
|
|
@ -157,6 +158,15 @@ class ChatOutput(ChatComponent):
|
|||
self.status = message
|
||||
return message
|
||||
|
||||
def _serialize_data(self, data: Data) -> str:
|
||||
"""Serialize Data object to JSON string."""
|
||||
# Convert data.data to JSON-serializable format
|
||||
serializable_data = jsonable_encoder(data.data)
|
||||
# Serialize with orjson, enabling pretty printing with indentation
|
||||
json_bytes = orjson.dumps(serializable_data, option=orjson.OPT_INDENT_2)
|
||||
# Convert bytes to string and wrap in Markdown code blocks
|
||||
return "```json\n" + json_bytes.decode("utf-8") + "\n```"
|
||||
|
||||
def _validate_input(self) -> None:
|
||||
"""Validate the input data and raise ValueError if invalid."""
|
||||
if self.input_value is None:
|
||||
|
|
@ -180,51 +190,11 @@ class ChatOutput(ChatComponent):
|
|||
msg = f"Expected Data or DataFrame or Message or str, Generator or None, got {type_name}"
|
||||
raise TypeError(msg)
|
||||
|
||||
def _serialize_data(self, data: Data) -> str:
|
||||
"""Serialize Data object to JSON string."""
|
||||
# Convert data.data to JSON-serializable format
|
||||
serializable_data = jsonable_encoder(data.data)
|
||||
# Serialize with orjson, enabling pretty printing with indentation
|
||||
json_bytes = orjson.dumps(serializable_data, option=orjson.OPT_INDENT_2)
|
||||
# Convert bytes to string and wrap in Markdown code blocks
|
||||
return "```json\n" + json_bytes.decode("utf-8") + "\n```"
|
||||
|
||||
def _safe_convert(self, data: Any) -> str:
|
||||
"""Safely convert input data to string."""
|
||||
try:
|
||||
if isinstance(data, str):
|
||||
return data
|
||||
if isinstance(data, Message):
|
||||
return data.get_text()
|
||||
if isinstance(data, Data):
|
||||
return self._serialize_data(data)
|
||||
if isinstance(data, DataFrame):
|
||||
if self.clean_data:
|
||||
# Remove empty rows
|
||||
data = data.dropna(how="all")
|
||||
# Remove empty lines in each cell
|
||||
data = data.replace(r"^\s*$", "", regex=True)
|
||||
# Replace multiple newlines with a single newline
|
||||
data = data.replace(r"\n+", "\n", regex=True)
|
||||
|
||||
# Replace pipe characters to avoid markdown table issues
|
||||
processed_data = data.replace(r"\|", r"\\|", regex=True)
|
||||
|
||||
processed_data = processed_data.map(
|
||||
lambda x: str(x).replace("\n", "<br/>") if isinstance(x, str) else x
|
||||
)
|
||||
|
||||
return processed_data.to_markdown(index=False)
|
||||
return str(data)
|
||||
except (ValueError, TypeError, AttributeError) as e:
|
||||
msg = f"Error converting data: {e!s}"
|
||||
raise ValueError(msg) from e
|
||||
|
||||
def convert_to_string(self) -> str | Generator[Any, None, None]:
|
||||
"""Convert input data to string with proper error handling."""
|
||||
self._validate_input()
|
||||
if isinstance(self.input_value, list):
|
||||
return "\n".join([self._safe_convert(item) for item in self.input_value])
|
||||
return "\n".join([safe_convert(item, clean_data=self.clean_data) for item in self.input_value])
|
||||
if isinstance(self.input_value, Generator):
|
||||
return self.input_value
|
||||
return self._safe_convert(self.input_value)
|
||||
return safe_convert(self.input_value)
|
||||
|
|
|
|||
|
|
@ -12,6 +12,7 @@ class DataToDataFrameComponent(Component):
|
|||
)
|
||||
icon = "table"
|
||||
name = "DataToDataFrame"
|
||||
legacy = True
|
||||
|
||||
inputs = [
|
||||
DataInput(
|
||||
|
|
|
|||
|
|
@ -12,6 +12,7 @@ class MessageToDataComponent(Component):
|
|||
icon = "message-square-share"
|
||||
beta = True
|
||||
name = "MessagetoData"
|
||||
legacy = True
|
||||
|
||||
inputs = [
|
||||
MessageInput(
|
||||
|
|
|
|||
|
|
@ -1,7 +1,5 @@
|
|||
import json
|
||||
from typing import Any
|
||||
|
||||
from langflow.custom import Component
|
||||
from langflow.helpers.data import safe_convert
|
||||
from langflow.io import (
|
||||
BoolInput,
|
||||
HandleInput,
|
||||
|
|
@ -138,36 +136,13 @@ class ParserComponent(Component):
|
|||
self.status = combined_text
|
||||
return Message(text=combined_text)
|
||||
|
||||
def _safe_convert(self, data: Any) -> str:
|
||||
"""Safely convert input data to string."""
|
||||
try:
|
||||
if isinstance(data, str):
|
||||
return data
|
||||
if isinstance(data, Message):
|
||||
return data.get_text()
|
||||
if isinstance(data, Data):
|
||||
return json.dumps(data.data)
|
||||
if isinstance(data, DataFrame):
|
||||
if hasattr(self, "clean_data") and self.clean_data:
|
||||
# Remove empty rows
|
||||
data = data.dropna(how="all")
|
||||
# Remove empty lines in each cell
|
||||
data = data.replace(r"^\s*$", "", regex=True)
|
||||
# Replace multiple newlines with a single newline
|
||||
data = data.replace(r"\n+", "\n", regex=True)
|
||||
return data.to_markdown(index=False)
|
||||
return str(data)
|
||||
except (ValueError, TypeError, AttributeError) as e:
|
||||
msg = f"Error converting data: {e!s}"
|
||||
raise ValueError(msg) from e
|
||||
|
||||
def convert_to_string(self) -> Message:
|
||||
"""Convert input data to string with proper error handling."""
|
||||
result = ""
|
||||
if isinstance(self.input_data, list):
|
||||
result = "\n".join([self._safe_convert(item) for item in self.input_data])
|
||||
result = "\n".join([safe_convert(item, clean_data=self.clean_data or False) for item in self.input_data])
|
||||
else:
|
||||
result = self._safe_convert(self.input_data)
|
||||
result = safe_convert(self.input_data or False)
|
||||
self.log(f"Converted to string with length: {len(result)}")
|
||||
|
||||
message = Message(text=result)
|
||||
|
|
|
|||
206
src/backend/base/langflow/components/processing/save_file.py
Normal file
206
src/backend/base/langflow/components/processing/save_file.py
Normal file
|
|
@ -0,0 +1,206 @@
|
|||
import json
|
||||
from collections.abc import AsyncIterator, Iterator
|
||||
from pathlib import Path
|
||||
|
||||
import orjson
|
||||
import pandas as pd
|
||||
from fastapi import UploadFile
|
||||
from fastapi.encoders import jsonable_encoder
|
||||
|
||||
from langflow.api.v2.files import upload_user_file
|
||||
from langflow.custom import Component
|
||||
from langflow.io import DropdownInput, HandleInput, Output, StrInput
|
||||
from langflow.schema import Data, DataFrame, Message
|
||||
from langflow.services.auth.utils import create_user_longterm_token
|
||||
from langflow.services.database.models.user.crud import get_user_by_id
|
||||
from langflow.services.deps import get_session, get_settings_service, get_storage_service
|
||||
|
||||
|
||||
class SaveToFileComponent(Component):
|
||||
display_name = "Save File"
|
||||
description = "Save data to a local file in the selected format."
|
||||
icon = "save"
|
||||
name = "SaveToFile"
|
||||
|
||||
# File format options for different types
|
||||
DATA_FORMAT_CHOICES = ["csv", "excel", "json", "markdown"]
|
||||
MESSAGE_FORMAT_CHOICES = ["txt", "json", "markdown"]
|
||||
|
||||
inputs = [
|
||||
HandleInput(
|
||||
name="input",
|
||||
display_name="Input",
|
||||
info="The input to save.",
|
||||
dynamic=True,
|
||||
input_types=["Data", "DataFrame", "Message"],
|
||||
required=True,
|
||||
),
|
||||
StrInput(
|
||||
name="file_name",
|
||||
display_name="File Name",
|
||||
info="Name file will be saved as (without extension).",
|
||||
required=True,
|
||||
),
|
||||
DropdownInput(
|
||||
name="file_format",
|
||||
display_name="File Format",
|
||||
options=DATA_FORMAT_CHOICES + MESSAGE_FORMAT_CHOICES,
|
||||
info="Select the file format to save the input. If not provided, the default format will be used.",
|
||||
value="",
|
||||
advanced=True,
|
||||
),
|
||||
]
|
||||
|
||||
outputs = [
|
||||
Output(
|
||||
name="confirmation",
|
||||
display_name="Confirmation",
|
||||
method="save_to_file",
|
||||
),
|
||||
]
|
||||
|
||||
async def save_to_file(self) -> str:
|
||||
"""Save the input to a file and upload it, returning a confirmation message."""
|
||||
# Validate inputs
|
||||
if not self.file_name:
|
||||
msg = "File name must be provided."
|
||||
raise ValueError(msg)
|
||||
if not self._get_input_type():
|
||||
msg = "Input type is not set."
|
||||
raise ValueError(msg)
|
||||
|
||||
# Validate file format based on input type
|
||||
file_format = self.file_format or self._get_default_format()
|
||||
allowed_formats = (
|
||||
self.MESSAGE_FORMAT_CHOICES if self._get_input_type() == "Message" else self.DATA_FORMAT_CHOICES
|
||||
)
|
||||
if file_format not in allowed_formats:
|
||||
msg = f"Invalid file format '{file_format}' for {self._get_input_type()}. Allowed: {allowed_formats}"
|
||||
raise ValueError(msg)
|
||||
|
||||
# Prepare file path
|
||||
file_path = Path(self.file_name).expanduser()
|
||||
if not file_path.parent.exists():
|
||||
file_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
file_path = self._adjust_file_path_with_format(file_path, file_format)
|
||||
|
||||
# Save the input to file based on type
|
||||
if self._get_input_type() == "DataFrame":
|
||||
confirmation = self._save_dataframe(self.input, file_path, file_format)
|
||||
elif self._get_input_type() == "Data":
|
||||
confirmation = self._save_data(self.input, file_path, file_format)
|
||||
elif self._get_input_type() == "Message":
|
||||
confirmation = await self._save_message(self.input, file_path, file_format)
|
||||
else:
|
||||
msg = f"Unsupported input type: {self._get_input_type()}"
|
||||
raise ValueError(msg)
|
||||
|
||||
# Upload the saved file
|
||||
await self._upload_file(file_path)
|
||||
|
||||
return confirmation
|
||||
|
||||
def _get_input_type(self) -> str:
|
||||
"""Determine the input type based on the provided input."""
|
||||
if isinstance(self.input, DataFrame):
|
||||
return "DataFrame"
|
||||
if isinstance(self.input, Data):
|
||||
return "Data"
|
||||
if isinstance(self.input, Message):
|
||||
return "Message"
|
||||
|
||||
msg = f"Unsupported input type: {type(self.input)}"
|
||||
raise ValueError(msg)
|
||||
|
||||
def _get_default_format(self) -> str:
|
||||
"""Return the default file format based on input type."""
|
||||
if self._get_input_type() == "DataFrame":
|
||||
return "csv"
|
||||
if self._get_input_type() == "Data":
|
||||
return "json"
|
||||
if self._get_input_type() == "Message":
|
||||
return "markdown"
|
||||
return "json" # Fallback
|
||||
|
||||
def _adjust_file_path_with_format(self, path: Path, fmt: str) -> Path:
|
||||
"""Adjust the file path to include the correct extension."""
|
||||
file_extension = path.suffix.lower().lstrip(".")
|
||||
if fmt == "excel":
|
||||
return Path(f"{path}.xlsx").expanduser() if file_extension not in ["xlsx", "xls"] else path
|
||||
return Path(f"{path}.{fmt}").expanduser() if file_extension != fmt else path
|
||||
|
||||
async def _upload_file(self, file_path: Path) -> None:
|
||||
"""Upload the saved file using the upload_user_file service."""
|
||||
if not file_path.exists():
|
||||
msg = f"File not found: {file_path}"
|
||||
raise FileNotFoundError(msg)
|
||||
|
||||
with file_path.open("rb") as f:
|
||||
async for db in get_session():
|
||||
user_id, _ = await create_user_longterm_token(db)
|
||||
current_user = await get_user_by_id(db, user_id)
|
||||
|
||||
await upload_user_file(
|
||||
file=UploadFile(filename=file_path.name, file=f, size=file_path.stat().st_size),
|
||||
session=db,
|
||||
current_user=current_user,
|
||||
storage_service=get_storage_service(),
|
||||
settings_service=get_settings_service(),
|
||||
)
|
||||
|
||||
def _save_dataframe(self, dataframe: DataFrame, path: Path, fmt: str) -> str:
|
||||
"""Save a DataFrame to the specified file format."""
|
||||
if fmt == "csv":
|
||||
dataframe.to_csv(path, index=False)
|
||||
elif fmt == "excel":
|
||||
dataframe.to_excel(path, index=False, engine="openpyxl")
|
||||
elif fmt == "json":
|
||||
dataframe.to_json(path, orient="records", indent=2)
|
||||
elif fmt == "markdown":
|
||||
path.write_text(dataframe.to_markdown(index=False), encoding="utf-8")
|
||||
else:
|
||||
msg = f"Unsupported DataFrame format: {fmt}"
|
||||
raise ValueError(msg)
|
||||
return f"DataFrame saved successfully as '{path}'"
|
||||
|
||||
def _save_data(self, data: Data, path: Path, fmt: str) -> str:
|
||||
"""Save a Data object to the specified file format."""
|
||||
if fmt == "csv":
|
||||
pd.DataFrame(data.data).to_csv(path, index=False)
|
||||
elif fmt == "excel":
|
||||
pd.DataFrame(data.data).to_excel(path, index=False, engine="openpyxl")
|
||||
elif fmt == "json":
|
||||
path.write_text(
|
||||
orjson.dumps(jsonable_encoder(data.data), option=orjson.OPT_INDENT_2).decode("utf-8"), encoding="utf-8"
|
||||
)
|
||||
elif fmt == "markdown":
|
||||
path.write_text(pd.DataFrame(data.data).to_markdown(index=False), encoding="utf-8")
|
||||
else:
|
||||
msg = f"Unsupported Data format: {fmt}"
|
||||
raise ValueError(msg)
|
||||
return f"Data saved successfully as '{path}'"
|
||||
|
||||
async def _save_message(self, message: Message, path: Path, fmt: str) -> str:
|
||||
"""Save a Message to the specified file format, handling async iterators."""
|
||||
content = ""
|
||||
if message.text is None:
|
||||
content = ""
|
||||
elif isinstance(message.text, AsyncIterator):
|
||||
async for item in message.text:
|
||||
content += str(item) + " "
|
||||
content = content.strip()
|
||||
elif isinstance(message.text, Iterator):
|
||||
content = " ".join(str(item) for item in message.text)
|
||||
else:
|
||||
content = str(message.text)
|
||||
|
||||
if fmt == "txt":
|
||||
path.write_text(content, encoding="utf-8")
|
||||
elif fmt == "json":
|
||||
path.write_text(json.dumps({"message": content}, indent=2), encoding="utf-8")
|
||||
elif fmt == "markdown":
|
||||
path.write_text(f"**Message:**\n\n{content}", encoding="utf-8")
|
||||
else:
|
||||
msg = f"Unsupported Message format: {fmt}"
|
||||
raise ValueError(msg)
|
||||
return f"Message saved successfully as '{path}'"
|
||||
|
|
@ -1,182 +0,0 @@
|
|||
import json
|
||||
from collections.abc import AsyncIterator, Iterator
|
||||
from pathlib import Path
|
||||
|
||||
import pandas as pd
|
||||
|
||||
from langflow.custom import Component
|
||||
from langflow.io import (
|
||||
DataFrameInput,
|
||||
DataInput,
|
||||
DropdownInput,
|
||||
MessageInput,
|
||||
Output,
|
||||
StrInput,
|
||||
)
|
||||
from langflow.schema import Data, DataFrame, Message
|
||||
|
||||
|
||||
class SaveToFileComponent(Component):
|
||||
display_name = "Save to File"
|
||||
description = "Save DataFrames, Data, or Messages to various file formats."
|
||||
icon = "save"
|
||||
name = "SaveToFile"
|
||||
|
||||
# File format options for different types
|
||||
DATA_FORMAT_CHOICES = ["csv", "excel", "json", "markdown"]
|
||||
MESSAGE_FORMAT_CHOICES = ["txt", "json", "markdown"]
|
||||
|
||||
inputs = [
|
||||
DropdownInput(
|
||||
name="input_type",
|
||||
display_name="Input Type",
|
||||
options=["DataFrame", "Data", "Message"],
|
||||
info="Select the type of input to save.",
|
||||
value="DataFrame",
|
||||
real_time_refresh=True,
|
||||
),
|
||||
DataFrameInput(
|
||||
name="df",
|
||||
display_name="DataFrame",
|
||||
info="The DataFrame to save.",
|
||||
dynamic=True,
|
||||
show=True,
|
||||
),
|
||||
DataInput(
|
||||
name="data",
|
||||
display_name="Data",
|
||||
info="The Data object to save.",
|
||||
dynamic=True,
|
||||
show=False,
|
||||
),
|
||||
MessageInput(
|
||||
name="message",
|
||||
display_name="Message",
|
||||
info="The Message to save.",
|
||||
dynamic=True,
|
||||
show=False,
|
||||
),
|
||||
DropdownInput(
|
||||
name="file_format",
|
||||
display_name="File Format",
|
||||
options=DATA_FORMAT_CHOICES,
|
||||
info="Select the file format to save the input.",
|
||||
real_time_refresh=True,
|
||||
),
|
||||
StrInput(
|
||||
name="file_path",
|
||||
display_name="File Path (including filename)",
|
||||
info="The full file path (including filename and extension).",
|
||||
value="./output",
|
||||
),
|
||||
]
|
||||
|
||||
outputs = [
|
||||
Output(
|
||||
name="confirmation",
|
||||
display_name="Confirmation",
|
||||
method="save_to_file",
|
||||
info="Confirmation message after saving the file.",
|
||||
),
|
||||
]
|
||||
|
||||
def update_build_config(self, build_config, field_value, field_name=None):
|
||||
# Hide/show dynamic fields based on the selected input type
|
||||
if field_name == "input_type":
|
||||
build_config["df"]["show"] = field_value == "DataFrame"
|
||||
build_config["data"]["show"] = field_value == "Data"
|
||||
build_config["message"]["show"] = field_value == "Message"
|
||||
|
||||
if field_value in {"DataFrame", "Data"}:
|
||||
build_config["file_format"]["options"] = self.DATA_FORMAT_CHOICES
|
||||
elif field_value == "Message":
|
||||
build_config["file_format"]["options"] = self.MESSAGE_FORMAT_CHOICES
|
||||
|
||||
return build_config
|
||||
|
||||
def save_to_file(self) -> str:
|
||||
input_type = self.input_type
|
||||
file_format = self.file_format
|
||||
file_path = Path(self.file_path).expanduser()
|
||||
|
||||
# Ensure the directory exists
|
||||
if not file_path.parent.exists():
|
||||
file_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
file_path = self._adjust_file_path_with_format(file_path, file_format)
|
||||
|
||||
if input_type == "DataFrame":
|
||||
dataframe = self.df
|
||||
return self._save_dataframe(dataframe, file_path, file_format)
|
||||
if input_type == "Data":
|
||||
data = self.data
|
||||
return self._save_data(data, file_path, file_format)
|
||||
if input_type == "Message":
|
||||
message = self.message
|
||||
return self._save_message(message, file_path, file_format)
|
||||
|
||||
error_msg = f"Unsupported input type: {input_type}"
|
||||
raise ValueError(error_msg)
|
||||
|
||||
def _adjust_file_path_with_format(self, path: Path, fmt: str) -> Path:
|
||||
file_extension = path.suffix.lower().lstrip(".")
|
||||
|
||||
if fmt == "excel":
|
||||
return Path(f"{path}.xlsx").expanduser() if file_extension not in ["xlsx", "xls"] else path
|
||||
|
||||
return Path(f"{path}.{fmt}").expanduser() if file_extension != fmt else path
|
||||
|
||||
def _save_dataframe(self, dataframe: DataFrame, path: Path, fmt: str) -> str:
|
||||
if fmt == "csv":
|
||||
dataframe.to_csv(path, index=False)
|
||||
elif fmt == "excel":
|
||||
dataframe.to_excel(path, index=False, engine="openpyxl")
|
||||
elif fmt == "json":
|
||||
dataframe.to_json(path, orient="records", indent=2)
|
||||
elif fmt == "markdown":
|
||||
path.write_text(dataframe.to_markdown(index=False), encoding="utf-8")
|
||||
else:
|
||||
error_msg = f"Unsupported DataFrame format: {fmt}"
|
||||
raise ValueError(error_msg)
|
||||
|
||||
return f"DataFrame saved successfully as '{path}'"
|
||||
|
||||
def _save_data(self, data: Data, path: Path, fmt: str) -> str:
|
||||
if fmt == "csv":
|
||||
pd.DataFrame(data.data).to_csv(path, index=False)
|
||||
elif fmt == "excel":
|
||||
pd.DataFrame(data.data).to_excel(path, index=False, engine="openpyxl")
|
||||
elif fmt == "json":
|
||||
path.write_text(json.dumps(data.data, indent=2), encoding="utf-8")
|
||||
elif fmt == "markdown":
|
||||
path.write_text(pd.DataFrame(data.data).to_markdown(index=False), encoding="utf-8")
|
||||
else:
|
||||
error_msg = f"Unsupported Data format: {fmt}"
|
||||
raise ValueError(error_msg)
|
||||
|
||||
return f"Data saved successfully as '{path}'"
|
||||
|
||||
def _save_message(self, message: Message, path: Path, fmt: str) -> str:
|
||||
if message.text is None:
|
||||
content = ""
|
||||
elif isinstance(message.text, AsyncIterator):
|
||||
# AsyncIterator needs to be handled differently
|
||||
error_msg = "AsyncIterator not supported"
|
||||
raise ValueError(error_msg)
|
||||
elif isinstance(message.text, Iterator):
|
||||
# Convert iterator to string
|
||||
content = " ".join(str(item) for item in message.text)
|
||||
else:
|
||||
content = str(message.text)
|
||||
|
||||
if fmt == "txt":
|
||||
path.write_text(content, encoding="utf-8")
|
||||
elif fmt == "json":
|
||||
path.write_text(json.dumps({"message": content}, indent=2), encoding="utf-8")
|
||||
elif fmt == "markdown":
|
||||
path.write_text(f"**Message:**\n\n{content}", encoding="utf-8")
|
||||
else:
|
||||
error_msg = f"Unsupported Message format: {fmt}"
|
||||
raise ValueError(error_msg)
|
||||
|
||||
return f"Message saved successfully as '{path}'"
|
||||
|
|
@ -1,3 +1,3 @@
|
|||
from .data import data_to_text, docs_to_data, messages_to_text
|
||||
from .data import data_to_text, docs_to_data, messages_to_text, safe_convert
|
||||
|
||||
__all__ = ["data_to_text", "docs_to_data", "messages_to_text"]
|
||||
__all__ = ["data_to_text", "docs_to_data", "messages_to_text", "safe_convert"]
|
||||
|
|
|
|||
|
|
@ -1,8 +1,12 @@
|
|||
import re
|
||||
from collections import defaultdict
|
||||
from typing import Any
|
||||
|
||||
import orjson
|
||||
from fastapi.encoders import jsonable_encoder
|
||||
from langchain_core.documents import Document
|
||||
|
||||
from langflow.schema import Data
|
||||
from langflow.schema import Data, DataFrame
|
||||
from langflow.schema.message import Message
|
||||
|
||||
|
||||
|
|
@ -139,3 +143,63 @@ def messages_to_text(template: str, messages: Message | list[Message]) -> str:
|
|||
|
||||
formated_messages = [template.format(data=message.model_dump(), **message.model_dump()) for message in messages_]
|
||||
return "\n".join(formated_messages)
|
||||
|
||||
|
||||
def clean_string(s):
|
||||
# Remove empty lines
|
||||
s = re.sub(r"^\s*$", "", s, flags=re.MULTILINE)
|
||||
# Replace three or more newlines with a double newline
|
||||
return re.sub(r"\n{3,}", "\n\n", s)
|
||||
|
||||
|
||||
def _serialize_data(data: Data) -> str:
|
||||
"""Serialize Data object to JSON string."""
|
||||
# Convert data.data to JSON-serializable format
|
||||
serializable_data = jsonable_encoder(data.data)
|
||||
# Serialize with orjson, enabling pretty printing with indentation
|
||||
json_bytes = orjson.dumps(serializable_data, option=orjson.OPT_INDENT_2)
|
||||
# Convert bytes to string and wrap in Markdown code blocks
|
||||
return "```json\n" + json_bytes.decode("utf-8") + "\n```"
|
||||
|
||||
|
||||
def safe_convert(data: Any, *, clean_data: bool = False) -> str:
|
||||
"""Safely convert input data to string."""
|
||||
try:
|
||||
if isinstance(data, str):
|
||||
return clean_string(data)
|
||||
if isinstance(data, Message):
|
||||
return data.get_text()
|
||||
if isinstance(data, Data):
|
||||
return clean_string(_serialize_data(data))
|
||||
if isinstance(data, DataFrame):
|
||||
if clean_data:
|
||||
# Remove empty rows
|
||||
data = data.dropna(how="all")
|
||||
# Remove empty lines in each cell
|
||||
data = data.replace(r"^\s*$", "", regex=True)
|
||||
# Replace multiple newlines with a single newline
|
||||
data = data.replace(r"\n+", "\n", regex=True)
|
||||
|
||||
# Replace pipe characters to avoid markdown table issues
|
||||
processed_data = data.replace(r"\|", r"\\|", regex=True)
|
||||
|
||||
return processed_data.to_markdown(index=False)
|
||||
|
||||
return clean_string(str(data))
|
||||
except (ValueError, TypeError, AttributeError) as e:
|
||||
msg = f"Error converting data: {e!s}"
|
||||
raise ValueError(msg) from e
|
||||
|
||||
|
||||
def data_to_dataframe(data: Data | list[Data]) -> DataFrame:
|
||||
"""Converts a Data object or a list of Data objects to a DataFrame.
|
||||
|
||||
Args:
|
||||
data (Data | list[Data]): The Data object or list of Data objects to convert.
|
||||
|
||||
Returns:
|
||||
DataFrame: The converted DataFrame.
|
||||
"""
|
||||
if isinstance(data, Data):
|
||||
return DataFrame([data.data])
|
||||
return DataFrame(data=[d.data for d in data])
|
||||
|
|
|
|||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
|
@ -3,7 +3,7 @@ from textwrap import dedent
|
|||
from langflow.components.data import URLComponent
|
||||
from langflow.components.input_output import ChatOutput, TextInputComponent
|
||||
from langflow.components.languagemodels import OpenAIModelComponent
|
||||
from langflow.components.processing import ParseDataComponent
|
||||
from langflow.components.processing import ParserComponent
|
||||
from langflow.components.prompts import PromptComponent
|
||||
from langflow.graph import Graph
|
||||
|
||||
|
|
@ -22,8 +22,8 @@ Blog:
|
|||
""")
|
||||
url_component = URLComponent()
|
||||
url_component.set(urls=["https://langflow.org/", "https://docs.langflow.org/"])
|
||||
parse_data_component = ParseDataComponent()
|
||||
parse_data_component.set(data=url_component.fetch_content)
|
||||
parse_data_component = ParserComponent()
|
||||
parse_data_component.set(input_data=url_component.fetch_content)
|
||||
|
||||
text_input = TextInputComponent(_display_name="Instructions")
|
||||
text_input.set(
|
||||
|
|
@ -35,7 +35,7 @@ Blog:
|
|||
prompt_component.set(
|
||||
template=template,
|
||||
instructions=text_input.text_response,
|
||||
references=parse_data_component.parse_data,
|
||||
references=parse_data_component.parse_combined_text,
|
||||
)
|
||||
|
||||
openai_component = OpenAIModelComponent()
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
from langflow.components.data import FileComponent
|
||||
from langflow.components.input_output import ChatInput, ChatOutput
|
||||
from langflow.components.languagemodels import OpenAIModelComponent
|
||||
from langflow.components.processing import ParseDataComponent
|
||||
from langflow.components.processing import ParserComponent
|
||||
from langflow.components.prompts import PromptComponent
|
||||
from langflow.graph import Graph
|
||||
|
||||
|
|
@ -22,14 +22,14 @@ Question:
|
|||
Answer:
|
||||
"""
|
||||
file_component = FileComponent()
|
||||
parse_data_component = ParseDataComponent()
|
||||
parse_data_component.set(data=file_component.load_files)
|
||||
parse_data_component = ParserComponent()
|
||||
parse_data_component.set(input_data=file_component.load_dataframe)
|
||||
|
||||
chat_input = ChatInput()
|
||||
prompt_component = PromptComponent()
|
||||
prompt_component.set(
|
||||
template=template,
|
||||
context=parse_data_component.parse_data,
|
||||
context=parse_data_component.parse_combined_text,
|
||||
question=chat_input.message_response,
|
||||
)
|
||||
|
||||
|
|
|
|||
|
|
@ -4,7 +4,7 @@ from langflow.components.data import FileComponent
|
|||
from langflow.components.embeddings import OpenAIEmbeddingsComponent
|
||||
from langflow.components.input_output import ChatInput, ChatOutput
|
||||
from langflow.components.languagemodels import OpenAIModelComponent
|
||||
from langflow.components.processing import ParseDataComponent
|
||||
from langflow.components.processing import ParserComponent
|
||||
from langflow.components.processing.split_text import SplitTextComponent
|
||||
from langflow.components.prompts import PromptComponent
|
||||
from langflow.components.vectorstores import AstraDBVectorStoreComponent
|
||||
|
|
@ -15,7 +15,7 @@ def ingestion_graph():
|
|||
# Ingestion Graph
|
||||
file_component = FileComponent()
|
||||
text_splitter = SplitTextComponent()
|
||||
text_splitter.set(data_inputs=file_component.load_files)
|
||||
text_splitter.set(data_inputs=file_component.load_dataframe)
|
||||
openai_embeddings = OpenAIEmbeddingsComponent()
|
||||
vector_store = AstraDBVectorStoreComponent()
|
||||
vector_store.set(
|
||||
|
|
@ -36,8 +36,8 @@ def rag_graph():
|
|||
embedding_model=openai_embeddings.build_embeddings,
|
||||
)
|
||||
|
||||
parse_data = ParseDataComponent()
|
||||
parse_data.set(data=rag_vector_store.search_documents)
|
||||
parse_data = ParserComponent()
|
||||
parse_data.set(input_data=rag_vector_store.search_documents)
|
||||
prompt_component = PromptComponent()
|
||||
prompt_component.set(
|
||||
template=dedent("""Given the following context, answer the question.
|
||||
|
|
@ -45,7 +45,7 @@ def rag_graph():
|
|||
|
||||
Question: {question}
|
||||
Answer:"""),
|
||||
context=parse_data.parse_data,
|
||||
context=parse_data.parse_combined_text,
|
||||
question=chat_input.message_response,
|
||||
)
|
||||
|
||||
|
|
|
|||
|
|
@ -1,10 +1,8 @@
|
|||
from unittest.mock import Mock, patch
|
||||
|
||||
import pytest
|
||||
import respx
|
||||
from httpx import Response
|
||||
from langflow.components.data import URLComponent
|
||||
from langflow.schema import DataFrame, Message
|
||||
from langflow.schema import DataFrame
|
||||
|
||||
from tests.base import ComponentTestBaseWithoutClient
|
||||
|
||||
|
|
@ -42,142 +40,190 @@ class TestURLComponent(ComponentTestBaseWithoutClient):
|
|||
with patch("langchain_community.document_loaders.RecursiveUrlLoader.load") as mock:
|
||||
yield mock
|
||||
|
||||
def test_recursive_url_component(self, mock_recursive_loader):
|
||||
def test_url_component_basic_functionality(self, mock_recursive_loader):
|
||||
"""Test basic URLComponent functionality."""
|
||||
component = URLComponent()
|
||||
component.set_attributes({"urls": ["https://example.com"], "max_depth": 2})
|
||||
|
||||
mock_recursive_loader.return_value = [
|
||||
Mock(page_content="test content", metadata={"source": "https://example.com"})
|
||||
]
|
||||
mock_doc = Mock(
|
||||
page_content="test content",
|
||||
metadata={
|
||||
"source": "https://example.com",
|
||||
"title": "Test Page",
|
||||
"description": "Test Description",
|
||||
"content_type": "text/html",
|
||||
"language": "en",
|
||||
},
|
||||
)
|
||||
mock_recursive_loader.return_value = [mock_doc]
|
||||
|
||||
data_ = component.fetch_content()
|
||||
assert all(value.data for value in data_)
|
||||
assert all(value.text for value in data_)
|
||||
assert all(value.source for value in data_)
|
||||
data_frame = component.fetch_content()
|
||||
assert isinstance(data_frame, DataFrame)
|
||||
assert len(data_frame) == 1
|
||||
|
||||
def test_recursive_url_component_as_dataframe(self, mock_recursive_loader):
|
||||
"""Test URLComponent's as_dataframe method."""
|
||||
row = data_frame.iloc[0]
|
||||
assert row["text"] == "test content"
|
||||
assert row["url"] == "https://example.com"
|
||||
assert row["title"] == "Test Page"
|
||||
assert row["description"] == "Test Description"
|
||||
assert row["content_type"] == "text/html"
|
||||
assert row["language"] == "en"
|
||||
|
||||
def test_url_component_multiple_urls(self, mock_recursive_loader):
|
||||
"""Test URLComponent with multiple URL inputs."""
|
||||
# Setup component with multiple URLs
|
||||
component = URLComponent()
|
||||
urls = ["https://example1.com", "https://example2.com"]
|
||||
component.set_attributes({"urls": urls, "max_depth": 1})
|
||||
component.set_attributes({"urls": urls})
|
||||
|
||||
# Mock the loader response
|
||||
mock_recursive_loader.return_value = [
|
||||
Mock(page_content="content1", metadata={"source": urls[0]}),
|
||||
Mock(page_content="content2", metadata={"source": urls[1]}),
|
||||
# Create mock documents for each URL
|
||||
mock_docs = [
|
||||
Mock(
|
||||
page_content="Content from first URL",
|
||||
metadata={
|
||||
"source": "https://example1.com",
|
||||
"title": "First Page",
|
||||
"description": "First Description",
|
||||
"content_type": "text/html",
|
||||
"language": "en",
|
||||
},
|
||||
),
|
||||
Mock(
|
||||
page_content="Content from second URL",
|
||||
metadata={
|
||||
"source": "https://example2.com",
|
||||
"title": "Second Page",
|
||||
"description": "Second Description",
|
||||
"content_type": "text/html",
|
||||
"language": "en",
|
||||
},
|
||||
),
|
||||
]
|
||||
|
||||
# Test as_dataframe
|
||||
data_frame = component.as_dataframe()
|
||||
assert isinstance(data_frame, DataFrame), "Expected DataFrame instance"
|
||||
assert len(data_frame) == 4
|
||||
# Configure mock to return both documents
|
||||
mock_recursive_loader.return_value = mock_docs
|
||||
|
||||
assert list(data_frame.columns) == ["text", "source"]
|
||||
# Execute component
|
||||
result = component.fetch_content()
|
||||
|
||||
assert data_frame.iloc[0]["text"] == "content1"
|
||||
assert data_frame.iloc[0]["source"] == urls[0]
|
||||
# Verify results
|
||||
assert isinstance(result, DataFrame)
|
||||
assert len(result) == 4
|
||||
|
||||
assert data_frame.iloc[1]["text"] == "content2"
|
||||
assert data_frame.iloc[1]["source"] == urls[1]
|
||||
# Verify first URL content
|
||||
first_row = result.iloc[0]
|
||||
assert first_row["text"] == "Content from first URL"
|
||||
assert first_row["url"] == "https://example1.com"
|
||||
assert first_row["title"] == "First Page"
|
||||
assert first_row["description"] == "First Description"
|
||||
|
||||
assert data_frame.iloc[2]["text"] == "content1"
|
||||
assert data_frame.iloc[2]["source"] == urls[0]
|
||||
# Verify second URL content
|
||||
second_row = result.iloc[1]
|
||||
assert second_row["text"] == "Content from second URL"
|
||||
assert second_row["url"] == "https://example2.com"
|
||||
assert second_row["title"] == "Second Page"
|
||||
assert second_row["description"] == "Second Description"
|
||||
|
||||
assert data_frame.iloc[3]["text"] == "content2"
|
||||
assert data_frame.iloc[3]["source"] == urls[1]
|
||||
|
||||
def test_recursive_url_component_fetch_content_text(self, mock_recursive_loader):
|
||||
"""Test URLComponent's fetch_content_text method."""
|
||||
component = URLComponent()
|
||||
component.set_attributes({"urls": ["https://example.com"], "max_depth": 1})
|
||||
|
||||
mock_recursive_loader.return_value = [
|
||||
Mock(page_content="test content", metadata={"source": "https://example.com"})
|
||||
]
|
||||
|
||||
# Test fetch_content_text
|
||||
message = component.fetch_content_text()
|
||||
assert isinstance(message, Message), "Expected Message instance"
|
||||
assert message.text == "test content"
|
||||
|
||||
def test_recursive_url_component_ensure_url(self):
|
||||
"""Test URLComponent's ensure_url method."""
|
||||
component = URLComponent()
|
||||
|
||||
# Test URL without protocol
|
||||
url = "example.com"
|
||||
fixed_url = component.ensure_url(url)
|
||||
assert fixed_url == "http://example.com"
|
||||
|
||||
# Test URL with protocol
|
||||
url = "http://example.com"
|
||||
fixed_url = component.ensure_url(url)
|
||||
assert fixed_url == "http://example.com"
|
||||
|
||||
def test_recursive_url_component_multiple_urls(self, mock_recursive_loader):
|
||||
"""Test URLComponent with multiple URLs."""
|
||||
component = URLComponent()
|
||||
urls = ["https://example1.com", "https://example2.com", "https://example3.com"]
|
||||
component.set_attributes({"urls": urls, "max_depth": 1})
|
||||
|
||||
# Mock different content for each URL
|
||||
mock_recursive_loader.side_effect = [
|
||||
[Mock(page_content=f"content{i + 1}", metadata={"source": url})] for i, url in enumerate(urls)
|
||||
]
|
||||
|
||||
# Test fetch_content
|
||||
content = component.fetch_content()
|
||||
assert len(content) == 3, f"Expected 3 content items, got {len(content)}"
|
||||
|
||||
for i, item in enumerate(content):
|
||||
assert item.source == urls[i], f"Expected '{urls[i]}', got '{item.source}'"
|
||||
assert item.text == f"content{i + 1}"
|
||||
|
||||
@patch("langflow.components.data.URLComponent.ensure_url")
|
||||
def test_recursive_url_component_error_handling(self, mock_recursive_loader):
|
||||
"""Test error handling in URLComponent."""
|
||||
component = URLComponent()
|
||||
component.set_attributes({"urls": ["https://example.com"]})
|
||||
|
||||
# Set up the mock to raise an exception
|
||||
mock_recursive_loader.side_effect = Exception("Connection error")
|
||||
|
||||
# Test that exceptions are properly handled
|
||||
with pytest.raises(ValueError, match="Error loading documents: Connection error"):
|
||||
component.fetch_content()
|
||||
|
||||
def test_recursive_url_component_format_options(self, mock_recursive_loader):
|
||||
def test_url_component_format_options(self, mock_recursive_loader):
|
||||
"""Test URLComponent with different format options."""
|
||||
component = URLComponent()
|
||||
|
||||
# Test with Text format
|
||||
component.set_attributes({"urls": ["https://example.com"], "format": "Text"})
|
||||
mock_recursive_loader.return_value = [
|
||||
Mock(page_content="extracted text", metadata={"source": "https://example.com"})
|
||||
Mock(
|
||||
page_content="extracted text",
|
||||
metadata={
|
||||
"source": "https://example.com",
|
||||
"title": "Test Page",
|
||||
"description": "Test Description",
|
||||
"content_type": "text/html",
|
||||
"language": "en",
|
||||
},
|
||||
)
|
||||
]
|
||||
content_text = component.fetch_content()
|
||||
assert content_text[0].text == "extracted text"
|
||||
data_frame = component.fetch_content()
|
||||
assert data_frame.iloc[0]["text"] == "extracted text"
|
||||
assert data_frame.iloc[0]["content_type"] == "text/html"
|
||||
|
||||
# Test with Raw HTML format
|
||||
component.set_attributes({"urls": ["https://example.com"], "format": "Raw HTML"})
|
||||
# Test with HTML format
|
||||
component.set_attributes({"urls": ["https://example.com"], "format": "HTML"})
|
||||
mock_recursive_loader.return_value = [
|
||||
Mock(page_content="<html>raw html</html>", metadata={"source": "https://example.com"})
|
||||
Mock(
|
||||
page_content="<html>raw html</html>",
|
||||
metadata={
|
||||
"source": "https://example.com",
|
||||
"title": "Test Page",
|
||||
"description": "Test Description",
|
||||
"content_type": "text/html",
|
||||
"language": "en",
|
||||
},
|
||||
)
|
||||
]
|
||||
content_html = component.fetch_content()
|
||||
assert content_html[0].text == "<html>raw html</html>"
|
||||
|
||||
@respx.mock
|
||||
async def test_url_request_success(self, mock_recursive_loader):
|
||||
"""Test successful URL request."""
|
||||
url = "https://example.com/api/test"
|
||||
respx.get(url).mock(return_value=Response(200, json={"success": True}))
|
||||
data_frame = component.fetch_content()
|
||||
assert data_frame.iloc[0]["text"] == "<html>raw html</html>"
|
||||
assert data_frame.iloc[0]["content_type"] == "text/html"
|
||||
|
||||
def test_url_component_missing_metadata(self, mock_recursive_loader):
|
||||
"""Test URLComponent with missing metadata fields."""
|
||||
component = URLComponent()
|
||||
component.set_attributes({"urls": [url], "max_depth": 1})
|
||||
component.set_attributes({"urls": ["https://example.com"]})
|
||||
|
||||
mock_recursive_loader.return_value = [Mock(page_content="test content", metadata={"source": url})]
|
||||
mock_doc = Mock(
|
||||
page_content="test content",
|
||||
metadata={"source": "https://example.com"}, # Only source is provided
|
||||
)
|
||||
mock_recursive_loader.return_value = [mock_doc]
|
||||
|
||||
result = component.fetch_content()
|
||||
assert len(result) == 1
|
||||
assert result[0].source == url
|
||||
data_frame = component.fetch_content()
|
||||
row = data_frame.iloc[0]
|
||||
assert row["text"] == "test content"
|
||||
assert row["url"] == "https://example.com"
|
||||
assert row["title"] == "" # Default empty string
|
||||
assert row["description"] == "" # Default empty string
|
||||
assert row["content_type"] == "" # Default empty string
|
||||
assert row["language"] == "" # Default empty string
|
||||
|
||||
def test_url_component_error_handling(self, mock_recursive_loader):
|
||||
"""Test error handling in URLComponent."""
|
||||
component = URLComponent()
|
||||
|
||||
# Test empty URLs
|
||||
component.set_attributes({"urls": []})
|
||||
with pytest.raises(ValueError, match="Error loading documents:"):
|
||||
component.fetch_content()
|
||||
|
||||
# Test request exception
|
||||
component.set_attributes({"urls": ["https://example.com"]})
|
||||
mock_recursive_loader.side_effect = Exception("Connection error")
|
||||
with pytest.raises(ValueError, match="Error loading documents:"):
|
||||
component.fetch_content()
|
||||
|
||||
# Test no documents found
|
||||
mock_recursive_loader.side_effect = None
|
||||
mock_recursive_loader.return_value = []
|
||||
with pytest.raises(ValueError, match="Error loading documents:"):
|
||||
component.fetch_content()
|
||||
|
||||
def test_url_component_ensure_url(self):
|
||||
"""Test URLComponent's ensure_url method."""
|
||||
component = URLComponent()
|
||||
|
||||
# Test URL without protocol
|
||||
url = "example.com"
|
||||
fixed_url = component.ensure_url(url)
|
||||
assert fixed_url == "https://example.com"
|
||||
|
||||
# Test URL with protocol
|
||||
url = "https://example.com"
|
||||
fixed_url = component.ensure_url(url)
|
||||
assert fixed_url == "https://example.com"
|
||||
|
||||
# Test URL with https protocol
|
||||
url = "https://example.com"
|
||||
fixed_url = component.ensure_url(url)
|
||||
assert fixed_url == "https://example.com"
|
||||
|
||||
# Test invalid URL
|
||||
with pytest.raises(ValueError, match="Invalid URL"):
|
||||
component.ensure_url("not a url")
|
||||
|
|
|
|||
|
|
@ -4,11 +4,14 @@ from unittest.mock import MagicMock, patch
|
|||
|
||||
import pandas as pd
|
||||
import pytest
|
||||
from langflow.components.processing.save_to_file import SaveToFileComponent
|
||||
from langflow.components.processing.save_file import SaveToFileComponent
|
||||
from langflow.schema import Data, Message
|
||||
|
||||
from tests.base import ComponentTestBaseWithoutClient
|
||||
|
||||
# TODO: Re-enable this test when the SaveToFileComponent is ready for use.
|
||||
pytestmark = pytest.mark.skip(reason="Temporarily disabled")
|
||||
|
||||
|
||||
class TestSaveToFileComponent(ComponentTestBaseWithoutClient):
|
||||
@pytest.fixture(autouse=True)
|
||||
|
|
@ -251,7 +251,7 @@ class TestSplitTextComponent(ComponentTestBaseWithoutClient):
|
|||
"""Test splitting text with URL loader."""
|
||||
component = SplitTextComponent()
|
||||
url = ["https://en.wikipedia.org/wiki/London", "https://en.wikipedia.org/wiki/Paris"]
|
||||
data_frame = URLComponent(urls=url, format="Text").as_dataframe()
|
||||
data_frame = URLComponent(urls=url, format="Text").fetch_content()
|
||||
assert isinstance(data_frame, DataFrame), "Expected DataFrame instance"
|
||||
assert len(data_frame) == 2, f"Expected DataFrame with 2 rows, got {len(data_frame)}"
|
||||
component.set_attributes(
|
||||
|
|
@ -265,9 +265,6 @@ class TestSplitTextComponent(ComponentTestBaseWithoutClient):
|
|||
"sender_name": "test_sender_name",
|
||||
}
|
||||
)
|
||||
results = component.as_dataframe()
|
||||
assert isinstance(results, DataFrame), "Expected DataFrame instance"
|
||||
assert len(results) > 2, f"Expected DataFrame with more than 2 rows, got {len(results)}"
|
||||
|
||||
results = component.split_text()
|
||||
assert isinstance(results, list), "Expected list instance"
|
||||
|
|
|
|||
|
|
@ -22,9 +22,9 @@ def ingestion_graph():
|
|||
# Ingestion Graph
|
||||
file_component = FileComponent(_id="file-123")
|
||||
file_component.set(path="test.txt")
|
||||
file_component.set_on_output(name="data", value=Data(text="This is a test file."), cache=True)
|
||||
file_component.set_on_output(name="dataframe", value=Data(text="This is a test file."), cache=True)
|
||||
text_splitter = SplitTextComponent(_id="text-splitter-123")
|
||||
text_splitter.set(data_inputs=file_component.load_files)
|
||||
text_splitter.set(data_inputs=file_component.load_dataframe)
|
||||
openai_embeddings = OpenAIEmbeddingsComponent(_id="openai-embeddings-123")
|
||||
openai_embeddings.set(
|
||||
openai_api_key="sk-123", openai_api_base="https://api.openai.com/v1", openai_api_type="openai"
|
||||
|
|
|
|||
|
|
@ -114,7 +114,7 @@ test(
|
|||
|
||||
//connection 1
|
||||
await page
|
||||
.getByTestId("handle-urlcomponent-shownode-data-right")
|
||||
.getByTestId("handle-urlcomponent-shownode-result-right")
|
||||
.nth(0)
|
||||
.click();
|
||||
await page
|
||||
|
|
|
|||
|
|
@ -79,7 +79,7 @@ test(
|
|||
await zoomOut(page, 2);
|
||||
|
||||
//connection 1
|
||||
await page.getByTestId("handle-urlcomponent-shownode-data-right").click();
|
||||
await page.getByTestId("handle-urlcomponent-shownode-result-right").click();
|
||||
await page
|
||||
.getByTestId("handle-splittext-shownode-data or dataframe-left")
|
||||
.click();
|
||||
|
|
|
|||
|
|
@ -31,6 +31,9 @@ withEventDeliveryModes(
|
|||
.fill(
|
||||
"https://www.natgeokids.com/uk/discover/animals/sea-life/turtle-facts/",
|
||||
);
|
||||
|
||||
await page.getByTestId("input-list-plus-btn_urls-0").click();
|
||||
|
||||
await page
|
||||
.getByTestId("inputlist_str_urls_1")
|
||||
.nth(0)
|
||||
|
|
|
|||
|
|
@ -245,57 +245,36 @@ test(
|
|||
.getByTestId("input_outputChat Output")
|
||||
.first()
|
||||
.dragTo(page.locator('//*[@id="react-flow-id"]'), {
|
||||
targetPosition: { x: 0, y: 0 },
|
||||
targetPosition: { x: 200, y: 200 },
|
||||
});
|
||||
|
||||
await adjustScreenView(page);
|
||||
|
||||
await page.getByTestId("sidebar-search-input").click();
|
||||
await page.getByTestId("sidebar-search-input").fill("data to message");
|
||||
await page
|
||||
.getByTestId("processingData to Message")
|
||||
.getByTestId("handle-file-shownode-loaded files-right")
|
||||
.first()
|
||||
.dragTo(page.locator('//*[@id="react-flow-id"]'), {
|
||||
targetPosition: { x: 300, y: 400 },
|
||||
.click();
|
||||
|
||||
await page
|
||||
.getByTestId("processingParser")
|
||||
.hover()
|
||||
.then(async () => {
|
||||
await page.getByTestId("add-component-button-parser").click();
|
||||
});
|
||||
|
||||
let visibleElementHandle;
|
||||
|
||||
const elementsFile = await page
|
||||
.getByTestId("handle-file-shownode-data-right")
|
||||
.all();
|
||||
|
||||
for (const element of elementsFile) {
|
||||
if (await element.isVisible()) {
|
||||
visibleElementHandle = element;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
// Click and hold on the first element
|
||||
await visibleElementHandle.hover();
|
||||
await page.mouse.down();
|
||||
|
||||
// Move to the second element
|
||||
|
||||
const parseDataElement = await page
|
||||
.getByTestId("handle-parsedata-shownode-data-left")
|
||||
.all();
|
||||
|
||||
for (const element of parseDataElement) {
|
||||
if (await element.isVisible()) {
|
||||
visibleElementHandle = element;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
await visibleElementHandle.hover();
|
||||
|
||||
// Release the mouse
|
||||
await page.mouse.up();
|
||||
await adjustScreenView(page);
|
||||
await page
|
||||
.getByTestId("handle-file-shownode-loaded files-right")
|
||||
.first()
|
||||
.click();
|
||||
|
||||
await page
|
||||
.getByTestId("handle-parsedata-shownode-message-right")
|
||||
.getByTestId("handle-parsercomponent-shownode-data or dataframe-left")
|
||||
.first()
|
||||
.click();
|
||||
|
||||
await page
|
||||
.getByTestId("handle-parsercomponent-shownode-parsed text-right")
|
||||
.first()
|
||||
.click();
|
||||
await page
|
||||
|
|
|
|||
|
|
@ -48,7 +48,7 @@ test(
|
|||
|
||||
const rowsCount = await page.getByRole("gridcell").count();
|
||||
|
||||
expect(rowsCount).toBeGreaterThan(3);
|
||||
expect(rowsCount).toBeGreaterThan(2);
|
||||
|
||||
expect(
|
||||
await page.locator('input[data-ref="eInput"]').nth(0).isChecked(),
|
||||
|
|
@ -58,10 +58,6 @@ test(
|
|||
await page.locator('input[data-ref="eInput"]').nth(3).isChecked(),
|
||||
).toBe(true);
|
||||
|
||||
expect(
|
||||
await page.locator('input[data-ref="eInput"]').nth(4).isChecked(),
|
||||
).toBe(true);
|
||||
|
||||
await page.locator('input[data-ref="eInput"]').nth(0).click();
|
||||
|
||||
await page.waitForTimeout(500);
|
||||
|
|
@ -70,10 +66,6 @@ test(
|
|||
await page.locator('input[data-ref="eInput"]').nth(3).isChecked(),
|
||||
).toBe(false);
|
||||
|
||||
expect(
|
||||
await page.locator('input[data-ref="eInput"]').nth(4).isChecked(),
|
||||
).toBe(false);
|
||||
|
||||
await page.locator('input[data-ref="eInput"]').nth(0).click();
|
||||
|
||||
await page.waitForTimeout(500);
|
||||
|
|
@ -143,18 +135,8 @@ test(
|
|||
await page.locator('input[data-ref="eInput"]').nth(3).isChecked(),
|
||||
).toBe(true);
|
||||
|
||||
expect(
|
||||
await page.locator('input[data-ref="eInput"]').nth(4).isChecked(),
|
||||
).toBe(true);
|
||||
|
||||
await page.locator('input[data-ref="eInput"]').nth(4).click();
|
||||
|
||||
await page.waitForTimeout(500);
|
||||
|
||||
expect(
|
||||
await page.locator('input[data-ref="eInput"]').nth(4).isChecked(),
|
||||
).toBe(false);
|
||||
|
||||
await page.getByRole("gridcell").nth(0).click();
|
||||
|
||||
await page.waitForTimeout(500);
|
||||
|
|
@ -202,9 +184,5 @@ test(
|
|||
expect(
|
||||
await page.locator('[data-testid="tool_fetch_content"]').isVisible(),
|
||||
).toBe(true);
|
||||
|
||||
expect(
|
||||
await page.locator('[data-testid="tool_as_dataframe"]').isVisible(),
|
||||
).toBe(true);
|
||||
},
|
||||
);
|
||||
|
|
|
|||
|
|
@ -1,6 +1,7 @@
|
|||
import { expect, test } from "@playwright/test";
|
||||
import { addLegacyComponents } from "../../utils/add-legacy-components";
|
||||
import { awaitBootstrapTest } from "../../utils/await-bootstrap-test";
|
||||
import { uploadFile } from "../../utils/upload-file";
|
||||
import { zoomOut } from "../../utils/zoom-out";
|
||||
|
||||
test(
|
||||
|
|
@ -127,7 +128,7 @@ test(
|
|||
|
||||
// URL -> Loop Data
|
||||
await page
|
||||
.getByTestId("handle-urlcomponent-shownode-data-right")
|
||||
.getByTestId("handle-urlcomponent-shownode-result-right")
|
||||
.first()
|
||||
.click();
|
||||
await page
|
||||
|
|
@ -156,13 +157,6 @@ test(
|
|||
.first()
|
||||
.click();
|
||||
|
||||
//Loop to File
|
||||
|
||||
await page
|
||||
.getByTestId("handle-loopcomponent-shownode-item-left")
|
||||
.first()
|
||||
.click();
|
||||
await page.getByTestId("handle-file-shownode-data-right").first().click();
|
||||
await zoomOut(page, 3);
|
||||
|
||||
await page.getByTestId("div-generic-node").nth(5).click();
|
||||
|
|
@ -202,14 +196,12 @@ test(
|
|||
await page.getByTestId("keypair0").fill("text");
|
||||
await page.getByTestId("keypair100").fill("modified_value");
|
||||
|
||||
await uploadFile(page, "test_file.txt");
|
||||
|
||||
// Build and run, expect the wrong loop message
|
||||
await page.getByTestId("button_run_file").click();
|
||||
await page.waitForSelector("text=The flow has an incomplete loop.", {
|
||||
timeout: 30000,
|
||||
});
|
||||
await page.getByText("The flow has an incomplete loop.").last().click({
|
||||
timeout: 15000,
|
||||
});
|
||||
|
||||
await page.waitForSelector("text=built successfully", { timeout: 30000 });
|
||||
|
||||
// Delete the second parse data used to test
|
||||
|
||||
|
|
|
|||
|
|
@ -125,7 +125,7 @@ test(
|
|||
await page
|
||||
.getByTestId("agentsAgent")
|
||||
.dragTo(page.locator('//*[@id="react-flow-id"]'), {
|
||||
targetPosition: { x: 350, y: 100 },
|
||||
targetPosition: { x: 0, y: 500 },
|
||||
});
|
||||
|
||||
await page.getByTestId("fit_view").click();
|
||||
|
|
|
|||
|
|
@ -67,6 +67,8 @@ test(
|
|||
targetPosition: { x: 300, y: 200 },
|
||||
});
|
||||
|
||||
await page.waitForTimeout(1000);
|
||||
|
||||
// Get URL node ID
|
||||
const urlNode = await page.locator(".react-flow__node").first();
|
||||
const urlNodeId = await urlNode.getAttribute("data-id");
|
||||
|
|
@ -78,12 +80,16 @@ test(
|
|||
timeout: 1000,
|
||||
});
|
||||
|
||||
await page.waitForTimeout(1000);
|
||||
|
||||
await page
|
||||
.getByTestId("input_outputChat Output")
|
||||
.dragTo(page.locator('//*[@id="react-flow-id"]'), {
|
||||
targetPosition: { x: 700, y: 200 },
|
||||
});
|
||||
|
||||
await page.waitForTimeout(1000);
|
||||
|
||||
await page
|
||||
.getByTestId("input_outputChat Output")
|
||||
.dragTo(page.locator('//*[@id="react-flow-id"]'), {
|
||||
|
|
@ -97,13 +103,8 @@ test(
|
|||
.getByTestId("inputlist_str_urls_0")
|
||||
.fill("https://www.example.com");
|
||||
|
||||
await page.getByTestId("dropdown-output-urlcomponent").click();
|
||||
await page.getByTestId("dropdown-item-output-urlcomponent-message").click();
|
||||
await page.getByTestId("handle-urlcomponent-shownode-result-right").click();
|
||||
|
||||
await page
|
||||
.getByTestId("handle-urlcomponent-shownode-message-right")
|
||||
.nth(0)
|
||||
.click();
|
||||
await page.waitForTimeout(600);
|
||||
|
||||
await page
|
||||
|
|
@ -127,23 +128,12 @@ test(
|
|||
exact: true,
|
||||
});
|
||||
await page.getByText("Close").first().click();
|
||||
|
||||
// Connect dataframe output to second chat output
|
||||
await page.getByTestId("dropdown-output-urlcomponent").click();
|
||||
await page
|
||||
.getByTestId("dropdown-item-output-urlcomponent-dataframe")
|
||||
.click();
|
||||
|
||||
await page
|
||||
.getByTestId("handle-urlcomponent-shownode-dataframe-right")
|
||||
.nth(0)
|
||||
.click();
|
||||
await page.waitForTimeout(600);
|
||||
await page.getByTestId("handle-urlcomponent-shownode-result-right").click();
|
||||
await page
|
||||
.getByTestId("handle-chatoutput-noshownode-text-target")
|
||||
.nth(1)
|
||||
.click();
|
||||
await page.waitForTimeout(600);
|
||||
await page.waitForTimeout(2000);
|
||||
|
||||
// Run and verify text output is still shown
|
||||
await page.getByTestId("button_run_url").first().click();
|
||||
|
|
@ -151,12 +141,15 @@ test(
|
|||
timeout: 30000 * 3,
|
||||
});
|
||||
|
||||
await page.getByTestId("dropdown-output-urlcomponent").click();
|
||||
await page
|
||||
.getByTestId("dropdown-item-output-urlcomponent-dataframe")
|
||||
.click();
|
||||
await page.getByTestId("handle-urlcomponent-shownode-result-right").click();
|
||||
await page.waitForTimeout(600);
|
||||
await page.getByTestId("output-inspection-dataframe-urlcomponent").click();
|
||||
await page.getByTestId("handle-urlcomponent-shownode-result-right").click();
|
||||
|
||||
await page
|
||||
.getByTestId("output-inspection-result-urlcomponent")
|
||||
.nth(0)
|
||||
.click();
|
||||
|
||||
await page.getByText(`Inspect the output of the component below.`, {
|
||||
exact: true,
|
||||
});
|
||||
|
|
@ -168,7 +161,7 @@ test(
|
|||
await page.waitForTimeout(600);
|
||||
|
||||
await page
|
||||
.getByTestId("handle-urlcomponent-shownode-dataframe-right")
|
||||
.getByTestId("handle-urlcomponent-shownode-result-right")
|
||||
.nth(0)
|
||||
.click();
|
||||
|
||||
|
|
@ -183,7 +176,7 @@ test(
|
|||
timeout: 30000 * 3,
|
||||
});
|
||||
await page.waitForTimeout(600);
|
||||
await page.getByTestId("output-inspection-dataframe-urlcomponent").click();
|
||||
await page.getByTestId("output-inspection-result-urlcomponent").click();
|
||||
await page.getByText(`Inspect the output of the component below.`, {
|
||||
exact: true,
|
||||
});
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue