feat: Add support for Ingestion and Retrieval of Knowledge Bases (#9088)

* refactor: Standardize import statements and improve code readability across components

- Updated import statements to use consistent single quotes.
- Refactored various components to enhance readability and maintainability.
- Adjusted folder and file handling logic in the sidebar and file manager components.
- Introduced a new tabbed interface for the files page to separate files and knowledge bases, improving user experience.

* [autofix.ci] apply automated fixes

* feat: Introduce new Files and Knowledge Bases page with tabbed interface

- Added a new FilesPage component to manage file uploads and organization.
- Implemented a tabbed interface to separate Files and Knowledge Bases for improved user experience.
- Created FilesTab and KnowledgeBasesTab components for handling respective functionalities.
- Refactored routing to accommodate the new structure and updated import statements for consistency.
- Removed the old filesPage component to streamline the codebase.

* Create knowledgebase_utils.py

* Push initial ingest component

* [autofix.ci] apply automated fixes

* Create initial KB Ingestion component

* [autofix.ci] apply automated fixes

* Fix ruff check on utility functions

* [autofix.ci] apply automated fixes

* Some quick fixes

* Update kb_ingest.py

* [autofix.ci] apply automated fixes

* First version of retrieval component

* [autofix.ci] apply automated fixes

* Update icon

* Update kb_retrieval.py

* [autofix.ci] apply automated fixes

* Add knowledge bases feature with API integration and UI components

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Refactor imports and update routing paths for assets and main page components. Adjust tab handling in the assets page to reflect URL changes and improve user navigation experience.

* [autofix.ci] apply automated fixes

* Add CreateKnowledgeBaseButton, KnowledgeBaseEmptyState, and KnowledgeBaseSelectionOverlay components. Refactor KnowledgeBasesTab to utilize new components and improve UI for knowledge base management. Introduce utility functions for formatting numbers and average chunk sizes.

* [autofix.ci] apply automated fixes

* PoV: Add Parquet data retrieval to KBRetrievalComponent (#9097)

* Add Parquet data retrieval to KBRetrievalComponent

Introduces a new output to KBRetrievalComponent for returning knowledge base data by reading Parquet files. Updates dependencies to include fastparquet for Parquet support.

* [autofix.ci] apply automated fixes

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>

* Fix some ruff issues

* [autofix.ci] apply automated fixes

* feat: refactor file management and knowledge base components

- Replaced the existing assetsPage with a new filesPage to better organize file management functionalities.
- Introduced KnowledgePage to handle knowledge base operations, integrating KnowledgeBasesTab for displaying and managing knowledge bases.
- Added various components for file and knowledge base management, including CreateKnowledgeBaseButton, KnowledgeBaseEmptyState, and drag-and-drop functionality.
- Updated routing and imports to reflect the new structure and ensure consistency across the application.
- Enhanced user experience with improved UI elements and state management for file selection and operations.

* feat: implement delete confirmation modal for knowledge base deletion

- Added a DeleteConfirmationModal component to confirm deletion actions.
- Integrated the modal into the KnowledgeBasesTab for handling knowledge base deletions.
- Updated column definitions to include a delete button for each knowledge base.
- Enhanced user experience by ensuring deletion actions require confirmation.
- Adjusted styles for the knowledge base table to improve checkbox visibility.

* feat: enhance knowledge base metadata with embedding model detection

- Added `embedding_model` field to `KnowledgeBaseInfo` for improved metadata tracking.
- Implemented `detect_embedding_model` function to extract embedding model information from configuration files.
- Updated `get_kb_metadata` to prioritize metadata extraction from `embedding_metadata.json`, falling back to detection if necessary.
- Modified `KBIngestionComponent` to save embedding model metadata during ingestion.
- Adjusted frontend components to display embedding model information in knowledge base queries and tables.

* refactor: clean up tooltip and value getter comments in knowledge base columns

- Removed redundant comments in the `knowledgeBaseColumns.tsx` file to enhance code clarity.
- Simplified the tooltip and value getter functions for embedding model display.

* [autofix.ci] apply automated fixes

* refactor: simplify KnowledgeBaseSelectionOverlay component

- Removed the unused onExport prop and its associated functionality.
- Cleaned up code formatting for consistency and readability.
- Updated success message strings to use single quotes for uniformity.

* feat: implement bulk and single deletion for knowledge bases

- Added `BulkDeleteRequest` model to handle bulk deletion requests.
- Implemented `delete_knowledge_base` endpoint for single knowledge base deletion.
- Created `delete_knowledge_bases_bulk` endpoint for deleting multiple knowledge bases at once.
- Introduced `useDeleteKnowledgeBase` and `useDeleteKnowledgeBases` hooks for frontend integration.
- Updated `KnowledgeBaseSelectionOverlay` and `KnowledgeBasesTab` components to utilize new deletion functionality with user feedback on success and error handling.

* Initial support for vector search

* feat: add KnowledgeBaseDrawer component for enhanced knowledge base details

- Introduced `KnowledgeBaseDrawer` component to display detailed information about selected knowledge bases.
- Integrated mock data for source files and linked flows, with a layout for displaying descriptions and embedding models.
- Updated `KnowledgeBasesTab` to handle row clicks and open the drawer with relevant knowledge base data.
- Enhanced `KnowledgePage` to manage drawer state and selected knowledge base, improving user interaction and experience.

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Fix ruff checks

* Update knowledge_bases.py

* feat: update mock data and enhance drawer functionality in KnowledgeBase components

- Replaced mock data in `KnowledgeBaseDrawer` with more descriptive placeholders.
- Added a reference to the drawer in `KnowledgePage` for improved click handling.
- Implemented logic to close the drawer when clicking outside, except for table row clicks.
- Enhanced row click handling to toggle drawer state based on current visibility.

* [autofix.ci] apply automated fixes

* Append scores column to rows

* refactor: improve knowledge base deletion and UI components

- Updated `useDeleteKnowledgeBase` and `useDeleteKnowledgeBases` to enhance parameter naming for clarity.
- Removed the `CreateKnowledgeBaseButton` component and its references to streamline the UI.
- Simplified the `KnowledgeBaseDrawer` and `KnowledgeBasesTab` components by removing mock data and improving state management.
- Enhanced the `KnowledgeBaseSelectionOverlay` to better handle bulk deletions and selection states.
- Refactored various components for consistent styling and improved readability.

* refactor: standardize import statements and improve code readability in SideBarFoldersButtonsComponent

- Updated import statements to use consistent single quotes.
- Refactored various function calls and state management for improved clarity.
- Enhanced folder handling logic and UI interactions for better user experience.

* feat: Add encryption for API keys in KB ingest and retrieval (#9129)

Add encryption for API keys in KB ingest and retrieval

Introduces secure storage of embedding model API keys by encrypting them during knowledge base ingestion and decrypting them during retrieval. Refactors metadata handling to include encrypted API keys, updates retrieval to support decryption and dynamic embedder construction, and improves logging for key operations. Removes legacy embedding client code in retrieval in favor of a provider-based approach.

* [autofix.ci] apply automated fixes

* Fix import of auth utils

* Allow appending to existing knowledge base

* [autofix.ci] apply automated fixes

* Update kb_ingest.py

* Update kb_ingest.py

* feat: enhance table component with editable Vectorize column functionality

- Implemented logic to determine editability of the Vectorize column based on other row values.
- Added checks to refresh grid cells upon changes to the Vectorize column.
- Updated TableAutoCellRender to conditionally disable editing based on Vectorize column state.

* New ingestion creation dialog

* [autofix.ci] apply automated fixes

* Clean up the creation process for KB

* [autofix.ci] apply automated fixes

* Clean up names and descriptions

* Update kb_retrieval.py

* chroma retrieval

* [autofix.ci] apply automated fixes

* Further KB cleanup

* refactor: update KB ingestion component and enhance NodeDialog functionality

- Restored SecretStrInput for API key in KB ingestion component.
- Modified NodeDialog to handle new value format and added support for additional properties.
- Introduced custom hooks for managing global variable states in InputGlobalComponent.
- Improved dropdown component styling and interaction.
- Cleaned up input component code for better readability and maintainability.

* Hash the text as id

* [autofix.ci] apply automated fixes

* Update kb_retrieval.py

* [autofix.ci] apply automated fixes

* Make sure to write out the source parquet

* Remove unneeded old code

* Add ability to block duplicate ingestion chunks

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Rename retrieval component

* Better refresh mechanism for the retrieve

* Clean up some unused functionality

* Update kb_ingest.py

* Fix dropdown component logic to include checks for refresh button and dialog inputs

* Test the API key before saving knowledge

* [autofix.ci] apply automated fixes

* Allow storing updated api keys if provided at ingest time

* Add Knowledge Bases component and enhance Knowledge Base Empty State

- Introduced a new JSON configuration for Knowledge Bases, defining nodes and edges for data processing.
- Enhanced the KnowledgeBaseEmptyState component to include a button for creating a knowledge base template.
- Updated KnowledgeBasesTab to handle template creation, integrating flow management and navigation features.

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Update Knowledge Bases.json

* Update Knowledge Bases configuration and enhance UI components

- Updated the code hash in the Knowledge Bases JSON configuration.
- Modified the KnowledgeBaseEmptyState component to change the button icon and text from "Try Knowledge Base Template" to "Create Knowledge".
- Cleared the options for the knowledge base selection dropdowns to ensure they reflect the current state of available knowledge bases.

* [autofix.ci] apply automated fixes

* Implement feature flag for Knowledge Bases functionality

- Added FEATURE_FLAGS.knowledge_bases to control the visibility of knowledge base components in the API and UI.
- Updated the router to conditionally include the knowledge bases router based on the feature flag.
- Modified KBIngestionComponent and KBRetrievalComponent to hide if the knowledge bases feature is disabled.
- Enhanced the initial setup to skip loading knowledge base starter projects when the feature is disabled.
- Updated frontend routes and sidebar components to conditionally render knowledge base options based on the feature flag.
- Adjusted API queries to return an empty array if the knowledge bases feature is disabled.

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Refactor Knowledge Bases feature flag implementation

- Removed the FEATURE_FLAGS.knowledge_bases flag from backend components and frontend routes.
- Updated the API and UI to always include knowledge base components, simplifying the codebase.
- Adjusted the frontend feature flags to set ENABLE_KNOWLEDGE_BASES to false, ensuring knowledge base features are not displayed.
- Cleaned up related components and routes to reflect the removal of the feature flag, enhancing maintainability.

* revert

* [autofix.ci] apply automated fixes

* Remove Knowledge Bases JSON configuration and clean up KnowledgeBasesTab component by eliminating unused imports and template creation functionality.

* [autofix.ci] apply automated fixes

* Enhance routing structure by adding admin and login routes with protected access. Refactor flow routes for improved organization and clarity.

* added template back

* Use chroma for stats computation

* Fix ruff issue

* [autofix.ci] apply automated fixes

* Update Knowledge Bases.json

* Update Knowledge Bases.json

* Rename to just knowledge

* feat: enhance Jest configuration and add new tests for Knowledge Base components

- Updated jest.config.js to include a new setup file and refined test matching patterns.
- Introduced jest.setup.js for mocking globals and Vite-specific syntax.
- Added tests for KnowledgeBaseDrawer, KnowledgeBaseEmptyState, KnowledgeBaseSelectionOverlay, KnowledgeBasesTab, and KnowledgePage components.
- Created utility functions for testing and mock data for knowledge bases.
- Implemented tests for utility functions related to knowledge base formatting.

* [autofix.ci] apply automated fixes

* refactor: reorganize imports and clean up console log in Dropdown component

- Moved and re-imported necessary dependencies for better structure.
- Removed unnecessary console log statement to clean up the code.

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* feat: add success callback for knowledge base creation in NodeDialog component

- Introduced a new success callback to handle knowledge base creation notifications.
- Enhanced dialog closing logic with a delay for Astra database tracking.
- Reorganized imports for better structure.

* refactor: update table component to handle single-toggle columns

- Renamed functions and variables to improve clarity regarding single-toggle columns (Vectorize and Identifier).
- Updated logic to ensure proper editability checks for single-toggle columns.
- Adjusted related components to reflect changes in column handling and rendering.

* [autofix.ci] apply automated fixes

* feat: Add unit tests for KBIngestionComponent (#9246)

* [autofix.ci] apply automated fixes

* fix: remove unnecessary drawer open state change in KnowledgePage

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Remove kb_info output from KBIngestionComponent (#9275)

* [autofix.ci] apply automated fixes

* Update Knowledge Bases.json

* Use settings service for knowledge base directory

Replaces the hardcoded knowledge base directory path with a value from the settings service. This improves configurability and centralizes directory management.

* Fix knowledge bases mypy issue

* test: Update file page tests for consistency and clarity

- Changed expected title text from "My Files" to "Files" for accuracy.
- Removed unnecessary parentheses in arrow functions for cleaner syntax.
- Updated test assertions to ensure visibility checks are clear and consistent.
- Improved readability by standardizing the formatting of test cases.

* test: Update expected title in file upload component test for accuracy

- Changed expected title text from "My Files" to "Files" to reflect the correct page title.

* [autofix.ci] apply automated fixes

* Fix tests on backend

* Update kb_ingest.py

* [autofix.ci] apply automated fixes

* Switch to two templates for KB

* Update names and descs

* [autofix.ci] apply automated fixes

* Rename templates

* [autofix.ci] apply automated fixes

---------

Co-authored-by: Deon Sanchez <69873175+deon-sanchez@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Edwin Jose <edwin.jose@datastax.com>
This commit is contained in:
Eric Hare 2025-08-13 13:15:57 -07:00 committed by GitHub
commit e68f6a405a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
53 changed files with 7475 additions and 607 deletions

View file

@ -77,6 +77,7 @@ dependencies = [
"opensearch-py==2.8.0",
"langchain-google-genai==2.0.6",
"langchain-cohere==0.3.3",
"langchain-huggingface==0.3.1",
"langchain-anthropic==0.3.14",
"langchain-astradb~=0.6.0",
"langchain-openai>=0.2.12",
@ -126,6 +127,7 @@ dependencies = [
"docling_core>=2.36.1",
"filelock>=3.18.0",
"jigsawstack==0.2.7",
"fastparquet>=2024.11.0",
]
[dependency-groups]

View file

@ -8,6 +8,7 @@ from langflow.api.v1 import (
files_router,
flows_router,
folders_router,
knowledge_bases_router,
login_router,
mcp_projects_router,
mcp_router,
@ -45,6 +46,7 @@ router_v1.include_router(monitor_router)
router_v1.include_router(folders_router)
router_v1.include_router(projects_router)
router_v1.include_router(starter_projects_router)
router_v1.include_router(knowledge_bases_router)
router_v1.include_router(mcp_router)
router_v1.include_router(voice_mode_router)
router_v1.include_router(mcp_projects_router)

View file

@ -4,6 +4,7 @@ from langflow.api.v1.endpoints import router as endpoints_router
from langflow.api.v1.files import router as files_router
from langflow.api.v1.flows import router as flows_router
from langflow.api.v1.folders import router as folders_router
from langflow.api.v1.knowledge_bases import router as knowledge_bases_router
from langflow.api.v1.login import router as login_router
from langflow.api.v1.mcp import router as mcp_router
from langflow.api.v1.mcp_projects import router as mcp_projects_router
@ -23,6 +24,7 @@ __all__ = [
"files_router",
"flows_router",
"folders_router",
"knowledge_bases_router",
"login_router",
"mcp_projects_router",
"mcp_router",

View file

@ -0,0 +1,437 @@
import json
import shutil
from http import HTTPStatus
from pathlib import Path
import pandas as pd
from fastapi import APIRouter, HTTPException
from langchain_chroma import Chroma
from loguru import logger
from pydantic import BaseModel
from langflow.services.deps import get_settings_service
router = APIRouter(tags=["Knowledge Bases"], prefix="/knowledge_bases")
settings = get_settings_service().settings
knowledge_directory = settings.knowledge_bases_dir
if not knowledge_directory:
msg = "Knowledge bases directory is not set in the settings."
raise ValueError(msg)
KNOWLEDGE_BASES_DIR = Path(knowledge_directory).expanduser()
class KnowledgeBaseInfo(BaseModel):
id: str
name: str
embedding_provider: str | None = "Unknown"
embedding_model: str | None = "Unknown"
size: int = 0
words: int = 0
characters: int = 0
chunks: int = 0
avg_chunk_size: float = 0.0
class BulkDeleteRequest(BaseModel):
kb_names: list[str]
def get_kb_root_path() -> Path:
"""Get the knowledge bases root path."""
return KNOWLEDGE_BASES_DIR
def get_directory_size(path: Path) -> int:
"""Calculate the total size of all files in a directory."""
total_size = 0
try:
for file_path in path.rglob("*"):
if file_path.is_file():
total_size += file_path.stat().st_size
except (OSError, PermissionError):
pass
return total_size
def detect_embedding_provider(kb_path: Path) -> str:
"""Detect the embedding provider from config files and directory structure."""
# Provider patterns to check for
provider_patterns = {
"OpenAI": ["openai", "text-embedding-ada", "text-embedding-3"],
"HuggingFace": ["sentence-transformers", "huggingface", "bert-"],
"Cohere": ["cohere", "embed-english", "embed-multilingual"],
"Google": ["palm", "gecko", "google"],
"Chroma": ["chroma"],
}
# Check JSON config files for provider information
for config_file in kb_path.glob("*.json"):
try:
with config_file.open("r", encoding="utf-8") as f:
config_data = json.load(f)
if not isinstance(config_data, dict):
continue
config_str = json.dumps(config_data).lower()
# Check for explicit provider fields first
provider_fields = ["embedding_provider", "provider", "embedding_model_provider"]
for field in provider_fields:
if field in config_data:
provider_value = str(config_data[field]).lower()
for provider, patterns in provider_patterns.items():
if any(pattern in provider_value for pattern in patterns):
return provider
# Check for model name patterns
for provider, patterns in provider_patterns.items():
if any(pattern in config_str for pattern in patterns):
return provider
except (OSError, json.JSONDecodeError) as _:
logger.exception("Error reading config file '%s'", config_file)
continue
# Fallback to directory structure
if (kb_path / "chroma").exists():
return "Chroma"
if (kb_path / "vectors.npy").exists():
return "Local"
return "Unknown"
def detect_embedding_model(kb_path: Path) -> str:
"""Detect the embedding model from config files."""
# First check the embedding metadata file (most accurate)
metadata_file = kb_path / "embedding_metadata.json"
if metadata_file.exists():
try:
with metadata_file.open("r", encoding="utf-8") as f:
metadata = json.load(f)
if isinstance(metadata, dict) and "embedding_model" in metadata:
# Check for embedding model field
model_value = str(metadata.get("embedding_model", "unknown"))
if model_value and model_value.lower() != "unknown":
return model_value
except (OSError, json.JSONDecodeError) as _:
logger.exception("Error reading embedding metadata file '%s'", metadata_file)
# Check other JSON config files for model information
for config_file in kb_path.glob("*.json"):
# Skip the embedding metadata file since we already checked it
if config_file.name == "embedding_metadata.json":
continue
try:
with config_file.open("r", encoding="utf-8") as f:
config_data = json.load(f)
if not isinstance(config_data, dict):
continue
# Check for explicit model fields first and return the actual model name
model_fields = ["embedding_model", "model", "embedding_model_name", "model_name"]
for field in model_fields:
if field in config_data:
model_value = str(config_data[field])
if model_value and model_value.lower() != "unknown":
return model_value
# Check for OpenAI specific model names
if "openai" in json.dumps(config_data).lower():
openai_models = ["text-embedding-ada-002", "text-embedding-3-small", "text-embedding-3-large"]
config_str = json.dumps(config_data).lower()
for model in openai_models:
if model in config_str:
return model
# Check for HuggingFace model names (usually in model field)
if "model" in config_data:
model_name = str(config_data["model"])
# Common HuggingFace embedding models
hf_patterns = ["sentence-transformers", "all-MiniLM", "all-mpnet", "multi-qa"]
if any(pattern in model_name for pattern in hf_patterns):
return model_name
except (OSError, json.JSONDecodeError) as _:
logger.exception("Error reading config file '%s'", config_file)
continue
return "Unknown"
def get_text_columns(df: pd.DataFrame, schema_data: list | None = None) -> list[str]:
"""Get the text columns to analyze for word/character counts."""
# First try schema-defined text columns
if schema_data:
text_columns = [
col["column_name"]
for col in schema_data
if col.get("vectorize", False) and col.get("data_type") == "string"
]
if text_columns:
return [col for col in text_columns if col in df.columns]
# Fallback to common text column names
common_names = ["text", "content", "document", "chunk"]
text_columns = [col for col in df.columns if col.lower() in common_names]
if text_columns:
return text_columns
# Last resort: all string columns
return [col for col in df.columns if df[col].dtype == "object"]
def calculate_text_metrics(df: pd.DataFrame, text_columns: list[str]) -> tuple[int, int]:
"""Calculate total words and characters from text columns."""
total_words = 0
total_characters = 0
for col in text_columns:
if col not in df.columns:
continue
text_series = df[col].astype(str).fillna("")
total_characters += text_series.str.len().sum()
total_words += text_series.str.split().str.len().sum()
return int(total_words), int(total_characters)
def get_kb_metadata(kb_path: Path) -> dict:
"""Extract metadata from a knowledge base directory."""
metadata: dict[str, float | int | str] = {
"chunks": 0,
"words": 0,
"characters": 0,
"avg_chunk_size": 0.0,
"embedding_provider": "Unknown",
"embedding_model": "Unknown",
}
try:
# First check embedding metadata file for accurate provider and model info
metadata_file = kb_path / "embedding_metadata.json"
if metadata_file.exists():
try:
with metadata_file.open("r", encoding="utf-8") as f:
embedding_metadata = json.load(f)
if isinstance(embedding_metadata, dict):
if "embedding_provider" in embedding_metadata:
metadata["embedding_provider"] = embedding_metadata["embedding_provider"]
if "embedding_model" in embedding_metadata:
metadata["embedding_model"] = embedding_metadata["embedding_model"]
except (OSError, json.JSONDecodeError) as _:
logger.exception("Error reading embedding metadata file '%s'", metadata_file)
# Fallback to detection if not found in metadata file
if metadata["embedding_provider"] == "Unknown":
metadata["embedding_provider"] = detect_embedding_provider(kb_path)
if metadata["embedding_model"] == "Unknown":
metadata["embedding_model"] = detect_embedding_model(kb_path)
# Read schema for text column information
schema_data = None
schema_file = kb_path / "schema.json"
if schema_file.exists():
try:
with schema_file.open("r", encoding="utf-8") as f:
schema_data = json.load(f)
if not isinstance(schema_data, list):
schema_data = None
except (ValueError, TypeError, OSError) as _:
logger.exception("Error reading schema file '%s'", schema_file)
# Create vector store
chroma = Chroma(
persist_directory=str(kb_path),
collection_name=kb_path.name,
)
# Access the raw collection
collection = chroma._collection
# Fetch all documents and metadata
results = collection.get(include=["documents", "metadatas"])
# Convert to pandas DataFrame
source_chunks = pd.DataFrame(
{
"document": results["documents"],
"metadata": results["metadatas"],
}
)
# Process the source data for metadata
try:
metadata["chunks"] = len(source_chunks)
# Get text columns and calculate metrics
text_columns = get_text_columns(source_chunks, schema_data)
if text_columns:
words, characters = calculate_text_metrics(source_chunks, text_columns)
metadata["words"] = words
metadata["characters"] = characters
# Calculate average chunk size
if int(metadata["chunks"]) > 0:
metadata["avg_chunk_size"] = round(int(characters) / int(metadata["chunks"]), 1)
except (OSError, ValueError, TypeError) as _:
logger.exception("Error processing Chroma DB '%s'", kb_path.name)
except (OSError, ValueError, TypeError) as _:
logger.exception("Error processing knowledge base directory '%s'", kb_path)
return metadata
@router.get("", status_code=HTTPStatus.OK)
@router.get("/", status_code=HTTPStatus.OK)
async def list_knowledge_bases() -> list[KnowledgeBaseInfo]:
"""List all available knowledge bases."""
try:
kb_root_path = get_kb_root_path()
if not kb_root_path.exists():
return []
knowledge_bases = []
for kb_dir in kb_root_path.iterdir():
if not kb_dir.is_dir() or kb_dir.name.startswith("."):
continue
try:
# Get size of the directory
size = get_directory_size(kb_dir)
# Get metadata from KB files
metadata = get_kb_metadata(kb_dir)
kb_info = KnowledgeBaseInfo(
id=kb_dir.name,
name=kb_dir.name.replace("_", " ").replace("-", " ").title(),
embedding_provider=metadata["embedding_provider"],
embedding_model=metadata["embedding_model"],
size=size,
words=metadata["words"],
characters=metadata["characters"],
chunks=metadata["chunks"],
avg_chunk_size=metadata["avg_chunk_size"],
)
knowledge_bases.append(kb_info)
except OSError as _:
# Log the exception and skip directories that can't be read
logger.exception("Error reading knowledge base directory '%s'", kb_dir)
continue
# Sort by name alphabetically
knowledge_bases.sort(key=lambda x: x.name)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error listing knowledge bases: {e!s}") from e
else:
return knowledge_bases
@router.get("/{kb_name}", status_code=HTTPStatus.OK)
async def get_knowledge_base(kb_name: str) -> KnowledgeBaseInfo:
"""Get detailed information about a specific knowledge base."""
try:
kb_root_path = get_kb_root_path()
kb_path = kb_root_path / kb_name
if not kb_path.exists() or not kb_path.is_dir():
raise HTTPException(status_code=404, detail=f"Knowledge base '{kb_name}' not found")
# Get size of the directory
size = get_directory_size(kb_path)
# Get metadata from KB files
metadata = get_kb_metadata(kb_path)
return KnowledgeBaseInfo(
id=kb_name,
name=kb_name.replace("_", " ").replace("-", " ").title(),
embedding_provider=metadata["embedding_provider"],
embedding_model=metadata["embedding_model"],
size=size,
words=metadata["words"],
characters=metadata["characters"],
chunks=metadata["chunks"],
avg_chunk_size=metadata["avg_chunk_size"],
)
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error getting knowledge base '{kb_name}': {e!s}") from e
@router.delete("/{kb_name}", status_code=HTTPStatus.OK)
async def delete_knowledge_base(kb_name: str) -> dict[str, str]:
"""Delete a specific knowledge base."""
try:
kb_root_path = get_kb_root_path()
kb_path = kb_root_path / kb_name
if not kb_path.exists() or not kb_path.is_dir():
raise HTTPException(status_code=404, detail=f"Knowledge base '{kb_name}' not found")
# Delete the entire knowledge base directory
shutil.rmtree(kb_path)
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error deleting knowledge base '{kb_name}': {e!s}") from e
else:
return {"message": f"Knowledge base '{kb_name}' deleted successfully"}
@router.delete("", status_code=HTTPStatus.OK)
@router.delete("/", status_code=HTTPStatus.OK)
async def delete_knowledge_bases_bulk(request: BulkDeleteRequest) -> dict[str, object]:
"""Delete multiple knowledge bases."""
try:
kb_root_path = get_kb_root_path()
deleted_count = 0
not_found_kbs = []
for kb_name in request.kb_names:
kb_path = kb_root_path / kb_name
if not kb_path.exists() or not kb_path.is_dir():
not_found_kbs.append(kb_name)
continue
try:
# Delete the entire knowledge base directory
shutil.rmtree(kb_path)
deleted_count += 1
except (OSError, PermissionError) as e:
logger.exception("Error deleting knowledge base '%s': %s", kb_name, e)
# Continue with other deletions even if one fails
if not_found_kbs and deleted_count == 0:
raise HTTPException(status_code=404, detail=f"Knowledge bases not found: {', '.join(not_found_kbs)}")
result = {
"message": f"Successfully deleted {deleted_count} knowledge base(s)",
"deleted_count": deleted_count,
}
if not_found_kbs:
result["not_found"] = ", ".join(not_found_kbs)
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error deleting knowledge bases: {e!s}") from e
else:
return result

View file

@ -0,0 +1,104 @@
import math
from collections import Counter
def compute_tfidf(documents: list[str], query_terms: list[str]) -> list[float]:
"""Compute TF-IDF scores for query terms across a collection of documents.
Args:
documents: List of document strings
query_terms: List of query terms to score
Returns:
List of TF-IDF scores for each document
"""
# Tokenize documents (simple whitespace splitting)
tokenized_docs = [doc.lower().split() for doc in documents]
n_docs = len(documents)
# Calculate document frequency for each term
document_frequencies = {}
for term in query_terms:
document_frequencies[term] = sum(1 for doc in tokenized_docs if term.lower() in doc)
scores = []
for doc_tokens in tokenized_docs:
doc_score = 0.0
doc_length = len(doc_tokens)
term_counts = Counter(doc_tokens)
for term in query_terms:
term_lower = term.lower()
# Term frequency (TF)
tf = term_counts[term_lower] / doc_length if doc_length > 0 else 0
# Inverse document frequency (IDF)
idf = math.log(n_docs / document_frequencies[term]) if document_frequencies[term] > 0 else 0
# TF-IDF score
doc_score += tf * idf
scores.append(doc_score)
return scores
def compute_bm25(documents: list[str], query_terms: list[str], k1: float = 1.2, b: float = 0.75) -> list[float]:
"""Compute BM25 scores for query terms across a collection of documents.
Args:
documents: List of document strings
query_terms: List of query terms to score
k1: Controls term frequency scaling (default: 1.2)
b: Controls document length normalization (default: 0.75)
Returns:
List of BM25 scores for each document
"""
# Tokenize documents
tokenized_docs = [doc.lower().split() for doc in documents]
n_docs = len(documents)
# Calculate average document length
avg_doc_length = sum(len(doc) for doc in tokenized_docs) / n_docs if n_docs > 0 else 0
# Handle edge case where all documents are empty
if avg_doc_length == 0:
return [0.0] * n_docs
# Calculate document frequency for each term
document_frequencies = {}
for term in query_terms:
document_frequencies[term] = sum(1 for doc in tokenized_docs if term.lower() in doc)
scores = []
for doc_tokens in tokenized_docs:
doc_score = 0.0
doc_length = len(doc_tokens)
term_counts = Counter(doc_tokens)
for term in query_terms:
term_lower = term.lower()
# Term frequency in document
tf = term_counts[term_lower]
# Inverse document frequency (IDF)
# Use standard BM25 IDF formula that ensures non-negative values
idf = math.log(n_docs / document_frequencies[term]) if document_frequencies[term] > 0 else 0
# BM25 score calculation
numerator = tf * (k1 + 1)
denominator = tf + k1 * (1 - b + b * (doc_length / avg_doc_length))
# Handle division by zero when tf=0 and k1=0
term_score = 0 if denominator == 0 else idf * (numerator / denominator)
doc_score += term_score
scores.append(doc_score)
return scores

View file

@ -3,6 +3,8 @@ from .csv_to_data import CSVToDataComponent
from .directory import DirectoryComponent
from .file import FileComponent
from .json_to_data import JSONToDataComponent
from .kb_ingest import KBIngestionComponent
from .kb_retrieval import KBRetrievalComponent
from .news_search import NewsSearchComponent
from .rss import RSSReaderComponent
from .sql_executor import SQLComponent
@ -16,6 +18,8 @@ __all__ = [
"DirectoryComponent",
"FileComponent",
"JSONToDataComponent",
"KBIngestionComponent",
"KBRetrievalComponent",
"NewsSearchComponent",
"RSSReaderComponent",
"SQLComponent",

View file

@ -0,0 +1,585 @@
from __future__ import annotations
import hashlib
import json
import re
import uuid
from dataclasses import asdict, dataclass, field
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
import pandas as pd
from cryptography.fernet import InvalidToken
from langchain_chroma import Chroma
from loguru import logger
from langflow.base.models.openai_constants import OPENAI_EMBEDDING_MODEL_NAMES
from langflow.custom import Component
from langflow.io import BoolInput, DataFrameInput, DropdownInput, IntInput, Output, SecretStrInput, StrInput, TableInput
from langflow.schema.data import Data
from langflow.schema.dotdict import dotdict # noqa: TC001
from langflow.schema.table import EditMode
from langflow.services.auth.utils import decrypt_api_key, encrypt_api_key
from langflow.services.deps import get_settings_service
HUGGINGFACE_MODEL_NAMES = ["sentence-transformers/all-MiniLM-L6-v2", "sentence-transformers/all-mpnet-base-v2"]
COHERE_MODEL_NAMES = ["embed-english-v3.0", "embed-multilingual-v3.0"]
settings = get_settings_service().settings
knowledge_directory = settings.knowledge_bases_dir
if not knowledge_directory:
msg = "Knowledge bases directory is not set in the settings."
raise ValueError(msg)
KNOWLEDGE_BASES_ROOT_PATH = Path(knowledge_directory).expanduser()
class KBIngestionComponent(Component):
"""Create or append to Langflow Knowledge from a DataFrame."""
# ------ UI metadata ---------------------------------------------------
display_name = "Knowledge Ingestion"
description = "Create or update knowledge in Langflow."
icon = "database"
name = "KBIngestion"
@dataclass
class NewKnowledgeBaseInput:
functionality: str = "create"
fields: dict[str, dict] = field(
default_factory=lambda: {
"data": {
"node": {
"name": "create_knowledge_base",
"description": "Create new knowledge in Langflow.",
"display_name": "Create new knowledge",
"field_order": ["01_new_kb_name", "02_embedding_model", "03_api_key"],
"template": {
"01_new_kb_name": StrInput(
name="new_kb_name",
display_name="Knowledge Name",
info="Name of the new knowledge to create.",
required=True,
),
"02_embedding_model": DropdownInput(
name="embedding_model",
display_name="Model Name",
info="Select the embedding model to use for this knowledge base.",
required=True,
options=OPENAI_EMBEDDING_MODEL_NAMES + HUGGINGFACE_MODEL_NAMES + COHERE_MODEL_NAMES,
options_metadata=[{"icon": "OpenAI"} for _ in OPENAI_EMBEDDING_MODEL_NAMES]
+ [{"icon": "HuggingFace"} for _ in HUGGINGFACE_MODEL_NAMES]
+ [{"icon": "Cohere"} for _ in COHERE_MODEL_NAMES],
),
"03_api_key": SecretStrInput(
name="api_key",
display_name="API Key",
info="Provider API key for embedding model",
required=True,
load_from_db=True,
),
},
},
}
}
)
# ------ Inputs --------------------------------------------------------
inputs = [
DropdownInput(
name="knowledge_base",
display_name="Knowledge",
info="Select the knowledge to load data from.",
required=True,
options=[
str(d.name) for d in KNOWLEDGE_BASES_ROOT_PATH.iterdir() if not d.name.startswith(".") and d.is_dir()
]
if KNOWLEDGE_BASES_ROOT_PATH.exists()
else [],
refresh_button=True,
dialog_inputs=asdict(NewKnowledgeBaseInput()),
),
DataFrameInput(
name="input_df",
display_name="Data",
info="Table with all original columns (already chunked / processed).",
required=True,
),
TableInput(
name="column_config",
display_name="Column Configuration",
info="Configure column behavior for the knowledge base.",
required=True,
table_schema=[
{
"name": "column_name",
"display_name": "Column Name",
"type": "str",
"description": "Name of the column in the source DataFrame",
"edit_mode": EditMode.INLINE,
},
{
"name": "vectorize",
"display_name": "Vectorize",
"type": "boolean",
"description": "Create embeddings for this column",
"default": False,
"edit_mode": EditMode.INLINE,
},
{
"name": "identifier",
"display_name": "Identifier",
"type": "boolean",
"description": "Use this column as unique identifier",
"default": False,
"edit_mode": EditMode.INLINE,
},
],
value=[
{
"column_name": "text",
"vectorize": True,
"identifier": False,
}
],
),
IntInput(
name="chunk_size",
display_name="Chunk Size",
info="Batch size for processing embeddings",
advanced=True,
value=1000,
),
SecretStrInput(
name="api_key",
display_name="Embedding Provider API Key",
info="API key for the embedding provider to generate embeddings.",
advanced=True,
required=False,
),
BoolInput(
name="allow_duplicates",
display_name="Allow Duplicates",
info="Allow duplicate rows in the knowledge base",
advanced=True,
value=False,
),
]
# ------ Outputs -------------------------------------------------------
outputs = [Output(display_name="DataFrame", name="dataframe", method="build_kb_info")]
# ------ Internal helpers ---------------------------------------------
def _get_kb_root(self) -> Path:
"""Return the root directory for knowledge bases."""
return KNOWLEDGE_BASES_ROOT_PATH
def _validate_column_config(self, df_source: pd.DataFrame) -> list[dict[str, Any]]:
"""Validate column configuration using Structured Output patterns."""
if not self.column_config:
msg = "Column configuration cannot be empty"
raise ValueError(msg)
# Convert table input to list of dicts (similar to Structured Output)
config_list = self.column_config if isinstance(self.column_config, list) else []
# Validate column names exist in DataFrame
df_columns = set(df_source.columns)
for config in config_list:
col_name = config.get("column_name")
if col_name not in df_columns and not self.silent_errors:
msg = f"Column '{col_name}' not found in DataFrame. Available columns: {sorted(df_columns)}"
self.log(f"Warning: {msg}")
raise ValueError(msg)
return config_list
def _get_embedding_provider(self, embedding_model: str) -> str:
"""Get embedding provider by matching model name to lists."""
if embedding_model in OPENAI_EMBEDDING_MODEL_NAMES:
return "OpenAI"
if embedding_model in HUGGINGFACE_MODEL_NAMES:
return "HuggingFace"
if embedding_model in COHERE_MODEL_NAMES:
return "Cohere"
return "Custom"
def _build_embeddings(self, embedding_model: str, api_key: str):
"""Build embedding model using provider patterns."""
# Get provider by matching model name to lists
provider = self._get_embedding_provider(embedding_model)
# Validate provider and model
if provider == "OpenAI":
from langchain_openai import OpenAIEmbeddings
if not api_key:
msg = "OpenAI API key is required when using OpenAI provider"
raise ValueError(msg)
return OpenAIEmbeddings(
model=embedding_model,
api_key=api_key,
chunk_size=self.chunk_size,
)
if provider == "HuggingFace":
from langchain_huggingface import HuggingFaceEmbeddings
return HuggingFaceEmbeddings(
model=embedding_model,
)
if provider == "Cohere":
from langchain_cohere import CohereEmbeddings
if not api_key:
msg = "Cohere API key is required when using Cohere provider"
raise ValueError(msg)
return CohereEmbeddings(
model=embedding_model,
cohere_api_key=api_key,
)
if provider == "Custom":
# For custom embedding models, we would need additional configuration
msg = "Custom embedding models not yet supported"
raise NotImplementedError(msg)
msg = f"Unknown provider: {provider}"
raise ValueError(msg)
def _build_embedding_metadata(self, embedding_model, api_key) -> dict[str, Any]:
"""Build embedding model metadata."""
# Get provider by matching model name to lists
embedding_provider = self._get_embedding_provider(embedding_model)
api_key_to_save = None
if api_key and hasattr(api_key, "get_secret_value"):
api_key_to_save = api_key.get_secret_value()
elif isinstance(api_key, str):
api_key_to_save = api_key
encrypted_api_key = None
if api_key_to_save:
settings_service = get_settings_service()
try:
encrypted_api_key = encrypt_api_key(api_key_to_save, settings_service=settings_service)
except (TypeError, ValueError) as e:
self.log(f"Could not encrypt API key: {e}")
logger.error(f"Could not encrypt API key: {e}")
return {
"embedding_provider": embedding_provider,
"embedding_model": embedding_model,
"api_key": encrypted_api_key,
"api_key_used": bool(api_key),
"chunk_size": self.chunk_size,
"created_at": datetime.now(timezone.utc).isoformat(),
}
def _save_embedding_metadata(self, kb_path: Path, embedding_model: str, api_key: str) -> None:
"""Save embedding model metadata."""
embedding_metadata = self._build_embedding_metadata(embedding_model, api_key)
metadata_path = kb_path / "embedding_metadata.json"
metadata_path.write_text(json.dumps(embedding_metadata, indent=2))
def _save_kb_files(
self,
kb_path: Path,
config_list: list[dict[str, Any]],
) -> None:
"""Save KB files using File Component storage patterns."""
try:
# Create directory (following File Component patterns)
kb_path.mkdir(parents=True, exist_ok=True)
# Save column configuration
# Only do this if the file doesn't exist already
cfg_path = kb_path / "schema.json"
if not cfg_path.exists():
cfg_path.write_text(json.dumps(config_list, indent=2))
except Exception as e:
if not self.silent_errors:
raise
self.log(f"Error saving KB files: {e}")
def _build_column_metadata(self, config_list: list[dict[str, Any]], df_source: pd.DataFrame) -> dict[str, Any]:
"""Build detailed column metadata."""
metadata: dict[str, Any] = {
"total_columns": len(df_source.columns),
"mapped_columns": len(config_list),
"unmapped_columns": len(df_source.columns) - len(config_list),
"columns": [],
"summary": {"vectorized_columns": [], "identifier_columns": []},
}
for config in config_list:
col_name = config.get("column_name")
vectorize = config.get("vectorize") == "True" or config.get("vectorize") is True
identifier = config.get("identifier") == "True" or config.get("identifier") is True
# Add to columns list
metadata["columns"].append(
{
"name": col_name,
"vectorize": vectorize,
"identifier": identifier,
}
)
# Update summary
if vectorize:
metadata["summary"]["vectorized_columns"].append(col_name)
if identifier:
metadata["summary"]["identifier_columns"].append(col_name)
return metadata
def _create_vector_store(
self, df_source: pd.DataFrame, config_list: list[dict[str, Any]], embedding_model: str, api_key: str
) -> None:
"""Create vector store following Local DB component pattern."""
try:
# Set up vector store directory
base_dir = self._get_kb_root()
vector_store_dir = base_dir / self.knowledge_base
vector_store_dir.mkdir(parents=True, exist_ok=True)
# Create embeddings model
embedding_function = self._build_embeddings(embedding_model, api_key)
# Convert DataFrame to Data objects (following Local DB pattern)
data_objects = self._convert_df_to_data_objects(df_source, config_list)
# Create vector store
chroma = Chroma(
persist_directory=str(vector_store_dir),
embedding_function=embedding_function,
collection_name=self.knowledge_base,
)
# Convert Data objects to LangChain Documents
documents = []
for data_obj in data_objects:
doc = data_obj.to_lc_document()
documents.append(doc)
# Add documents to vector store
if documents:
chroma.add_documents(documents)
self.log(f"Added {len(documents)} documents to vector store '{self.knowledge_base}'")
except Exception as e:
if not self.silent_errors:
raise
self.log(f"Error creating vector store: {e}")
def _convert_df_to_data_objects(self, df_source: pd.DataFrame, config_list: list[dict[str, Any]]) -> list[Data]:
"""Convert DataFrame to Data objects for vector store."""
data_objects: list[Data] = []
# Set up vector store directory
base_dir = self._get_kb_root()
# If we don't allow duplicates, we need to get the existing hashes
chroma = Chroma(
persist_directory=str(base_dir / self.knowledge_base),
collection_name=self.knowledge_base,
)
# Get all documents and their metadata
all_docs = chroma.get()
# Extract all _id values from metadata
id_list = [metadata.get("_id") for metadata in all_docs["metadatas"] if metadata.get("_id")]
# Get column roles
content_cols = []
identifier_cols = []
for config in config_list:
col_name = config.get("column_name")
vectorize = config.get("vectorize") == "True" or config.get("vectorize") is True
identifier = config.get("identifier") == "True" or config.get("identifier") is True
if vectorize:
content_cols.append(col_name)
elif identifier:
identifier_cols.append(col_name)
# Convert each row to a Data object
for _, row in df_source.iterrows():
# Build content text from vectorized columns using list comprehension
content_parts = [str(row[col]) for col in content_cols if col in row and pd.notna(row[col])]
page_content = " ".join(content_parts)
# Build metadata from NON-vectorized columns only (simple key-value pairs)
data_dict = {
"text": page_content, # Main content for vectorization
}
# Add metadata columns as simple key-value pairs
for col in df_source.columns:
if col not in content_cols and col in row and pd.notna(row[col]):
# Convert to simple types for Chroma metadata
value = row[col]
data_dict[col] = str(value) # Convert complex types to string
# Hash the page_content for unique ID
page_content_hash = hashlib.sha256(page_content.encode()).hexdigest()
data_dict["_id"] = page_content_hash
# If duplicates are disallowed, and hash exists, prevent adding this row
if not self.allow_duplicates and page_content_hash in id_list:
self.log(f"Skipping duplicate row with hash {page_content_hash}")
continue
# Create Data object - everything except "text" becomes metadata
data_obj = Data(data=data_dict)
data_objects.append(data_obj)
return data_objects
def is_valid_collection_name(self, name, min_length: int = 3, max_length: int = 63) -> bool:
"""Validates collection name against conditions 1-3.
1. Contains 3-63 characters
2. Starts and ends with alphanumeric character
3. Contains only alphanumeric characters, underscores, or hyphens.
Args:
name (str): Collection name to validate
min_length (int): Minimum length of the name
max_length (int): Maximum length of the name
Returns:
bool: True if valid, False otherwise
"""
# Check length (condition 1)
if not (min_length <= len(name) <= max_length):
return False
# Check start/end with alphanumeric (condition 2)
if not (name[0].isalnum() and name[-1].isalnum()):
return False
# Check allowed characters (condition 3)
return re.match(r"^[a-zA-Z0-9_-]+$", name) is not None
# ---------------------------------------------------------------------
# OUTPUT METHODS
# ---------------------------------------------------------------------
def build_kb_info(self) -> Data:
"""Main ingestion routine → returns a dict with KB metadata."""
try:
# Get source DataFrame
df_source: pd.DataFrame = self.input_df
# Validate column configuration (using Structured Output patterns)
config_list = self._validate_column_config(df_source)
column_metadata = self._build_column_metadata(config_list, df_source)
# Prepare KB folder (using File Component patterns)
kb_root = self._get_kb_root()
kb_path = kb_root / self.knowledge_base
# Read the embedding info from the knowledge base folder
metadata_path = kb_path / "embedding_metadata.json"
# If the API key is not provided, try to read it from the metadata file
if metadata_path.exists():
settings_service = get_settings_service()
metadata = json.loads(metadata_path.read_text())
embedding_model = metadata.get("embedding_model")
try:
api_key = decrypt_api_key(metadata["api_key"], settings_service)
except (InvalidToken, TypeError, ValueError) as e:
logger.error(f"Could not decrypt API key. Please provide it manually. Error: {e}")
# Check if a custom API key was provided, update metadata if so
if self.api_key:
api_key = self.api_key
self._save_embedding_metadata(
kb_path=kb_path,
embedding_model=embedding_model,
api_key=api_key,
)
# Create vector store following Local DB component pattern
self._create_vector_store(df_source, config_list, embedding_model=embedding_model, api_key=api_key)
# Save KB files (using File Component storage patterns)
self._save_kb_files(kb_path, config_list)
# Build metadata response
meta: dict[str, Any] = {
"kb_id": str(uuid.uuid4()),
"kb_name": self.knowledge_base,
"rows": len(df_source),
"column_metadata": column_metadata,
"path": str(kb_path),
"config_columns": len(config_list),
"timestamp": datetime.now(tz=timezone.utc).isoformat(),
}
# Set status message
self.status = f"✅ KB **{self.knowledge_base}** saved · {len(df_source)} chunks."
return Data(data=meta)
except Exception as e:
if not self.silent_errors:
raise
self.log(f"Error in KB ingestion: {e}")
self.status = f"❌ KB ingestion failed: {e}"
return Data(data={"error": str(e), "kb_name": self.knowledge_base})
def _get_knowledge_bases(self) -> list[str]:
"""Retrieve a list of available knowledge bases.
Returns:
A list of knowledge base names.
"""
# Return the list of directories in the knowledge base root path
kb_root_path = self._get_kb_root()
if not kb_root_path.exists():
return []
return [str(d.name) for d in kb_root_path.iterdir() if not d.name.startswith(".") and d.is_dir()]
def update_build_config(self, build_config: dotdict, field_value: Any, field_name: str | None = None) -> dotdict:
"""Update build configuration based on provider selection."""
# Create a new knowledge base
if field_name == "knowledge_base":
if isinstance(field_value, dict) and "01_new_kb_name" in field_value:
# Validate the knowledge base name - Make sure it follows these rules:
if not self.is_valid_collection_name(field_value["01_new_kb_name"]):
msg = f"Invalid knowledge base name: {field_value['01_new_kb_name']}"
raise ValueError(msg)
# We need to test the API Key one time against the embedding model
embed_model = self._build_embeddings(
embedding_model=field_value["02_embedding_model"], api_key=field_value["03_api_key"]
)
# Try to generate a dummy embedding to validate the API key
embed_model.embed_query("test")
# Create the new knowledge base directory
kb_path = KNOWLEDGE_BASES_ROOT_PATH / field_value["01_new_kb_name"]
kb_path.mkdir(parents=True, exist_ok=True)
# Save the embedding metadata
build_config["knowledge_base"]["value"] = field_value["01_new_kb_name"]
self._save_embedding_metadata(
kb_path=kb_path,
embedding_model=field_value["02_embedding_model"],
api_key=field_value["03_api_key"],
)
# Update the knowledge base options dynamically
build_config["knowledge_base"]["options"] = self._get_knowledge_bases()
if build_config["knowledge_base"]["value"] not in build_config["knowledge_base"]["options"]:
build_config["knowledge_base"]["value"] = None
return build_config

View file

@ -0,0 +1,254 @@
import json
from pathlib import Path
from typing import Any
from cryptography.fernet import InvalidToken
from langchain_chroma import Chroma
from loguru import logger
from langflow.custom import Component
from langflow.io import BoolInput, DropdownInput, IntInput, MessageTextInput, Output, SecretStrInput
from langflow.schema.data import Data
from langflow.schema.dataframe import DataFrame
from langflow.services.auth.utils import decrypt_api_key
from langflow.services.deps import get_settings_service
settings = get_settings_service().settings
knowledge_directory = settings.knowledge_bases_dir
if not knowledge_directory:
msg = "Knowledge bases directory is not set in the settings."
raise ValueError(msg)
KNOWLEDGE_BASES_ROOT_PATH = Path(knowledge_directory).expanduser()
class KBRetrievalComponent(Component):
display_name = "Knowledge Retrieval"
description = "Search and retrieve data from knowledge."
icon = "database"
name = "KBRetrieval"
inputs = [
DropdownInput(
name="knowledge_base",
display_name="Knowledge",
info="Select the knowledge to load data from.",
required=True,
options=[
str(d.name) for d in KNOWLEDGE_BASES_ROOT_PATH.iterdir() if not d.name.startswith(".") and d.is_dir()
]
if KNOWLEDGE_BASES_ROOT_PATH.exists()
else [],
refresh_button=True,
real_time_refresh=True,
),
SecretStrInput(
name="api_key",
display_name="Embedding Provider API Key",
info="API key for the embedding provider to generate embeddings.",
advanced=True,
required=False,
),
MessageTextInput(
name="search_query",
display_name="Search Query",
info="Optional search query to filter knowledge base data.",
),
IntInput(
name="top_k",
display_name="Top K Results",
info="Number of top results to return from the knowledge base.",
value=5,
advanced=True,
required=False,
),
BoolInput(
name="include_metadata",
display_name="Include Metadata",
info="Whether to include all metadata and embeddings in the output. If false, only content is returned.",
value=True,
advanced=True,
),
]
outputs = [
Output(
name="chroma_kb_data",
display_name="Results",
method="get_chroma_kb_data",
info="Returns the data from the selected knowledge base.",
),
]
def _get_knowledge_bases(self) -> list[str]:
"""Retrieve a list of available knowledge bases.
Returns:
A list of knowledge base names.
"""
if not KNOWLEDGE_BASES_ROOT_PATH.exists():
return []
return [str(d.name) for d in KNOWLEDGE_BASES_ROOT_PATH.iterdir() if not d.name.startswith(".") and d.is_dir()]
def update_build_config(self, build_config, field_value, field_name=None): # noqa: ARG002
if field_name == "knowledge_base":
# Update the knowledge base options dynamically
build_config["knowledge_base"]["options"] = self._get_knowledge_bases()
# If the selected knowledge base is not available, reset it
if build_config["knowledge_base"]["value"] not in build_config["knowledge_base"]["options"]:
build_config["knowledge_base"]["value"] = None
return build_config
def _get_kb_metadata(self, kb_path: Path) -> dict:
"""Load and process knowledge base metadata."""
metadata: dict[str, Any] = {}
metadata_file = kb_path / "embedding_metadata.json"
if not metadata_file.exists():
logger.warning(f"Embedding metadata file not found at {metadata_file}")
return metadata
try:
with metadata_file.open("r", encoding="utf-8") as f:
metadata = json.load(f)
except json.JSONDecodeError:
logger.error(f"Error decoding JSON from {metadata_file}")
return {}
# Decrypt API key if it exists
if "api_key" in metadata and metadata.get("api_key"):
settings_service = get_settings_service()
try:
decrypted_key = decrypt_api_key(metadata["api_key"], settings_service)
metadata["api_key"] = decrypted_key
except (InvalidToken, TypeError, ValueError) as e:
logger.error(f"Could not decrypt API key. Please provide it manually. Error: {e}")
metadata["api_key"] = None
return metadata
def _build_embeddings(self, metadata: dict):
"""Build embedding model from metadata."""
provider = metadata.get("embedding_provider")
model = metadata.get("embedding_model")
api_key = metadata.get("api_key")
chunk_size = metadata.get("chunk_size")
# If user provided a key in the input, it overrides the stored one.
if self.api_key and self.api_key.get_secret_value():
api_key = self.api_key.get_secret_value()
# Handle various providers
if provider == "OpenAI":
from langchain_openai import OpenAIEmbeddings
if not api_key:
msg = "OpenAI API key is required. Provide it in the component's advanced settings."
raise ValueError(msg)
return OpenAIEmbeddings(
model=model,
api_key=api_key,
chunk_size=chunk_size,
)
if provider == "HuggingFace":
from langchain_huggingface import HuggingFaceEmbeddings
return HuggingFaceEmbeddings(
model=model,
)
if provider == "Cohere":
from langchain_cohere import CohereEmbeddings
if not api_key:
msg = "Cohere API key is required when using Cohere provider"
raise ValueError(msg)
return CohereEmbeddings(
model=model,
cohere_api_key=api_key,
)
if provider == "Custom":
# For custom embedding models, we would need additional configuration
msg = "Custom embedding models not yet supported"
raise NotImplementedError(msg)
# Add other providers here if they become supported in ingest
msg = f"Embedding provider '{provider}' is not supported for retrieval."
raise NotImplementedError(msg)
def get_chroma_kb_data(self) -> DataFrame:
"""Retrieve data from the selected knowledge base by reading the Chroma collection.
Returns:
A DataFrame containing the data rows from the knowledge base.
"""
kb_path = KNOWLEDGE_BASES_ROOT_PATH / self.knowledge_base
metadata = self._get_kb_metadata(kb_path)
if not metadata:
msg = f"Metadata not found for knowledge base: {self.knowledge_base}. Ensure it has been indexed."
raise ValueError(msg)
# Build the embedder for the knowledge base
embedding_function = self._build_embeddings(metadata)
# Load vector store
chroma = Chroma(
persist_directory=str(kb_path),
embedding_function=embedding_function,
collection_name=self.knowledge_base,
)
# If a search query is provided, perform a similarity search
if self.search_query:
# Use the search query to perform a similarity search
logger.info(f"Performing similarity search with query: {self.search_query}")
results = chroma.similarity_search_with_score(
query=self.search_query or "",
k=self.top_k,
)
else:
results = chroma.similarity_search(
query=self.search_query or "",
k=self.top_k,
)
# For each result, make it a tuple to match the expected output format
results = [(doc, 0) for doc in results] # Assign a dummy score of 0
# If metadata is enabled, get embeddings for the results
id_to_embedding = {}
if self.include_metadata and results:
doc_ids = [doc[0].metadata.get("_id") for doc in results if doc[0].metadata.get("_id")]
# Only proceed if we have valid document IDs
if doc_ids:
# Access underlying client to get embeddings
collection = chroma._client.get_collection(name=self.knowledge_base)
embeddings_result = collection.get(where={"_id": {"$in": doc_ids}}, include=["embeddings", "metadatas"])
# Create a mapping from document ID to embedding
for i, metadata in enumerate(embeddings_result.get("metadatas", [])):
if metadata and "_id" in metadata:
id_to_embedding[metadata["_id"]] = embeddings_result["embeddings"][i]
# Build output data based on include_metadata setting
data_list = []
for doc in results:
if self.include_metadata:
# Include all metadata, embeddings, and content
kwargs = {
"content": doc[0].page_content,
**doc[0].metadata,
}
if self.search_query:
kwargs["_score"] = -1 * doc[1]
kwargs["_embeddings"] = id_to_embedding.get(doc[0].metadata.get("_id"))
else:
# Only include content
kwargs = {
"content": doc[0].page_content,
}
data_list.append(Data(**kwargs))
# Return the DataFrame containing the data
return DataFrame(data=data_list)

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -73,6 +73,9 @@ class Settings(BaseSettings):
"""Define if langflow database should be saved in LANGFLOW_CONFIG_DIR or in the langflow directory
(i.e. in the package directory)."""
knowledge_bases_dir: str | None = "~/.langflow/knowledge_bases"
"""The directory to store knowledge bases."""
dev: bool = False
"""If True, Langflow will run in development mode."""
database_url: str | None = None

View file

@ -0,0 +1,458 @@
import pytest
from langflow.base.data.kb_utils import compute_bm25, compute_tfidf
class TestKBUtils:
"""Test suite for knowledge base utility functions."""
# Test data for TF-IDF and BM25 tests
@pytest.fixture
def sample_documents(self):
"""Sample documents for testing."""
return ["the cat sat on the mat", "the dog ran in the park", "cats and dogs are pets", "birds fly in the sky"]
@pytest.fixture
def query_terms(self):
"""Sample query terms for testing."""
return ["cat", "dog"]
@pytest.fixture
def empty_documents(self):
"""Empty documents for edge case testing."""
return ["", "", ""]
@pytest.fixture
def single_document(self):
"""Single document for testing."""
return ["hello world"]
def test_compute_tfidf_basic(self, sample_documents, query_terms):
"""Test basic TF-IDF computation."""
scores = compute_tfidf(sample_documents, query_terms)
# Should return a score for each document
assert len(scores) == len(sample_documents)
# All scores should be floats
assert all(isinstance(score, float) for score in scores)
# First document contains "cat", should have non-zero score
assert scores[0] > 0.0
# Second document contains "dog", should have non-zero score
assert scores[1] > 0.0
# Third document contains both "cats" and "dogs", but case-insensitive matching should work
# Note: "cats" != "cat" exactly, so this tests the term matching behavior
assert scores[2] >= 0.0
# Fourth document contains neither term, should have zero score
assert scores[3] == 0.0
def test_compute_tfidf_case_insensitive(self):
"""Test that TF-IDF computation is case insensitive."""
documents = ["The CAT sat", "the dog RAN", "CATS and DOGS"]
query_terms = ["cat", "DOG"]
scores = compute_tfidf(documents, query_terms)
# First document should match "cat" (case insensitive)
assert scores[0] > 0.0
# Second document should match "dog" (case insensitive)
assert scores[1] > 0.0
def test_compute_tfidf_empty_documents(self, empty_documents, query_terms):
"""Test TF-IDF with empty documents."""
scores = compute_tfidf(empty_documents, query_terms)
# Should return scores for all documents
assert len(scores) == len(empty_documents)
# All scores should be zero since documents are empty
assert all(score == 0.0 for score in scores)
def test_compute_tfidf_empty_query_terms(self, sample_documents):
"""Test TF-IDF with empty query terms."""
scores = compute_tfidf(sample_documents, [])
# Should return scores for all documents
assert len(scores) == len(sample_documents)
# All scores should be zero since no query terms
assert all(score == 0.0 for score in scores)
def test_compute_tfidf_single_document(self, single_document):
"""Test TF-IDF with single document."""
query_terms = ["hello", "world"]
scores = compute_tfidf(single_document, query_terms)
assert len(scores) == 1
# With only one document, IDF = log(1/1) = 0, so TF-IDF score is always 0
# This is correct mathematical behavior - TF-IDF is designed to discriminate between documents
assert scores[0] == 0.0
def test_compute_tfidf_two_documents_positive_scores(self):
"""Test TF-IDF with two documents to ensure positive scores are possible."""
documents = ["hello world", "goodbye earth"]
query_terms = ["hello", "world"]
scores = compute_tfidf(documents, query_terms)
assert len(scores) == 2
# First document contains both terms, should have positive score
assert scores[0] > 0.0
# Second document contains neither term, should have zero score
assert scores[1] == 0.0
def test_compute_tfidf_no_documents(self):
"""Test TF-IDF with no documents."""
scores = compute_tfidf([], ["cat", "dog"])
assert scores == []
def test_compute_tfidf_term_frequency_calculation(self):
"""Test TF-IDF term frequency calculation."""
# Documents with different term frequencies for the same term
documents = ["rare word text", "rare rare word", "other content"]
query_terms = ["rare"]
scores = compute_tfidf(documents, query_terms)
# "rare" appears in documents 0 and 1, but with different frequencies
# Document 1 has higher TF (2/3 vs 1/3), so should score higher
assert scores[0] > 0.0 # Contains "rare" once
assert scores[1] > scores[0] # Contains "rare" twice, should score higher
assert scores[2] == 0.0 # Doesn't contain "rare"
def test_compute_tfidf_idf_calculation(self):
"""Test TF-IDF inverse document frequency calculation."""
# "rare" appears in only one document, "common" appears in both
documents = ["rare term", "common term", "common word"]
query_terms = ["rare", "common"]
scores = compute_tfidf(documents, query_terms)
# First document should have higher score due to rare term having higher IDF
assert scores[0] > scores[1] # rare term gets higher IDF
assert scores[0] > scores[2]
def test_compute_bm25_basic(self, sample_documents, query_terms):
"""Test basic BM25 computation."""
scores = compute_bm25(sample_documents, query_terms)
# Should return a score for each document
assert len(scores) == len(sample_documents)
# All scores should be floats
assert all(isinstance(score, float) for score in scores)
# First document contains "cat", should have non-zero score
assert scores[0] > 0.0
# Second document contains "dog", should have non-zero score
assert scores[1] > 0.0
# Fourth document contains neither term, should have zero score
assert scores[3] == 0.0
def test_compute_bm25_parameters(self, sample_documents, query_terms):
"""Test BM25 with different k1 and b parameters."""
# Test with default parameters
scores_default = compute_bm25(sample_documents, query_terms)
# Test with different k1
scores_k1 = compute_bm25(sample_documents, query_terms, k1=2.0)
# Test with different b
scores_b = compute_bm25(sample_documents, query_terms, b=0.5)
# Test with both different
scores_both = compute_bm25(sample_documents, query_terms, k1=2.0, b=0.5)
# All should return valid scores
assert len(scores_default) == len(sample_documents)
assert len(scores_k1) == len(sample_documents)
assert len(scores_b) == len(sample_documents)
assert len(scores_both) == len(sample_documents)
# Scores should be different with different parameters
assert scores_default != scores_k1
assert scores_default != scores_b
def test_compute_bm25_case_insensitive(self):
"""Test that BM25 computation is case insensitive."""
documents = ["The CAT sat", "the dog RAN", "CATS and DOGS"]
query_terms = ["cat", "DOG"]
scores = compute_bm25(documents, query_terms)
# First document should match "cat" (case insensitive)
assert scores[0] > 0.0
# Second document should match "dog" (case insensitive)
assert scores[1] > 0.0
def test_compute_bm25_empty_documents(self, empty_documents, query_terms):
"""Test BM25 with empty documents."""
scores = compute_bm25(empty_documents, query_terms)
# Should return scores for all documents
assert len(scores) == len(empty_documents)
# All scores should be zero since documents are empty
assert all(score == 0.0 for score in scores)
def test_compute_bm25_empty_query_terms(self, sample_documents):
"""Test BM25 with empty query terms."""
scores = compute_bm25(sample_documents, [])
# Should return scores for all documents
assert len(scores) == len(sample_documents)
# All scores should be zero since no query terms
assert all(score == 0.0 for score in scores)
def test_compute_bm25_single_document(self, single_document):
"""Test BM25 with single document."""
query_terms = ["hello", "world"]
scores = compute_bm25(single_document, query_terms)
assert len(scores) == 1
# With only one document, IDF = log(1/1) = 0, so BM25 score is always 0
# This is correct mathematical behavior - both TF-IDF and BM25 are designed to discriminate between documents
assert scores[0] == 0.0
def test_compute_bm25_two_documents_positive_scores(self):
"""Test BM25 with two documents to ensure positive scores are possible."""
documents = ["hello world", "goodbye earth"]
query_terms = ["hello", "world"]
scores = compute_bm25(documents, query_terms)
assert len(scores) == 2
# First document contains both terms, should have positive score
assert scores[0] > 0.0
# Second document contains neither term, should have zero score
assert scores[1] == 0.0
def test_compute_bm25_no_documents(self):
"""Test BM25 with no documents."""
scores = compute_bm25([], ["cat", "dog"])
assert scores == []
def test_compute_bm25_document_length_normalization(self):
"""Test BM25 document length normalization."""
# Test with documents where some terms appear in subset of documents
documents = [
"cat unique1", # Short document with unique term
"cat dog bird mouse elephant tiger lion bear wolf unique2", # Long document with unique term
"other content", # Document without query terms
]
query_terms = ["unique1", "unique2"]
scores = compute_bm25(documents, query_terms)
# Documents with unique terms should have positive scores
assert scores[0] > 0.0 # Contains "unique1"
assert scores[1] > 0.0 # Contains "unique2"
assert scores[2] == 0.0 # Contains neither term
# Document length normalization affects scores
assert len(scores) == 3
def test_compute_bm25_term_frequency_saturation(self):
"""Test BM25 term frequency saturation behavior."""
# Test with documents where term frequencies can be meaningfully compared
documents = [
"rare word text", # TF = 1 for "rare"
"rare rare word", # TF = 2 for "rare"
"rare rare rare rare rare word", # TF = 5 for "rare"
"other content", # No "rare" term
]
query_terms = ["rare"]
scores = compute_bm25(documents, query_terms)
# Documents with the term should have positive scores
assert scores[0] > 0.0 # TF=1
assert scores[1] > 0.0 # TF=2
assert scores[2] > 0.0 # TF=5
assert scores[3] == 0.0 # TF=0
# Scores should increase with term frequency, but with diminishing returns
assert scores[1] > scores[0] # TF=2 > TF=1
assert scores[2] > scores[1] # TF=5 > TF=2
# Check that increases demonstrate saturation effect
increase_1_to_2 = scores[1] - scores[0]
increase_2_to_5 = scores[2] - scores[1]
assert increase_1_to_2 > 0
assert increase_2_to_5 > 0
def test_compute_bm25_idf_calculation(self):
"""Test BM25 inverse document frequency calculation."""
# "rare" appears in only one document, "common" appears in multiple
documents = ["rare term", "common term", "common word"]
query_terms = ["rare", "common"]
scores = compute_bm25(documents, query_terms)
# First document should have higher score due to rare term having higher IDF
assert scores[0] > scores[1] # rare term gets higher IDF
assert scores[0] > scores[2]
def test_compute_bm25_zero_parameters(self, sample_documents, query_terms):
"""Test BM25 with edge case parameters."""
# Test with k1=0 (no term frequency scaling)
scores_k1_zero = compute_bm25(sample_documents, query_terms, k1=0.0)
assert len(scores_k1_zero) == len(sample_documents)
# Test with b=0 (no document length normalization)
scores_b_zero = compute_bm25(sample_documents, query_terms, b=0.0)
assert len(scores_b_zero) == len(sample_documents)
# Test with b=1 (full document length normalization)
scores_b_one = compute_bm25(sample_documents, query_terms, b=1.0)
assert len(scores_b_one) == len(sample_documents)
def test_tfidf_vs_bm25_comparison(self, sample_documents, query_terms):
"""Test that TF-IDF and BM25 produce different but related scores."""
tfidf_scores = compute_tfidf(sample_documents, query_terms)
bm25_scores = compute_bm25(sample_documents, query_terms)
# Both should return same number of scores
assert len(tfidf_scores) == len(bm25_scores) == len(sample_documents)
# For documents that match, both should be positive
for i in range(len(sample_documents)):
if tfidf_scores[i] > 0:
assert bm25_scores[i] > 0, f"Document {i} has TF-IDF score but zero BM25 score"
if bm25_scores[i] > 0:
assert tfidf_scores[i] > 0, f"Document {i} has BM25 score but zero TF-IDF score"
def test_compute_tfidf_special_characters(self):
"""Test TF-IDF with documents containing special characters."""
documents = ["hello, world!", "world... hello?", "no match here"]
query_terms = ["hello", "world"]
scores = compute_tfidf(documents, query_terms)
# Should handle punctuation and still match terms
assert len(scores) == 3
# Note: Current implementation does simple split(), so punctuation stays attached
# This tests the current behavior - may need updating if tokenization improves
def test_compute_bm25_special_characters(self):
"""Test BM25 with documents containing special characters."""
documents = ["hello, world!", "world... hello?", "no match here"]
query_terms = ["hello", "world"]
scores = compute_bm25(documents, query_terms)
# Should handle punctuation and still match terms
assert len(scores) == 3
# Same tokenization behavior as TF-IDF
def test_compute_tfidf_whitespace_handling(self):
"""Test TF-IDF with various whitespace scenarios."""
documents = [
" hello world ", # Extra spaces
"\thello\tworld\t", # Tabs
"hello\nworld", # Newlines
"", # Empty string
]
query_terms = ["hello", "world"]
scores = compute_tfidf(documents, query_terms)
assert len(scores) == 4
# First three should have positive scores (they contain the terms)
assert scores[0] > 0.0
assert scores[1] > 0.0
assert scores[2] > 0.0
# Last should be zero (empty document)
assert scores[3] == 0.0
def test_compute_bm25_whitespace_handling(self):
"""Test BM25 with various whitespace scenarios."""
documents = [
" hello world ", # Extra spaces
"\thello\tworld\t", # Tabs
"hello\nworld", # Newlines
"", # Empty string
]
query_terms = ["hello", "world"]
scores = compute_bm25(documents, query_terms)
assert len(scores) == 4
# First three should have positive scores (they contain the terms)
assert scores[0] > 0.0
assert scores[1] > 0.0
assert scores[2] > 0.0
# Last should be zero (empty document)
assert scores[3] == 0.0
def test_compute_tfidf_mathematical_properties(self):
"""Test mathematical properties of TF-IDF scores."""
documents = ["cat dog", "cat", "dog"]
query_terms = ["cat"]
scores = compute_tfidf(documents, query_terms)
# All scores should be non-negative
assert all(score >= 0.0 for score in scores)
# Documents containing the term should have positive scores
assert scores[0] > 0.0 # contains "cat"
assert scores[1] > 0.0 # contains "cat"
assert scores[2] == 0.0 # doesn't contain "cat"
def test_compute_bm25_mathematical_properties(self):
"""Test mathematical properties of BM25 scores."""
documents = ["cat dog", "cat", "dog"]
query_terms = ["cat"]
scores = compute_bm25(documents, query_terms)
# All scores should be non-negative
assert all(score >= 0.0 for score in scores)
# Documents containing the term should have positive scores
assert scores[0] > 0.0 # contains "cat"
assert scores[1] > 0.0 # contains "cat"
assert scores[2] == 0.0 # doesn't contain "cat"
def test_compute_tfidf_duplicate_terms_in_query(self):
"""Test TF-IDF with duplicate terms in query."""
documents = ["cat dog bird", "cat cat dog", "bird bird bird"]
query_terms = ["cat", "cat", "dog"] # "cat" appears twice
scores = compute_tfidf(documents, query_terms)
# Should handle duplicate query terms gracefully
assert len(scores) == 3
assert all(isinstance(score, float) for score in scores)
# First two documents should have positive scores
assert scores[0] > 0.0
assert scores[1] > 0.0
# Third document only contains "bird", so should have zero score
assert scores[2] == 0.0
def test_compute_bm25_duplicate_terms_in_query(self):
"""Test BM25 with duplicate terms in query."""
documents = ["cat dog bird", "cat cat dog", "bird bird bird"]
query_terms = ["cat", "cat", "dog"] # "cat" appears twice
scores = compute_bm25(documents, query_terms)
# Should handle duplicate query terms gracefully
assert len(scores) == 3
assert all(isinstance(score, float) for score in scores)
# First two documents should have positive scores
assert scores[0] > 0.0
assert scores[1] > 0.0
# Third document only contains "bird", so should have zero score
assert scores[2] == 0.0

View file

@ -0,0 +1,392 @@
import json
from pathlib import Path
from unittest.mock import MagicMock, patch
import pandas as pd
import pytest
from langflow.components.data.kb_ingest import KBIngestionComponent
from langflow.schema.data import Data
from tests.base import ComponentTestBaseWithoutClient
class TestKBIngestionComponent(ComponentTestBaseWithoutClient):
@pytest.fixture
def component_class(self):
"""Return the component class to test."""
return KBIngestionComponent
@pytest.fixture(autouse=True)
def mock_knowledge_base_path(self, tmp_path):
"""Mock the knowledge base root path directly."""
with patch("langflow.components.data.kb_ingest.KNOWLEDGE_BASES_ROOT_PATH", tmp_path):
yield
@pytest.fixture
def default_kwargs(self, tmp_path):
"""Return default kwargs for component instantiation."""
# Create a sample DataFrame
data_df = pd.DataFrame(
{"text": ["Sample text 1", "Sample text 2"], "title": ["Title 1", "Title 2"], "category": ["cat1", "cat2"]}
)
# Create column configuration
column_config = [
{"column_name": "text", "vectorize": True, "identifier": False},
{"column_name": "title", "vectorize": False, "identifier": False},
{"column_name": "category", "vectorize": False, "identifier": True},
]
# Create knowledge base directory
kb_name = "test_kb"
kb_path = tmp_path / kb_name
kb_path.mkdir(exist_ok=True)
# Create embedding metadata file
metadata = {
"embedding_provider": "HuggingFace",
"embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
"api_key": None,
"api_key_used": False,
"chunk_size": 1000,
"created_at": "2024-01-01T00:00:00Z",
}
(kb_path / "embedding_metadata.json").write_text(json.dumps(metadata))
return {
"knowledge_base": kb_name,
"input_df": data_df,
"column_config": column_config,
"chunk_size": 1000,
"kb_root_path": str(tmp_path),
"api_key": None,
"allow_duplicates": False,
"silent_errors": False,
}
@pytest.fixture
def file_names_mapping(self):
"""Return file names mapping for version testing."""
# This is a new component, so it doesn't exist in older versions
return []
def test_validate_column_config_valid(self, component_class, default_kwargs):
"""Test column configuration validation with valid config."""
component = component_class(**default_kwargs)
data_df = default_kwargs["input_df"]
config_list = component._validate_column_config(data_df)
assert len(config_list) == 3
assert config_list[0]["column_name"] == "text"
assert config_list[0]["vectorize"] is True
def test_validate_column_config_invalid_column(self, component_class, default_kwargs):
"""Test column configuration validation with invalid column name."""
# Modify column config to include non-existent column
invalid_config = [{"column_name": "nonexistent", "vectorize": True, "identifier": False}]
default_kwargs["column_config"] = invalid_config
component = component_class(**default_kwargs)
data_df = default_kwargs["input_df"]
with pytest.raises(ValueError, match="Column 'nonexistent' not found in DataFrame"):
component._validate_column_config(data_df)
def test_validate_column_config_silent_errors(self, component_class, default_kwargs):
"""Test column configuration validation with silent errors enabled."""
# Modify column config to include non-existent column
invalid_config = [{"column_name": "nonexistent", "vectorize": True, "identifier": False}]
default_kwargs["column_config"] = invalid_config
default_kwargs["silent_errors"] = True
component = component_class(**default_kwargs)
data_df = default_kwargs["input_df"]
# Should not raise exception with silent_errors=True
config_list = component._validate_column_config(data_df)
assert isinstance(config_list, list)
def test_get_embedding_provider(self, component_class, default_kwargs):
"""Test embedding provider detection."""
component = component_class(**default_kwargs)
# Test OpenAI provider
assert component._get_embedding_provider("text-embedding-ada-002") == "OpenAI"
# Test HuggingFace provider
assert component._get_embedding_provider("sentence-transformers/all-MiniLM-L6-v2") == "HuggingFace"
# Test Cohere provider
assert component._get_embedding_provider("embed-english-v3.0") == "Cohere"
# Test custom provider
assert component._get_embedding_provider("custom-model") == "Custom"
@patch("langchain_huggingface.HuggingFaceEmbeddings")
def test_build_embeddings_huggingface(self, mock_hf_embeddings, component_class, default_kwargs):
"""Test building HuggingFace embeddings."""
component = component_class(**default_kwargs)
mock_embeddings = MagicMock()
mock_hf_embeddings.return_value = mock_embeddings
result = component._build_embeddings("sentence-transformers/all-MiniLM-L6-v2", None)
mock_hf_embeddings.assert_called_once_with(model="sentence-transformers/all-MiniLM-L6-v2")
assert result == mock_embeddings
@patch("langchain_openai.OpenAIEmbeddings")
def test_build_embeddings_openai(self, mock_openai_embeddings, component_class, default_kwargs):
"""Test building OpenAI embeddings."""
component = component_class(**default_kwargs)
mock_embeddings = MagicMock()
mock_openai_embeddings.return_value = mock_embeddings
result = component._build_embeddings("text-embedding-ada-002", "test-api-key")
mock_openai_embeddings.assert_called_once_with(
model="text-embedding-ada-002", api_key="test-api-key", chunk_size=1000
)
assert result == mock_embeddings
def test_build_embeddings_openai_no_key(self, component_class, default_kwargs):
"""Test building OpenAI embeddings without API key raises error."""
component = component_class(**default_kwargs)
with pytest.raises(ValueError, match="OpenAI API key is required"):
component._build_embeddings("text-embedding-ada-002", None)
@patch("langchain_cohere.CohereEmbeddings")
def test_build_embeddings_cohere(self, mock_cohere_embeddings, component_class, default_kwargs):
"""Test building Cohere embeddings."""
component = component_class(**default_kwargs)
mock_embeddings = MagicMock()
mock_cohere_embeddings.return_value = mock_embeddings
result = component._build_embeddings("embed-english-v3.0", "test-api-key")
mock_cohere_embeddings.assert_called_once_with(model="embed-english-v3.0", cohere_api_key="test-api-key")
assert result == mock_embeddings
def test_build_embeddings_cohere_no_key(self, component_class, default_kwargs):
"""Test building Cohere embeddings without API key raises error."""
component = component_class(**default_kwargs)
with pytest.raises(ValueError, match="Cohere API key is required"):
component._build_embeddings("embed-english-v3.0", None)
def test_build_embeddings_custom_not_supported(self, component_class, default_kwargs):
"""Test building custom embeddings raises NotImplementedError."""
component = component_class(**default_kwargs)
with pytest.raises(NotImplementedError, match="Custom embedding models not yet supported"):
component._build_embeddings("custom-model", "test-key")
@patch("langflow.components.data.kb_ingest.get_settings_service")
@patch("langflow.components.data.kb_ingest.encrypt_api_key")
def test_build_embedding_metadata(self, mock_encrypt, mock_get_settings, component_class, default_kwargs):
"""Test building embedding metadata."""
component = component_class(**default_kwargs)
mock_settings = MagicMock()
mock_get_settings.return_value = mock_settings
mock_encrypt.return_value = "encrypted_key"
metadata = component._build_embedding_metadata("sentence-transformers/all-MiniLM-L6-v2", "test-key")
assert metadata["embedding_provider"] == "HuggingFace"
assert metadata["embedding_model"] == "sentence-transformers/all-MiniLM-L6-v2"
assert metadata["api_key"] == "encrypted_key"
assert metadata["api_key_used"] is True
assert metadata["chunk_size"] == 1000
assert "created_at" in metadata
def test_build_column_metadata(self, component_class, default_kwargs):
"""Test building column metadata."""
component = component_class(**default_kwargs)
data_df = default_kwargs["input_df"]
config_list = default_kwargs["column_config"]
metadata = component._build_column_metadata(config_list, data_df)
assert metadata["total_columns"] == 3
assert metadata["mapped_columns"] == 3
assert metadata["unmapped_columns"] == 0
assert len(metadata["columns"]) == 3
assert "text" in metadata["summary"]["vectorized_columns"]
assert "category" in metadata["summary"]["identifier_columns"]
def test_convert_df_to_data_objects(self, component_class, default_kwargs):
"""Test converting DataFrame to Data objects."""
component = component_class(**default_kwargs)
data_df = default_kwargs["input_df"]
config_list = default_kwargs["column_config"]
# Mock Chroma to avoid actual vector store operations
with patch("langflow.components.data.kb_ingest.Chroma") as mock_chroma:
mock_chroma_instance = MagicMock()
mock_chroma_instance.get.return_value = {"metadatas": []}
mock_chroma.return_value = mock_chroma_instance
data_objects = component._convert_df_to_data_objects(data_df, config_list)
assert len(data_objects) == 2
assert all(isinstance(obj, Data) for obj in data_objects)
# Check first data object
first_obj = data_objects[0]
assert "text" in first_obj.data
assert "title" in first_obj.data
assert "category" in first_obj.data
assert "_id" in first_obj.data
def test_convert_df_to_data_objects_no_duplicates(self, component_class, default_kwargs):
"""Test converting DataFrame to Data objects with duplicate prevention."""
default_kwargs["allow_duplicates"] = False
component = component_class(**default_kwargs)
data_df = default_kwargs["input_df"]
config_list = default_kwargs["column_config"]
# Mock Chroma with existing hash
with patch("langflow.components.data.kb_ingest.Chroma") as mock_chroma:
# Simulate existing document with same hash
existing_hash = "some_existing_hash"
mock_chroma_instance = MagicMock()
mock_chroma_instance.get.return_value = {"metadatas": [{"_id": existing_hash}]}
mock_chroma.return_value = mock_chroma_instance
# Mock hashlib to return the existing hash for first row
with patch("langflow.components.data.kb_ingest.hashlib.sha256") as mock_hash:
mock_hash_obj = MagicMock()
mock_hash_obj.hexdigest.side_effect = [existing_hash, "different_hash"]
mock_hash.return_value = mock_hash_obj
data_objects = component._convert_df_to_data_objects(data_df, config_list)
# Should only return one object (second row) since first is duplicate
assert len(data_objects) == 1
def test_is_valid_collection_name(self, component_class, default_kwargs):
"""Test collection name validation."""
component = component_class(**default_kwargs)
# Valid names
assert component.is_valid_collection_name("valid_name") is True
assert component.is_valid_collection_name("valid-name") is True
assert component.is_valid_collection_name("ValidName123") is True
# Invalid names
assert component.is_valid_collection_name("ab") is False # Too short
assert component.is_valid_collection_name("a" * 64) is False # Too long
assert component.is_valid_collection_name("_invalid") is False # Starts with underscore
assert component.is_valid_collection_name("invalid_") is False # Ends with underscore
assert component.is_valid_collection_name("invalid@name") is False # Invalid character
@patch("langflow.components.data.kb_ingest.json.loads")
@patch("langflow.components.data.kb_ingest.decrypt_api_key")
def test_build_kb_info_success(self, mock_decrypt, mock_json_loads, component_class, default_kwargs):
"""Test successful KB info building."""
component = component_class(**default_kwargs)
# Mock metadata loading
mock_json_loads.return_value = {
"embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
"api_key": "encrypted_key",
}
mock_decrypt.return_value = "decrypted_key"
# Mock vector store creation
with patch.object(component, "_create_vector_store"), patch.object(component, "_save_kb_files"):
result = component.build_kb_info()
assert isinstance(result, Data)
assert "kb_id" in result.data
assert "kb_name" in result.data
assert "rows" in result.data
assert result.data["rows"] == 2
def test_build_kb_info_with_silent_errors(self, component_class, default_kwargs):
"""Test KB info building with silent errors enabled."""
default_kwargs["silent_errors"] = True
component = component_class(**default_kwargs)
# Remove the metadata file to cause an error
kb_path = Path(default_kwargs["kb_root_path"]) / default_kwargs["knowledge_base"]
metadata_file = kb_path / "embedding_metadata.json"
if metadata_file.exists():
metadata_file.unlink()
# Should not raise exception with silent_errors=True
result = component.build_kb_info()
assert isinstance(result, Data)
assert "error" in result.data
def test_get_knowledge_bases(self, component_class, default_kwargs, tmp_path):
"""Test getting list of knowledge bases."""
component = component_class(**default_kwargs)
# Create additional test directories
(tmp_path / "kb1").mkdir()
(tmp_path / "kb2").mkdir()
(tmp_path / ".hidden").mkdir() # Should be ignored
kb_list = component._get_knowledge_bases()
assert "test_kb" in kb_list
assert "kb1" in kb_list
assert "kb2" in kb_list
assert ".hidden" not in kb_list
@patch("langflow.components.data.kb_ingest.Path.exists")
def test_get_knowledge_bases_no_path(self, mock_exists, component_class, default_kwargs):
"""Test getting knowledge bases when path doesn't exist."""
component = component_class(**default_kwargs)
mock_exists.return_value = False
kb_list = component._get_knowledge_bases()
assert kb_list == []
def test_update_build_config_new_kb(self, component_class, default_kwargs):
"""Test updating build config for new knowledge base creation."""
component = component_class(**default_kwargs)
build_config = {"knowledge_base": {"value": None, "options": []}}
field_value = {
"01_new_kb_name": "new_test_kb",
"02_embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
"03_api_key": None,
}
# Mock embedding validation
with (
patch.object(component, "_build_embeddings") as mock_build_emb,
patch.object(component, "_save_embedding_metadata"),
patch.object(component, "_get_knowledge_bases") as mock_get_kbs,
):
mock_embeddings = MagicMock()
mock_embeddings.embed_query.return_value = [0.1, 0.2, 0.3]
mock_build_emb.return_value = mock_embeddings
mock_get_kbs.return_value = ["new_test_kb"]
result = component.update_build_config(build_config, field_value, "knowledge_base")
assert result["knowledge_base"]["value"] == "new_test_kb"
assert "new_test_kb" in result["knowledge_base"]["options"]
def test_update_build_config_invalid_kb_name(self, component_class, default_kwargs):
"""Test updating build config with invalid KB name."""
component = component_class(**default_kwargs)
build_config = {"knowledge_base": {"value": None, "options": []}}
field_value = {
"01_new_kb_name": "invalid@name", # Invalid character
"02_embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
"03_api_key": None,
}
with pytest.raises(ValueError, match="Invalid knowledge base name"):
component.update_build_config(build_config, field_value, "knowledge_base")

View file

@ -0,0 +1,368 @@
import contextlib
import json
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from langflow.components.data.kb_retrieval import KBRetrievalComponent
from tests.base import ComponentTestBaseWithoutClient
class TestKBRetrievalComponent(ComponentTestBaseWithoutClient):
@pytest.fixture
def component_class(self):
"""Return the component class to test."""
return KBRetrievalComponent
@pytest.fixture(autouse=True)
def mock_knowledge_base_path(self, tmp_path):
"""Mock the knowledge base root path directly."""
with patch("langflow.components.data.kb_retrieval.KNOWLEDGE_BASES_ROOT_PATH", tmp_path):
yield
@pytest.fixture
def default_kwargs(self, tmp_path):
"""Return default kwargs for component instantiation."""
# Create knowledge base directory structure
kb_name = "test_kb"
kb_path = tmp_path / kb_name
kb_path.mkdir(exist_ok=True)
# Create embedding metadata file
metadata = {
"embedding_provider": "HuggingFace",
"embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
"api_key": None,
"api_key_used": False,
"chunk_size": 1000,
"created_at": "2024-01-01T00:00:00Z",
}
(kb_path / "embedding_metadata.json").write_text(json.dumps(metadata))
return {
"knowledge_base": kb_name,
"kb_root_path": str(tmp_path),
"api_key": None,
"search_query": "",
"top_k": 5,
"include_embeddings": True,
}
@pytest.fixture
def file_names_mapping(self):
"""Return file names mapping for version testing."""
# This is a new component, so it doesn't exist in older versions
return []
def test_get_knowledge_bases(self, component_class, default_kwargs, tmp_path):
"""Test getting list of knowledge bases."""
component = component_class(**default_kwargs)
# Create additional test directories
(tmp_path / "kb1").mkdir()
(tmp_path / "kb2").mkdir()
(tmp_path / ".hidden").mkdir() # Should be ignored
kb_list = component._get_knowledge_bases()
assert "test_kb" in kb_list
assert "kb1" in kb_list
assert "kb2" in kb_list
assert ".hidden" not in kb_list
@patch("langflow.components.data.kb_retrieval.Path.exists")
def test_get_knowledge_bases_no_path(self, mock_exists, component_class, default_kwargs):
"""Test getting knowledge bases when path doesn't exist."""
component = component_class(**default_kwargs)
mock_exists.return_value = False
kb_list = component._get_knowledge_bases()
assert kb_list == []
def test_update_build_config(self, component_class, default_kwargs, tmp_path):
"""Test updating build configuration."""
component = component_class(**default_kwargs)
# Create additional KB directories
(tmp_path / "kb1").mkdir()
(tmp_path / "kb2").mkdir()
build_config = {"knowledge_base": {"value": "test_kb", "options": []}}
result = component.update_build_config(build_config, None, "knowledge_base")
assert "test_kb" in result["knowledge_base"]["options"]
assert "kb1" in result["knowledge_base"]["options"]
assert "kb2" in result["knowledge_base"]["options"]
def test_update_build_config_invalid_kb(self, component_class, default_kwargs):
"""Test updating build config when selected KB is not available."""
component = component_class(**default_kwargs)
build_config = {"knowledge_base": {"value": "nonexistent_kb", "options": ["test_kb"]}}
result = component.update_build_config(build_config, None, "knowledge_base")
assert result["knowledge_base"]["value"] is None
def test_get_kb_metadata_success(self, component_class, default_kwargs):
"""Test successful metadata loading."""
component = component_class(**default_kwargs)
kb_path = Path(default_kwargs["kb_root_path"]) / default_kwargs["knowledge_base"]
with patch("langflow.components.data.kb_retrieval.decrypt_api_key") as mock_decrypt:
mock_decrypt.return_value = "decrypted_key"
metadata = component._get_kb_metadata(kb_path)
assert metadata["embedding_provider"] == "HuggingFace"
assert metadata["embedding_model"] == "sentence-transformers/all-MiniLM-L6-v2"
assert "chunk_size" in metadata
def test_get_kb_metadata_no_file(self, component_class, default_kwargs, tmp_path):
"""Test metadata loading when file doesn't exist."""
component = component_class(**default_kwargs)
nonexistent_path = tmp_path / "nonexistent"
nonexistent_path.mkdir()
metadata = component._get_kb_metadata(nonexistent_path)
assert metadata == {}
def test_get_kb_metadata_json_error(self, component_class, default_kwargs, tmp_path):
"""Test metadata loading with invalid JSON."""
component = component_class(**default_kwargs)
kb_path = tmp_path / "invalid_json_kb"
kb_path.mkdir()
# Create invalid JSON file
(kb_path / "embedding_metadata.json").write_text("invalid json content")
metadata = component._get_kb_metadata(kb_path)
assert metadata == {}
def test_get_kb_metadata_decrypt_error(self, component_class, default_kwargs, tmp_path):
"""Test metadata loading with decryption error."""
component = component_class(**default_kwargs)
kb_path = tmp_path / "decrypt_error_kb"
kb_path.mkdir()
# Create metadata with encrypted key
metadata = {
"embedding_provider": "OpenAI",
"embedding_model": "text-embedding-ada-002",
"api_key": "encrypted_key",
"chunk_size": 1000,
}
(kb_path / "embedding_metadata.json").write_text(json.dumps(metadata))
with patch("langflow.components.data.kb_retrieval.decrypt_api_key") as mock_decrypt:
mock_decrypt.side_effect = ValueError("Decryption failed")
result = component._get_kb_metadata(kb_path)
assert result["api_key"] is None
@patch("langchain_huggingface.HuggingFaceEmbeddings")
def test_build_embeddings_huggingface(self, mock_hf_embeddings, component_class, default_kwargs):
"""Test building HuggingFace embeddings."""
component = component_class(**default_kwargs)
metadata = {
"embedding_provider": "HuggingFace",
"embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
"chunk_size": 1000,
}
mock_embeddings = MagicMock()
mock_hf_embeddings.return_value = mock_embeddings
result = component._build_embeddings(metadata)
mock_hf_embeddings.assert_called_once_with(model="sentence-transformers/all-MiniLM-L6-v2")
assert result == mock_embeddings
@patch("langchain_openai.OpenAIEmbeddings")
def test_build_embeddings_openai(self, mock_openai_embeddings, component_class, default_kwargs):
"""Test building OpenAI embeddings."""
component = component_class(**default_kwargs)
metadata = {
"embedding_provider": "OpenAI",
"embedding_model": "text-embedding-ada-002",
"api_key": "test-api-key",
"chunk_size": 1000,
}
mock_embeddings = MagicMock()
mock_openai_embeddings.return_value = mock_embeddings
result = component._build_embeddings(metadata)
mock_openai_embeddings.assert_called_once_with(
model="text-embedding-ada-002", api_key="test-api-key", chunk_size=1000
)
assert result == mock_embeddings
def test_build_embeddings_openai_no_key(self, component_class, default_kwargs):
"""Test building OpenAI embeddings without API key raises error."""
component = component_class(**default_kwargs)
metadata = {
"embedding_provider": "OpenAI",
"embedding_model": "text-embedding-ada-002",
"api_key": None,
"chunk_size": 1000,
}
with pytest.raises(ValueError, match="OpenAI API key is required"):
component._build_embeddings(metadata)
@patch("langchain_cohere.CohereEmbeddings")
def test_build_embeddings_cohere(self, mock_cohere_embeddings, component_class, default_kwargs):
"""Test building Cohere embeddings."""
component = component_class(**default_kwargs)
metadata = {
"embedding_provider": "Cohere",
"embedding_model": "embed-english-v3.0",
"api_key": "test-api-key",
"chunk_size": 1000,
}
mock_embeddings = MagicMock()
mock_cohere_embeddings.return_value = mock_embeddings
result = component._build_embeddings(metadata)
mock_cohere_embeddings.assert_called_once_with(model="embed-english-v3.0", cohere_api_key="test-api-key")
assert result == mock_embeddings
def test_build_embeddings_cohere_no_key(self, component_class, default_kwargs):
"""Test building Cohere embeddings without API key raises error."""
component = component_class(**default_kwargs)
metadata = {
"embedding_provider": "Cohere",
"embedding_model": "embed-english-v3.0",
"api_key": None,
"chunk_size": 1000,
}
with pytest.raises(ValueError, match="Cohere API key is required"):
component._build_embeddings(metadata)
def test_build_embeddings_custom_not_supported(self, component_class, default_kwargs):
"""Test building custom embeddings raises NotImplementedError."""
component = component_class(**default_kwargs)
metadata = {"embedding_provider": "Custom", "embedding_model": "custom-model", "api_key": "test-key"}
with pytest.raises(NotImplementedError, match="Custom embedding models not yet supported"):
component._build_embeddings(metadata)
def test_build_embeddings_unsupported_provider(self, component_class, default_kwargs):
"""Test building embeddings with unsupported provider raises NotImplementedError."""
component = component_class(**default_kwargs)
metadata = {"embedding_provider": "UnsupportedProvider", "embedding_model": "some-model", "api_key": "test-key"}
with pytest.raises(NotImplementedError, match="Embedding provider 'UnsupportedProvider' is not supported"):
component._build_embeddings(metadata)
def test_build_embeddings_with_user_api_key(self, component_class, default_kwargs):
"""Test that user-provided API key overrides stored one."""
# Create a mock secret input
mock_secret = MagicMock()
mock_secret.get_secret_value.return_value = "user-provided-key"
default_kwargs["api_key"] = mock_secret
component = component_class(**default_kwargs)
metadata = {
"embedding_provider": "OpenAI",
"embedding_model": "text-embedding-ada-002",
"api_key": "stored-key",
"chunk_size": 1000,
}
with patch("langchain_openai.OpenAIEmbeddings") as mock_openai:
mock_embeddings = MagicMock()
mock_openai.return_value = mock_embeddings
component._build_embeddings(metadata)
mock_openai.assert_called_once_with(
model="text-embedding-ada-002", api_key="user-provided-key", chunk_size=1000
)
def test_get_chroma_kb_data_no_metadata(self, component_class, default_kwargs, tmp_path):
"""Test retrieving data when metadata is missing."""
# Remove metadata file
kb_path = tmp_path / default_kwargs["knowledge_base"]
metadata_file = kb_path / "embedding_metadata.json"
if metadata_file.exists():
metadata_file.unlink()
component = component_class(**default_kwargs)
with pytest.raises(ValueError, match="Metadata not found for knowledge base"):
component.get_chroma_kb_data()
def test_get_chroma_kb_data_path_construction(self, component_class, default_kwargs):
"""Test that get_chroma_kb_data constructs the correct paths."""
component = component_class(**default_kwargs)
# Test that the component correctly builds the KB path
assert component.kb_root_path == default_kwargs["kb_root_path"]
assert component.knowledge_base == default_kwargs["knowledge_base"]
# Test that paths are correctly expanded
expanded_path = Path(component.kb_root_path).expanduser()
assert expanded_path.exists() # tmp_path should exist
# Verify method exists with correct parameters
assert hasattr(component, "get_chroma_kb_data")
assert hasattr(component, "search_query")
assert hasattr(component, "top_k")
assert hasattr(component, "include_embeddings")
def test_get_chroma_kb_data_method_exists(self, component_class, default_kwargs):
"""Test that get_chroma_kb_data method exists and can be called."""
component = component_class(**default_kwargs)
# Just verify the method exists and has the right signature
assert hasattr(component, "get_chroma_kb_data"), "Component should have get_chroma_kb_data method"
# Mock all external calls to avoid integration issues
with (
patch.object(component, "_get_kb_metadata") as mock_get_metadata,
patch.object(component, "_build_embeddings") as mock_build_embeddings,
patch("langchain_chroma.Chroma"),
):
mock_get_metadata.return_value = {"embedding_provider": "HuggingFace", "embedding_model": "test-model"}
mock_build_embeddings.return_value = MagicMock()
# This is a unit test focused on the component's internal logic
with contextlib.suppress(Exception):
component.get_chroma_kb_data()
# Verify internal methods were called
mock_get_metadata.assert_called_once()
mock_build_embeddings.assert_called_once()
def test_include_embeddings_parameter(self, component_class, default_kwargs):
"""Test that include_embeddings parameter is properly set."""
# Test with embeddings enabled
default_kwargs["include_embeddings"] = True
component = component_class(**default_kwargs)
assert component.include_embeddings is True
# Test with embeddings disabled
default_kwargs["include_embeddings"] = False
component = component_class(**default_kwargs)
assert component.include_embeddings is False

View file

@ -7,10 +7,12 @@ module.exports = {
"\\.(css|less|scss|sass)$": "identity-obj-proxy",
},
setupFilesAfterEnv: ["<rootDir>/src/setupTests.ts"],
setupFiles: ["<rootDir>/jest.setup.js"],
testMatch: [
"<rootDir>/src/**/__tests__/**/*.{ts,tsx}",
"<rootDir>/src/**/__tests__/**/*.{test,spec}.{ts,tsx}",
"<rootDir>/src/**/*.{test,spec}.{ts,tsx}",
],
testPathIgnorePatterns: ["/node_modules/", "test-utils.tsx"],
transform: {
"^.+\\.(ts|tsx)$": "ts-jest",
},

View file

@ -0,0 +1,38 @@
// Jest setup file to mock globals and Vite-specific syntax
// Mock import.meta
global.import = {
meta: {
env: {
CI: process.env.CI || false,
NODE_ENV: "test",
MODE: "test",
DEV: false,
PROD: false,
VITE_API_URL: "http://localhost:7860",
},
},
};
// Mock crypto for Node.js environment
if (typeof global.crypto === "undefined") {
const { webcrypto } = require("crypto");
global.crypto = webcrypto;
}
// Mock URL if not available
if (typeof global.URL === "undefined") {
global.URL = require("url").URL;
}
// Mock localStorage
const localStorageMock = {
getItem: jest.fn(),
setItem: jest.fn(),
removeItem: jest.fn(),
clear: jest.fn(),
};
global.localStorage = localStorageMock;
// Mock sessionStorage
global.sessionStorage = localStorageMock;

View file

@ -1,5 +1,6 @@
import { useState } from "react";
import { mutateTemplate } from "@/CustomNodes/helpers/mutate-template";
import type { handleOnNewValueType } from "@/CustomNodes/hooks/use-handle-new-value";
import { ParameterRenderComponent } from "@/components/core/parameterRenderComponent";
import { Button } from "@/components/ui/button";
import {
@ -26,10 +27,6 @@ interface NodeDialogProps {
nodeClass: APIClassType;
}
interface ValueObject {
value: string;
}
export const NodeDialog: React.FC<NodeDialogProps> = ({
open,
onClose,
@ -44,6 +41,7 @@ export const NodeDialog: React.FC<NodeDialogProps> = ({
const nodes = useFlowStore((state) => state.nodes);
const setNode = useFlowStore((state) => state.setNode);
const setErrorData = useAlertStore((state) => state.setErrorData);
const setSuccessData = useAlertStore((state) => state.setSuccessData);
const postTemplateValue = usePostTemplateValue({
parameterId: name,
@ -71,14 +69,41 @@ export const NodeDialog: React.FC<NodeDialogProps> = ({
setIsLoading(false);
};
const updateFieldValue = (value: string | ValueObject, fieldKey: string) => {
const newValue = typeof value === "object" ? value.value : value;
const updateFieldValue = (
changes: Parameters<handleOnNewValueType>[0],
fieldKey: string,
) => {
// Handle both legacy string format and new object format
const newValue =
typeof changes === "object" && changes !== null ? changes.value : changes;
const targetNode = nodes.find((node) => node.id === nodeId);
if (!targetNode || !name) return;
// Update the main field value
targetNode.data.node.template[name].dialog_inputs.fields.data.node.template[
fieldKey
].value = newValue;
// Handle additional properties like load_from_db for InputGlobalComponent
if (typeof changes === "object" && changes !== null) {
const fieldTemplate =
targetNode.data.node.template[name].dialog_inputs.fields.data.node
.template[fieldKey];
// Update load_from_db if present (for InputGlobalComponent)
if ("load_from_db" in changes) {
fieldTemplate.load_from_db = changes.load_from_db;
}
// Handle any other properties that might be needed
Object.keys(changes).forEach((key) => {
if (key !== "value" && key in fieldTemplate) {
fieldTemplate[key] = changes[key];
}
});
}
setNode(nodeId, targetNode);
setFieldValues((prev) => ({ ...prev, [fieldKey]: newValue }));
@ -110,6 +135,48 @@ export const NodeDialog: React.FC<NodeDialogProps> = ({
onClose();
};
const handleSuccessCallback = () => {
// Check if this is a knowledge base creation
const isKnowledgeBaseCreation =
dialogNodeData?.display_name === "Create Knowledge" ||
dialogNodeData?.name === "create_knowledge_base" ||
(dialogNodeData?.description &&
dialogNodeData.description.toLowerCase().includes("knowledge"));
if (isKnowledgeBaseCreation) {
// Get the knowledge base name from field values
const knowledgeBaseName =
fieldValues["01_new_kb_name"] ||
fieldValues["new_kb_name"] ||
"Knowledge Base";
setSuccessData({
title: `Knowledge Base "${knowledgeBaseName}" created successfully!`,
});
}
// Only close dialog after success and delay for Astra database tracking
if (nodeId.toLowerCase().includes("astra") && name === "database_name") {
const {
cloud_provider: cloudProvider,
new_database_name: databaseName,
...otherFields
} = fieldValues;
track("Database Created", {
nodeId,
cloudProvider,
databaseName,
...otherFields,
});
setTimeout(() => {
handleCloseDialog();
}, 5000);
} else {
handleCloseDialog();
}
};
const handleSubmitDialog = async () => {
// Validate required fields first
const missingRequiredFields = Object.entries(dialogTemplate)
@ -143,27 +210,9 @@ export const NodeDialog: React.FC<NodeDialogProps> = ({
postTemplateValue,
handleErrorData,
name,
handleCloseDialog,
handleSuccessCallback,
nodeClass.tool_mode,
);
if (nodeId.toLowerCase().includes("astra") && name === "database_name") {
const {
cloud_provider: cloudProvider,
new_database_name: databaseName,
...otherFields
} = fieldValues;
track("Database Created", {
nodeId,
cloudProvider,
databaseName,
...otherFields,
});
}
setTimeout(() => {
handleCloseDialog();
}, 5000);
};
// Render
@ -198,8 +247,8 @@ export const NodeDialog: React.FC<NodeDialogProps> = ({
})}
</div>
<ParameterRenderComponent
handleOnNewValue={(value: string) =>
updateFieldValue(value, fieldKey)
handleOnNewValue={(changes) =>
updateFieldValue(changes, fieldKey)
}
name={fieldKey}
nodeId={nodeId}

View file

@ -1,6 +1,5 @@
import { PopoverAnchor } from "@radix-ui/react-popover";
import Fuse from "fuse.js";
import { cloneDeep } from "lodash";
import { type ChangeEvent, useEffect, useMemo, useRef, useState } from "react";
import NodeDialog from "@/CustomNodes/GenericNode/components/NodeDialogComponent";
import { mutateTemplate } from "@/CustomNodes/helpers/mutate-template";
@ -305,7 +304,9 @@ export default function Dropdown({
disabled ||
(Object.keys(validOptions).length === 0 &&
!combobox &&
!dialogInputs?.fields?.data?.node?.template)
!dialogInputs?.fields?.data?.node?.template &&
!hasRefreshButton &&
!dialogInputs?.fields)
}
variant="primary"
size="xs"
@ -489,41 +490,38 @@ export default function Dropdown({
<CommandSeparator />
{dialogInputs && dialogInputs?.fields && (
<CommandGroup className="p-0">
<CommandItem className="flex cursor-pointer items-center justify-start gap-2 truncate rounded-none py-2.5 text-xs font-semibold text-muted-foreground">
<Button
className="w-full"
unstyled
onClick={() => {
setOpenDialog(true);
}}
>
<div className="flex items-center gap-2 pl-1">
<ForwardedIconComponent
name="Plus"
className="h-3 w-3 text-primary"
/>
{`New ${firstWord}`}
</div>
</Button>
</CommandItem>
<CommandItem className="flex cursor-pointer items-center justify-start gap-2 truncate rounded-none py-2.5 text-xs font-semibold text-muted-foreground">
<Button
className="w-full"
unstyled
data-testid={`refresh-dropdown-list-${name}`}
onClick={() => {
handleRefreshButtonPress();
}}
>
<div className="flex items-center gap-2 pl-1">
<ForwardedIconComponent
name="RefreshCcw"
className={cn("refresh-icon h-3 w-3 text-primary")}
/>
Refresh list
</div>
</Button>
</CommandItem>
<Button
className="flex w-full cursor-pointer items-center justify-start gap-2 truncate rounded-none p-2.5 text-xs font-semibold text-muted-foreground hover:bg-muted hover:text-foreground"
unstyled
onClick={() => {
setOpenDialog(true);
}}
>
<div className="flex items-center gap-2 pl-1">
<ForwardedIconComponent
name="Plus"
className="h-3 w-3 text-primary"
/>
{`New ${firstWord}`}
</div>
</Button>
<Button
className="flex w-full cursor-pointer items-center justify-start gap-2 truncate rounded-none p-2.5 text-xs font-semibold text-muted-foreground hover:bg-muted hover:text-foreground"
unstyled
data-testid={`refresh-dropdown-list-${name}`}
onClick={() => {
handleRefreshButtonPress();
}}
>
<div className="flex items-center gap-2 pl-1">
<ForwardedIconComponent
name="RefreshCcw"
className={cn("refresh-icon h-3 w-3 text-primary")}
/>
Refresh list
</div>
</Button>
<NodeDialog
open={openDialog}
dialogInputs={dialogInputs}

View file

@ -70,7 +70,7 @@ const SideBarFoldersButtonsComponent = ({
const currentFolder = pathname.split("/");
const urlWithoutPath =
pathname.split("/").length < (ENABLE_CUSTOM_PARAM ? 5 : 4);
const checkPathFiles = pathname.includes("files");
const checkPathFiles = pathname.includes("assets");
const checkPathName = (itemId: string) => {
if (urlWithoutPath && itemId === myCollectionId && !checkPathFiles) {
@ -354,6 +354,14 @@ const SideBarFoldersButtonsComponent = ({
});
};
const handleFilesNavigation = () => {
_navigate("/assets/files");
};
const handleKnowledgeNavigation = () => {
_navigate("/assets/knowledge-bases");
};
return (
<Sidebar
collapsible={isMobile ? "offcanvas" : "none"}
@ -469,10 +477,17 @@ const SideBarFoldersButtonsComponent = ({
<SidebarFooter className="border-t">
<div className="grid w-full items-center gap-2 p-2">
{/* TODO: Remove this on cleanup */}
{ENABLE_DATASTAX_LANGFLOW && <CustomStoreButton />}
{ENABLE_DATASTAX_LANGFLOW && <CustomStoreButton />}{" "}
<SidebarMenuButton
isActive={checkPathFiles}
onClick={() => handleFilesClick?.()}
onClick={handleKnowledgeNavigation}
size="md"
className="text-sm"
>
<ForwardedIconComponent name="Library" className="h-4 w-4" />
Knowledge
</SidebarMenuButton>
<SidebarMenuButton
onClick={handleFilesNavigation}
size="md"
className="text-sm"
>

View file

@ -0,0 +1,82 @@
import { useCallback, useEffect, useMemo, useRef } from "react";
import { useGlobalVariablesStore } from "@/stores/globalVariablesStore/globalVariables";
import type { GlobalVariable } from "./types";
// Custom hook for managing global variable value existence
export const useGlobalVariableValue = (
value: string,
globalVariables: GlobalVariable[],
) => {
return useMemo(() => {
return (
globalVariables?.some((variable) => variable.name === value) ?? false
);
}, [globalVariables, value]);
};
// Custom hook for managing unavailable fields
export const useUnavailableField = (
displayName: string | undefined,
value: string,
) => {
const unavailableFields = useGlobalVariablesStore(
(state) => state.unavailableFields,
);
return useMemo(() => {
if (
displayName &&
unavailableFields &&
Object.keys(unavailableFields).includes(displayName) &&
value === ""
) {
return unavailableFields[displayName];
}
return null;
}, [unavailableFields, displayName, value]);
};
// Custom hook for handling initial load logic
export const useInitialLoad = (
disabled: boolean,
loadFromDb: boolean,
globalVariables: GlobalVariable[],
valueExists: boolean,
unavailableField: string | null,
handleOnNewValue: (
value: { value: string; load_from_db: boolean },
options?: { skipSnapshot: boolean },
) => void,
) => {
const initialLoadCompleted = useRef(false);
const handleOnNewValueRef = useRef(handleOnNewValue);
// Keep the latest handleOnNewValue reference
handleOnNewValueRef.current = handleOnNewValue;
// Handle database loading when value doesn't exist
useEffect(() => {
if (disabled || !loadFromDb || !globalVariables.length || valueExists) {
return;
}
handleOnNewValueRef.current(
{ value: "", load_from_db: false },
{ skipSnapshot: true },
);
}, [disabled, loadFromDb, globalVariables.length, valueExists]);
// Handle unavailable field initialization
useEffect(() => {
if (initialLoadCompleted.current || disabled || unavailableField === null) {
return;
}
handleOnNewValueRef.current(
{ value: unavailableField, load_from_db: true },
{ skipSnapshot: true },
);
initialLoadCompleted.current = true;
}, [unavailableField, disabled]);
};

View file

@ -1,8 +1,6 @@
import { useEffect, useMemo, useRef } from "react";
import { useEffect } from "react";
import { useGetGlobalVariables } from "@/controllers/API/queries/variables";
import GeneralDeleteConfirmationModal from "@/shared/components/delete-confirmation-modal";
import { useGlobalVariablesStore } from "@/stores/globalVariablesStore/globalVariables";
import { cn } from "../../../../../utils/utils";
import ForwardedIconComponent from "../../../../common/genericIconComponent";
import { CommandItem } from "../../../../ui/command";
@ -10,6 +8,12 @@ import GlobalVariableModal from "../../../GlobalVariableModal/GlobalVariableModa
import { getPlaceholder } from "../../helpers/get-placeholder-disabled";
import type { InputGlobalComponentType, InputProps } from "../../types";
import InputComponent from "../inputComponent";
import {
useGlobalVariableValue,
useInitialLoad,
useUnavailableField,
} from "./hooks";
import type { GlobalVariable, GlobalVariableHandlers } from "./types";
export default function InputGlobalComponent({
display_name,
@ -25,70 +29,93 @@ export default function InputGlobalComponent({
hasRefreshButton = false,
}: InputProps<string, InputGlobalComponentType>): JSX.Element {
const { data: globalVariables } = useGetGlobalVariables();
const unavailableFields = useGlobalVariablesStore(
(state) => state.unavailableFields,
// // Safely cast the data to our typed interface
const typedGlobalVariables: GlobalVariable[] = globalVariables ?? [];
const currentValue = value ?? "";
const isDisabled = disabled ?? false;
const loadFromDb = load_from_db ?? false;
// // Extract complex logic into custom hooks
const valueExists = useGlobalVariableValue(
currentValue,
typedGlobalVariables,
);
const unavailableField = useUnavailableField(display_name, currentValue);
useInitialLoad(
isDisabled,
loadFromDb,
typedGlobalVariables,
valueExists,
unavailableField,
handleOnNewValue,
);
const initialLoadCompleted = useRef(false);
const valueExists = useMemo(() => {
return (
globalVariables?.some((variable) => variable.name === value) ?? false
);
}, [globalVariables, value]);
const unavailableField = useMemo(() => {
if (
display_name &&
unavailableFields &&
Object.keys(unavailableFields).includes(display_name) &&
value === ""
) {
return unavailableFields[display_name];
}
return null;
}, [unavailableFields, display_name]);
useMemo(() => {
if (disabled) {
return;
}
if (load_from_db && globalVariables && !valueExists) {
// Clean up when selected variable no longer exists
useEffect(() => {
if (loadFromDb && currentValue && !valueExists && !isDisabled) {
handleOnNewValue(
{ value: "", load_from_db: false },
{ skipSnapshot: true },
);
}
}, [
globalVariables,
unavailableFields,
disabled,
load_from_db,
valueExists,
unavailableField,
value,
handleOnNewValue,
]);
}, [loadFromDb, currentValue, valueExists, isDisabled, handleOnNewValue]);
useEffect(() => {
if (initialLoadCompleted.current || disabled || unavailableField === null) {
return;
}
// Create handlers object for better organization
const handlers: GlobalVariableHandlers = {
// Handler for deleting global variables
handleVariableDelete: (variableName: string) => {
if (value === variableName) {
handleOnNewValue({
value: "",
load_from_db: false,
});
}
},
handleOnNewValue(
{ value: unavailableField, load_from_db: true },
{ skipSnapshot: true },
);
// Handler for selecting a global variable
handleVariableSelect: (selectedValue: string) => {
handleOnNewValue({
value: selectedValue,
load_from_db: selectedValue !== "",
});
},
initialLoadCompleted.current = true;
}, [unavailableField, disabled, load_from_db, value, handleOnNewValue]);
// Handler for input changes
handleInputChange: (inputValue: string, skipSnapshot?: boolean) => {
handleOnNewValue(
{ value: inputValue, load_from_db: false },
{ skipSnapshot },
);
},
};
function handleDelete(key: string) {
if (value === key) {
handleOnNewValue({ value: "", load_from_db: load_from_db });
}
}
// Render add new variable button
const renderAddVariableButton = () => (
<GlobalVariableModal referenceField={display_name} disabled={disabled}>
<CommandItem value="doNotFilter-addNewVariable">
<ForwardedIconComponent
name="Plus"
className={cn("mr-2 h-4 w-4 text-primary")}
aria-hidden="true"
/>
<span>Add New Variable</span>
</CommandItem>
</GlobalVariableModal>
);
// Render delete button for each option
const renderDeleteButton = (option: string) => (
<GeneralDeleteConfirmationModal
option={option}
onConfirmDelete={() => handlers.handleVariableDelete(option)}
/>
);
// // Extract options list for better readability
const variableOptions = typedGlobalVariables.map((variable) => variable.name);
const selectedOption = loadFromDb && valueExists ? currentValue : "";
return (
<InputComponent
@ -99,41 +126,15 @@ export default function InputGlobalComponent({
editNode={editNode}
disabled={disabled}
password={password ?? false}
value={value ?? ""}
options={globalVariables?.map((variable) => variable.name) ?? []}
optionsPlaceholder={"Global Variables"}
value={currentValue}
options={variableOptions}
optionsPlaceholder="Global Variables"
optionsIcon="Globe"
optionsButton={
<GlobalVariableModal referenceField={display_name} disabled={disabled}>
<CommandItem value="doNotFilter-addNewVariable">
<ForwardedIconComponent
name="Plus"
className={cn("mr-2 h-4 w-4 text-primary")}
aria-hidden="true"
/>
<span>Add New Variable</span>
</CommandItem>
</GlobalVariableModal>
}
optionButton={(option) => (
<GeneralDeleteConfirmationModal
option={option}
onConfirmDelete={() => handleDelete(option)}
/>
)}
selectedOption={load_from_db && valueExists ? value : ""}
setSelectedOption={(value) => {
handleOnNewValue({
value: value,
load_from_db: value !== "" ? true : false,
});
}}
onChange={(value, skipSnapshot) => {
handleOnNewValue(
{ value: value, load_from_db: false },
{ skipSnapshot },
);
}}
optionsButton={renderAddVariableButton()}
optionButton={renderDeleteButton}
selectedOption={selectedOption}
setSelectedOption={handlers.handleVariableSelect}
onChange={handlers.handleInputChange}
isToolMode={isToolMode}
hasRefreshButton={hasRefreshButton}
/>

View file

@ -0,0 +1,14 @@
export interface GlobalVariable {
name: string;
// Add other properties as needed
}
export interface UnavailableFields {
[key: string]: string;
}
export interface GlobalVariableHandlers {
handleVariableDelete: (variableName: string) => void;
handleVariableSelect: (selectedValue: string) => void;
handleInputChange: (inputValue: string, skipSnapshot?: boolean) => void;
}

View file

@ -18,6 +18,7 @@ export default function TableAutoCellRender({
colDef,
formatter,
api,
...props
}: CustomCellRender) {
function getCellType() {
let format: string = formatter ? formatter : typeof value;
@ -92,7 +93,12 @@ export default function TableAutoCellRender({
}}
editNode={true}
id={"toggle" + colDef?.colId + uniqueId()}
disabled={false}
disabled={
colDef?.cellRendererParams?.isSingleToggleColumn &&
colDef?.cellRendererParams?.checkSingleToggleEditable
? !colDef.cellRendererParams.checkSingleToggleEditable(props)
: false
}
/>
) : (
<Badge

View file

@ -54,6 +54,53 @@ const TableComponent = forwardRef<
},
ref,
) => {
const isSingleToggleRowEditable = (
colField: string,
rowData: any,
currentRowValue: any,
) => {
try {
// Check if this is a single-toggle column (Vectorize or Identifier)
const isSingleToggleColumn =
colField === "Vectorize" ||
colField === "vectorize" ||
colField === "Identifier" ||
colField === "identifier";
if (!isSingleToggleColumn) return true;
// Safeguard: ensure we have rowData array
if (!props.rowData || !Array.isArray(props.rowData)) {
return true;
}
// Normalize the current value to boolean
const normalizedCurrentValue =
currentRowValue === true ||
currentRowValue === "true" ||
currentRowValue === 1;
// If current row is true, always allow editing (to turn it off)
if (normalizedCurrentValue) {
return true;
}
// If current row is false, only allow editing if no other row is true
const hasAnyTrue = props.rowData.some((row) => {
if (!row || typeof row !== "object") return false;
const value = row[colField];
const normalizedValue =
value === true || value === "true" || value === 1;
return normalizedValue;
});
return !hasAnyTrue;
} catch (error) {
// Default to editable if there's an error to avoid breaking functionality
return true;
}
};
const colDef = props.columnDefs
.filter((col) => !col.hide)
.map((col, index, filteredArray) => {
@ -92,10 +139,49 @@ const TableComponent = forwardRef<
props.editable.every((field) => typeof field === "string") &&
(props.editable as Array<string>).includes(newCol.field ?? ""))
) {
newCol = {
...newCol,
editable: true,
};
// Special handling for single-toggle columns (Vectorize and Identifier)
const isSingleToggleColumn =
newCol.field === "Vectorize" ||
newCol.field === "vectorize" ||
newCol.field === "Identifier" ||
newCol.field === "identifier";
if (isSingleToggleColumn) {
newCol = {
...newCol,
editable: (params) => {
const currentValue = params.data[params.colDef.field!];
return isSingleToggleRowEditable(
newCol.field!,
params.data,
currentValue,
);
},
cellRendererParams: {
...newCol.cellRendererParams,
isSingleToggleColumn: true,
singleToggleField: newCol.field,
checkSingleToggleEditable: (params) => {
try {
const fieldName = newCol.field!;
const currentValue = params?.data?.[fieldName];
return isSingleToggleRowEditable(
fieldName,
params?.data,
currentValue,
);
} catch (error) {
return false;
}
},
},
};
} else {
newCol = {
...newCol,
editable: true,
};
}
}
if (
Array.isArray(props.editable) &&
@ -109,11 +195,68 @@ const TableComponent = forwardRef<
}>
).find((field) => field.field === newCol.field);
if (field) {
newCol = {
...newCol,
editable: field.editableCell,
onCellValueChanged: (e) => field.onUpdate(e),
};
// Special handling for single-toggle columns (Vectorize and Identifier)
const isSingleToggleColumn =
newCol.field === "Vectorize" ||
newCol.field === "vectorize" ||
newCol.field === "Identifier" ||
newCol.field === "identifier";
if (isSingleToggleColumn) {
newCol = {
...newCol,
editable: (params) => {
const currentValue = params.data[params.colDef.field!];
return (
field.editableCell &&
isSingleToggleRowEditable(
newCol.field!,
params.data,
currentValue,
)
);
},
cellRendererParams: {
...newCol.cellRendererParams,
isSingleToggleColumn: true,
singleToggleField: newCol.field,
checkSingleToggleEditable: (params) => {
try {
const fieldName = newCol.field!;
const currentValue = params?.data?.[fieldName];
return (
field.editableCell &&
isSingleToggleRowEditable(
fieldName,
params?.data,
currentValue,
)
);
} catch (error) {
return false;
}
},
},
onCellValueChanged: (e) => {
field.onUpdate(e);
// Refresh grid to update editable state of other cells
setTimeout(() => {
if (
realRef.current?.api &&
!realRef.current.api.isDestroyed()
) {
realRef.current.api.refreshCells({ force: true });
}
}, 0);
},
};
} else {
newCol = {
...newCol,
editable: field.editableCell,
onCellValueChanged: (e) => field.onUpdate(e),
};
}
}
}
return newCol;
@ -253,6 +396,61 @@ const TableComponent = forwardRef<
}}
onGridReady={onGridReady}
onColumnMoved={onColumnMoved}
onCellValueChanged={(e) => {
// Handle single-toggle column changes (Vectorize and Identifier) to refresh grid editability
const isSingleToggleField =
e.colDef.field === "Vectorize" ||
e.colDef.field === "vectorize" ||
e.colDef.field === "Identifier" ||
e.colDef.field === "identifier";
if (isSingleToggleField) {
setTimeout(() => {
if (
realRef.current?.api &&
!realRef.current.api.isDestroyed()
) {
// Refresh all cells with force to update cell renderer params
if (e.colDef.field) {
realRef.current.api.refreshCells({
force: true,
columns: [e.colDef.field],
});
}
// Also refresh all other single-toggle column cells if they exist
const allSingleToggleColumns = realRef.current.api
.getColumns()
?.filter((col) => {
const field = col.getColDef().field;
return (
field === "Vectorize" ||
field === "vectorize" ||
field === "Identifier" ||
field === "identifier"
);
});
if (
allSingleToggleColumns &&
allSingleToggleColumns.length > 0
) {
const columnFields = allSingleToggleColumns
.map((col) => col.getColDef().field)
.filter((field): field is string => field !== undefined);
if (columnFields.length > 0) {
realRef.current.api.refreshCells({
force: true,
columns: columnFields,
});
}
}
}
}, 0);
}
// Call original onCellValueChanged if it exists
if (props.onCellValueChanged) {
props.onCellValueChanged(e);
}
}}
onStateUpdated={(e) => {
if (e.sources.some((source) => source.includes("column"))) {
localStorage.setItem(

View file

@ -29,6 +29,7 @@ export const URLs = {
PUBLIC_FLOW: `flows/public_flow`,
MCP: `mcp/project`,
MCP_SERVERS: `mcp/servers`,
KNOWLEDGE_BASES: `knowledge_bases`,
} as const;
// IMPORTANT: FOLDERS endpoint now points to 'projects' for backward compatibility

View file

@ -0,0 +1,39 @@
import type { UseMutationResult } from "@tanstack/react-query";
import type { useMutationFunctionType } from "@/types/api";
import { api } from "../../api";
import { getURL } from "../../helpers/constants";
import { UseRequestProcessor } from "../../services/request-processor";
interface DeleteKnowledgeBaseParams {
kb_name: string;
}
export const useDeleteKnowledgeBase: useMutationFunctionType<
DeleteKnowledgeBaseParams,
void
> = (params, options?) => {
const { mutate, queryClient } = UseRequestProcessor();
const deleteKnowledgeBaseFn = async (): Promise<any> => {
const response = await api.delete<any>(
`${getURL("KNOWLEDGE_BASES")}/${params.kb_name}`,
);
return response.data;
};
const mutation: UseMutationResult<any, any, void> = mutate(
["useDeleteKnowledgeBase"],
deleteKnowledgeBaseFn,
{
onSettled: (data, error, variables, context) => {
queryClient.invalidateQueries({
queryKey: ["useGetKnowledgeBases"],
});
options?.onSettled?.(data, error, variables, context);
},
...options,
},
);
return mutation;
};

View file

@ -0,0 +1,38 @@
import type { UseMutationResult } from "@tanstack/react-query";
import type { useMutationFunctionType } from "@/types/api";
import { api } from "../../api";
import { getURL } from "../../helpers/constants";
import { UseRequestProcessor } from "../../services/request-processor";
interface DeleteKnowledgeBasesParams {
kb_names: string[];
}
export const useDeleteKnowledgeBases: useMutationFunctionType<
undefined,
DeleteKnowledgeBasesParams
> = (options?) => {
const { mutate, queryClient } = UseRequestProcessor();
const deleteKnowledgeBasesFn = async (
params: DeleteKnowledgeBasesParams,
): Promise<any> => {
const response = await api.delete<any>(`${getURL("KNOWLEDGE_BASES")}/`, {
data: { kb_names: params.kb_names },
});
return response.data;
};
const mutation: UseMutationResult<any, any, DeleteKnowledgeBasesParams> =
mutate(["useDeleteKnowledgeBases"], deleteKnowledgeBasesFn, {
onSettled: (data, error, variables, context) => {
queryClient.invalidateQueries({
queryKey: ["useGetKnowledgeBases"],
});
options?.onSettled?.(data, error, variables, context);
},
...options,
});
return mutation;
};

View file

@ -0,0 +1,40 @@
import type { UseQueryResult } from "@tanstack/react-query";
import type { useQueryFunctionType } from "@/types/api";
import { api } from "../../api";
import { getURL } from "../../helpers/constants";
import { UseRequestProcessor } from "../../services/request-processor";
export interface KnowledgeBaseInfo {
id: string;
name: string;
embedding_provider?: string;
embedding_model?: string;
size: number;
words: number;
characters: number;
chunks: number;
avg_chunk_size: number;
}
export const useGetKnowledgeBases: useQueryFunctionType<
undefined,
KnowledgeBaseInfo[]
> = (options?) => {
const { query } = UseRequestProcessor();
const getKnowledgeBasesFn = async (): Promise<KnowledgeBaseInfo[]> => {
const res = await api.get(`${getURL("KNOWLEDGE_BASES")}/`);
return res.data;
};
const queryResult: UseQueryResult<KnowledgeBaseInfo[], any> = query(
["useGetKnowledgeBases"],
getKnowledgeBasesFn,
{
refetchOnWindowFocus: false,
...options,
},
);
return queryResult;
};

View file

@ -15,5 +15,7 @@ export const ENABLE_VOICE_ASSISTANT = true;
export const ENABLE_IMAGE_ON_PLAYGROUND = false;
export const ENABLE_MCP = true;
export const ENABLE_MCP_NOTICE = false;
export const ENABLE_KNOWLEDGE_BASES = false;
export const ENABLE_MCP_COMPOSER =
process.env.LANGFLOW_FEATURE_MCP_COMPOSER === "true";

View file

@ -46,7 +46,9 @@ export default function DeleteConfirmationModal({
</DialogHeader>
<span className="pb-3 text-sm">
This will permanently delete the {description ?? "flow"}
{note ? " " + note : ""}.<br></br>This can't be undone.
{note ? " " + note : ""}.<br />
<br />
This can't be undone.
</span>
<DialogFooter>
<DialogClose asChild>

View file

@ -0,0 +1,446 @@
import type {
ColDef,
NewValueParams,
SelectionChangedEvent,
} from "ag-grid-community";
import type { AgGridReact } from "ag-grid-react";
import { useEffect, useMemo, useRef, useState } from "react";
import ForwardedIconComponent from "@/components/common/genericIconComponent";
import ShadTooltip from "@/components/common/shadTooltipComponent";
import CardsWrapComponent from "@/components/core/cardsWrapComponent";
import TableComponent from "@/components/core/parameterRenderComponent/components/tableComponent";
import { Button } from "@/components/ui/button";
import { Input } from "@/components/ui/input";
import Loading from "@/components/ui/loading";
import { useGetFilesV2 } from "@/controllers/API/queries/file-management";
import { useDeleteFilesV2 } from "@/controllers/API/queries/file-management/use-delete-files";
import { usePostRenameFileV2 } from "@/controllers/API/queries/file-management/use-put-rename-file";
import { useCustomHandleBulkFilesDownload } from "@/customization/hooks/use-custom-handle-bulk-files-download";
import { customPostUploadFileV2 } from "@/customization/hooks/use-custom-post-upload-file";
import useUploadFile from "@/hooks/files/use-upload-file";
import DeleteConfirmationModal from "@/modals/deleteConfirmationModal";
import FilesContextMenuComponent from "@/modals/fileManagerModal/components/filesContextMenuComponent";
import useAlertStore from "@/stores/alertStore";
import { formatFileSize } from "@/utils/stringManipulation";
import { FILE_ICONS } from "@/utils/styleUtils";
import { cn } from "@/utils/utils";
import { sortByDate } from "../../../utils/sort-flows";
import DragWrapComponent from "./dragWrapComponent";
interface FilesTabProps {
quickFilterText: string;
setQuickFilterText: (text: string) => void;
selectedFiles: any[];
setSelectedFiles: (files: any[]) => void;
quantitySelected: number;
setQuantitySelected: (quantity: number) => void;
isShiftPressed: boolean;
}
const FilesTab = ({
quickFilterText,
setQuickFilterText,
selectedFiles,
setSelectedFiles,
quantitySelected,
setQuantitySelected,
isShiftPressed,
}: FilesTabProps) => {
const tableRef = useRef<AgGridReact<any>>(null);
const { data: files } = useGetFilesV2();
const setErrorData = useAlertStore((state) => state.setErrorData);
const setSuccessData = useAlertStore((state) => state.setSuccessData);
const [isDownloading, setIsDownloading] = useState(false);
const { mutate: rename } = usePostRenameFileV2();
const { mutate: deleteFiles, isPending: isDeleting } = useDeleteFilesV2();
const { handleBulkDownload } = useCustomHandleBulkFilesDownload();
const handleRename = (params: NewValueParams<any, any>) => {
rename({
id: params.data.id,
name: params.newValue,
});
};
const handleOpenRename = (id: string, name: string) => {
if (tableRef.current) {
tableRef.current.api.startEditingCell({
rowIndex: files?.findIndex((file) => file.id === id) ?? 0,
colKey: "name",
});
}
};
const uploadFile = useUploadFile({ multiple: true });
const handleUpload = async (files?: File[]) => {
try {
const filesIds = await uploadFile({
files: files,
});
setSuccessData({
title: `File${filesIds.length > 1 ? "s" : ""} uploaded successfully`,
});
} catch (error: any) {
setErrorData({
title: "Error uploading file",
list: [error.message || "An error occurred while uploading the file"],
});
}
};
const { mutate: uploadFileDirect } = customPostUploadFileV2();
useEffect(() => {
if (files) {
setQuantitySelected(0);
setSelectedFiles([]);
}
}, [files, setQuantitySelected, setSelectedFiles]);
const handleSelectionChanged = (event: SelectionChangedEvent) => {
const selectedRows = event.api.getSelectedRows();
setSelectedFiles(selectedRows);
if (selectedRows.length > 0) {
setQuantitySelected(selectedRows.length);
} else {
setTimeout(() => {
setQuantitySelected(0);
}, 300);
}
};
const colDefs: ColDef[] = [
{
headerName: "Name",
field: "name",
flex: 2,
headerCheckboxSelection: true,
checkboxSelection: true,
editable: true,
filter: "agTextColumnFilter",
cellClass:
"cursor-text select-text group-[.no-select-cells]:cursor-default group-[.no-select-cells]:select-none",
cellRenderer: (params) => {
const type = params.data.path.split(".")[1]?.toLowerCase();
return (
<div className="flex items-center gap-4 font-medium">
{params.data.progress !== undefined &&
params.data.progress !== -1 ? (
<div className="flex h-6 items-center justify-center text-xs font-semibold text-muted-foreground">
{Math.round(params.data.progress * 100)}%
</div>
) : (
<div className="file-icon pointer-events-none relative">
<ForwardedIconComponent
name={FILE_ICONS[type]?.icon ?? "File"}
className={cn(
"-mx-[3px] h-6 w-6 shrink-0",
params.data.progress !== undefined
? "text-placeholder-foreground"
: (FILE_ICONS[type]?.color ?? undefined),
)}
/>
</div>
)}
<div
className={cn(
"flex items-center gap-2 text-sm font-medium",
params.data.progress !== undefined &&
params.data.progress === -1 &&
"pointer-events-none text-placeholder-foreground",
)}
>
{params.value}.{type}
</div>
{params.data.progress !== undefined &&
params.data.progress === -1 ? (
<span className="text-xs text-primary">
Upload failed,{" "}
<span
className="cursor-pointer text-accent-pink-foreground underline"
onClick={(e) => {
e.stopPropagation();
if (params.data.file) {
uploadFileDirect({ file: params.data.file });
}
}}
>
try again?
</span>
</span>
) : (
<></>
)}
</div>
);
},
},
{
headerName: "Type",
field: "path",
flex: 1,
filter: "agTextColumnFilter",
editable: false,
valueFormatter: (params) => {
return params.value.split(".")[1]?.toUpperCase();
},
cellClass:
"text-muted-foreground cursor-text select-text group-[.no-select-cells]:cursor-default group-[.no-select-cells]:select-none",
},
{
headerName: "Size",
field: "size",
flex: 1,
valueFormatter: (params) => {
return formatFileSize(params.value);
},
editable: false,
cellClass:
"text-muted-foreground cursor-text select-text group-[.no-select-cells]:cursor-default group-[.no-select-cells]:select-none",
},
{
headerName: "Modified",
field: "updated_at",
valueFormatter: (params) => {
return params.data.progress
? ""
: new Date(params.value + "Z").toLocaleString();
},
editable: false,
flex: 1,
resizable: false,
cellClass:
"text-muted-foreground cursor-text select-text group-[.no-select-cells]:cursor-default group-[.no-select-cells]:select-none",
},
{
maxWidth: 60,
editable: false,
resizable: false,
cellClass: "cursor-default",
cellRenderer: (params) => {
return (
<div className="flex h-full cursor-default items-center justify-center">
{!params.data.progress && (
<FilesContextMenuComponent
file={params.data}
handleRename={handleOpenRename}
>
<Button variant="ghost" size="iconMd">
<ForwardedIconComponent name="EllipsisVertical" />
</Button>
</FilesContextMenuComponent>
)}
</div>
);
},
},
];
const onFileDrop = async (e: React.DragEvent) => {
e.preventDefault;
e.stopPropagation();
const droppedFiles = Array.from(e.dataTransfer.files);
if (droppedFiles.length > 0) {
await handleUpload(droppedFiles);
}
};
const handleDownload = () => {
handleBulkDownload(
selectedFiles,
setSuccessData,
setErrorData,
setIsDownloading,
);
};
const handleDelete = () => {
deleteFiles(
{
ids: selectedFiles.map((file) => file.id),
},
{
onSuccess: (data) => {
setSuccessData({ title: data.message });
setQuantitySelected(0);
setSelectedFiles([]);
},
onError: (error) => {
setErrorData({
title: "Error deleting files",
list: [
error.message || "An error occurred while deleting the files",
],
});
},
},
);
};
const UploadButtonComponent = useMemo(() => {
return (
<ShadTooltip content="Upload File" side="bottom">
<Button
className="!px-3 md:!px-4 md:!pl-3.5"
onClick={async () => {
await handleUpload();
}}
id="upload-file-btn"
data-testid="upload-file-btn"
>
<ForwardedIconComponent
name="Plus"
aria-hidden="true"
className="h-4 w-4"
/>
<span className="hidden whitespace-nowrap font-semibold md:inline">
Upload Files
</span>
</Button>
</ShadTooltip>
);
}, []);
return (
<div className="flex h-full flex-col">
{files && files.length !== 0 ? (
<div className="flex justify-between">
<div className="flex w-full xl:w-5/12">
<Input
icon="Search"
data-testid="search-store-input"
type="text"
placeholder={`Search files...`}
className="mr-2 w-full"
value={quickFilterText || ""}
onChange={(event) => {
setQuickFilterText(event.target.value);
}}
/>
</div>
<div className="flex items-center gap-2">{UploadButtonComponent}</div>
</div>
) : (
<></>
)}
<div className="flex h-full flex-col py-4">
{!files || !Array.isArray(files) ? (
<div className="flex h-full w-full items-center justify-center">
<Loading />
</div>
) : files.length > 0 ? (
<DragWrapComponent onFileDrop={onFileDrop}>
<div className="relative h-full">
<TableComponent
rowHeight={45}
headerHeight={45}
cellSelection={false}
tableOptions={{
hide_options: true,
}}
suppressRowClickSelection={!isShiftPressed}
editable={[
{
field: "name",
onUpdate: handleRename,
editableCell: true,
},
]}
rowSelection="multiple"
onSelectionChanged={handleSelectionChanged}
columnDefs={colDefs}
rowData={files.sort((a, b) => {
return sortByDate(
a.updated_at ?? a.created_at,
b.updated_at ?? b.created_at,
);
})}
className={cn(
"ag-no-border group w-full",
isShiftPressed && quantitySelected > 0 && "no-select-cells",
)}
pagination
ref={tableRef}
quickFilterText={quickFilterText}
gridOptions={{
stopEditingWhenCellsLoseFocus: true,
ensureDomOrder: true,
colResizeDefault: "shift",
}}
/>
<div
className={cn(
"pointer-events-none absolute top-1.5 z-50 flex h-8 w-full transition-opacity",
selectedFiles.length > 0 ? "opacity-100" : "opacity-0",
)}
>
<div
className={cn(
"ml-12 flex h-full flex-1 items-center justify-between bg-background",
selectedFiles.length > 0
? "pointer-events-auto"
: "pointer-events-none",
)}
>
<span className="text-xs text-muted-foreground">
{quantitySelected} selected
</span>
<div className="flex items-center gap-2">
<Button
variant="outline"
size="iconMd"
onClick={handleDownload}
loading={isDownloading}
data-testid="bulk-download-btn"
>
<ForwardedIconComponent name="Download" />
</Button>
<DeleteConfirmationModal
onConfirm={handleDelete}
description={"file" + (quantitySelected > 1 ? "s" : "")}
>
<Button
variant="destructive"
size="iconMd"
className="px-2.5 !text-mmd"
loading={isDeleting}
data-testid="bulk-delete-btn"
>
<ForwardedIconComponent name="Trash2" />
Delete
</Button>
</DeleteConfirmationModal>
</div>
</div>
</div>
</div>
</DragWrapComponent>
) : (
<CardsWrapComponent
onFileDrop={onFileDrop}
dragMessage="Drop files to upload"
>
<div className="flex h-full w-full flex-col items-center justify-center gap-8 pb-8">
<div className="flex flex-col items-center gap-2">
<h3 className="text-2xl font-semibold">No files</h3>
<p className="text-lg text-secondary-foreground">
Upload files or import from your preferred cloud.
</p>
</div>
<div className="flex items-center gap-2">
{UploadButtonComponent}
</div>
</div>
</CardsWrapComponent>
)}
</div>
</div>
);
};
export default FilesTab;

View file

@ -0,0 +1,68 @@
import ForwardedIconComponent from "@/components/common/genericIconComponent";
import { Button } from "@/components/ui/button";
import { Separator } from "@/components/ui/separator";
import type { KnowledgeBaseInfo } from "@/controllers/API/queries/knowledge-bases/use-get-knowledge-bases";
interface KnowledgeBaseDrawerProps {
isOpen: boolean;
onClose: () => void;
knowledgeBase: KnowledgeBaseInfo | null;
}
const KnowledgeBaseDrawer = ({
isOpen,
onClose,
knowledgeBase,
}: KnowledgeBaseDrawerProps) => {
if (!isOpen || !knowledgeBase) {
return null;
}
return (
<div className="flex h-full w-80 flex-col border-l bg-background">
<div className="flex items-center justify-between pt-4 px-4">
<h3 className="font-semibold">{knowledgeBase.name}</h3>
<Button variant="ghost" size="iconSm" onClick={onClose}>
<ForwardedIconComponent name="X" className="h-4 w-4" />
</Button>
</div>
<div className="flex-1 overflow-y-auto pt-3">
<div className="flex flex-col gap-4">
<div className="px-4">
<div className="text-sm text-muted-foreground">
No description available.
</div>
</div>
<Separator />
<div className="space-y-2 px-4">
<label className="text-sm font-medium">Embedding Provider</label>
<div className="flex items-center gap-2">
<div className="text-sm font-medium text-muted-foreground">
{knowledgeBase.embedding_model || "Unknown"}
</div>
</div>
</div>
<div className="space-y-3 px-4">
<h4 className="text-sm font-medium">Source Files</h4>
<div className="text-sm text-muted-foreground">
No source files available.
</div>
</div>
<div className="space-y-3 px-4">
<h4 className="text-sm font-medium">Linked Flows</h4>
<div className="text-sm text-muted-foreground">
No linked flows available.
</div>
</div>
</div>
</div>
</div>
);
};
export default KnowledgeBaseDrawer;

View file

@ -0,0 +1,63 @@
import { useParams } from "react-router-dom";
import ForwardedIconComponent from "@/components/common/genericIconComponent";
import { Button } from "@/components/ui/button";
import { useCustomNavigate } from "@/customization/hooks/use-custom-navigate";
import { track } from "@/customization/utils/analytics";
import useAddFlow from "@/hooks/flows/use-add-flow";
import useFlowsManagerStore from "@/stores/flowsManagerStore";
import { useFolderStore } from "@/stores/foldersStore";
import { updateIds } from "@/utils/reactflowUtils";
const KnowledgeBaseEmptyState = () => {
const examples = useFlowsManagerStore((state) => state.examples);
const addFlow = useAddFlow();
const navigate = useCustomNavigate();
const { folderId } = useParams();
const myCollectionId = useFolderStore((state) => state.myCollectionId);
const folderIdUrl = folderId ?? myCollectionId;
const handleCreateKnowledge = async () => {
const knowledgeBasesExample = examples.find(
(example) => example.name === "Knowledge Ingestion",
);
if (knowledgeBasesExample && knowledgeBasesExample.data) {
updateIds(knowledgeBasesExample.data);
addFlow({ flow: knowledgeBasesExample }).then((id) => {
navigate(`/flow/${id}/folder/${folderIdUrl}`);
});
track("New Flow Created", {
template: `${knowledgeBasesExample.name} Template`,
});
}
};
return (
<div className="flex h-full w-full flex-col items-center justify-center gap-8 pb-8">
<div className="flex flex-col items-center gap-2">
<h3 className="text-2xl font-semibold">No knowledge bases</h3>
<p className="text-lg text-secondary-foreground">
Create your first knowledge base to get started.
</p>
</div>
<div className="flex items-center gap-2">
<Button
onClick={handleCreateKnowledge}
className="!px-3 md:!px-4 md:!pl-3.5"
>
<ForwardedIconComponent
name="Plus"
aria-hidden="true"
className="h-4 w-4"
/>
<span className="whitespace-nowrap font-semibold">
Create Knowledge
</span>
</Button>
</div>
</div>
);
};
export default KnowledgeBaseEmptyState;

View file

@ -0,0 +1,97 @@
import ForwardedIconComponent from "@/components/common/genericIconComponent";
import { Button } from "@/components/ui/button";
import { useDeleteKnowledgeBases } from "@/controllers/API/queries/knowledge-bases/use-delete-knowledge-bases";
import DeleteConfirmationModal from "@/modals/deleteConfirmationModal";
import useAlertStore from "@/stores/alertStore";
import { cn } from "@/utils/utils";
interface KnowledgeBaseSelectionOverlayProps {
selectedFiles: any[];
quantitySelected: number;
onDelete?: () => void;
onClearSelection: () => void;
}
const KnowledgeBaseSelectionOverlay = ({
selectedFiles,
quantitySelected,
onDelete,
onClearSelection,
}: KnowledgeBaseSelectionOverlayProps) => {
const { setSuccessData, setErrorData } = useAlertStore((state) => ({
setSuccessData: state.setSuccessData,
setErrorData: state.setErrorData,
}));
const deleteMutation = useDeleteKnowledgeBases({
onSuccess: (data) => {
setSuccessData({
title: `${data.deleted_count} Knowledge Base(s) deleted successfully!`,
});
onClearSelection();
},
onError: (error: any) => {
setErrorData({
title: "Failed to delete knowledge bases",
list: [
error?.response?.data?.detail ||
error?.message ||
"An unknown error occurred",
],
});
onClearSelection();
},
});
const handleBulkDelete = () => {
if (onDelete) {
onDelete();
} else {
const knowledgeBaseIds = selectedFiles.map((file) => file.id);
if (knowledgeBaseIds.length > 0 && !deleteMutation.isPending) {
deleteMutation.mutate({ kb_names: knowledgeBaseIds });
}
}
};
const isVisible = selectedFiles.length > 0;
const pluralSuffix = quantitySelected > 1 ? "s" : "";
return (
<div
className={cn(
"pointer-events-none absolute top-1.5 z-50 flex h-8 w-full transition-opacity",
isVisible ? "opacity-100" : "opacity-0",
)}
>
<div
className={cn(
"ml-12 flex h-full flex-1 items-center justify-between bg-background",
isVisible ? "pointer-events-auto" : "pointer-events-none",
)}
>
<span className="text-xs text-muted-foreground">
{quantitySelected} selected
</span>
<div className="flex items-center gap-2">
<DeleteConfirmationModal
onConfirm={handleBulkDelete}
description={`knowledge base${pluralSuffix}`}
>
<Button
variant="destructive"
size="iconMd"
className="px-2.5 !text-mmd"
data-testid="bulk-delete-kb-btn"
>
<ForwardedIconComponent name="Trash2" />
Delete
</Button>
</DeleteConfirmationModal>
</div>
</div>
</div>
);
};
export default KnowledgeBaseSelectionOverlay;

View file

@ -0,0 +1,221 @@
import type {
NewValueParams,
RowClickedEvent,
SelectionChangedEvent,
} from "ag-grid-community";
import type { AgGridReact } from "ag-grid-react";
import { useRef, useState } from "react";
import TableComponent from "@/components/core/parameterRenderComponent/components/tableComponent";
import { Input } from "@/components/ui/input";
import Loading from "@/components/ui/loading";
import { useDeleteKnowledgeBase } from "@/controllers/API/queries/knowledge-bases/use-delete-knowledge-base";
import {
type KnowledgeBaseInfo,
useGetKnowledgeBases,
} from "@/controllers/API/queries/knowledge-bases/use-get-knowledge-bases";
import DeleteConfirmationModal from "@/modals/deleteConfirmationModal";
import useAlertStore from "@/stores/alertStore";
import { cn } from "@/utils/utils";
import { createKnowledgeBaseColumns } from "../config/knowledgeBaseColumns";
import KnowledgeBaseEmptyState from "./KnowledgeBaseEmptyState";
import KnowledgeBaseSelectionOverlay from "./KnowledgeBaseSelectionOverlay";
interface KnowledgeBasesTabProps {
quickFilterText: string;
setQuickFilterText: (text: string) => void;
selectedFiles: any[];
setSelectedFiles: (files: any[]) => void;
quantitySelected: number;
setQuantitySelected: (quantity: number) => void;
isShiftPressed: boolean;
onRowClick?: (knowledgeBase: KnowledgeBaseInfo) => void;
}
const KnowledgeBasesTab = ({
quickFilterText,
setQuickFilterText,
selectedFiles,
setSelectedFiles,
quantitySelected,
setQuantitySelected,
isShiftPressed,
onRowClick,
}: KnowledgeBasesTabProps) => {
const tableRef = useRef<AgGridReact<any>>(null);
const { setErrorData, setSuccessData } = useAlertStore((state) => ({
setErrorData: state.setErrorData,
setSuccessData: state.setSuccessData,
}));
const [isDeleteModalOpen, setIsDeleteModalOpen] = useState(false);
const [knowledgeBaseToDelete, setKnowledgeBaseToDelete] =
useState<KnowledgeBaseInfo | null>(null);
const { data: knowledgeBases, isLoading, error } = useGetKnowledgeBases();
const deleteKnowledgeBaseMutation = useDeleteKnowledgeBase(
{
kb_name: knowledgeBaseToDelete?.id || "",
},
{
onSuccess: () => {
setSuccessData({
title: `Knowledge Base "${knowledgeBaseToDelete?.name}" deleted successfully!`,
});
resetDeleteState();
},
onError: (error: any) => {
setErrorData({
title: "Failed to delete knowledge base",
list: [
error?.response?.data?.detail ||
error?.message ||
"An unknown error occurred",
],
});
resetDeleteState();
},
},
);
if (error) {
setErrorData({
title: "Failed to load knowledge bases",
list: [error?.message || "An unknown error occurred"],
});
}
const resetDeleteState = () => {
setKnowledgeBaseToDelete(null);
setIsDeleteModalOpen(false);
};
const handleRename = (params: NewValueParams<any, any>) => {
setSuccessData({
title: "Knowledge Base renamed successfully!",
});
};
const handleDelete = (knowledgeBase: KnowledgeBaseInfo) => {
setKnowledgeBaseToDelete(knowledgeBase);
setIsDeleteModalOpen(true);
};
const confirmDelete = () => {
if (knowledgeBaseToDelete && !deleteKnowledgeBaseMutation.isPending) {
deleteKnowledgeBaseMutation.mutate();
}
};
const handleSelectionChange = (event: SelectionChangedEvent) => {
const selectedRows = event.api.getSelectedRows();
setSelectedFiles(selectedRows);
if (selectedRows.length > 0) {
setQuantitySelected(selectedRows.length);
} else {
setTimeout(() => {
setQuantitySelected(0);
}, 300);
}
};
const clearSelection = () => {
setQuantitySelected(0);
setSelectedFiles([]);
};
const handleRowClick = (event: RowClickedEvent) => {
const clickedElement = event.event?.target as HTMLElement;
if (clickedElement && !clickedElement.closest("button") && onRowClick) {
onRowClick(event.data);
}
};
const columnDefs = createKnowledgeBaseColumns(handleRename, handleDelete);
if (isLoading || !knowledgeBases || !Array.isArray(knowledgeBases)) {
return (
<div className="flex h-full w-full items-center justify-center">
<Loading />
</div>
);
}
if (knowledgeBases.length === 0) {
return <KnowledgeBaseEmptyState />;
}
return (
<div className="flex h-full flex-col pb-4">
<div className="flex justify-between">
<div className="flex w-full xl:w-5/12">
<Input
icon="Search"
data-testid="search-kb-input"
type="text"
placeholder="Search knowledge bases..."
className="mr-2 w-full"
value={quickFilterText || ""}
onChange={(event) => setQuickFilterText(event.target.value)}
/>
</div>
</div>
<div className="flex h-full flex-col pt-4">
<div className="relative h-full">
<TableComponent
rowHeight={45}
headerHeight={45}
cellSelection={false}
tableOptions={{
hide_options: true,
}}
suppressRowClickSelection={!isShiftPressed}
editable={[
{
field: "name",
onUpdate: handleRename,
editableCell: true,
},
]}
rowSelection="multiple"
onSelectionChanged={handleSelectionChange}
onRowClicked={handleRowClick}
columnDefs={columnDefs}
rowData={knowledgeBases}
className={cn(
"ag-no-border ag-knowledge-table group w-full",
isShiftPressed && quantitySelected > 0 && "no-select-cells",
)}
pagination
ref={tableRef}
quickFilterText={quickFilterText}
gridOptions={{
stopEditingWhenCellsLoseFocus: true,
ensureDomOrder: true,
colResizeDefault: "shift",
}}
/>
<KnowledgeBaseSelectionOverlay
selectedFiles={selectedFiles}
quantitySelected={quantitySelected}
onClearSelection={clearSelection}
/>
</div>
</div>
<DeleteConfirmationModal
open={isDeleteModalOpen}
setOpen={setIsDeleteModalOpen}
onConfirm={confirmDelete}
description={`knowledge base "${knowledgeBaseToDelete?.name || ""}"`}
note="This action cannot be undone"
>
<></>
</DeleteConfirmationModal>
</div>
);
};
export default KnowledgeBasesTab;

View file

@ -0,0 +1,163 @@
import { fireEvent, render, screen } from "@testing-library/react";
import React from "react";
// Mock the component to avoid complex dependency chains
jest.mock("../KnowledgeBaseDrawer", () => {
const MockKnowledgeBaseDrawer = ({ isOpen, onClose, knowledgeBase }: any) => {
if (!isOpen || !knowledgeBase) {
return null;
}
return (
<div
data-testid="knowledge-base-drawer"
className="w-80 border-l bg-background"
>
<div className="flex items-center justify-between p-4">
<h3>{knowledgeBase.name}</h3>
<button onClick={onClose} data-testid="close-button">
<span data-testid="icon-X">X</span>
</button>
</div>
<div className="p-4">
<div data-testid="description">No description available.</div>
<div data-testid="embedding-provider">
<label>Embedding Provider</label>
<div>{knowledgeBase.embedding_model || "Unknown"}</div>
</div>
<div data-testid="source-files">
<h4>Source Files</h4>
<div>No source files available.</div>
</div>
<div data-testid="linked-flows">
<h4>Linked Flows</h4>
<div>No linked flows available.</div>
</div>
</div>
</div>
);
};
MockKnowledgeBaseDrawer.displayName = "KnowledgeBaseDrawer";
return {
__esModule: true,
default: MockKnowledgeBaseDrawer,
};
});
const KnowledgeBaseDrawer = require("../KnowledgeBaseDrawer").default;
const mockKnowledgeBase = {
id: "kb-1",
name: "Test Knowledge Base",
embedding_provider: "OpenAI",
embedding_model: "text-embedding-ada-002",
size: 1024000,
words: 50000,
characters: 250000,
chunks: 100,
avg_chunk_size: 2500,
};
describe("KnowledgeBaseDrawer", () => {
const mockOnClose = jest.fn();
beforeEach(() => {
jest.clearAllMocks();
});
it("renders nothing when isOpen is false", () => {
const { container } = render(
<KnowledgeBaseDrawer
isOpen={false}
onClose={mockOnClose}
knowledgeBase={mockKnowledgeBase}
/>,
);
expect(container.firstChild).toBeNull();
});
it("renders nothing when knowledgeBase is null", () => {
const { container } = render(
<KnowledgeBaseDrawer
isOpen={true}
onClose={mockOnClose}
knowledgeBase={null}
/>,
);
expect(container.firstChild).toBeNull();
});
it("renders drawer when both isOpen is true and knowledgeBase is provided", () => {
render(
<KnowledgeBaseDrawer
isOpen={true}
onClose={mockOnClose}
knowledgeBase={mockKnowledgeBase}
/>,
);
expect(screen.getByTestId("knowledge-base-drawer")).toBeInTheDocument();
expect(screen.getByText("Test Knowledge Base")).toBeInTheDocument();
});
it("calls onClose when close button is clicked", () => {
render(
<KnowledgeBaseDrawer
isOpen={true}
onClose={mockOnClose}
knowledgeBase={mockKnowledgeBase}
/>,
);
const closeButton = screen.getByTestId("close-button");
fireEvent.click(closeButton);
expect(mockOnClose).toHaveBeenCalledTimes(1);
});
it("displays embedding model information", () => {
render(
<KnowledgeBaseDrawer
isOpen={true}
onClose={mockOnClose}
knowledgeBase={mockKnowledgeBase}
/>,
);
expect(screen.getByText("Embedding Provider")).toBeInTheDocument();
expect(screen.getByText("text-embedding-ada-002")).toBeInTheDocument();
});
it("displays Unknown for missing embedding model", () => {
const kbWithoutModel = {
...mockKnowledgeBase,
embedding_model: undefined,
};
render(
<KnowledgeBaseDrawer
isOpen={true}
onClose={mockOnClose}
knowledgeBase={kbWithoutModel}
/>,
);
expect(screen.getByText("Unknown")).toBeInTheDocument();
});
it("displays content sections", () => {
render(
<KnowledgeBaseDrawer
isOpen={true}
onClose={mockOnClose}
knowledgeBase={mockKnowledgeBase}
/>,
);
expect(screen.getByText("No description available.")).toBeInTheDocument();
expect(screen.getByText("Source Files")).toBeInTheDocument();
expect(screen.getByText("Linked Flows")).toBeInTheDocument();
});
});

View file

@ -0,0 +1,105 @@
import { QueryClient, QueryClientProvider } from "@tanstack/react-query";
import { fireEvent, render, screen, waitFor } from "@testing-library/react";
import React from "react";
import { BrowserRouter } from "react-router-dom";
// Mock all the dependencies to avoid complex imports
jest.mock("@/stores/flowsManagerStore", () => ({
__esModule: true,
default: jest.fn(),
}));
jest.mock("@/hooks/flows/use-add-flow", () => ({
__esModule: true,
default: jest.fn(),
}));
jest.mock("@/customization/hooks/use-custom-navigate", () => ({
useCustomNavigate: jest.fn(),
}));
jest.mock("@/stores/foldersStore", () => ({
useFolderStore: jest.fn(),
}));
jest.mock("@/customization/utils/analytics", () => ({
track: jest.fn(),
}));
jest.mock("@/utils/reactflowUtils", () => ({
updateIds: jest.fn(),
}));
// Mock the component itself to test in isolation
jest.mock("../KnowledgeBaseEmptyState", () => {
const MockKnowledgeBaseEmptyState = () => (
<div data-testid="knowledge-base-empty-state">
<h3>No knowledge bases</h3>
<p>Create your first knowledge base to get started.</p>
<button data-testid="create-knowledge-btn">Create Knowledge</button>
</div>
);
MockKnowledgeBaseEmptyState.displayName = "KnowledgeBaseEmptyState";
return {
__esModule: true,
default: MockKnowledgeBaseEmptyState,
};
});
const KnowledgeBaseEmptyState = require("../KnowledgeBaseEmptyState").default;
const createTestWrapper = () => {
const queryClient = new QueryClient({
defaultOptions: {
queries: { retry: false },
mutations: { retry: false },
},
});
return ({ children }: { children: React.ReactNode }) => (
<QueryClientProvider client={queryClient}>
<BrowserRouter>{children}</BrowserRouter>
</QueryClientProvider>
);
};
describe("KnowledgeBaseEmptyState", () => {
beforeEach(() => {
jest.clearAllMocks();
});
it("renders empty state message correctly", () => {
render(<KnowledgeBaseEmptyState />, { wrapper: createTestWrapper() });
expect(screen.getByText("No knowledge bases")).toBeInTheDocument();
expect(
screen.getByText("Create your first knowledge base to get started."),
).toBeInTheDocument();
});
it("renders create knowledge button", () => {
render(<KnowledgeBaseEmptyState />, { wrapper: createTestWrapper() });
const createButton = screen.getByTestId("create-knowledge-btn");
expect(createButton).toBeInTheDocument();
expect(createButton).toHaveTextContent("Create Knowledge");
});
it("handles create knowledge button click", () => {
render(<KnowledgeBaseEmptyState />, { wrapper: createTestWrapper() });
const createButton = screen.getByTestId("create-knowledge-btn");
fireEvent.click(createButton);
// Since we're using a mock, we just verify the button is clickable
expect(createButton).toBeInTheDocument();
});
it("renders with correct test id", () => {
render(<KnowledgeBaseEmptyState />, { wrapper: createTestWrapper() });
expect(
screen.getByTestId("knowledge-base-empty-state"),
).toBeInTheDocument();
});
});

View file

@ -0,0 +1,173 @@
import { QueryClient, QueryClientProvider } from "@tanstack/react-query";
import { fireEvent, render, screen } from "@testing-library/react";
import React from "react";
// Mock the component to avoid complex dependency chains
jest.mock("../KnowledgeBaseSelectionOverlay", () => {
const MockKnowledgeBaseSelectionOverlay = ({
selectedFiles,
quantitySelected,
onClearSelection,
onDelete,
}: any) => {
const isVisible = selectedFiles.length > 0;
const pluralSuffix = quantitySelected > 1 ? "s" : "";
const handleDelete = () => {
if (onDelete) {
onDelete();
}
};
return (
<div
data-testid="selection-overlay"
className={isVisible ? "opacity-100" : "opacity-0"}
>
<span data-testid="selection-count">{quantitySelected} selected</span>
<button data-testid="bulk-delete-kb-btn" onClick={handleDelete}>
Delete
</button>
<button data-testid="clear-selection-btn" onClick={onClearSelection}>
Clear
</button>
<span data-testid="delete-description">
knowledge base{pluralSuffix}
</span>
</div>
);
};
MockKnowledgeBaseSelectionOverlay.displayName =
"KnowledgeBaseSelectionOverlay";
return {
__esModule: true,
default: MockKnowledgeBaseSelectionOverlay,
};
});
const KnowledgeBaseSelectionOverlay =
require("../KnowledgeBaseSelectionOverlay").default;
const createTestWrapper = () => {
const queryClient = new QueryClient({
defaultOptions: {
queries: { retry: false },
mutations: { retry: false },
},
});
return ({ children }: { children: React.ReactNode }) => (
<QueryClientProvider client={queryClient}>{children}</QueryClientProvider>
);
};
const mockSelectedFiles = [
{ id: "kb-1", name: "Knowledge Base 1" },
{ id: "kb-2", name: "Knowledge Base 2" },
];
describe("KnowledgeBaseSelectionOverlay", () => {
const mockOnClearSelection = jest.fn();
const mockOnDelete = jest.fn();
beforeEach(() => {
jest.clearAllMocks();
});
it("renders as invisible when no files are selected", () => {
render(
<KnowledgeBaseSelectionOverlay
selectedFiles={[]}
quantitySelected={0}
onClearSelection={mockOnClearSelection}
/>,
{ wrapper: createTestWrapper() },
);
const overlay = screen.getByTestId("selection-overlay");
expect(overlay).toHaveClass("opacity-0");
});
it("renders as visible when files are selected", () => {
render(
<KnowledgeBaseSelectionOverlay
selectedFiles={mockSelectedFiles}
quantitySelected={2}
onClearSelection={mockOnClearSelection}
/>,
{ wrapper: createTestWrapper() },
);
const overlay = screen.getByTestId("selection-overlay");
expect(overlay).toHaveClass("opacity-100");
});
it("displays correct selection count for single item", () => {
render(
<KnowledgeBaseSelectionOverlay
selectedFiles={[mockSelectedFiles[0]]}
quantitySelected={1}
onClearSelection={mockOnClearSelection}
/>,
{ wrapper: createTestWrapper() },
);
expect(screen.getByTestId("selection-count")).toHaveTextContent(
"1 selected",
);
expect(screen.getByTestId("delete-description")).toHaveTextContent(
"knowledge base",
);
});
it("displays correct selection count for multiple items", () => {
render(
<KnowledgeBaseSelectionOverlay
selectedFiles={mockSelectedFiles}
quantitySelected={2}
onClearSelection={mockOnClearSelection}
/>,
{ wrapper: createTestWrapper() },
);
expect(screen.getByTestId("selection-count")).toHaveTextContent(
"2 selected",
);
expect(screen.getByTestId("delete-description")).toHaveTextContent(
"knowledge bases",
);
});
it("calls custom onDelete when provided", () => {
render(
<KnowledgeBaseSelectionOverlay
selectedFiles={mockSelectedFiles}
quantitySelected={2}
onDelete={mockOnDelete}
onClearSelection={mockOnClearSelection}
/>,
{ wrapper: createTestWrapper() },
);
const deleteButton = screen.getByTestId("bulk-delete-kb-btn");
fireEvent.click(deleteButton);
expect(mockOnDelete).toHaveBeenCalledTimes(1);
});
it("calls onClearSelection when clear button is clicked", () => {
render(
<KnowledgeBaseSelectionOverlay
selectedFiles={mockSelectedFiles}
quantitySelected={2}
onClearSelection={mockOnClearSelection}
/>,
{ wrapper: createTestWrapper() },
);
const clearButton = screen.getByTestId("clear-selection-btn");
fireEvent.click(clearButton);
expect(mockOnClearSelection).toHaveBeenCalledTimes(1);
});
});

View file

@ -0,0 +1,170 @@
import { QueryClient, QueryClientProvider } from "@tanstack/react-query";
import { fireEvent, render, screen } from "@testing-library/react";
import React from "react";
// Mock the component to avoid complex dependencies
jest.mock("../KnowledgeBasesTab", () => {
const MockKnowledgeBasesTab = ({
quickFilterText,
setQuickFilterText,
selectedFiles,
quantitySelected,
isShiftPressed,
onRowClick,
}: any) => (
<div data-testid="knowledge-bases-tab">
<input
data-testid="search-kb-input"
placeholder="Search knowledge bases..."
value={quickFilterText || ""}
onChange={(e) => setQuickFilterText?.(e.target.value)}
/>
<div data-testid="table-content">
<div>Mock Table</div>
<div data-testid="selected-count">
{selectedFiles?.length || 0} selected
</div>
<div data-testid="shift-pressed">
{isShiftPressed ? "Shift pressed" : "No shift"}
</div>
{onRowClick && (
<button
data-testid="mock-row-click"
onClick={() => onRowClick({ id: "kb-1", name: "Test KB" })}
>
Click Row
</button>
)}
</div>
</div>
);
MockKnowledgeBasesTab.displayName = "KnowledgeBasesTab";
return {
__esModule: true,
default: MockKnowledgeBasesTab,
};
});
const KnowledgeBasesTab = require("../KnowledgeBasesTab").default;
const createTestWrapper = () => {
const queryClient = new QueryClient({
defaultOptions: {
queries: { retry: false },
mutations: { retry: false },
},
});
return ({ children }: { children: React.ReactNode }) => (
<QueryClientProvider client={queryClient}>{children}</QueryClientProvider>
);
};
const defaultProps = {
quickFilterText: "",
setQuickFilterText: jest.fn(),
selectedFiles: [],
setSelectedFiles: jest.fn(),
quantitySelected: 0,
setQuantitySelected: jest.fn(),
isShiftPressed: false,
onRowClick: jest.fn(),
};
describe("KnowledgeBasesTab", () => {
beforeEach(() => {
jest.clearAllMocks();
});
it("renders search input with correct placeholder", () => {
render(<KnowledgeBasesTab {...defaultProps} />, {
wrapper: createTestWrapper(),
});
const searchInput = screen.getByTestId("search-kb-input");
expect(searchInput).toBeInTheDocument();
expect(searchInput).toHaveAttribute(
"placeholder",
"Search knowledge bases...",
);
});
it("handles search input changes", () => {
const mockSetQuickFilterText = jest.fn();
render(
<KnowledgeBasesTab
{...defaultProps}
setQuickFilterText={mockSetQuickFilterText}
/>,
{ wrapper: createTestWrapper() },
);
const searchInput = screen.getByTestId("search-kb-input");
fireEvent.change(searchInput, { target: { value: "test search" } });
expect(mockSetQuickFilterText).toHaveBeenCalledWith("test search");
});
it("displays search value in input", () => {
render(
<KnowledgeBasesTab {...defaultProps} quickFilterText="existing search" />,
{ wrapper: createTestWrapper() },
);
const searchInput = screen.getByTestId(
"search-kb-input",
) as HTMLInputElement;
expect(searchInput.value).toBe("existing search");
});
it("displays selected count", () => {
const selectedFiles = [{ id: "kb-1" }, { id: "kb-2" }];
render(
<KnowledgeBasesTab
{...defaultProps}
selectedFiles={selectedFiles}
quantitySelected={2}
/>,
{ wrapper: createTestWrapper() },
);
expect(screen.getByTestId("selected-count")).toHaveTextContent(
"2 selected",
);
});
it("displays shift key state", () => {
render(<KnowledgeBasesTab {...defaultProps} isShiftPressed={true} />, {
wrapper: createTestWrapper(),
});
expect(screen.getByTestId("shift-pressed")).toHaveTextContent(
"Shift pressed",
);
});
it("calls onRowClick when provided", () => {
const mockOnRowClick = jest.fn();
render(
<KnowledgeBasesTab {...defaultProps} onRowClick={mockOnRowClick} />,
{ wrapper: createTestWrapper() },
);
const rowButton = screen.getByTestId("mock-row-click");
fireEvent.click(rowButton);
expect(mockOnRowClick).toHaveBeenCalledWith({
id: "kb-1",
name: "Test KB",
});
});
it("renders table content", () => {
render(<KnowledgeBasesTab {...defaultProps} />, {
wrapper: createTestWrapper(),
});
expect(screen.getByTestId("table-content")).toBeInTheDocument();
expect(screen.getByText("Mock Table")).toBeInTheDocument();
});
});

View file

@ -0,0 +1,126 @@
import { QueryClient, QueryClientProvider } from "@tanstack/react-query";
import React from "react";
import { BrowserRouter } from "react-router-dom";
import type { KnowledgeBaseInfo } from "@/controllers/API/queries/knowledge-bases/use-get-knowledge-bases";
/**
* Creates a test wrapper with React Query and Router providers
*/
export const createTestWrapper = () => {
const queryClient = new QueryClient({
defaultOptions: {
queries: { retry: false },
mutations: { retry: false },
},
});
return ({ children }: { children: React.ReactNode }) => (
<QueryClientProvider client={queryClient}>
<BrowserRouter>{children}</BrowserRouter>
</QueryClientProvider>
);
};
/**
* Mock knowledge base data for testing
*/
export const mockKnowledgeBase: KnowledgeBaseInfo = {
id: "kb-1",
name: "Test Knowledge Base",
embedding_provider: "OpenAI",
embedding_model: "text-embedding-ada-002",
size: 1024000,
words: 50000,
characters: 250000,
chunks: 100,
avg_chunk_size: 2500,
};
export const mockKnowledgeBaseList: KnowledgeBaseInfo[] = [
mockKnowledgeBase,
{
id: "kb-2",
name: "Second Knowledge Base",
embedding_provider: "Anthropic",
embedding_model: "claude-embedding",
size: 2048000,
words: 75000,
characters: 400000,
chunks: 150,
avg_chunk_size: 2666,
},
{
id: "kb-3",
name: "Third Knowledge Base",
embedding_model: undefined, // Test case for missing embedding model
size: 512000,
words: 25000,
characters: 125000,
chunks: 50,
avg_chunk_size: 2500,
},
];
/**
* Mock ForwardedIconComponent for consistent testing
*/
export const mockIconComponent = () => {
jest.mock("@/components/common/genericIconComponent", () => {
const MockedIcon = ({
name,
...props
}: {
name: string;
[key: string]: any;
}) => <span data-testid={`icon-${name}`} {...props} />;
MockedIcon.displayName = "ForwardedIconComponent";
return MockedIcon;
});
};
/**
* Mock TableComponent for testing components that use ag-grid
*/
export const mockTableComponent = () => {
jest.mock(
"@/components/core/parameterRenderComponent/components/tableComponent",
() => {
const MockTable = (props: any) => (
<div data-testid="mock-table" {...props}>
<div data-testid="table-content">Mock Table</div>
</div>
);
MockTable.displayName = "TableComponent";
return MockTable;
},
);
};
/**
* Common alert store mock setup
*/
export const setupAlertStoreMock = () => {
const mockSetSuccessData = jest.fn();
const mockSetErrorData = jest.fn();
return {
mockSetSuccessData,
mockSetErrorData,
mockAlertStore: {
setSuccessData: mockSetSuccessData,
setErrorData: mockSetErrorData,
},
};
};
/**
* Mock react-router-dom useParams hook
*/
export const mockUseParams = (
params: Record<string, string | undefined> = {},
) => {
jest.doMock("react-router-dom", () => ({
...jest.requireActual("react-router-dom"),
useParams: () => params,
}));
};

View file

@ -0,0 +1,115 @@
import type { ColDef, NewValueParams } from "ag-grid-community";
import ForwardedIconComponent from "@/components/common/genericIconComponent";
import { Button } from "@/components/ui/button";
import { formatFileSize } from "@/utils/stringManipulation";
import {
formatAverageChunkSize,
formatNumber,
} from "../utils/knowledgeBaseUtils";
export const createKnowledgeBaseColumns = (
onRename?: (params: NewValueParams<any, any>) => void,
onDelete?: (knowledgeBase: any) => void,
): ColDef[] => {
const baseCellClass =
"text-muted-foreground cursor-pointer select-text group-[.no-select-cells]:cursor-default group-[.no-select-cells]:select-none";
return [
{
headerName: "Name",
field: "name",
flex: 2,
headerCheckboxSelection: true,
checkboxSelection: true,
editable: true,
filter: "agTextColumnFilter",
cellClass: baseCellClass,
cellRenderer: (params) => (
<div className="flex items-center gap-3 font-medium">
<div className="flex flex-col">
<div className="text-sm font-medium">{params.value}</div>
</div>
</div>
),
},
{
headerName: "Embedding Model",
field: "embedding_provider",
flex: 1.2,
filter: "agTextColumnFilter",
editable: false,
cellClass: baseCellClass,
tooltipValueGetter: (params) => params.data.embedding_model || "Unknown",
valueGetter: (params) => params.data.embedding_model || "Unknown",
},
{
headerName: "Size",
field: "size",
flex: 0.8,
valueFormatter: (params) => formatFileSize(params.value),
editable: false,
cellClass: baseCellClass,
},
{
headerName: "Words",
field: "words",
flex: 0.8,
editable: false,
cellClass: baseCellClass,
valueFormatter: (params) => formatNumber(params.value),
},
{
headerName: "Characters",
field: "characters",
flex: 1,
editable: false,
cellClass: baseCellClass,
valueFormatter: (params) => formatNumber(params.value),
},
{
headerName: "Chunks",
field: "chunks",
flex: 0.7,
editable: false,
cellClass: baseCellClass,
valueFormatter: (params) => formatNumber(params.value),
},
{
headerName: "Avg Chunks",
field: "avg_chunk_size",
flex: 1,
editable: false,
cellClass: baseCellClass,
valueFormatter: (params) => formatAverageChunkSize(params.value),
},
{
maxWidth: 60,
editable: false,
resizable: false,
cellClass: "cursor-default",
cellRenderer: (params) => {
const handleDeleteClick = () => {
if (onDelete) {
onDelete(params.data);
}
};
return (
<div className="flex h-full cursor-default items-center justify-center">
<Button
variant="ghost"
size="iconMd"
onClick={handleDeleteClick}
className="hover:bg-destructive/10"
>
<ForwardedIconComponent
name="Trash2"
className="h-4 w-4 text-destructive"
/>
</Button>
</div>
);
},
},
];
};

View file

@ -1,43 +1,13 @@
import type {
ColDef,
NewValueParams,
SelectionChangedEvent,
} from "ag-grid-community";
import type { AgGridReact } from "ag-grid-react";
import { useEffect, useMemo, useRef, useState } from "react";
import { useEffect, useState } from "react";
import ForwardedIconComponent from "@/components/common/genericIconComponent";
import ShadTooltip from "@/components/common/shadTooltipComponent";
import CardsWrapComponent from "@/components/core/cardsWrapComponent";
import TableComponent from "@/components/core/parameterRenderComponent/components/tableComponent";
import { Button } from "@/components/ui/button";
import { Input } from "@/components/ui/input";
import Loading from "@/components/ui/loading";
import { SidebarTrigger } from "@/components/ui/sidebar";
import { useGetFilesV2 } from "@/controllers/API/queries/file-management";
import { useDeleteFilesV2 } from "@/controllers/API/queries/file-management/use-delete-files";
import { usePostRenameFileV2 } from "@/controllers/API/queries/file-management/use-put-rename-file";
import { useCustomHandleBulkFilesDownload } from "@/customization/hooks/use-custom-handle-bulk-files-download";
import { customPostUploadFileV2 } from "@/customization/hooks/use-custom-post-upload-file";
import useUploadFile from "@/hooks/files/use-upload-file";
import DeleteConfirmationModal from "@/modals/deleteConfirmationModal";
import FilesContextMenuComponent from "@/modals/fileManagerModal/components/filesContextMenuComponent";
import useAlertStore from "@/stores/alertStore";
import { formatFileSize } from "@/utils/stringManipulation";
import { FILE_ICONS } from "@/utils/styleUtils";
import { cn } from "@/utils/utils";
import { sortByDate } from "../../utils/sort-flows";
import DragWrapComponent from "./components/dragWrapComponent";
import FilesTab from "./components/FilesTab";
export const FilesPage = () => {
const tableRef = useRef<AgGridReact<any>>(null);
const { data: files } = useGetFilesV2();
const setErrorData = useAlertStore((state) => state.setErrorData);
const setSuccessData = useAlertStore((state) => state.setSuccessData);
const [selectedFiles, setSelectedFiles] = useState<any[]>([]);
const [quantitySelected, setQuantitySelected] = useState(0);
const [isShiftPressed, setIsShiftPressed] = useState(false);
const [isDownloading, setIsDownloading] = useState(false);
const [quickFilterText, setQuickFilterText] = useState("");
useEffect(() => {
const handleKeyDown = (e: KeyboardEvent) => {
@ -61,260 +31,16 @@ export const FilesPage = () => {
};
}, []);
const handleSelectionChanged = (event: SelectionChangedEvent) => {
const selectedRows = event.api.getSelectedRows();
setSelectedFiles(selectedRows);
if (selectedRows.length > 0) {
setQuantitySelected(selectedRows.length);
} else {
setTimeout(() => {
setQuantitySelected(0);
}, 300);
}
const tabProps = {
quickFilterText,
setQuickFilterText,
selectedFiles,
setSelectedFiles,
quantitySelected,
setQuantitySelected,
isShiftPressed,
};
const { mutate: rename } = usePostRenameFileV2();
const { mutate: deleteFiles, isPending: isDeleting } = useDeleteFilesV2();
const { handleBulkDownload } = useCustomHandleBulkFilesDownload();
const handleRename = (params: NewValueParams<any, any>) => {
rename({
id: params.data.id,
name: params.newValue,
});
};
const handleOpenRename = (id: string, name: string) => {
if (tableRef.current) {
tableRef.current.api.startEditingCell({
rowIndex: files?.findIndex((file) => file.id === id) ?? 0,
colKey: "name",
});
}
};
const uploadFile = useUploadFile({ multiple: true });
const handleUpload = async (files?: File[]) => {
try {
const filesIds = await uploadFile({
files: files,
});
setSuccessData({
title: `File${filesIds.length > 1 ? "s" : ""} uploaded successfully`,
});
} catch (error: any) {
setErrorData({
title: "Error uploading file",
list: [error.message || "An error occurred while uploading the file"],
});
}
};
const { mutate: uploadFileDirect } = customPostUploadFileV2();
useEffect(() => {
if (files) {
setQuantitySelected(0);
setSelectedFiles([]);
}
}, [files]);
const colDefs: ColDef[] = [
{
headerName: "Name",
field: "name",
flex: 2,
headerCheckboxSelection: true,
checkboxSelection: true,
editable: true,
filter: "agTextColumnFilter",
cellClass:
"cursor-text select-text group-[.no-select-cells]:cursor-default group-[.no-select-cells]:select-none",
cellRenderer: (params) => {
const type = params.data.path.split(".")[1]?.toLowerCase();
return (
<div className="flex items-center gap-4 font-medium">
{params.data.progress !== undefined &&
params.data.progress !== -1 ? (
<div className="flex h-6 items-center justify-center text-xs font-semibold text-muted-foreground">
{Math.round(params.data.progress * 100)}%
</div>
) : (
<div className="file-icon pointer-events-none relative">
<ForwardedIconComponent
name={FILE_ICONS[type]?.icon ?? "File"}
className={cn(
"-mx-[3px] h-6 w-6 shrink-0",
params.data.progress !== undefined
? "text-placeholder-foreground"
: (FILE_ICONS[type]?.color ?? undefined),
)}
/>
</div>
)}
<div
className={cn(
"flex items-center gap-2 text-sm font-medium",
params.data.progress !== undefined &&
params.data.progress === -1 &&
"pointer-events-none text-placeholder-foreground",
)}
>
{params.value}.{type}
</div>
{params.data.progress !== undefined &&
params.data.progress === -1 ? (
<span className="text-xs text-primary">
Upload failed,{" "}
<span
className="cursor-pointer text-accent-pink-foreground underline"
onClick={(e) => {
e.stopPropagation();
if (params.data.file) {
uploadFileDirect({ file: params.data.file });
}
}}
>
try again?
</span>
</span>
) : (
<></>
)}
</div>
);
}, //This column will be twice as wide as the others
}, //This column will be twice as wide as the others
{
headerName: "Type",
field: "path",
flex: 1,
filter: "agTextColumnFilter",
editable: false,
valueFormatter: (params) => {
return params.value.split(".")[1]?.toUpperCase();
},
cellClass:
"text-muted-foreground cursor-text select-text group-[.no-select-cells]:cursor-default group-[.no-select-cells]:select-none",
},
{
headerName: "Size",
field: "size",
flex: 1,
valueFormatter: (params) => {
return formatFileSize(params.value);
},
editable: false,
cellClass:
"text-muted-foreground cursor-text select-text group-[.no-select-cells]:cursor-default group-[.no-select-cells]:select-none",
},
{
headerName: "Modified",
field: "updated_at",
valueFormatter: (params) => {
return params.data.progress
? ""
: new Date(params.value + "Z").toLocaleString();
},
editable: false,
flex: 1,
resizable: false,
cellClass:
"text-muted-foreground cursor-text select-text group-[.no-select-cells]:cursor-default group-[.no-select-cells]:select-none",
},
{
maxWidth: 60,
editable: false,
resizable: false,
cellClass: "cursor-default",
cellRenderer: (params) => {
return (
<div className="flex h-full cursor-default items-center justify-center">
{!params.data.progress && (
<FilesContextMenuComponent
file={params.data}
handleRename={handleOpenRename}
>
<Button variant="ghost" size="iconMd">
<ForwardedIconComponent name="EllipsisVertical" />
</Button>
</FilesContextMenuComponent>
)}
</div>
);
},
},
];
const onFileDrop = async (e: React.DragEvent) => {
e.preventDefault;
e.stopPropagation();
const droppedFiles = Array.from(e.dataTransfer.files);
if (droppedFiles.length > 0) {
await handleUpload(droppedFiles);
}
};
const handleDownload = () => {
handleBulkDownload(
selectedFiles,
setSuccessData,
setErrorData,
setIsDownloading,
);
};
const handleDelete = () => {
deleteFiles(
{
ids: selectedFiles.map((file) => file.id),
},
{
onSuccess: (data) => {
setSuccessData({ title: data.message });
setQuantitySelected(0);
setSelectedFiles([]);
},
onError: (error) => {
setErrorData({
title: "Error deleting files",
list: [
error.message || "An error occurred while deleting the files",
],
});
},
},
);
};
const UploadButtonComponent = useMemo(() => {
return (
<ShadTooltip content="Upload File" side="bottom">
<Button
className="!px-3 md:!px-4 md:!pl-3.5"
onClick={async () => {
await handleUpload();
}}
id="upload-file-btn"
data-testid="upload-file-btn"
>
<ForwardedIconComponent
name="Plus"
aria-hidden="true"
className="h-4 w-4"
/>
<span className="hidden whitespace-nowrap font-semibold md:inline">
Upload
</span>
</Button>
</ShadTooltip>
);
}, [uploadFile]);
const [quickFilterText, setQuickFilterText] = useState("");
return (
<div
className="flex h-full w-full flex-col overflow-y-auto"
@ -338,149 +64,10 @@ export const FilesPage = () => {
</SidebarTrigger>
</div>
</div>
My Files
Files
</div>
{files && files.length !== 0 ? (
<div className="flex justify-between">
<div className="flex w-full xl:w-5/12">
<Input
icon="Search"
data-testid="search-store-input"
type="text"
placeholder={`Search files...`}
className="mr-2 w-full"
value={quickFilterText || ""}
onChange={(event) => {
setQuickFilterText(event.target.value);
}}
/>
</div>
<div className="flex items-center gap-2">
{UploadButtonComponent}
{/* <ImportButtonComponent /> */}
</div>
</div>
) : (
<></>
)}
<div className="flex h-full flex-col py-4">
{!files || !Array.isArray(files) ? (
<div className="flex h-full w-full items-center justify-center">
<Loading />
</div>
) : files.length > 0 ? (
<DragWrapComponent onFileDrop={onFileDrop}>
<div className="relative h-full">
<TableComponent
rowHeight={45}
headerHeight={45}
cellSelection={false}
tableOptions={{
hide_options: true,
}}
suppressRowClickSelection={!isShiftPressed}
editable={[
{
field: "name",
onUpdate: handleRename,
editableCell: true,
},
]}
rowSelection="multiple"
onSelectionChanged={handleSelectionChanged}
columnDefs={colDefs}
rowData={files.sort((a, b) => {
return sortByDate(
a.updated_at ?? a.created_at,
b.updated_at ?? b.created_at,
);
})}
className={cn(
"ag-no-border group w-full",
isShiftPressed &&
quantitySelected > 0 &&
"no-select-cells",
)}
pagination
ref={tableRef}
quickFilterText={quickFilterText}
gridOptions={{
stopEditingWhenCellsLoseFocus: true,
ensureDomOrder: true,
colResizeDefault: "shift",
}}
/>
<div
className={cn(
"pointer-events-none absolute top-1.5 z-50 flex h-8 w-full transition-opacity",
selectedFiles.length > 0 ? "opacity-100" : "opacity-0",
)}
>
<div
className={cn(
"ml-12 flex h-full flex-1 items-center justify-between bg-background",
selectedFiles.length > 0
? "pointer-events-auto"
: "pointer-events-none",
)}
>
<span className="text-xs text-muted-foreground">
{quantitySelected} selected
</span>
<div className="flex items-center gap-2">
<Button
variant="outline"
size="iconMd"
onClick={handleDownload}
loading={isDownloading}
data-testid="bulk-download-btn"
>
<ForwardedIconComponent name="Download" />
</Button>
<DeleteConfirmationModal
onConfirm={handleDelete}
description={
"file" + (quantitySelected > 1 ? "s" : "")
}
>
<Button
variant="destructive"
size="iconMd"
className="px-2.5 !text-mmd"
loading={isDeleting}
data-testid="bulk-delete-btn"
>
<ForwardedIconComponent name="Trash2" />
Delete
</Button>
</DeleteConfirmationModal>
</div>
</div>
</div>
</div>
</DragWrapComponent>
) : (
<CardsWrapComponent
onFileDrop={onFileDrop}
dragMessage="Drop files to upload"
>
<div className="flex h-full w-full flex-col items-center justify-center gap-8 pb-8">
<div className="flex flex-col items-center gap-2">
<h3 className="text-2xl font-semibold">No files</h3>
<p className="text-lg text-secondary-foreground">
Upload files or import from your preferred cloud.
</p>
</div>
<div className="flex items-center gap-2">
{UploadButtonComponent}
{/* <ImportButtonComponent /> */}
</div>
</div>
</CardsWrapComponent>
)}
<div className="flex h-full flex-col">
<FilesTab {...tabProps} />
</div>
</div>
</div>

View file

@ -0,0 +1,73 @@
import { formatAverageChunkSize, formatNumber } from "../knowledgeBaseUtils";
describe("knowledgeBaseUtils", () => {
describe("formatNumber", () => {
it("formats numbers with commas for thousands", () => {
expect(formatNumber(1000)).toBe("1,000");
expect(formatNumber(1500)).toBe("1,500");
expect(formatNumber(10000)).toBe("10,000");
expect(formatNumber(100000)).toBe("100,000");
expect(formatNumber(1000000)).toBe("1,000,000");
});
it("handles numbers less than 1000 without commas", () => {
expect(formatNumber(0)).toBe("0");
expect(formatNumber(1)).toBe("1");
expect(formatNumber(99)).toBe("99");
expect(formatNumber(999)).toBe("999");
});
it("handles negative numbers", () => {
expect(formatNumber(-1000)).toBe("-1,000");
expect(formatNumber(-1500)).toBe("-1,500");
expect(formatNumber(-999)).toBe("-999");
});
it("handles decimal numbers by displaying them with decimals", () => {
expect(formatNumber(1000.5)).toBe("1,000.5");
expect(formatNumber(1999.9)).toBe("1,999.9");
expect(formatNumber(999.1)).toBe("999.1");
});
it("handles very large numbers", () => {
expect(formatNumber(1234567890)).toBe("1,234,567,890");
expect(formatNumber(987654321)).toBe("987,654,321");
});
});
describe("formatAverageChunkSize", () => {
it("formats average chunk size by rounding and formatting", () => {
expect(formatAverageChunkSize(1000.4)).toBe("1,000");
expect(formatAverageChunkSize(1000.6)).toBe("1,001");
expect(formatAverageChunkSize(2500)).toBe("2,500");
expect(formatAverageChunkSize(999.9)).toBe("1,000");
});
it("handles small decimal values", () => {
expect(formatAverageChunkSize(1.2)).toBe("1");
expect(formatAverageChunkSize(1.6)).toBe("2");
expect(formatAverageChunkSize(0.4)).toBe("0");
expect(formatAverageChunkSize(0.6)).toBe("1");
});
it("handles zero and negative values", () => {
expect(formatAverageChunkSize(0)).toBe("0");
expect(formatAverageChunkSize(-5.5)).toBe("-5");
expect(formatAverageChunkSize(-1000.4)).toBe("-1,000");
});
it("handles large decimal values", () => {
expect(formatAverageChunkSize(123456.7)).toBe("123,457");
expect(formatAverageChunkSize(999999.1)).toBe("999,999");
expect(formatAverageChunkSize(999999.9)).toBe("1,000,000");
});
it("handles edge cases", () => {
expect(formatAverageChunkSize(0.5)).toBe("1");
expect(formatAverageChunkSize(-0.5)).toBe("-0");
expect(formatAverageChunkSize(Number.MAX_SAFE_INTEGER)).toBe(
"9,007,199,254,740,991",
);
});
});
});

View file

@ -0,0 +1,13 @@
/**
* Helper function to format numbers with commas
*/
export const formatNumber = (num: number): string => {
return new Intl.NumberFormat().format(num);
};
/**
* Format average chunk size with units
*/
export const formatAverageChunkSize = (avgChunkSize: number): string => {
return `${formatNumber(Math.round(avgChunkSize))}`;
};

View file

@ -0,0 +1,244 @@
import { QueryClient, QueryClientProvider } from "@tanstack/react-query";
import { fireEvent, render, screen, waitFor } from "@testing-library/react";
import React from "react";
import { BrowserRouter } from "react-router-dom";
// Mock the KnowledgePage component to test in isolation
jest.mock("../index", () => {
const MockKnowledgePage = () => {
const [isShiftPressed, setIsShiftPressed] = React.useState(false);
const [isDrawerOpen, setIsDrawerOpen] = React.useState(false);
const [selectedKnowledgeBase, setSelectedKnowledgeBase] =
React.useState<any>(null);
React.useEffect(() => {
const handleKeyDown = (e: KeyboardEvent) => {
if (e.key === "Shift") {
setIsShiftPressed(true);
}
};
const handleKeyUp = (e: KeyboardEvent) => {
if (e.key === "Shift") {
setIsShiftPressed(false);
}
};
window.addEventListener("keydown", handleKeyDown);
window.addEventListener("keyup", handleKeyUp);
return () => {
window.removeEventListener("keydown", handleKeyDown);
window.removeEventListener("keyup", handleKeyUp);
};
}, []);
const handleRowClick = (knowledgeBase: any) => {
setSelectedKnowledgeBase(knowledgeBase);
setIsDrawerOpen(true);
};
const closeDrawer = () => {
setIsDrawerOpen(false);
setSelectedKnowledgeBase(null);
};
return (
<div className="flex h-full w-full" data-testid="cards-wrapper">
<div
className={`flex h-full w-full flex-col ${isDrawerOpen ? "mr-80" : ""}`}
>
<div className="flex h-full w-full flex-col xl:container">
<div className="flex flex-1 flex-col justify-start px-5 pt-10">
<div className="flex h-full flex-col justify-start">
<div
className="flex items-center pb-8 text-xl font-semibold"
data-testid="mainpage_title"
>
<button data-testid="sidebar-trigger">
<span data-testid="icon-PanelLeftOpen" />
</button>
Knowledge
</div>
<div className="flex h-full flex-col">
<div data-testid="knowledge-bases-tab">
<div>Quick Filter: </div>
<div>Selected Files: 0</div>
<div>Quantity Selected: 0</div>
<div>Shift Pressed: {isShiftPressed ? "Yes" : "No"}</div>
<button
data-testid="mock-row-click"
onClick={() =>
handleRowClick({ name: "Test Knowledge Base" })
}
>
Mock Row Click
</button>
</div>
</div>
</div>
</div>
</div>
</div>
{isDrawerOpen && (
<div className="fixed right-0 top-12 z-50 h-[calc(100vh-48px)]">
<div data-testid="knowledge-base-drawer">
<div>Drawer Open: Yes</div>
<div>Knowledge Base: {selectedKnowledgeBase?.name || "None"}</div>
<button data-testid="drawer-close" onClick={closeDrawer}>
Close Drawer
</button>
</div>
</div>
)}
{!isDrawerOpen && (
<div data-testid="knowledge-base-drawer">
<div>Drawer Open: No</div>
<div>Knowledge Base: None</div>
</div>
)}
</div>
);
};
MockKnowledgePage.displayName = "KnowledgePage";
return {
KnowledgePage: MockKnowledgePage,
};
});
const { KnowledgePage } = require("../index");
const createTestWrapper = () => {
const queryClient = new QueryClient({
defaultOptions: {
queries: { retry: false },
mutations: { retry: false },
},
});
return ({ children }: { children: React.ReactNode }) => (
<QueryClientProvider client={queryClient}>
<BrowserRouter>{children}</BrowserRouter>
</QueryClientProvider>
);
};
describe("KnowledgePage", () => {
beforeEach(() => {
jest.clearAllMocks();
});
it("renders page title correctly", () => {
render(<KnowledgePage />, { wrapper: createTestWrapper() });
expect(screen.getByTestId("mainpage_title")).toBeInTheDocument();
expect(screen.getByText("Knowledge")).toBeInTheDocument();
});
it("renders sidebar trigger", () => {
render(<KnowledgePage />, { wrapper: createTestWrapper() });
expect(screen.getByTestId("sidebar-trigger")).toBeInTheDocument();
expect(screen.getByTestId("icon-PanelLeftOpen")).toBeInTheDocument();
});
it("handles shift key press and release", async () => {
render(<KnowledgePage />, { wrapper: createTestWrapper() });
// Initially shift is not pressed
expect(screen.getByText("Shift Pressed: No")).toBeInTheDocument();
// Simulate shift key down
fireEvent.keyDown(window, { key: "Shift" });
await waitFor(() => {
expect(screen.getByText("Shift Pressed: Yes")).toBeInTheDocument();
});
// Simulate shift key up
fireEvent.keyUp(window, { key: "Shift" });
await waitFor(() => {
expect(screen.getByText("Shift Pressed: No")).toBeInTheDocument();
});
});
it("ignores non-shift key events", async () => {
render(<KnowledgePage />, { wrapper: createTestWrapper() });
expect(screen.getByText("Shift Pressed: No")).toBeInTheDocument();
// Simulate other key events
fireEvent.keyDown(window, { key: "Enter" });
fireEvent.keyUp(window, { key: "Enter" });
// Should still be false
expect(screen.getByText("Shift Pressed: No")).toBeInTheDocument();
});
it("initializes with drawer closed", () => {
render(<KnowledgePage />, { wrapper: createTestWrapper() });
expect(screen.getByText("Drawer Open: No")).toBeInTheDocument();
expect(screen.getByText("Knowledge Base: None")).toBeInTheDocument();
});
it("opens drawer when row is clicked", async () => {
render(<KnowledgePage />, { wrapper: createTestWrapper() });
// Initially drawer is closed
expect(screen.getByText("Drawer Open: No")).toBeInTheDocument();
// Click on a row
const rowClickButton = screen.getByTestId("mock-row-click");
fireEvent.click(rowClickButton);
await waitFor(() => {
expect(screen.getByText("Drawer Open: Yes")).toBeInTheDocument();
expect(
screen.getByText("Knowledge Base: Test Knowledge Base"),
).toBeInTheDocument();
});
});
it("closes drawer when close button is clicked", async () => {
render(<KnowledgePage />, { wrapper: createTestWrapper() });
// First open the drawer
const rowClickButton = screen.getByTestId("mock-row-click");
fireEvent.click(rowClickButton);
await waitFor(() => {
expect(screen.getByText("Drawer Open: Yes")).toBeInTheDocument();
});
// Now close the drawer
const closeButton = screen.getByTestId("drawer-close");
fireEvent.click(closeButton);
await waitFor(() => {
expect(screen.getByText("Drawer Open: No")).toBeInTheDocument();
expect(screen.getByText("Knowledge Base: None")).toBeInTheDocument();
});
});
it("adjusts layout when drawer is open", async () => {
render(<KnowledgePage />, { wrapper: createTestWrapper() });
const contentContainer = screen.getByTestId("cards-wrapper")
.firstChild as HTMLElement;
// Initially no margin adjustment
expect(contentContainer).not.toHaveClass("mr-80");
// Open drawer
const rowClickButton = screen.getByTestId("mock-row-click");
fireEvent.click(rowClickButton);
await waitFor(() => {
expect(contentContainer).toHaveClass("mr-80");
});
});
});

View file

@ -0,0 +1,143 @@
import { useEffect, useRef, useState } from "react";
import ForwardedIconComponent from "@/components/common/genericIconComponent";
import { SidebarTrigger } from "@/components/ui/sidebar";
import type { KnowledgeBaseInfo } from "@/controllers/API/queries/knowledge-bases/use-get-knowledge-bases";
import KnowledgeBaseDrawer from "../filesPage/components/KnowledgeBaseDrawer";
import KnowledgeBasesTab from "../filesPage/components/KnowledgeBasesTab";
export const KnowledgePage = () => {
const [selectedKnowledgeBases, setSelectedKnowledgeBases] = useState<any[]>(
[],
);
const [selectionCount, setSelectionCount] = useState(0);
const [isShiftPressed, setIsShiftPressed] = useState(false);
const [searchText, setSearchText] = useState("");
const [isDrawerOpen, setIsDrawerOpen] = useState(false);
const [selectedKnowledgeBase, setSelectedKnowledgeBase] =
useState<KnowledgeBaseInfo | null>(null);
const drawerRef = useRef<HTMLDivElement>(null);
useEffect(() => {
const handleKeyDown = (e: KeyboardEvent) => {
if (e.key === "Shift") {
setIsShiftPressed(true);
}
};
const handleKeyUp = (e: KeyboardEvent) => {
if (e.key === "Shift") {
setIsShiftPressed(false);
}
};
window.addEventListener("keydown", handleKeyDown);
window.addEventListener("keyup", handleKeyUp);
return () => {
window.removeEventListener("keydown", handleKeyDown);
window.removeEventListener("keyup", handleKeyUp);
};
}, []);
useEffect(() => {
const handleClickOutside = (event: MouseEvent) => {
if (
isDrawerOpen &&
drawerRef.current &&
!drawerRef.current.contains(event.target as Node)
) {
const clickedElement = event.target as HTMLElement;
const isTableRowClick = clickedElement.closest(".ag-row");
if (!isTableRowClick) {
closeDrawer();
}
}
};
if (isDrawerOpen) {
document.addEventListener("mousedown", handleClickOutside);
}
return () => {
document.removeEventListener("mousedown", handleClickOutside);
};
}, [isDrawerOpen]);
const handleKnowledgeBaseSelect = (knowledgeBase: KnowledgeBaseInfo) => {
if (isDrawerOpen) {
closeDrawer();
} else {
setSelectedKnowledgeBase(knowledgeBase);
// setIsDrawerOpen(true);
}
};
const closeDrawer = () => {
setIsDrawerOpen(false);
setSelectedKnowledgeBase(null);
};
const tabProps = {
quickFilterText: searchText,
setQuickFilterText: setSearchText,
selectedFiles: selectedKnowledgeBases,
setSelectedFiles: setSelectedKnowledgeBases,
quantitySelected: selectionCount,
setQuantitySelected: setSelectionCount,
isShiftPressed,
onRowClick: handleKnowledgeBaseSelect,
};
return (
<div className="flex h-full w-full" data-testid="cards-wrapper">
<div
className={`flex h-full w-full flex-col overflow-y-auto transition-all duration-200 ${
isDrawerOpen ? "mr-80" : ""
}`}
>
<div className="flex h-full w-full flex-col xl:container">
<div className="flex flex-1 flex-col justify-start px-5 pt-10">
<div className="flex h-full flex-col justify-start">
<div
className="flex items-center pb-8 text-xl font-semibold"
data-testid="mainpage_title"
>
<div className="h-7 w-10 transition-all group-data-[open=true]/sidebar-wrapper:md:w-0 lg:hidden">
<div className="relative left-0 opacity-100 transition-all group-data-[open=true]/sidebar-wrapper:md:opacity-0">
<SidebarTrigger>
<ForwardedIconComponent
name="PanelLeftOpen"
aria-hidden="true"
/>
</SidebarTrigger>
</div>
</div>
Knowledge
</div>
<div className="flex h-full flex-col">
<KnowledgeBasesTab {...tabProps} />
</div>
</div>
</div>
</div>
</div>
{isDrawerOpen && (
<div
ref={drawerRef}
className="fixed right-0 top-12 z-50 h-[calc(100vh-48px)]"
>
<KnowledgeBaseDrawer
isOpen={isDrawerOpen}
onClose={closeDrawer}
knowledgeBase={selectedKnowledgeBase}
/>
</div>
)}
</div>
);
};
export default KnowledgePage;

View file

@ -69,7 +69,7 @@ export default function CollectionPage(): JSX.Element {
setOpenDeleteFolderModal(true);
}}
handleFilesClick={() => {
navigate("files");
navigate("assets");
}}
/>
)}

View file

@ -26,6 +26,7 @@ import FlowPage from "./pages/FlowPage";
import LoginPage from "./pages/LoginPage";
import FilesPage from "./pages/MainPage/pages/filesPage";
import HomePage from "./pages/MainPage/pages/homePage";
import KnowledgePage from "./pages/MainPage/pages/knowledgePage";
import CollectionPage from "./pages/MainPage/pages/main-page";
import SettingsPage from "./pages/SettingsPage";
import ApiKeysPage from "./pages/SettingsPage/pages/ApiKeysPage";
@ -82,7 +83,17 @@ const router = createBrowserRouter(
element={<CustomNavigate replace to={"flows"} />}
/>
{ENABLE_FILE_MANAGEMENT && (
<Route path="files" element={<FilesPage />} />
<Route path="assets">
<Route
index
element={<CustomNavigate replace to="files" />}
/>
<Route path="files" element={<FilesPage />} />
<Route
path="knowledge-bases"
element={<KnowledgePage />}
/>
</Route>
)}
<Route
path="flows/"

View file

@ -182,3 +182,13 @@
.ag-tool-mode .ag-layout-auto-height .ag-center-cols-viewport {
min-height: 0px !important;
}
/* Knowledge Base Table - Always show checkboxes */
.ag-knowledge-table .ag-selection-checkbox .ag-checkbox {
width: 32px !important;
opacity: 1 !important;
}
.ag-knowledge-table .ag-header-checkbox {
opacity: 1 !important;
}

View file

@ -624,7 +624,7 @@ test(
// Check if we're on the files page
await page.waitForSelector('[data-testid="mainpage_title"]');
const title = await page.getByTestId("mainpage_title");
expect(await title.textContent()).toContain("My Files");
expect(await title.textContent()).toContain("Files");
// Upload the PNG file
const fileChooserPromisePng = page.waitForEvent("filechooser");

View file

@ -30,7 +30,7 @@ test(
// Check if we're on the files page
await page.waitForSelector('[data-testid="mainpage_title"]');
const title = await page.getByTestId("mainpage_title");
expect(await title.textContent()).toContain("My Files");
expect(await title.textContent()).toContain("Files");
// Check for empty state when no files are present
const noFilesText = await page.getByText("No files");

157
uv.lock generated
View file

@ -1481,6 +1481,86 @@ toml = [
{ name = "tomli", marker = "python_full_version <= '3.11'" },
]
[[package]]
name = "cramjam"
version = "2.10.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/e9/dc/ccc87820b189e35323433e80de450bf2fb8826a5b64834c740e7d5e66ce2/cramjam-2.10.0.tar.gz", hash = "sha256:e821dd487384ae8004e977c3b13135ad6665ccf8c9874e68441cad1146e66d8a", size = 47801, upload-time = "2025-04-12T18:00:10.025Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/f0/83/3e5f558aebb0064b1d7b197869055118ee849ccc5d7a86520ba751a79cb9/cramjam-2.10.0-cp310-cp310-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:26c44f17938cf00a339899ce6ea7ba12af7b1210d707a80a7f14724fba39869b", size = 3514239, upload-time = "2025-04-12T17:56:47.464Z" },
{ url = "https://files.pythonhosted.org/packages/5d/34/de70de0a7e675d72d78b50f326451ea854f7f12608d3e093423bbe8fae1c/cramjam-2.10.0-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:ce208a3e4043b8ce89e5d90047da16882456ea395577b1ee07e8215dce7d7c91", size = 1841404, upload-time = "2025-04-12T17:56:50.396Z" },
{ url = "https://files.pythonhosted.org/packages/77/ae/5e12b524eb98c03a3c24c243c52894b633ee86c03c36c5e4b5d4738a6567/cramjam-2.10.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:2c24907c972aca7b56c8326307e15d78f56199852dda1e67e4e54c2672afede4", size = 1678655, upload-time = "2025-04-12T17:56:52.62Z" },
{ url = "https://files.pythonhosted.org/packages/3a/d7/5adbd0b7bb55c5e40356949417e61ac4f950d656a49a8697a08a8b01d724/cramjam-2.10.0-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:f25db473667774725e4f34e738d644ffb205bf0bdc0e8146870a1104c5f42e4a", size = 2019539, upload-time = "2025-04-12T17:56:54.177Z" },
{ url = "https://files.pythonhosted.org/packages/db/c4/0cf4c9591b04a8e187df60defd920e3bb905b0db5a41d43e96213a0204d8/cramjam-2.10.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:51eb00c72d4a93e4a2ddcc751ba2a7a1318026247e80742866912ec82b39e5ce", size = 1752221, upload-time = "2025-04-12T17:56:56.379Z" },
{ url = "https://files.pythonhosted.org/packages/f5/ca/0d06de89c531b4acf9782775a1527d1d498dc13f7abaa427c665a17ce86f/cramjam-2.10.0-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:def47645b1b970fd97f063da852b0ddc4f5bdee9af8d5b718d9682c7b828d89d", size = 1848859, upload-time = "2025-04-12T17:56:57.987Z" },
{ url = "https://files.pythonhosted.org/packages/b8/2e/f7f04638bd26808b9f4d03e988de12a06ca5db4551897c780a756ce44384/cramjam-2.10.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:42dcd7c83104edae70004a8dc494e4e57de4940e3019e5d2cbec2830d5908a85", size = 2003282, upload-time = "2025-04-12T17:56:59.647Z" },
{ url = "https://files.pythonhosted.org/packages/83/06/e2048df7a8e1b05a089c25ca0ac1b17c7aa4108c8d6328bf1f74314701b7/cramjam-2.10.0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:e0744e391ea8baf0ddea5a180b0aa71a6a302490c14d7a37add730bf0172c7c6", size = 2312472, upload-time = "2025-04-12T17:57:01.264Z" },
{ url = "https://files.pythonhosted.org/packages/aa/f5/5826951d6398d7f11baaef0ff15d510f7e90af2338af0a92d872adc51f70/cramjam-2.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5018c7414047f640b126df02e9286a8da7cc620798cea2b39bac79731c2ee336", size = 1964217, upload-time = "2025-04-12T17:57:03.415Z" },
{ url = "https://files.pythonhosted.org/packages/fd/4c/9a1282c4650a1aba666947214a1437973757463e9c60994c497fb9cb5cf5/cramjam-2.10.0-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:4b201aacc7a06079b063cfbcf5efe78b1e65c7279b2828d06ffaa90a8316579d", size = 2022270, upload-time = "2025-04-12T17:57:05.082Z" },
{ url = "https://files.pythonhosted.org/packages/ac/e0/b78ab4ee7bcbd6116fdfe54cd771019bcc0d9039b81b070fe2780363c6f2/cramjam-2.10.0-cp310-cp310-musllinux_1_1_armv7l.whl", hash = "sha256:5264ac242697fbb1cfffa79d0153cbc4c088538bd99d60cfa374e8a8b83e2bb5", size = 2152240, upload-time = "2025-04-12T17:57:06.737Z" },
{ url = "https://files.pythonhosted.org/packages/94/0d/df2299892a7fa9b5d973111e81ee6772aaf27cc0489da41a34e66efe3cd5/cramjam-2.10.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:e193918c81139361f3f45db19696d31847601f2c0e79a38618f34d7bff6ee704", size = 2164031, upload-time = "2025-04-12T17:57:08.319Z" },
{ url = "https://files.pythonhosted.org/packages/ee/39/67cc689fcba789076890c980472a40653749d91a8dc3165a8913a84f5670/cramjam-2.10.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:22a7ab05c62b0a71fcd6db4274af1508c5ea039a43fb143ac50a62f86e6f32f7", size = 2134442, upload-time = "2025-04-12T17:57:09.892Z" },
{ url = "https://files.pythonhosted.org/packages/85/4c/cd4bc9f05d76a127372b991e819b9eefd05a296adfc4f99ba0471033b528/cramjam-2.10.0-cp310-cp310-win32.whl", hash = "sha256:2464bdf0e2432e0f07a834f48c16022cd7f4648ed18badf52c32c13d6722518c", size = 1598011, upload-time = "2025-04-12T17:57:11.978Z" },
{ url = "https://files.pythonhosted.org/packages/4f/73/8ea115e1bcda57de7793211bd6b425bddffecd79a6b6d6a424ceaeed52bf/cramjam-2.10.0-cp310-cp310-win_amd64.whl", hash = "sha256:73b6ffc8ffe6546462ccc7e34ca3acd9eb3984e1232645f498544a7eab6b8aca", size = 1700050, upload-time = "2025-04-12T17:57:14.266Z" },
{ url = "https://files.pythonhosted.org/packages/15/a3/493dd4a4791ae14e4011d5fe7082a7aca8d31255f5cb50f930ede68561ce/cramjam-2.10.0-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:fb73ee9616e3efd2cf3857b019c66f9bf287bb47139ea48425850da2ae508670", size = 3514540, upload-time = "2025-04-12T17:57:15.956Z" },
{ url = "https://files.pythonhosted.org/packages/7a/26/22a5f8d408a0799b960ffcfa97f28c851e5800a904ef69988c3816819f79/cramjam-2.10.0-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:acef0e2c4d9f38428721a0ec878dee3fb73a35e640593d99c9803457dbb65214", size = 1841685, upload-time = "2025-04-12T17:57:18.201Z" },
{ url = "https://files.pythonhosted.org/packages/33/e8/76d0ae48c64007542b5563ae81712cf1c571f0bbbab45b778112e61c92b7/cramjam-2.10.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:5b21b1672814ecce88f1da76635f0483d2d877d4cb8998db3692792f46279bf1", size = 1678629, upload-time = "2025-04-12T17:57:19.912Z" },
{ url = "https://files.pythonhosted.org/packages/61/a1/cf686e49740404b8a336e8134c5c22a0c2de64f918db0081b80d01682b5f/cramjam-2.10.0-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:7699d61c712bc77907c48fe63a21fffa03c4dd70401e1d14e368af031fde7c21", size = 2019846, upload-time = "2025-04-12T17:57:21.543Z" },
{ url = "https://files.pythonhosted.org/packages/f1/f7/91b3bd99d903567ca2fd76fc600b4ce08a85e6c4800fc94f505ef9cf486e/cramjam-2.10.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3484f1595eef64cefed05804d7ec8a88695f89086c49b086634e44c16f3d4769", size = 1752196, upload-time = "2025-04-12T17:57:23.34Z" },
{ url = "https://files.pythonhosted.org/packages/0d/b4/3c9f9f32197c0ad7b33cc99bdf786c2bd4ccf97fdb82b07b6b211c896744/cramjam-2.10.0-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:38fba4594dd0e2b7423ef403039e63774086ebb0696d9060db20093f18a2f43e", size = 1849188, upload-time = "2025-04-12T17:57:25.009Z" },
{ url = "https://files.pythonhosted.org/packages/93/f6/9b35acb94bcab5e2089a1ff4268a3b40cd640b4200e82a4d5bf419e6a64e/cramjam-2.10.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:b07fe3e48c881a75a11f722e1d5b052173b5e7c78b22518f659b8c9b4ac4c937", size = 2003528, upload-time = "2025-04-12T17:57:27.224Z" },
{ url = "https://files.pythonhosted.org/packages/13/4e/0c92d0c2ac978d1a95d6ff00095e5abbaeba766b5ff531d9700212db480e/cramjam-2.10.0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:3596b6ceaf85f872c1e56295c6ec80bb15fdd71e7ed9e0e5c3e654563dcc40a2", size = 2311664, upload-time = "2025-04-12T17:57:30.335Z" },
{ url = "https://files.pythonhosted.org/packages/84/ed/1db09adb133c569afd98b3f507ff372a39c3c7947cd0c42e161b5e6e13aa/cramjam-2.10.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e1c03360c1760f8608dc5ce1ddd7e5491180765360cae8104b428d5f86fbe1b9", size = 1964336, upload-time = "2025-04-12T17:57:32.023Z" },
{ url = "https://files.pythonhosted.org/packages/94/52/f7a45ba637a53bdde08fa98440341d04d7395de27a33dfd51b1211e35677/cramjam-2.10.0-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:3e0b70fe7796b63b87cb7ebfaad0ebaca7574fdf177311952f74b8bda6522fb8", size = 2022247, upload-time = "2025-04-12T17:57:34.334Z" },
{ url = "https://files.pythonhosted.org/packages/92/13/b2f101f98adbb1134d5f3a6ffd5859f88de705325e7eeeea8d57b0c106cd/cramjam-2.10.0-cp311-cp311-musllinux_1_1_armv7l.whl", hash = "sha256:d61a21e4153589bd53ffe71b553f93f2afbc8fb7baf63c91a83c933347473083", size = 2152365, upload-time = "2025-04-12T17:57:35.988Z" },
{ url = "https://files.pythonhosted.org/packages/19/62/85fe4091085a2d0cbe1c6271aad8f678434680fbedc9ab9fb694186c6551/cramjam-2.10.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:91ab85752a08dc875a05742cfda0234d7a70fadda07dd0b0582cfe991911f332", size = 2164416, upload-time = "2025-04-12T17:57:37.906Z" },
{ url = "https://files.pythonhosted.org/packages/63/3c/039bbde86826d13c6d328de70fed824cd7c2ab830d0c8b3fbdf4f61fc4e4/cramjam-2.10.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:c6afff7e9da53afb8d11eae27a20ee5709e2943b39af6c949b38424d0f271569", size = 2134635, upload-time = "2025-04-12T17:57:39.708Z" },
{ url = "https://files.pythonhosted.org/packages/ee/69/77703decb6b354bed28adcf81b423e0085ce816a80102f1e395c81b68cf6/cramjam-2.10.0-cp311-cp311-win32.whl", hash = "sha256:adf484b06063134ae604d4fc826d942af7e751c9d0b2fcab5bf1058a8ebe242b", size = 1598155, upload-time = "2025-04-12T17:57:41.896Z" },
{ url = "https://files.pythonhosted.org/packages/00/ba/6e7ba6bbc6bde49b62ddcbc0a670ae099d99bf5c7c5bfc3b1134aa9e2de7/cramjam-2.10.0-cp311-cp311-win_amd64.whl", hash = "sha256:9e20ebea6ec77232cd12e4084c8be6d03534dc5f3d027d365b32766beafce6c3", size = 1700119, upload-time = "2025-04-12T17:57:43.659Z" },
{ url = "https://files.pythonhosted.org/packages/00/50/09b2cdeee0e757a902cb25559783b0d81aeea2b055034de55f57db64152f/cramjam-2.10.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:0acb17e3681138b48300b27d3409742c81d5734ec39c650a60a764c135197840", size = 3503057, upload-time = "2025-04-12T17:57:45.698Z" },
{ url = "https://files.pythonhosted.org/packages/66/53/6baa9ef73833bd609df07c4334dccb3f7d2d43c4750f5fffadc878dbc2c9/cramjam-2.10.0-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:647553c44cf6b5ce2d9b56e743cc1eab886940d776b36438183e807bb5a7a42b", size = 1836184, upload-time = "2025-04-12T17:57:47.391Z" },
{ url = "https://files.pythonhosted.org/packages/b9/53/514dbdda46c5ce2d32f7d92d2aa570c7b47f78d7cc6fd79ee3db4ac2dd2a/cramjam-2.10.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:5c52805c7ccb533fe42d3d36c91d237c97c3b6551cd6b32f98b79eeb30d0f139", size = 1674041, upload-time = "2025-04-12T17:57:49.229Z" },
{ url = "https://files.pythonhosted.org/packages/fc/b8/07b88ee64f548ccd6d7f49589b8e5dffb5526e56572acee1a19fbd74cd5a/cramjam-2.10.0-cp312-cp312-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:337ceb50bde7708b2a4068f3000625c23ceb1b2497edce2e21fd08ef58549170", size = 2020058, upload-time = "2025-04-12T17:57:51.128Z" },
{ url = "https://files.pythonhosted.org/packages/ab/bc/6ffdb375a7699751ea6341704b56050c8df428485e8363962cd6a87d3ab8/cramjam-2.10.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1c071765bdd5eefa3b2157a61e84d72e161b63f95eb702a0133fee293800a619", size = 1747828, upload-time = "2025-04-12T17:57:54.223Z" },
{ url = "https://files.pythonhosted.org/packages/4e/46/45e7eb96960fbbf30b280142488b61afd7092a2430414f2539c72adf292e/cramjam-2.10.0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:8b40d46d2aa566f8e3def953279cce0191e47364b453cda492db12a84dd97f78", size = 1850669, upload-time = "2025-04-12T17:57:56.308Z" },
{ url = "https://files.pythonhosted.org/packages/ba/46/0ff7c54a9e649ad092bbbcaa21ae2535d8f53687c04836421bd4f930d780/cramjam-2.10.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:4c7bab3703babb93c9dd4444ac9797d01ec46cf521e247d3319bfb292414d053", size = 1998309, upload-time = "2025-04-12T17:57:58.763Z" },
{ url = "https://files.pythonhosted.org/packages/1d/16/387beef4365f86ce3a45812d93e9ce230a2d7cd4ff0d81f7aad84a55d0d5/cramjam-2.10.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:ba19308b8e19cdaadfbf47142f52b705d2cbfb8edd84a8271573e50fa7fa022d", size = 2361331, upload-time = "2025-04-12T17:58:00.42Z" },
{ url = "https://files.pythonhosted.org/packages/6f/5e/2d9fa4d310c9fa7b1db0ba9f27ea64f2975810bb18ba64f2c13e5e5728c9/cramjam-2.10.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:de3e4be5aa71b73c2640c9b86e435ec033592f7f79787937f8342259106a63ae", size = 1962253, upload-time = "2025-04-12T17:58:02.674Z" },
{ url = "https://files.pythonhosted.org/packages/a7/e7/00debcc4589b6b4a2b6d7a1d523eb09683f7a3cfea9d0a1f67ab20e9f36e/cramjam-2.10.0-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:11c5ef0c70d6bdd8e1d8afed8b0430709b22decc3865eb6c0656aa00117a7b3d", size = 2016921, upload-time = "2025-04-12T17:58:04.283Z" },
{ url = "https://files.pythonhosted.org/packages/af/d1/c62de1b4630108fa4da62ec579d9925171013cad195b44e4b49e58ee1d38/cramjam-2.10.0-cp312-cp312-musllinux_1_1_armv7l.whl", hash = "sha256:86b29e349064821ceeb14d60d01a11a0788f94e73ed4b3a5c3f9fac7aa4e2cd7", size = 2152996, upload-time = "2025-04-12T17:58:05.957Z" },
{ url = "https://files.pythonhosted.org/packages/1d/c2/429af269a0146f6fe54993e9cb41a35b1c231387307480ec84c641bd3629/cramjam-2.10.0-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:2c7008bb54bdc5d130c0e8581925dfcbdc6f0a4d2051de7a153bfced9a31910f", size = 2163476, upload-time = "2025-04-12T17:58:07.579Z" },
{ url = "https://files.pythonhosted.org/packages/2f/6d/0534780537175dd09aa4322119ab919acddfda404771b9e61b0bad00a955/cramjam-2.10.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:3a94fe7024137ed8bf200308000d106874afe52ff203f852f43b3547eddfa10e", size = 2132883, upload-time = "2025-04-12T17:58:09.141Z" },
{ url = "https://files.pythonhosted.org/packages/5d/2d/990b77c8257ff30ec5cf75fc110248f00a236dd8180410362ed6a32846ad/cramjam-2.10.0-cp312-cp312-win32.whl", hash = "sha256:ce11be5722c9d433c5e1eb3980f16eb7d80828b9614f089e28f4f1724fc8973f", size = 1597254, upload-time = "2025-04-12T17:58:10.728Z" },
{ url = "https://files.pythonhosted.org/packages/26/c7/baf6b960403313f9df3217f7b8039bb2e403559c95641e23a0b0056283c2/cramjam-2.10.0-cp312-cp312-win_amd64.whl", hash = "sha256:a01e89e99ba066dfa2df40fe99a2371565f4a3adc6811a73c8019d9929a312e8", size = 1699580, upload-time = "2025-04-12T17:58:12.586Z" },
{ url = "https://files.pythonhosted.org/packages/cc/9e/40ecf165dd9fd177c85d1d7b8614036865f15f39d116cf2c96dc84a3eb8a/cramjam-2.10.0-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:8bb0b6aaaa5f37091e05d756a3337faf0ddcffe8a68dbe8a710731b0d555ec8f", size = 3502800, upload-time = "2025-04-12T17:58:14.286Z" },
{ url = "https://files.pythonhosted.org/packages/af/63/83c7dbe9078ff7e9d8c449913a46a40ae8b9c260f2ec885a0249f00dd763/cramjam-2.10.0-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:27b2625c0840b9a5522eba30b165940084391762492e03b9d640fca5074016ae", size = 1835841, upload-time = "2025-04-12T17:58:15.986Z" },
{ url = "https://files.pythonhosted.org/packages/d0/bd/d5f9bdd562d4387ca7e1dcfc5121297cba0623e696882bf7cfd343fae88d/cramjam-2.10.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:4ba90f7b8f986934f33aad8cc029cf7c74842d3ecd5eda71f7531330d38a8dc4", size = 1673882, upload-time = "2025-04-12T17:58:17.725Z" },
{ url = "https://files.pythonhosted.org/packages/30/ac/198378091434078efb9e25b69a142de1203bf2e54a674f15d6048221a13e/cramjam-2.10.0-cp313-cp313-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:6655d04942f7c02087a6bba4bdc8d88961aa8ddf3fb9a05b3bad06d2d1ca321b", size = 2019844, upload-time = "2025-04-12T17:58:19.987Z" },
{ url = "https://files.pythonhosted.org/packages/5c/63/ab625cd743cd1950e0b8a1922b5599ee9109085dcb55dad30a3d1751a8ab/cramjam-2.10.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7dda9be2caf067ac21c4aa63497833e0984908b66849c07aaa42b1cfa93f5e1c", size = 1747573, upload-time = "2025-04-12T17:58:22.172Z" },
{ url = "https://files.pythonhosted.org/packages/fe/c9/d17f6d5fc9e619298b98c86cfca2b728945b05135b0cc16be8e6305e00cb/cramjam-2.10.0-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:afa36aa006d7692718fce427ecb276211918447f806f80c19096a627f5122e3d", size = 1850318, upload-time = "2025-04-12T17:58:23.988Z" },
{ url = "https://files.pythonhosted.org/packages/60/83/9e35fcd2a373c30251088d4abfb87312a51bc39a0c15f5eda5099888f6fd/cramjam-2.10.0-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:d46fd5a9e8eb5d56eccc6191a55e3e1e2b3ab24b19ab87563a2299a39c855fd7", size = 1997907, upload-time = "2025-04-12T17:58:26.336Z" },
{ url = "https://files.pythonhosted.org/packages/e5/5d/c0999ebd3c829b50b93f57fbc478c6a31d7b785789d14221b5962631a610/cramjam-2.10.0-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:e3012564760394dff89e7a10c5a244f8885cd155aec07bdbe2d6dc46be398614", size = 2361103, upload-time = "2025-04-12T17:58:29.38Z" },
{ url = "https://files.pythonhosted.org/packages/58/2c/866a73d33ea0950a3ea6e12d5d6f15abc8d5b5e2302c5e4aa9bd7c6d5179/cramjam-2.10.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e2d216ed4aca2090eabdd354204ae55ed3e13333d1a5b271981543696e634672", size = 1961830, upload-time = "2025-04-12T17:58:31.11Z" },
{ url = "https://files.pythonhosted.org/packages/70/2b/4f91b3d36d2b7288c8d180b0debce092357d41ca02bd3649f49354180613/cramjam-2.10.0-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:44c2660ee7c4c269646955e4e40c2693f803fbad12398bb31b2ad00cfc6027b8", size = 2016782, upload-time = "2025-04-12T17:58:33.383Z" },
{ url = "https://files.pythonhosted.org/packages/90/99/cff347c3279b99e3e9e1bc249319ec391c7cedb1bdc288929d4310bdd6f0/cramjam-2.10.0-cp313-cp313-musllinux_1_1_armv7l.whl", hash = "sha256:636a48e2d01fe8d7955e9523efd2f8efce55a0221f3b5d5b4bdf37c7ff056bf1", size = 2152536, upload-time = "2025-04-12T17:58:35.879Z" },
{ url = "https://files.pythonhosted.org/packages/c3/36/2f4353217477d017300676545cfa7bef8e55a1fa818b4fb97c2ab6d7bfd4/cramjam-2.10.0-cp313-cp313-musllinux_1_1_i686.whl", hash = "sha256:44c15f6117031a84497433b5f55d30ee72d438fdcba9778fec0c5ca5d416aa96", size = 2162962, upload-time = "2025-04-12T17:58:38.403Z" },
{ url = "https://files.pythonhosted.org/packages/ed/d2/808533ea5d8cccfa2bd272dc9900fa47d6cb93a6d0b2b18bcc23b0962a08/cramjam-2.10.0-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:76e4e42f2ecf1aca0a710adaa23000a192efb81a2aee3bcc16761f1777f08a74", size = 2132699, upload-time = "2025-04-12T17:58:40.374Z" },
{ url = "https://files.pythonhosted.org/packages/f9/18/f8a96e4e2448196ce39be0684053e48b2920a2f6b8467b43cc8be62476aa/cramjam-2.10.0-cp313-cp313-win32.whl", hash = "sha256:5b34f4678d386c64d3be402fdf67f75e8f1869627ea2ec4decd43e828d3b6fba", size = 1597001, upload-time = "2025-04-12T17:58:42.201Z" },
{ url = "https://files.pythonhosted.org/packages/dc/4f/d90e9a8379452e3882e4d937ca566a5286eea98811571a7da0277959253e/cramjam-2.10.0-cp313-cp313-win_amd64.whl", hash = "sha256:88754dd516f0e2f4dd242880b8e760dc854e917315a17fe3fc626475bea9b252", size = 1699339, upload-time = "2025-04-12T17:58:44.227Z" },
{ url = "https://files.pythonhosted.org/packages/db/37/96e3b41fa2e2ca8924ec8ec53ed152c7cef1b6507ee676035a9d6e4da01c/cramjam-2.10.0-pp310-pypy310_pp73-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:77192bc1a9897ecd91cf977a5d5f990373e35a8d028c9141c8c3d3680a4a4cd7", size = 3539602, upload-time = "2025-04-12T17:59:45.59Z" },
{ url = "https://files.pythonhosted.org/packages/48/2e/5c102cda83b38f10e6021ede32915270bd2ae5c6b0f704d42b5cdef17802/cramjam-2.10.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl", hash = "sha256:50b59e981f219d6840ac43cda8e885aff1457944ddbabaa16ac047690bfd6ad1", size = 1855894, upload-time = "2025-04-12T17:59:48.011Z" },
{ url = "https://files.pythonhosted.org/packages/e5/be/21e0a88a28d8fbfdc7d33eb78ff7ef31e5f1a67f86538607b01a25017512/cramjam-2.10.0-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:d84581c869d279fab437182d5db2b590d44975084e8d50b164947f7aaa2c5f25", size = 1684764, upload-time = "2025-04-12T17:59:49.763Z" },
{ url = "https://files.pythonhosted.org/packages/aa/4e/cb3f28b36aa9391c31b66b5c47d3b47e469e337f7a660cabf72adc57c37d/cramjam-2.10.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:04f54bea9ce39c440d1ac6901fe4d647f9218dd5cd8fe903c6fe9c42bf5e1f3b", size = 1761657, upload-time = "2025-04-12T17:59:51.64Z" },
{ url = "https://files.pythonhosted.org/packages/1c/ba/0c7309f22708301ce617f1b24e7d74691909385ab5c34f72683c41f98414/cramjam-2.10.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:cddd12ee5a2ef4100478db7f5563a9cdb8bc0a067fbd8ccd1ecdc446d2e6a41a", size = 1975717, upload-time = "2025-04-12T17:59:53.957Z" },
{ url = "https://files.pythonhosted.org/packages/02/2f/125ad8ba5482aca1704ac3510a4d8d7f9224b206060b974c4a1ac50962ec/cramjam-2.10.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:35bcecff38648908a4833928a892a1e7a32611171785bef27015107426bc1d9d", size = 1706860, upload-time = "2025-04-12T17:59:55.79Z" },
{ url = "https://files.pythonhosted.org/packages/5d/c9/03eae05fc36540ea92c1b136c727937bd82fd9a1f20986ac7c10191e9d40/cramjam-2.10.0-pp311-pypy311_pp73-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:1e826469cfbb6dcd5b967591e52855073267835229674cfa3d327088805855da", size = 3539823, upload-time = "2025-04-12T17:59:57.75Z" },
{ url = "https://files.pythonhosted.org/packages/de/34/e1066303c9dc9b6c9c8e5f820e277afa1c135ded170eb2190419af1e5df6/cramjam-2.10.0-pp311-pypy311_pp73-macosx_10_12_x86_64.whl", hash = "sha256:1a200b74220dcd80c2bb99e3bfe1cdb1e4ed0f5c071959f4316abd65f9ef1e39", size = 1856103, upload-time = "2025-04-12T17:59:59.794Z" },
{ url = "https://files.pythonhosted.org/packages/81/dd/edc1207ebe09e2f1bb8a1e46dfba039bbc14f1875deed5f21f1002c3c51d/cramjam-2.10.0-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:2e419b65538786fc1f0cf776612262d4bf6c9449983d3fc0d0acfd86594fe551", size = 1684791, upload-time = "2025-04-12T18:00:01.747Z" },
{ url = "https://files.pythonhosted.org/packages/64/47/53dbc9070c54001f96972ddf7eba168340114593eb891fe89dfd816ffc73/cramjam-2.10.0-pp311-pypy311_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:bf1321a40da930edeff418d561dfb03e6d59d5b8ab5cbab1c4b03ff0aa4c6d21", size = 1761774, upload-time = "2025-04-12T18:00:04.164Z" },
{ url = "https://files.pythonhosted.org/packages/5e/23/ce7688d7fe92e870cf64001db5c396d778056d48b5384d387e0263e5133c/cramjam-2.10.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a04376601c8f9714fb3a6a0a1699b85aab665d9d952a2a31fb37cf70e1be1fba", size = 1975809, upload-time = "2025-04-12T18:00:05.987Z" },
{ url = "https://files.pythonhosted.org/packages/50/58/da5ada423f010318958db6de98c188afa915e31f5ad4ac072c2e73563a53/cramjam-2.10.0-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:2c1eb6e6c3d5c1cc3f7c7f8a52e034340a3c454641f019687fa94077c05da5c2", size = 1707057, upload-time = "2025-04-12T18:00:08.118Z" },
]
[[package]]
name = "crosshair-tool"
version = "0.0.94"
@ -2404,6 +2484,53 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/fe/84/9c2917a70ed570ddbfd1d32ac23200c1d011e36c332e59950d2f6d204941/fastavro-1.11.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:1bc2824e9969c04ab6263d269a1e0e5d40b9bd16ade6b70c29d6ffbc4f3cc102", size = 3387171, upload-time = "2025-05-18T04:55:32.531Z" },
]
[[package]]
name = "fastparquet"
version = "2024.11.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "cramjam" },
{ name = "fsspec" },
{ name = "numpy" },
{ name = "packaging" },
{ name = "pandas" },
]
sdist = { url = "https://files.pythonhosted.org/packages/b4/66/862da14f5fde4eff2cedc0f51a8dc34ba145088e5041b45b2d57ac54f922/fastparquet-2024.11.0.tar.gz", hash = "sha256:e3b1fc73fd3e1b70b0de254bae7feb890436cb67e99458b88cb9bd3cc44db419", size = 467192, upload-time = "2024-11-15T19:30:10.413Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/3d/56/476f5b83476a256489879b78513bee737691a80905e246a2daa30ebcc362/fastparquet-2024.11.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:60ccf587410f0979105e17036df61bb60e1c2b81880dc91895cdb4ee65b71e7f", size = 910272, upload-time = "2024-11-12T20:37:19.594Z" },
{ url = "https://files.pythonhosted.org/packages/3b/ad/4ce73440df874479f7205fe5445090f71ed4e9bd77fdb3b740253ce82703/fastparquet-2024.11.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:a5ad5fc14b0567e700bea3cd528a0bd45a6f9371370b49de8889fb3d10a6574a", size = 684095, upload-time = "2024-11-12T20:37:22.957Z" },
{ url = "https://files.pythonhosted.org/packages/20/37/c3164261d6183d529a59afef2749821b262c8581d837faa91043837c6f76/fastparquet-2024.11.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0b74333914f454344458dab9d1432fda9b70d62e28dc7acb1512d937ef1424ee", size = 1700355, upload-time = "2024-11-12T20:37:25.792Z" },
{ url = "https://files.pythonhosted.org/packages/e6/95/cf4b175c22160ec21e4664830763bfaa80b2cf05133ef854c3f436d01c16/fastparquet-2024.11.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:41d1610130b5cb1ce36467766191c5418cba8631e2bfe3affffaf13f9be4e7a8", size = 1714663, upload-time = "2024-11-12T20:37:28.369Z" },
{ url = "https://files.pythonhosted.org/packages/2c/31/b6c8cdb6d5df964a192e4e8c8ecd979718afb9ca7e2dc9243a4368b370e9/fastparquet-2024.11.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d281edd625c33628ba028d3221180283d6161bc5ceb55eae1f0ca1678f864f26", size = 1666729, upload-time = "2024-11-12T20:37:30.243Z" },
{ url = "https://files.pythonhosted.org/packages/31/e5/8a0575c46a7973849f8f2a88af16618b9c7efe98f249f03e3e3de69c2b86/fastparquet-2024.11.0-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:fa56b19a29008c34cfe8831e810f770080debcbffc69aabd1df4d47572181f9c", size = 1741669, upload-time = "2024-11-12T20:37:32.067Z" },
{ url = "https://files.pythonhosted.org/packages/bb/6a/669f8c9cf2fc6e30c9353832f870e5a2e170b458d12c5080837f742d963d/fastparquet-2024.11.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:5914ecfa766b7763201b9f49d832a5e89c2dccad470ca4f9c9b228d9a8349756", size = 1782359, upload-time = "2024-11-12T20:37:33.806Z" },
{ url = "https://files.pythonhosted.org/packages/70/c0/1374cb43924739f4542e39d972481c1f4c7dd96808a1947450808e4e7df7/fastparquet-2024.11.0-cp310-cp310-win_amd64.whl", hash = "sha256:561202e8f0e859ccc1aa77c4aaad1d7901b2d50fd6f624ca018bae4c3c7a62ce", size = 670700, upload-time = "2024-11-12T20:37:35.312Z" },
{ url = "https://files.pythonhosted.org/packages/7c/51/e0d6e702523ac923ede6c05e240f4a02533ccf2cea9fec7a43491078e920/fastparquet-2024.11.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:374cdfa745aa7d5188430528d5841cf823eb9ad16df72ad6dadd898ccccce3be", size = 909934, upload-time = "2024-11-12T20:37:37.049Z" },
{ url = "https://files.pythonhosted.org/packages/0a/c8/5c0fb644c19a8d80b2ae4d8aa7d90c2d85d0bd4a948c5c700bea5c2802ea/fastparquet-2024.11.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:4c8401bfd86cccaf0ab7c0ade58c91ae19317ff6092e1d4ad96c2178197d8124", size = 683844, upload-time = "2024-11-12T20:37:38.456Z" },
{ url = "https://files.pythonhosted.org/packages/33/4a/1e532fd1a0d4d8af7ffc7e3a8106c0bcd13ed914a93a61e299b3832dd3d2/fastparquet-2024.11.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f9cca4c6b5969df5561c13786f9d116300db1ec22c7941e237cfca4ce602f59b", size = 1791698, upload-time = "2024-11-12T20:37:41.101Z" },
{ url = "https://files.pythonhosted.org/packages/8d/e8/e1ede861bea68394a755d8be1aa2e2d60a3b9f6b551bfd56aeca74987e2e/fastparquet-2024.11.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9a9387e77ac608d8978774caaf1e19de67eaa1386806e514dcb19f741b19cfe5", size = 1804289, upload-time = "2024-11-12T20:37:43.08Z" },
{ url = "https://files.pythonhosted.org/packages/4f/1e/957090cccaede805583ca3f3e46e2762d0f9bf8860ecbce65197e47d84c1/fastparquet-2024.11.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6595d3771b3d587a31137e985f751b4d599d5c8e9af9c4858e373fdf5c3f8720", size = 1753638, upload-time = "2024-11-12T20:37:45.498Z" },
{ url = "https://files.pythonhosted.org/packages/85/72/344787c685fd1531f07ae712a855a7c34d13deaa26c3fd4a9231bea7dbab/fastparquet-2024.11.0-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:053695c2f730b78a2d3925df7cd5c6444d6c1560076af907993361cc7accf3e2", size = 1814407, upload-time = "2024-11-12T20:37:47.25Z" },
{ url = "https://files.pythonhosted.org/packages/6c/ec/ab9d5685f776a1965797eb68c4364c72edf57cd35beed2df49b34425d1df/fastparquet-2024.11.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:0a52eecc6270ae15f0d51347c3f762703dd667ca486f127dc0a21e7e59856ae5", size = 1874462, upload-time = "2024-11-12T20:37:49.755Z" },
{ url = "https://files.pythonhosted.org/packages/90/4f/7a4ea9a7ddf0a3409873f0787f355806f9e0b73f42f2acecacdd9a8eff0a/fastparquet-2024.11.0-cp311-cp311-win_amd64.whl", hash = "sha256:e29ff7a367fafa57c6896fb6abc84126e2466811aefd3e4ad4070b9e18820e54", size = 671023, upload-time = "2024-11-12T20:37:51.461Z" },
{ url = "https://files.pythonhosted.org/packages/08/76/068ac7ec9b4fc783be21a75a6a90b8c0654da4d46934d969e524ce287787/fastparquet-2024.11.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:dbad4b014782bd38b58b8e9f514fe958cfa7a6c4e187859232d29fd5c5ddd849", size = 915968, upload-time = "2024-11-12T20:37:52.861Z" },
{ url = "https://files.pythonhosted.org/packages/c7/9e/6d3b4188ad64ed51173263c07109a5f18f9c84a44fa39ab524fca7420cda/fastparquet-2024.11.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:403d31109d398b6be7ce84fa3483fc277c6a23f0b321348c0a505eb098a041cb", size = 685399, upload-time = "2024-11-12T20:37:54.899Z" },
{ url = "https://files.pythonhosted.org/packages/8f/6c/809220bc9fbe83d107df2d664c3fb62fb81867be8f5218ac66c2e6b6a358/fastparquet-2024.11.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cbbb9057a26acf0abad7adf58781ee357258b7708ee44a289e3bee97e2f55d42", size = 1758557, upload-time = "2024-11-12T20:37:56.553Z" },
{ url = "https://files.pythonhosted.org/packages/e0/2c/b3b3e6ca2e531484289024138cd4709c22512b3fe68066d7f9849da4a76c/fastparquet-2024.11.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:63e0e416e25c15daa174aad8ba991c2e9e5b0dc347e5aed5562124261400f87b", size = 1781052, upload-time = "2024-11-12T20:37:58.339Z" },
{ url = "https://files.pythonhosted.org/packages/21/fe/97ed45092d0311c013996dae633122b7a51c5d9fe8dcbc2c840dc491201e/fastparquet-2024.11.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:0e2d7f02f57231e6c86d26e9ea71953737202f20e948790e5d4db6d6a1a150dc", size = 1715797, upload-time = "2024-11-12T20:38:00.694Z" },
{ url = "https://files.pythonhosted.org/packages/24/df/02fa6aee6c0d53d1563b5bc22097076c609c4c5baa47056b0b4bed456fcf/fastparquet-2024.11.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:fbe4468146b633d8f09d7b196fea0547f213cb5ce5f76e9d1beb29eaa9593a93", size = 1795682, upload-time = "2024-11-12T20:38:02.38Z" },
{ url = "https://files.pythonhosted.org/packages/b0/25/f4f87557589e1923ee0e3bebbc84f08b7c56962bf90f51b116ddc54f2c9f/fastparquet-2024.11.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:29d5c718817bcd765fc519b17f759cad4945974421ecc1931d3bdc3e05e57fa9", size = 1857842, upload-time = "2024-11-12T20:38:04.196Z" },
{ url = "https://files.pythonhosted.org/packages/b1/f9/98cd0c39115879be1044d59c9b76e8292776e99bb93565bf990078fd11c4/fastparquet-2024.11.0-cp312-cp312-win_amd64.whl", hash = "sha256:74a0b3c40ab373442c0fda96b75a36e88745d8b138fcc3a6143e04682cbbb8ca", size = 673269, upload-time = "2024-12-11T21:22:48.073Z" },
{ url = "https://files.pythonhosted.org/packages/47/e3/e7db38704be5db787270d43dde895eaa1a825ab25dc245e71df70860ec12/fastparquet-2024.11.0-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:59e5c5b51083d5b82572cdb7aed0346e3181e3ac9d2e45759da2e804bdafa7ee", size = 912523, upload-time = "2024-11-12T20:38:06.003Z" },
{ url = "https://files.pythonhosted.org/packages/d3/66/e3387c99293dae441634e7724acaa425b27de19a00ee3d546775dace54a9/fastparquet-2024.11.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:bdadf7b6bad789125b823bfc5b0a719ba5c4a2ef965f973702d3ea89cff057f6", size = 683779, upload-time = "2024-11-12T20:38:07.442Z" },
{ url = "https://files.pythonhosted.org/packages/0a/21/d112d0573d086b578bf04302a502e9a7605ea8f1244a7b8577cd945eec78/fastparquet-2024.11.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:46b2db02fc2a1507939d35441c8ab211d53afd75d82eec9767d1c3656402859b", size = 1751113, upload-time = "2024-11-12T20:38:09.36Z" },
{ url = "https://files.pythonhosted.org/packages/6b/a7/040507cee3a7798954e8fdbca21d2dbc532774b02b882d902b8a4a6849ef/fastparquet-2024.11.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a3afdef2895c9f459135a00a7ed3ceafebfbce918a9e7b5d550e4fae39c1b64d", size = 1780496, upload-time = "2024-11-12T20:38:11.022Z" },
{ url = "https://files.pythonhosted.org/packages/bc/75/d0d9f7533d780ec167eede16ad88073ee71696150511126c31940e7f73aa/fastparquet-2024.11.0-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:36b5c9bd2ffaaa26ff45d59a6cefe58503dd748e0c7fad80dd905749da0f2b9e", size = 1713608, upload-time = "2024-11-12T20:38:12.848Z" },
{ url = "https://files.pythonhosted.org/packages/30/fa/1d95bc86e45e80669c4f374b2ca26a9e5895a1011bb05d6341b4a7414693/fastparquet-2024.11.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:6b7df5d3b61a19d76e209fe8d3133759af1c139e04ebc6d43f3cc2d8045ef338", size = 1792779, upload-time = "2024-11-12T20:38:14.5Z" },
{ url = "https://files.pythonhosted.org/packages/13/3d/c076beeb926c79593374c04662a9422a76650eef17cd1c8e10951340764a/fastparquet-2024.11.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:8b35823ac7a194134e5f82fa4a9659e42e8f9ad1f2d22a55fbb7b9e4053aabbb", size = 1851322, upload-time = "2024-11-12T20:38:16.231Z" },
{ url = "https://files.pythonhosted.org/packages/09/5a/1d0d47e64816002824d4a876644e8c65540fa23f91b701f0daa726931545/fastparquet-2024.11.0-cp313-cp313-win_amd64.whl", hash = "sha256:d20632964e65530374ff7cddd42cc06aa0a1388934903693d6d22592a5ba827b", size = 673266, upload-time = "2024-11-12T20:38:17.661Z" },
]
[[package]]
name = "filelock"
version = "3.18.0"
@ -3501,7 +3628,7 @@ wheels = [
[[package]]
name = "huggingface-hub"
version = "0.33.0"
version = "0.33.5"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "filelock" },
@ -3513,9 +3640,9 @@ dependencies = [
{ name = "tqdm" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/91/8a/1362d565fefabaa4185cf3ae842a98dbc5b35146f5694f7080f043a6952f/huggingface_hub-0.33.0.tar.gz", hash = "sha256:aa31f70d29439d00ff7a33837c03f1f9dd83971ce4e29ad664d63ffb17d3bb97", size = 426179, upload-time = "2025-06-11T17:08:07.913Z" }
sdist = { url = "https://files.pythonhosted.org/packages/02/16/5716d03e2b48bcc8e32d9b18ed7e55d2ae52e3d5df146cced9fe0581b5ff/huggingface_hub-0.33.5.tar.gz", hash = "sha256:814097e475646d170c44be4c38f7d381ccc4539156a5ac62a54f53aaf1602ed8", size = 427075, upload-time = "2025-07-24T12:30:31.449Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/33/fb/53587a89fbc00799e4179796f51b3ad713c5de6bb680b2becb6d37c94649/huggingface_hub-0.33.0-py3-none-any.whl", hash = "sha256:e8668875b40c68f9929150d99727d39e5ebb8a05a98e4191b908dc7ded9074b3", size = 514799, upload-time = "2025-06-11T17:08:05.757Z" },
{ url = "https://files.pythonhosted.org/packages/33/d5/d9e9b75d8dc9cf125fff16fb0cd51d864a29e8b46b6880d8808940989405/huggingface_hub-0.33.5-py3-none-any.whl", hash = "sha256:29b4e64982c2064006021af297e1b17d44c85a8aaf90a0d7efeff7e7d2426296", size = 515705, upload-time = "2025-07-24T12:30:29.55Z" },
]
[package.optional-dependencies]
@ -4367,7 +4494,7 @@ wheels = [
[[package]]
name = "langchain-core"
version = "0.3.66"
version = "0.3.72"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "jsonpatch" },
@ -4378,9 +4505,9 @@ dependencies = [
{ name = "tenacity" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/f0/63/470aa84393bad5d51749417af58522a691174f8b2d05843f5633d473faa0/langchain_core-0.3.66.tar.gz", hash = "sha256:350c92e792ec1401f4b740d759b95f297710a50de29e1be9fbfff8676ef62117", size = 560102, upload-time = "2025-06-20T22:08:19.532Z" }
sdist = { url = "https://files.pythonhosted.org/packages/8b/49/7568baeb96a57d3218cb5f1f113b142063679088fd3a0d0cae1feb0b3d36/langchain_core-0.3.72.tar.gz", hash = "sha256:4de3828909b3d7910c313242ab07b241294650f5cb6eac17738dd3638b1cd7de", size = 567227, upload-time = "2025-07-24T00:40:08.5Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/c0/c3/8080431fd7567a340d3a42e36c0bb3970a8d00d5e27bf3ca2103b3b55996/langchain_core-0.3.66-py3-none-any.whl", hash = "sha256:65cd6c3659afa4f91de7aa681397a0c53ff9282425c281e53646dd7faf16099e", size = 438874, upload-time = "2025-06-20T22:08:17.52Z" },
{ url = "https://files.pythonhosted.org/packages/6e/7d/9f75023c478e3b854d67da31d721e39f0eb30ae969ec6e755430cb1c0fb5/langchain_core-0.3.72-py3-none-any.whl", hash = "sha256:9fa15d390600eb6b6544397a7aa84be9564939b6adf7a2b091179ea30405b240", size = 442806, upload-time = "2025-07-24T00:40:06.994Z" },
]
[[package]]
@ -4505,6 +4632,20 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/e3/ed/bf857c2857a7aa1f9b1d436c668a3f1a4071cb2bb6f1d247f98f1ebb3f0a/langchain_groq-0.2.1-py3-none-any.whl", hash = "sha256:98d282fd9d7d99b0f55de0a1daea2d5d350ef697e3cb5e97de06aeba4eca8679", size = 14331, upload-time = "2024-10-31T18:34:35.211Z" },
]
[[package]]
name = "langchain-huggingface"
version = "0.3.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "huggingface-hub" },
{ name = "langchain-core" },
{ name = "tokenizers" },
]
sdist = { url = "https://files.pythonhosted.org/packages/3f/15/f832ae485707bf52f9a8f055db389850de06c46bc6e3e4420a0ef105fbbf/langchain_huggingface-0.3.1.tar.gz", hash = "sha256:0a145534ce65b5a723c8562c456100a92513bbbf212e6d8c93fdbae174b41341", size = 25154, upload-time = "2025-07-22T17:22:26.77Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/bf/26/7c5d4b4d3e1a7385863acc49fb6f96c55ccf941a750991d18e3f6a69a14a/langchain_huggingface-0.3.1-py3-none-any.whl", hash = "sha256:de10a692dc812885696fbaab607d28ac86b833b0f305bccd5d82d60336b07b7d", size = 27609, upload-time = "2025-07-22T17:22:25.282Z" },
]
[[package]]
name = "langchain-ibm"
version = "0.3.12"
@ -4715,6 +4856,7 @@ dependencies = [
{ name = "fake-useragent" },
{ name = "fastavro", version = "1.9.7", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.13'" },
{ name = "fastavro", version = "1.11.1", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.13'" },
{ name = "fastparquet" },
{ name = "filelock" },
{ name = "gassist", marker = "sys_platform == 'win32'" },
{ name = "gitpython" },
@ -4741,6 +4883,7 @@ dependencies = [
{ name = "langchain-google-vertexai" },
{ name = "langchain-graph-retriever" },
{ name = "langchain-groq" },
{ name = "langchain-huggingface" },
{ name = "langchain-ibm" },
{ name = "langchain-milvus" },
{ name = "langchain-mistralai" },
@ -4912,6 +5055,7 @@ requires-dist = [
{ name = "fake-useragent", specifier = "==1.5.1" },
{ name = "fastavro", marker = "python_full_version < '3.13'", specifier = "==1.9.7" },
{ name = "fastavro", marker = "python_full_version >= '3.13'", specifier = ">=1.9.8" },
{ name = "fastparquet", specifier = ">=2024.11.0" },
{ name = "filelock", specifier = ">=3.18.0" },
{ name = "gassist", marker = "sys_platform == 'win32'", specifier = ">=0.0.1" },
{ name = "gitpython", specifier = "==3.1.43" },
@ -4938,6 +5082,7 @@ requires-dist = [
{ name = "langchain-google-vertexai", specifier = "==2.0.7" },
{ name = "langchain-graph-retriever", specifier = "==0.6.1" },
{ name = "langchain-groq", specifier = "==0.2.1" },
{ name = "langchain-huggingface", specifier = "==0.3.1" },
{ name = "langchain-ibm", specifier = ">=0.3.8" },
{ name = "langchain-milvus", specifier = "==0.1.7" },
{ name = "langchain-mistralai", specifier = "==0.2.3" },