feat: add StructuredOutput component (#4024)

* Add utility functions to build Pydantic models from schema definitions * Add unit tests for build_model_from_schema function in test_base_model.py - Implement various test cases to validate the functionality of build_model_from_schema. - Test cases cover scenarios such as handling valid and empty schemas, managing unknown field types, and processing schemas with missing optional keys. - Ensure proper handling of nested list and dict types, and verify the function's efficiency with large schemas. - Confirm that the function raises exceptions for invalid input and handles duplicate field names correctly. * Refactor tests in `test_base_model.py` to improve type handling and error checking * Refactor output schema handling to use TableInput and build_model_from_schema * Update OpenAI model components and hierarchical crew setup - Refactor `OpenAIModelComponent` to use `TableInput` for `output_schema` and integrate `build_model_from_schema`. - Modify `HierarchicalCrewComponent` to use unpacking for base inputs. - Ensure consistent import statements across JSON files. - Improve error handling and logging for vector store operations. * Add chat result model with message building and execution logic - Implement `build_messages_and_runnable` to construct message lists and configure runnable models. - Add `get_chat_result` to execute language models with input messages, supporting streaming and custom configurations. - Handle exceptions with optional custom error messages. * Add "table" to DIRECT_TYPES in constants.py * Add support for DataFrame input validation in TableInput class * Add StructuredOutputComponent for generating structured outputs from language models * Enhance structured output component with improved input descriptions and schema naming * Convert DataFrame to list of dictionaries in TableInput validation * Remove pandas dependency and refactor schema handling in structured_output.py * Remove 'default' field from structured output schema and update field initialization * Add 'number' and 'text' types to type mapping and remove default value from field creation * Enhance error handling in structured output building process * Improve error message for non-BaseModel output in structured_output.py * Add unit tests for StructuredOutputComponent in helpers module - Implement various test cases to ensure correct functionality of StructuredOutputComponent. - Test successful structured output generation, handling of unsupported language models, and correct output model building. - Validate handling of multiple outputs, empty and invalid output schemas, and nested schemas. - Include tests for large input values and invalid language model configurations. * Update description for StructuredOutputComponent to clarify functionality * Add default values and error handling for structured output in helpers * Remove unused 'method' parameter from 'with_structured_output' in MockLanguageModel * refactor: rename test_base_model.py to test_base_model_from_schema.py Rename the test_base_model.py file to test_base_model_from_schema.py to better reflect its purpose of testing the build_model_from_schema function. This change improves code clarity and maintainability. * Add type ignore comments to suppress type checking errors * Add Generic typing to StructuredOutputComponent and fix method call * Revert "Refactor output schema handling to use TableInput and build_model_from_schema" This reverts commit 2e84a8608689bcfb519dc589d3eeef852784f3e4. * Deprecate JSON mode in OpenAIModel output schema documentation * Remove unused Generic import and add type ignore comment in StructuredOutputComponent * Refactor OpenAI model components and deprecate output schema - Refactored `OpenAIModelComponent` to use `operator.ior` and `functools.reduce` for converting `output_schema` to a dictionary. - Deprecated the `output_schema` field, updating its info to reflect the deprecation. - Simplified the `_docs_to_data` method in `SplitTextComponent` for better readability. - Updated import statements and removed unused imports across multiple JSON files. * Add specific type ignore comments and update exception types in backend code
2024-10-15 18:41:42 -03:00 · 2024-10-15 18:41:42 -03:00 · 2be7c56939
commit 2be7c56939
parent c9b9fcf63c
19 changed files with 693 additions and 57 deletions
--- a/src/backend/base/langflow/base/models/chat_result.py
+++ b/src/backend/base/langflow/base/models/chat_result.py
@ -0,0 +1,76 @@
+import warnings
+
+from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage
+
+from langflow.field_typing.constants import LanguageModel
+from langflow.schema.message import Message
+
+
+def build_messages_and_runnable(
+    input_value: str | Message, system_message: str | None, original_runnable: LanguageModel
+) -> tuple[list[BaseMessage], LanguageModel]:
+    messages: list[BaseMessage] = []
+    system_message_added = False
+    runnable = original_runnable
+
+    if input_value:
+        if isinstance(input_value, Message):
+            with warnings.catch_warnings():
+                warnings.simplefilter("ignore")
+                if "prompt" in input_value:
+                    prompt = input_value.load_lc_prompt()
+                    if system_message:
+                        prompt.messages = [
+                            SystemMessage(content=system_message),
+                            *prompt.messages,  # type: ignore[has-type]
+                        ]
+                        system_message_added = True
+                    runnable = prompt | runnable
+                else:
+                    messages.append(input_value.to_lc_message())
+        else:
+            messages.append(HumanMessage(content=input_value))
+
+    if system_message and not system_message_added:
+        messages.insert(0, SystemMessage(content=system_message))
+
+    return messages, runnable
+
+
+def get_chat_result(
+    runnable: LanguageModel,
+    input_value: str | Message,
+    system_message: str | None = None,
+    config: dict | None = None,
+    *,
+    stream: bool = False,
+):
+    if not input_value and not system_message:
+        msg = "The message you want to send to the model is empty."
+        raise ValueError(msg)
+
+    messages, runnable = build_messages_and_runnable(
+        input_value=input_value, system_message=system_message, original_runnable=runnable
+    )
+
+    inputs: list | dict = messages or {}
+    try:
+        if config and config.get("output_parser") is not None:
+            runnable = runnable | config["output_parser"]
+
+        if config:
+            runnable = runnable.with_config(
+                {
+                    "run_name": config.get("display_name", ""),
+                    "project_name": config.get("get_project_name", lambda: "")(),
+                    "callbacks": config.get("get_langchain_callbacks", list)(),
+                }
+            )
+        if stream:
+            return runnable.stream(inputs)
+        message = runnable.invoke(inputs)
+        return message.content if hasattr(message, "content") else message
+    except Exception as e:
+        if config and config.get("_get_exception_message") and (message := config["_get_exception_message"](e)):
+            raise ValueError(message) from e
+        raise
--- a/src/backend/base/langflow/components/helpers/structured_output.py
+++ b/src/backend/base/langflow/components/helpers/structured_output.py
@ -0,0 +1,111 @@
+from typing import cast
+
+from pydantic import BaseModel, Field, create_model
+
+from langflow.base.models.chat_result import get_chat_result
+from langflow.custom import Component
+from langflow.field_typing.constants import LanguageModel
+from langflow.helpers.base_model import build_model_from_schema
+from langflow.io import BoolInput, HandleInput, MessageTextInput, Output, StrInput, TableInput
+from langflow.schema.data import Data
+
+
+class StructuredOutputComponent(Component):
+    display_name = "Structured Output"
+    description = (
+        "Transforms LLM responses into **structured data formats**. Ideal for extracting specific information "
+        "or creating consistent outputs."
+    )
+    inputs = [
+        HandleInput(
+            name="llm",
+            display_name="Language Model",
+            info="The language model to use to generate the structured output.",
+            input_types=["LanguageModel"],
+        ),
+        MessageTextInput(name="input_value", display_name="Input message"),
+        StrInput(
+            name="schema_name",
+            display_name="Schema Name",
+            info="Provide a name for the output data schema.",
+        ),
+        TableInput(
+            name="output_schema",
+            display_name="Output Schema",
+            info="Define the structure and data types for the model's output.",
+            table_schema=[
+                {
+                    "name": "name",
+                    "display_name": "Name",
+                    "type": "str",
+                    "description": "Specify the name of the output field.",
+                },
+                {
+                    "name": "description",
+                    "display_name": "Description",
+                    "type": "str",
+                    "description": "Describe the purpose of the output field.",
+                },
+                {
+                    "name": "type",
+                    "display_name": "Type",
+                    "type": "str",
+                    "description": (
+                        "Indicate the data type of the output field " "(e.g., str, int, float, bool, list, dict)."
+                    ),
+                    "default": "text",
+                },
+                {
+                    "name": "multiple",
+                    "display_name": "Multiple",
+                    "type": "boolean",
+                    "description": "Set to True if this output field should be a list of the specified type.",
+                    "default": "False",
+                },
+            ],
+        ),
+        BoolInput(
+            name="multiple",
+            display_name="Generate Multiple",
+            info="Set to True if the model should generate a list of outputs instead of a single output.",
+        ),
+    ]
+
+    outputs = [
+        Output(name="structured_output", display_name="Structured Output", method="build_structured_output"),
+    ]
+
+    def build_structured_output(self) -> Data:
+        if not hasattr(self.llm, "with_structured_output"):
+            msg = "Language model does not support structured output."
+            raise TypeError(msg)
+        if not self.output_schema:
+            msg = "Output schema cannot be empty"
+            raise ValueError(msg)
+
+        _output_model = build_model_from_schema(self.output_schema)
+        if self.multiple:
+            output_model = create_model(
+                self.schema_name,
+                objects=(list[_output_model], Field(description=f"A list of {self.schema_name}.")),  # type: ignore[valid-type]
+            )
+        else:
+            output_model = _output_model
+        try:
+            llm_with_structured_output = cast(LanguageModel, self.llm).with_structured_output(schema=output_model)  # type: ignore[valid-type, attr-defined]
+
+        except NotImplementedError as exc:
+            msg = f"{self.llm.__class__.__name__} does not support structured output."
+            raise TypeError(msg) from exc
+        config_dict = {
+            "run_name": self.display_name,
+            "project_name": self.get_project_name(),
+            "callbacks": self.get_langchain_callbacks(),
+        }
+        output = get_chat_result(runnable=llm_with_structured_output, input_value=self.input_value, config=config_dict)
+        if isinstance(output, BaseModel):
+            output_dict = output.model_dump()
+        else:
+            msg = f"Output should be a Pydantic BaseModel, got {type(output)} ({output})"
+            raise TypeError(msg)
+        return Data(data=output_dict)
--- a/src/backend/base/langflow/components/models/OpenAIModel.py
+++ b/src/backend/base/langflow/components/models/OpenAIModel.py
@ -8,15 +8,7 @@ from langflow.base.models.model import LCModelComponent
 from langflow.base.models.openai_constants import OPENAI_MODEL_NAMES
 from langflow.field_typing import LanguageModel
 from langflow.field_typing.range_spec import RangeSpec
-from langflow.inputs import (
-    BoolInput,
-    DictInput,
-    DropdownInput,
-    FloatInput,
-    IntInput,
-    SecretStrInput,
-    StrInput,
-)
+from langflow.inputs import BoolInput, DictInput, DropdownInput, FloatInput, IntInput, SecretStrInput, StrInput
 from langflow.inputs.inputs import HandleInput


@ -49,7 +41,7 @@ class OpenAIModelComponent(LCModelComponent):
            advanced=True,
            info="The schema for the Output of the model. "
            "You must pass the word JSON in the prompt. "
-            "If left blank, JSON mode will be disabled.",
+            "If left blank, JSON mode will be disabled. [DEPRECATED]",
        ),
        DropdownInput(
            name="model_name",
--- a/src/backend/base/langflow/helpers/base_model.py
+++ b/src/backend/base/langflow/helpers/base_model.py
@ -1,6 +1,71 @@
+from typing import Any, TypedDict
+
 from pydantic import BaseModel as PydanticBaseModel
-from pydantic import ConfigDict
+from pydantic import ConfigDict, Field, create_model
+
+TRUE_VALUES = ["true", "1", "t", "y", "yes"]
+
+
+class SchemaField(TypedDict):
+    name: str
+    type: str
+    description: str
+    multiple: bool


 class BaseModel(PydanticBaseModel):
    model_config = ConfigDict(populate_by_name=True)
+
+
+def _get_type_annotation(type_str: str, *, multiple: bool) -> type:
+    type_mapping = {
+        "str": str,
+        "int": int,
+        "float": float,
+        "bool": bool,
+        "boolean": bool,
+        "list": list[Any],
+        "dict": dict[str, Any],
+        "number": float,
+        "text": str,
+    }
+    try:
+        base_type = type_mapping[type_str]
+    except KeyError as e:
+        msg = f"Invalid type: {type_str}"
+        raise ValueError(msg) from e
+    if multiple:
+        return list[base_type]  # type: ignore[valid-type]
+    return base_type  # type: ignore[return-value]
+
+
+def build_model_from_schema(schema: list[SchemaField]) -> type[PydanticBaseModel]:
+    fields = {}
+    for field in schema:
+        field_name = field["name"]
+        field_type_str = field["type"]
+        description = field.get("description", "")
+        multiple = field.get("multiple", False)
+        multiple = coalesce_bool(multiple)
+        field_type_annotation = _get_type_annotation(field_type_str, multiple=multiple)
+        fields[field_name] = (field_type_annotation, Field(description=description))
+    return create_model("OutputModel", **fields)
+
+
+def coalesce_bool(value: Any) -> bool:
+    """Coalesces the given value into a boolean.
+
+    Args:
+        value (Any): The value to be coalesced.
+
+    Returns:
+        bool: The coalesced boolean value.
+
+    """
+    if isinstance(value, bool):
+        return value
+    if isinstance(value, str):
+        return value.lower() in TRUE_VALUES
+    if isinstance(value, int):
+        return bool(value)
+    return False
--- a/src/backend/base/langflow/initial_setup/starter_projects/Agent
+++ b/src/backend/base/langflow/initial_setup/starter_projects/Agent
--- a/src/backend/base/langflow/initial_setup/starter_projects/Basic
+++ b/src/backend/base/langflow/initial_setup/starter_projects/Basic
--- a/src/backend/base/langflow/initial_setup/starter_projects/Blog
+++ b/src/backend/base/langflow/initial_setup/starter_projects/Blog
--- a/src/backend/base/langflow/initial_setup/starter_projects/Complex
+++ b/src/backend/base/langflow/initial_setup/starter_projects/Complex
--- a/src/backend/base/langflow/initial_setup/starter_projects/Document
+++ b/src/backend/base/langflow/initial_setup/starter_projects/Document
--- a/src/backend/base/langflow/initial_setup/starter_projects/Hierarchical
+++ b/src/backend/base/langflow/initial_setup/starter_projects/Hierarchical
--- a/src/backend/base/langflow/initial_setup/starter_projects/Memory
+++ b/src/backend/base/langflow/initial_setup/starter_projects/Memory
--- a/src/backend/base/langflow/initial_setup/starter_projects/Sequential
+++ b/src/backend/base/langflow/initial_setup/starter_projects/Sequential
--- a/src/backend/base/langflow/initial_setup/starter_projects/Travel
+++ b/src/backend/base/langflow/initial_setup/starter_projects/Travel
--- a/src/backend/base/langflow/initial_setup/starter_projects/Vector
+++ b/src/backend/base/langflow/initial_setup/starter_projects/Vector
--- a/src/backend/base/langflow/inputs/inputs.py
+++ b/src/backend/base/langflow/inputs/inputs.py
@ -2,6 +2,7 @@ import warnings
 from collections.abc import AsyncIterator, Iterator
 from typing import Any, get_args

+from pandas import DataFrame
 from pydantic import Field, field_validator

 from langflow.inputs.validators import CoalesceBool
@ -34,6 +35,8 @@ class TableInput(BaseInputMixin, MetadataTraceMixin, TableMixin, ListableInputMi
    @classmethod
    def validate_value(cls, v: Any, _info):
        # Check if value is a list of dicts
+        if isinstance(v, DataFrame):
+            v = v.to_dict(orient="records")
        if not isinstance(v, list):
            msg = f"TableInput value must be a list of dictionaries or Data. Value '{v}' is not a list."
            raise ValueError(msg)  # noqa: TRY004
--- a/src/backend/base/langflow/schema/table.py
+++ b/src/backend/base/langflow/schema/table.py
@ -2,7 +2,7 @@ from enum import Enum

 from pydantic import BaseModel, ConfigDict, Field, field_validator, model_validator

-VALID_TYPES = ["date", "number", "text", "json", "integer", "int", "float", "str", "string"]
+VALID_TYPES = ["date", "number", "text", "json", "integer", "int", "float", "str", "string", "boolean"]


 class FormatterType(str, Enum):
@ -10,6 +10,7 @@ class FormatterType(str, Enum):
    text = "text"
    number = "number"
    json = "json"
+    boolean = "boolean"


 class Column(BaseModel):
--- a/src/backend/base/langflow/utils/constants.py
+++ b/src/backend/base/langflow/utils/constants.py
@ -52,17 +52,7 @@ def python_function(text: str) -> str:


 PYTHON_BASIC_TYPES = [str, bool, int, float, tuple, list, dict, set]
-DIRECT_TYPES = [
-    "str",
-    "bool",
-    "dict",
-    "int",
-    "float",
-    "Any",
-    "prompt",
-    "code",
-    "NestedDict",
-]
+DIRECT_TYPES = ["str", "bool", "dict", "int", "float", "Any", "prompt", "code", "NestedDict", "table"]


 LOADERS_INFO: list[dict[str, Any]] = [
--- a/src/backend/tests/unit/components/helpers/test_structured_output_component.py
+++ b/src/backend/tests/unit/components/helpers/test_structured_output_component.py
@ -0,0 +1,240 @@
+from unittest.mock import MagicMock, patch
+
+import pytest
+from pydantic import BaseModel
+
+from langflow.components.helpers.structured_output import StructuredOutputComponent
+from langflow.schema.data import Data
+
+
+@pytest.fixture
+def client():
+    pass
+
+
+class TestStructuredOutputComponent:
+    # Ensure that the structured output is successfully generated with the correct BaseModel instance returned by the mock function
+    def test_successful_structured_output_generation_with_patch_with_config(self):
+        from unittest.mock import patch
+
+        class MockLanguageModel:
+            def with_structured_output(self, schema):
+                return self
+
+            def with_config(self, config):
+                return self
+
+            def invoke(self, inputs):
+                return self
+
+        def mock_get_chat_result(runnable, input_value, config):
+            class MockBaseModel(BaseModel):
+                def model_dump(self):
+                    return {"field": "value"}
+
+            return MockBaseModel()
+
+        component = StructuredOutputComponent(
+            llm=MockLanguageModel(),
+            input_value="Test input",
+            schema_name="TestSchema",
+            output_schema=[{"name": "field", "type": "str", "description": "A test field"}],
+            multiple=False,
+        )
+
+        with patch("langflow.components.helpers.structured_output.get_chat_result", mock_get_chat_result):
+            result = component.build_structured_output()
+            assert isinstance(result, Data)
+            assert result.data == {"field": "value"}
+
+    # Raises ValueError when the language model does not support structured output
+    def test_raises_value_error_for_unsupported_language_model(self):
+        # Mocking an incompatible language model
+        class MockLanguageModel:
+            pass
+
+        # Creating an instance of StructuredOutputComponent
+        component = StructuredOutputComponent(
+            llm=MockLanguageModel(),
+            input_value="Test input",
+            schema_name="TestSchema",
+            output_schema=[{"name": "field", "type": "str", "description": "A test field"}],
+            multiple=False,
+        )
+
+        with pytest.raises(TypeError, match="Language model does not support structured output."):
+            component.build_structured_output()
+
+    # Correctly builds the output model from the provided schema
+    def test_correctly_builds_output_model(self):
+        # Import internal organization modules, packages, and libraries
+        from langflow.helpers.base_model import build_model_from_schema
+        from langflow.inputs.inputs import TableInput
+
+        # Setup
+        component = StructuredOutputComponent()
+        schema = [
+            {
+                "name": "name",
+                "display_name": "Name",
+                "type": "str",
+                "description": "Specify the name of the output field.",
+            },
+            {
+                "name": "description",
+                "display_name": "Description",
+                "type": "str",
+                "description": "Describe the purpose of the output field.",
+            },
+            {
+                "name": "type",
+                "display_name": "Type",
+                "type": "str",
+                "description": (
+                    "Indicate the data type of the output field " "(e.g., str, int, float, bool, list, dict)."
+                ),
+            },
+            {
+                "name": "multiple",
+                "display_name": "Multiple",
+                "type": "boolean",
+                "description": "Set to True if this output field should be a list of the specified type.",
+            },
+        ]
+        component.output_schema = TableInput(name="output_schema", display_name="Output Schema", table_schema=schema)
+
+        # Assertion
+        output_model = build_model_from_schema(schema)
+        assert isinstance(output_model, type)
+
+    # Properly handles multiple outputs when 'multiple' is set to True
+    def test_handles_multiple_outputs(self):
+        # Import internal organization modules, packages, and libraries
+        from langflow.helpers.base_model import build_model_from_schema
+        from langflow.inputs.inputs import TableInput
+
+        # Setup
+        component = StructuredOutputComponent()
+        schema = [
+            {
+                "name": "name",
+                "display_name": "Name",
+                "type": "str",
+                "description": "Specify the name of the output field.",
+            },
+            {
+                "name": "description",
+                "display_name": "Description",
+                "type": "str",
+                "description": "Describe the purpose of the output field.",
+            },
+            {
+                "name": "type",
+                "display_name": "Type",
+                "type": "str",
+                "description": (
+                    "Indicate the data type of the output field " "(e.g., str, int, float, bool, list, dict)."
+                ),
+            },
+            {
+                "name": "multiple",
+                "display_name": "Multiple",
+                "type": "boolean",
+                "description": "Set to True if this output field should be a list of the specified type.",
+            },
+        ]
+        component.output_schema = TableInput(name="output_schema", display_name="Output Schema", table_schema=schema)
+        component.multiple = True
+
+        # Assertion
+        output_model = build_model_from_schema(schema)
+        assert isinstance(output_model, type)
+
+    def test_empty_output_schema(self):
+        component = StructuredOutputComponent(
+            llm=MagicMock(),
+            input_value="Test input",
+            schema_name="EmptySchema",
+            output_schema=[],
+            multiple=False,
+        )
+
+        with pytest.raises(ValueError, match="Output schema cannot be empty"):
+            component.build_structured_output()
+
+    def test_invalid_output_schema_type(self):
+        component = StructuredOutputComponent(
+            llm=MagicMock(),
+            input_value="Test input",
+            schema_name="InvalidSchema",
+            output_schema=[{"name": "field", "type": "invalid_type", "description": "Invalid field"}],
+            multiple=False,
+        )
+
+        with pytest.raises(ValueError, match="Invalid type: invalid_type"):
+            component.build_structured_output()
+
+    @patch("langflow.components.helpers.structured_output.get_chat_result")
+    def test_nested_output_schema(self, mock_get_chat_result):
+        class ChildModel(BaseModel):
+            child: str = "value"
+
+        class ParentModel(BaseModel):
+            parent: ChildModel = ChildModel()
+
+        mock_llm = MagicMock()
+        mock_llm.with_structured_output.return_value = mock_llm
+        mock_get_chat_result.return_value = ParentModel(parent=ChildModel(child="value"))
+
+        component = StructuredOutputComponent(
+            llm=mock_llm,
+            input_value="Test input",
+            schema_name="NestedSchema",
+            output_schema=[
+                {
+                    "name": "parent",
+                    "type": "dict",
+                    "description": "Parent field",
+                    "fields": [{"name": "child", "type": "str", "description": "Child field"}],
+                }
+            ],
+            multiple=False,
+        )
+
+        result = component.build_structured_output()
+        assert isinstance(result, Data)
+        assert result.data == {"parent": {"child": "value"}}
+
+    @patch("langflow.components.helpers.structured_output.get_chat_result")
+    def test_large_input_value(self, mock_get_chat_result):
+        large_input = "Test input " * 1000
+
+        class MockBaseModel(BaseModel):
+            field: str = "value"
+
+        mock_get_chat_result.return_value = MockBaseModel(field="value")
+
+        component = StructuredOutputComponent(
+            llm=MagicMock(),
+            input_value=large_input,
+            schema_name="LargeInputSchema",
+            output_schema=[{"name": "field", "type": "str", "description": "A test field"}],
+            multiple=False,
+        )
+
+        result = component.build_structured_output()
+        assert isinstance(result, Data)
+        assert result.data == {"field": "value"}
+        mock_get_chat_result.assert_called_once()
+
+    def test_invalid_llm_config(self):
+        component = StructuredOutputComponent(
+            llm="invalid_llm",  # Not a proper LLM instance
+            input_value="Test input",
+            schema_name="InvalidLLMSchema",
+            output_schema=[{"name": "field", "type": "str", "description": "A test field"}],
+            multiple=False,
+        )
+
+        with pytest.raises(TypeError, match="Language model does not support structured output."):
+            component.build_structured_output()
--- a/src/backend/tests/unit/helpers/test_base_model_from_schema.py
+++ b/src/backend/tests/unit/helpers/test_base_model_from_schema.py
@ -0,0 +1,160 @@
+# Generated by qodo Gen
+
+from typing import Any
+
+import pytest
+from pydantic import BaseModel
+from pydantic_core import PydanticUndefined
+
+from langflow.helpers.base_model import build_model_from_schema
+
+
+class TestBuildModelFromSchema:
+    # Successfully creates a Pydantic model from a valid schema
+    def test_create_model_from_valid_schema(self):
+        schema = [
+            {"name": "field1", "type": "str", "default": "default_value", "description": "A string field"},
+            {"name": "field2", "type": "int", "default": 0, "description": "An integer field"},
+            {"name": "field3", "type": "bool", "default": False, "description": "A boolean field"},
+        ]
+        model = build_model_from_schema(schema)
+        instance = model(field1="test", field2=123, field3=True)
+        assert instance.field1 == "test"
+        assert instance.field2 == 123
+        assert instance.field3 is True
+
+    # Handles empty schema gracefully without errors
+    def test_handle_empty_schema(self):
+        schema = []
+        model = build_model_from_schema(schema)
+        instance = model()
+        assert instance is not None
+
+    # Ensure the model created from schema has the expected attributes by checking on an instance
+    def test_handles_multiple_fields_fixed_with_instance_check(self):
+        schema = [
+            {"name": "field1", "type": "str", "default": "default_value1"},
+            {"name": "field2", "type": "int", "default": 42},
+            {"name": "field3", "type": "list", "default": [1, 2, 3]},
+            {"name": "field4", "type": "dict", "default": {"key": "value"}},
+        ]
+
+        model = build_model_from_schema(schema)
+        model_instance = model(field1="test", field2=123, field3=[1, 2, 3], field4={"key": "value"})
+
+        assert issubclass(model, BaseModel)
+        assert hasattr(model_instance, "field1")
+        assert hasattr(model_instance, "field2")
+        assert hasattr(model_instance, "field3")
+        assert hasattr(model_instance, "field4")
+
+    # Correctly accesses descriptions using the recommended fix
+    def test_correctly_accesses_descriptions_recommended_fix(self):
+        schema = [
+            {"name": "field1", "type": "str", "default": "default_value1", "description": "Description for field1"},
+            {"name": "field2", "type": "int", "default": 42, "description": "Description for field2"},
+            {"name": "field3", "type": "list", "default": [1, 2, 3], "description": "Description for field3"},
+            {"name": "field4", "type": "dict", "default": {"key": "value"}, "description": "Description for field4"},
+        ]
+
+        model = build_model_from_schema(schema)
+
+        assert model.model_fields["field1"].description == "Description for field1"
+        assert model.model_fields["field2"].description == "Description for field2"
+        assert model.model_fields["field3"].description == "Description for field3"
+        assert model.model_fields["field4"].description == "Description for field4"
+
+    # Supports both single and multiple type annotations
+    def test_supports_single_and_multiple_type_annotations(self):
+        schema = [
+            {"name": "field1", "type": "str", "default": "default_value1", "description": "Description 1"},
+            {"name": "field2", "type": "list", "default": [1, 2, 3], "description": "Description 2", "multiple": True},
+            {"name": "field3", "type": "int", "default": 100, "description": "Description 3"},
+        ]
+        model_type = build_model_from_schema(schema)
+        assert issubclass(model_type, BaseModel)
+
+    # Manages unknown field types by defaulting to Any
+    def test_manages_unknown_field_types(self):
+        schema = [
+            {"name": "field1", "type": "str", "default": "default_value1"},
+            {"name": "field2", "type": "unknown_type", "default": "default_value2"},
+        ]
+        with pytest.raises(ValueError):
+            build_model_from_schema(schema)
+
+    # Confirms that the function raises a specific exception for invalid input
+    def test_raises_error_for_invalid_input_different_exception_with_specific_exception(self):
+        with pytest.raises(ValueError):
+            schema = [{"name": "field1", "type": "invalid_type", "default": "default_value"}]
+            build_model_from_schema(schema)
+
+    # Processes schemas with missing optional keys like description or multiple
+    def test_process_schema_missing_optional_keys_updated(self):
+        schema = [
+            {"name": "field1", "type": "str", "default": "default_value1"},
+            {"name": "field2", "type": "int", "default": 0, "description": "Field 2 description"},
+            {"name": "field3", "type": "list", "default": [], "multiple": True},
+            {"name": "field4", "type": "dict", "default": {}, "description": "Field 4 description", "multiple": True},
+        ]
+        result_model = build_model_from_schema(schema)
+        assert result_model.__annotations__["field1"] == str  # noqa: E721
+        assert result_model.model_fields["field1"].description == ""
+        assert result_model.__annotations__["field2"] == int  # noqa: E721
+        assert result_model.model_fields["field2"].description == "Field 2 description"
+        assert result_model.__annotations__["field3"] == list[list[Any]]
+        assert result_model.model_fields["field3"].description == ""
+        assert result_model.__annotations__["field4"] == list[dict[str, Any]]
+        assert result_model.model_fields["field4"].description == "Field 4 description"
+
+    # Deals with schemas containing fields with None as default values
+    def test_schema_fields_with_none_default(self):
+        schema = [
+            {"name": "field1", "type": "str", "default": None, "description": "Field 1 description"},
+            {"name": "field2", "type": "int", "default": None, "description": "Field 2 description"},
+            {"name": "field3", "type": "list", "default": None, "description": "Field 3 description", "multiple": True},
+        ]
+        model = build_model_from_schema(schema)
+        assert model.model_fields["field1"].default == PydanticUndefined  # noqa: E711
+        assert model.model_fields["field2"].default == PydanticUndefined  # noqa: E711
+        assert model.model_fields["field3"].default == PydanticUndefined  # noqa: E711
+
+    # Checks for proper handling of nested list and dict types
+    def test_nested_list_and_dict_types_handling(self):
+        schema = [
+            {"name": "field1", "type": "list", "default": [], "description": "list field", "multiple": True},
+            {"name": "field2", "type": "dict", "default": {}, "description": "Dict field"},
+        ]
+        model_type = build_model_from_schema(schema)
+        assert issubclass(model_type, BaseModel)
+
+    # Verifies that the function can handle large schemas efficiently
+    def test_handle_large_schemas_efficiently(self):
+        schema = [
+            {"name": "field1", "type": "str", "default": "default_value1", "description": "Description 1"},
+            {"name": "field2", "type": "int", "default": 100, "description": "Description 2"},
+            {"name": "field3", "type": "list", "default": [1, 2, 3], "description": "Description 3", "multiple": True},
+            {"name": "field4", "type": "dict", "default": {"key": "value"}, "description": "Description 4"},
+        ]
+        model_type = build_model_from_schema(schema)
+        assert issubclass(model_type, BaseModel)
+
+    # Ensures that the function returns a valid Pydantic model class
+    def test_returns_valid_model_class(self):
+        schema = [
+            {"name": "field1", "type": "str", "default": "default_value1", "description": "Description for field1"},
+            {"name": "field2", "type": "int", "default": 42, "description": "Description for field2", "multiple": True},
+        ]
+        model_class = build_model_from_schema(schema)
+        assert issubclass(model_class, BaseModel)
+
+    # Validates that the last occurrence of a duplicate field name defines the type in the schema
+    def test_no_duplicate_field_names_fixed_fixed(self):
+        schema = [
+            {"name": "field1", "type": "str", "default": "default_value1"},
+            {"name": "field2", "type": "int", "default": 0},
+            {"name": "field1", "type": "float", "default": 0.0},  # Duplicate field name
+        ]
+        model = build_model_from_schema(schema)
+        assert model.__annotations__["field1"] == float  # noqa: E721
+        assert model.__annotations__["field2"] == int  # noqa: E721