🐛 fix(loading.py): refactor instantiate_textsplitter function to improve readability and remove unnecessary code

The `instantiate_textsplitter` function has been refactored to improve readability and remove unnecessary code. The condition for creating the `text_splitter` object has been simplified and the `separator_type` parameter is now removed from the `params` dictionary if it exists. Additionally, the `language` parameter is now passed as an instance of the `Language` class from the `langchain.text_splitter` module. This change ensures that the `text_splitter` object is created correctly and the `split_documents` method is called with the appropriate parameters.
This commit is contained in:
Gabriel Luiz Freitas Almeida 2023-06-28 18:16:13 -03:00
commit 36212884e4

View file

@ -221,14 +221,17 @@ def instantiate_textsplitter(
) from exc
if (
"separator_type" in params
and params["separator_type"] == "Text"
or "separator_type" not in params
):
"separator_type" in params and params["separator_type"] == "Text"
) or "separator_type" not in params:
params.pop("separator_type", None)
text_splitter = class_object(**params)
else:
params["language"] = params.pop("separator_type", None)
from langchain.text_splitter import Language
language = params.pop("separator_type", None)
params["language"] = Language(language)
params.pop("separators", None)
text_splitter = class_object.from_language(**params)
return text_splitter.split_documents(documents)