2.5 KiB
With PyPDFLoader, you can load a PDF file with pypdf and chunks at a character level.
You can check more about the PyPDFLoader{.internal-link target=_blank} in the LangChain documentation.
⛓️LangFlow example
Download Flow{: .md-button download="Py_pdf_loader"}
File path:
Download PDF{: .md-button download="example.pdf"}
CharacterTextSplitter implements splitting text based on characters.
Text splitters operate as follows:
-
Split the text into small, meaningful chunks (usually sentences).
-
Combine these small chunks into larger ones until they reach a certain size (measured by a function).
-
Once a chunk reaches the desired size, make it its piece of text and create a new chunk with some overlap to maintain context.
Separator used:
.
Chunk size used:
2000
Chunk overlap used:
200
The OpenAIEmbeddings, wrapper around OpenAI Embeddings{.internal-link target=_blank} models. Make sure to get the API key from the LLM provider, in this case OpenAI{.internal-link target=_blank}.
Chroma vector databases can be used as vector stores to conduct a semantic search or to select examples, thanks to a wrapper around them.
A VectorStoreInfo set information about the vector store, such as the name and description.
Name used:
example
Description used:
USENIX Example Paper.
For the example, we used OpenAI as the LLM, but you can use any LLM that has an API. Make sure to get the API key from the LLM provider. For example, OpenAI{.internal-link target=_blank} requires you to create an account to get your API key.
Check out the OpenAI{.internal-link target=_blank} documentation to learn more about the API and the options that contain in the node.
The VectoStoreAgentis an agent designed to retrieve information from one or more vector stores, either with or without sources.



