Skip to content

vectordb

Low-Level Vector Database Connection Management.

This module provides low-level functionality for managing connections to a vector database. The main purpose of this module is to handle the connection and communication with the vector database, allowing users to perform vectorization operations.

Warning

Currently only Milvus is supported.

Attributes

VectorTrimStrategyType module-attribute

VectorTrimStrategyType = Literal['chunk', 'truncate']

When performing a bulk vectorization operation there are multiple strategies for control trimming the text.

  • chunk: Split the text into chunks that fit into the embedding.
  • truncate: Truncate the text to fit into the embedding.

Classes

MemoryVectorMetadata

Bases: BaseDataModel

Memory metadata associated with the vector store

Attributes

collection_pkid instance-attribute
collection_pkid: str
memory_type instance-attribute
memory_type: str
pkid class-attribute instance-attribute
pkid: str = Field('', description='Memory RDBMS pkid')

VectorizeError

Bases: Exception

Exception raised when an error occurs during vectorization.

Functions

bulk_vectorize

bulk_vectorize(
    texts: List[str],
    metadatas: Union[
        List[Dict], List[MemoryVectorMetadata]
    ],
    ids: List[str],
    vectordb_ref: VectorStore,
    trim_strategy: VectorTrimStrategyType | None = None,
) -> Future

Vectorizes a collection of texts in bulk.

Parameters:

  • texts (List[str]) –

    The list of texts to be vectorized.

  • metadatas (Union[List[Dict], List[MemoryVectorMetadata]]) –

    The list of metadata associated with each text.

  • ids (List[str]) –

    The list of IDs corresponding to each text.

  • vectordb_ref (VectorStore) –

    The reference to the vector store.

Returns:

  • Future ( Future ) –

    A future object representing the asynchronous computation of vectorizing the collection.

normalize_collection_name

normalize_collection_name(*args) -> str

Normalize the collection name by joining the arguments with ‘__’, replacing whitespace with ‘_’, and removing any characters that are not alphanumeric, hyphen, period, or underscore.

Parameters:

  • *args

    Variable number of arguments representing the collection name parts.

Returns:

  • str ( str ) –

    The normalized collection name.

vectorize_collection_worker

vectorize_collection_worker(
    texts: List[str],
    metadatas: Union[
        List[Dict], List[MemoryVectorMetadata]
    ],
    ids: List[str],
    vectordb_ref: VectorStore,
    trim_strategy: VectorTrimStrategyType | None = None,
    **kwargs
) -> List[str]

Vectorizes a collection of texts using the specified VectorStore.

Will automatically chunk texts that are too long to fit into the vectorizer.

Parameters:

  • texts (List[str]) –

    The list of texts to be vectorized.

  • metadatas (Union[List[Dict], List[MemoryVectorMetadata]]) –

    The list of metadata associated with each text. If an element in the list is an instance of MemoryVectorMetadata, its model_dump() method will be called to obtain the metadata dictionary.

  • ids (List[str]) –

    The list of IDs associated with each text.

  • vectordb_ref (VectorStore) –

    The VectorStore object used for vectorization.

  • **kwargs

    Additional keyword arguments.

Returns:

  • List[str]

    List[str]: The list of vector IDs generated by the VectorStore.

Raises:

  • ValueError

    If the lengths of texts, metadatas, and ids are not equal.