corpus_reader_settings

Corpus reader settings module

Classes

CorpusReaderSettings

Bases: BaseDataModel

Corpus reader configuration settings

Attributes:

source (str) –

Source Identifier
default_namespace (str) –

Default namespace
default_user_name (str) –

Default User
default_agent_name (str) –

Default Agent
system_message_map (Dict[str, str]) –

System Message Map
new_collection_vectordb (str) –

New Collection VectorDB Name
session_defaults (SessionResourceSettings) –

Session Defaults
splitter (SplitterSettings) –

Splitter Settings
dynamic_session (DynamicSessionSettings) –

Dynamic Session Settings
resume_info_path (str | None) –

Resume Info Path
unbind_other_collections (bool) –

Unbind Other Collections
chunk_retry_limit (int) –

Chunk Retry Limit

Attributes

chunk_retry_limit `class-attribute` `instance-attribute`

chunk_retry_limit: int = Field(
    default=3,
    title="Chunk Retry Limit",
    description="Number of times to retry a chunk operation that fails to process",
)

default_agent_name `class-attribute` `instance-attribute`

default_agent_name: str = Field(
    ...,
    title="Default Agent",
    description="Agent that is selected when one is not provided via the CLI",
)

default_namespace `class-attribute` `instance-attribute`

default_namespace: str = Field(
    "default",
    title="Default namespace",
    description="Namespace that is selected when one is not provided via the CLI",
)

default_user_name `class-attribute` `instance-attribute`

default_user_name: str = Field(
    ...,
    title="Default User",
    description="User that is selected when one is not provided via the CLI",
)

dynamic_session `class-attribute` `instance-attribute`

dynamic_session: DynamicSessionSettings = Field(
    default_factory=DynamicSessionSettings,
    title="Dynamic Session Settings",
    description=__doc__,
)

new_collection_vectordb `class-attribute` `instance-attribute`

new_collection_vectordb: str = Field(
    default="default",
    title="New Collection VectorDB Name",
    description="When auto-creating new memory collections, this is the name/reference to the vectordb configuration. See settings.yaml for a list of available VectorDB configurations.",
)

resume_info_path `class-attribute` `instance-attribute`

resume_info_path: str | None = Field(
    default=None,
    title="Resume Info Path",
    description="Path to a file where resume information is written when an error is raised during the corpus read process.",
)

session_defaults `class-attribute` `instance-attribute`

session_defaults: SessionResourceSettings = Field(
    default_factory=SessionResourceSettings,
    title="Session Defaults",
    description="Session settings to use for corpus reading operations",
)

source `class-attribute` `instance-attribute`

source: str = Field(
    ...,
    title="Source Identifier",
    description="A freeform string that is added to the Eleanor Framework header to identify the source system when using the chat API",
)

splitter `class-attribute` `instance-attribute`

splitter: SplitterSettings = Field(
    default_factory=SplitterSettings,
    title="Splitter Settings",
    description=__doc__,
)

system_message_map `class-attribute` `instance-attribute`

system_message_map: Dict[str, str] = Field(
    default_factory=dict,
    title="System Message Map",
    description="Top-level system message to use. This dictionary is keyed by the namespace name which - by convention - is the same as the model name on the Eleanor Framework chat API. This allows users to map specific system messages per LLM.",
)

unbind_other_collections `class-attribute` `instance-attribute`

unbind_other_collections: bool = Field(
    default=False,
    title="Unbind Other Collections",
    description="Then true, all other memory collections will be unbound from the agent to prevent conceptual cross-over. This may or may not be desirable depending on the use case.",
)

DynamicSessionSettings

Bases: BaseDataModel

Dynamic session settings configuration.

Attributes:

enabled (bool) –

Flag that when true enables dynamic session creation
score_dividend (int) –

Score dividend
spacy_model (str) –

SpaCy model to used when determining chunk information density

Attributes

enabled `class-attribute` `instance-attribute`

enabled: bool = Field(
    default=True, title="Enabled", description=""
)

score_dividend `class-attribute` `instance-attribute`

score_dividend: int = Field(
    default=10, gt=0, title="Score Dividend", description=""
)

spacy_model `class-attribute` `instance-attribute`

spacy_model: str = Field(
    default="",
    title="SpaCy Model",
    description="SpaCy model to used when determining chunk information density",
)

SplitterSettings

Bases: BaseDataModel

Corpus splitter configuration settings

Attributes:

sample_packing_enabled (bool) –

Flag that when true enables sample packing
sample_packing_tokenizer (str) –

Reference to a LLM tokenizer to use for sample packing. This tokenizer must be attached to a LLM that is loaded by the framework. Required whenever sample packing is enabled
splitter_kwargs (KwargsModel) –

Additional keyword arguments to pass to the splitter when creating new sessions. This is applicable whether or not sample packing is enabled.

Attributes

sample_packing_enabled `class-attribute` `instance-attribute`

sample_packing_enabled: bool = Field(
    default=False,
    title="Sample Packing Enabled",
    description="Flag that when true enables sample packing",
)

sample_packing_tokenizer `class-attribute` `instance-attribute`

sample_packing_tokenizer: str = Field(
    default="",
    title="Sample Packing Tokenizer",
    description="Reference to a LLM tokenizer to use for sample packing. This tokenizer must be attached to a LLM that is loaded by the framework. Required whenever sample packing is enabled",
)

splitter_kwargs `class-attribute` `instance-attribute`

splitter_kwargs: KwargsModel = Field(
    default_factory=KwargsModel,
    title="Splitter Kwargs",
    description="Additional keyword arguments to pass to the splitter when creating new sessions. This is applicable whether or not sample packing is enabled.",
)

corpus_reader_settings

Classes

CorpusReaderSettings

Attributes

chunk_retry_limit class-attribute instance-attribute

default_agent_name class-attribute instance-attribute

default_namespace class-attribute instance-attribute

default_user_name class-attribute instance-attribute

dynamic_session class-attribute instance-attribute

new_collection_vectordb class-attribute instance-attribute

resume_info_path class-attribute instance-attribute

session_defaults class-attribute instance-attribute

source class-attribute instance-attribute

splitter class-attribute instance-attribute

system_message_map class-attribute instance-attribute

unbind_other_collections class-attribute instance-attribute

DynamicSessionSettings

Attributes

enabled class-attribute instance-attribute

score_dividend class-attribute instance-attribute

spacy_model class-attribute instance-attribute

SplitterSettings

Attributes

sample_packing_enabled class-attribute instance-attribute

sample_packing_tokenizer class-attribute instance-attribute

splitter_kwargs class-attribute instance-attribute

chunk_retry_limit `class-attribute` `instance-attribute`

default_agent_name `class-attribute` `instance-attribute`

default_namespace `class-attribute` `instance-attribute`

default_user_name `class-attribute` `instance-attribute`

dynamic_session `class-attribute` `instance-attribute`

new_collection_vectordb `class-attribute` `instance-attribute`

resume_info_path `class-attribute` `instance-attribute`

session_defaults `class-attribute` `instance-attribute`

source `class-attribute` `instance-attribute`

splitter `class-attribute` `instance-attribute`

system_message_map `class-attribute` `instance-attribute`

unbind_other_collections `class-attribute` `instance-attribute`

enabled `class-attribute` `instance-attribute`

score_dividend `class-attribute` `instance-attribute`

spacy_model `class-attribute` `instance-attribute`

sample_packing_enabled `class-attribute` `instance-attribute`

sample_packing_tokenizer `class-attribute` `instance-attribute`

splitter_kwargs `class-attribute` `instance-attribute`