Skip to content

corpus_reader_settings

Corpus reader settings module

Classes

CorpusReaderSettings

Bases: BaseDataModel

Corpus reader configuration settings

Attributes:

Attributes

chunk_retry_limit class-attribute instance-attribute
chunk_retry_limit: int = Field(
    default=3,
    title="Chunk Retry Limit",
    description="Number of times to retry a chunk operation that fails to process",
)
default_agent_name class-attribute instance-attribute
default_agent_name: str = Field(
    ...,
    title="Default Agent",
    description="Agent that is selected when one is not provided via the CLI",
)
default_namespace class-attribute instance-attribute
default_namespace: str = Field(
    "default",
    title="Default namespace",
    description="Namespace that is selected when one is not provided via the CLI",
)
default_user_name class-attribute instance-attribute
default_user_name: str = Field(
    ...,
    title="Default User",
    description="User that is selected when one is not provided via the CLI",
)
dynamic_session class-attribute instance-attribute
dynamic_session: DynamicSessionSettings = Field(
    default_factory=DynamicSessionSettings,
    title="Dynamic Session Settings",
    description=__doc__,
)
new_collection_vectordb class-attribute instance-attribute
new_collection_vectordb: str = Field(
    default="default",
    title="New Collection VectorDB Name",
    description="When auto-creating new memory collections, this is the name/reference to the vectordb configuration. See settings.yaml for a list of available VectorDB configurations.",
)
resume_info_path class-attribute instance-attribute
resume_info_path: str | None = Field(
    default=None,
    title="Resume Info Path",
    description="Path to a file where resume information is written when an error is raised during the corpus read process.",
)
session_defaults class-attribute instance-attribute
session_defaults: SessionResourceSettings = Field(
    default_factory=SessionResourceSettings,
    title="Session Defaults",
    description="Session settings to use for corpus reading operations",
)
source class-attribute instance-attribute
source: str = Field(
    ...,
    title="Source Identifier",
    description="A freeform string that is added to the Eleanor Framework header to identify the source system when using the chat API",
)
splitter class-attribute instance-attribute
splitter: SplitterSettings = Field(
    default_factory=SplitterSettings,
    title="Splitter Settings",
    description=__doc__,
)
system_message_map class-attribute instance-attribute
system_message_map: Dict[str, str] = Field(
    default_factory=dict,
    title="System Message Map",
    description="Top-level system message to use. This dictionary is keyed by the namespace name which - by convention - is the same as the model name on the Eleanor Framework chat API. This allows users to map specific system messages per LLM.",
)
unbind_other_collections class-attribute instance-attribute
unbind_other_collections: bool = Field(
    default=False,
    title="Unbind Other Collections",
    description="Then true, all other memory collections will be unbound from the agent to prevent conceptual cross-over. This may or may not be desirable depending on the use case.",
)

DynamicSessionSettings

Bases: BaseDataModel

Dynamic session settings configuration.

Attributes:

  • enabled (bool) –

    Flag that when true enables dynamic session creation

  • score_dividend (int) –

    Score dividend

  • spacy_model (str) –

    SpaCy model to used when determining chunk information density

Attributes

enabled class-attribute instance-attribute
enabled: bool = Field(
    default=True, title="Enabled", description=""
)
score_dividend class-attribute instance-attribute
score_dividend: int = Field(
    default=10, gt=0, title="Score Dividend", description=""
)
spacy_model class-attribute instance-attribute
spacy_model: str = Field(
    default="",
    title="SpaCy Model",
    description="SpaCy model to used when determining chunk information density",
)

SplitterSettings

Bases: BaseDataModel

Corpus splitter configuration settings

Attributes:

  • sample_packing_enabled (bool) –

    Flag that when true enables sample packing

  • sample_packing_tokenizer (str) –

    Reference to a LLM tokenizer to use for sample packing. This tokenizer must be attached to a LLM that is loaded by the framework. Required whenever sample packing is enabled

  • splitter_kwargs (KwargsModel) –

    Additional keyword arguments to pass to the splitter when creating new sessions. This is applicable whether or not sample packing is enabled.

Attributes

sample_packing_enabled class-attribute instance-attribute
sample_packing_enabled: bool = Field(
    default=False,
    title="Sample Packing Enabled",
    description="Flag that when true enables sample packing",
)
sample_packing_tokenizer class-attribute instance-attribute
sample_packing_tokenizer: str = Field(
    default="",
    title="Sample Packing Tokenizer",
    description="Reference to a LLM tokenizer to use for sample packing. This tokenizer must be attached to a LLM that is loaded by the framework. Required whenever sample packing is enabled",
)
splitter_kwargs class-attribute instance-attribute
splitter_kwargs: KwargsModel = Field(
    default_factory=KwargsModel,
    title="Splitter Kwargs",
    description="Additional keyword arguments to pass to the splitter when creating new sessions. This is applicable whether or not sample packing is enabled.",
)