Skip to content

API

The API layer in the EleanorAI Framework performs the following functions:

  • Exposing the service layer as a versioned and HTTP accessible endpoint suitable for external client integrations.
  • Mapping internal service calls to the externally-facing, strongly-versioned data transfer objects (DTOs).
  • Management of internal resource dependencies such as the execution thread pool.
  • Applying internal middleware components such as authentication and authorization.

API Use Cases

From an implementation perspective, the API layer is responsible for handling the use cases described below.

Request/Response Service Invocation

This use case involves invoking a service method in response to an API request and returning the result to the client. The operation is expected to complete within the request lifecycle and return a response to the client.

Example

response = await async_run_service(
    service_cls=AgentService,
    executor=job_pool,
    method=AgentService.get_agent,
    method_kwargs={"agent_pkid": agent_pkid},
    dto_mapper=lambda x: dto.AgentGetResponseV1(
        agent=dto.AgentV1.from_orm_model(x)
    ),
)

API Task Submission

This use case involves triggering a long-running operation via the API, such as re-voicing a memory collection. The operation may involve significant computation or database interactions that exceed the usual request lifecycle. Clients do not hold the HTTP connection open and instead are given a task_id to track hte operation’s progress.

Example

response = dto.NewTaskResponseV1()

run_background_service(
    service_cls=ReVoiceService,
    executor=job_pool,
    method=ReVoiceService.re_voice,
    task_id=response.task_id,
    method_kwargs={
        "agent_pkid": agent_pkid,
        "source_collection_pkid": collection_pkid,
        "enable_source_memory": request.source_memory_settings.enabled,
        "source_top_k_vectors": request.source_memory_settings.top_k_vectors,
        "source_relevance_alpha": request.source_memory_settings.relevance_alpha,
        "source_importance_alpha": request.source_memory_settings.importance_alpha,
        "source_recency_alpha": request.source_memory_settings.recency_alpha,
        "source_min_score": request.source_memory_settings.min_score,
        "source_max_memories": request.source_memory_settings.max_memories,
        "source_max_memory_strategy": request.source_memory_settings.max_memory_strategy,
        "enable_agent_memory": request.agent_memory_settings.enabled,
        "agent_top_k_vectors": request.agent_memory_settings.top_k_vectors,
        "agent_relevance_alpha": request.agent_memory_settings.relevance_alpha,
        "agent_importance_alpha": request.agent_memory_settings.importance_alpha,
        "agent_recency_alpha": request.agent_memory_settings.recency_alpha,
        "agent_min_score": request.agent_memory_settings.min_score,
        "agent_max_memories": request.agent_memory_settings.max_memories,
        "agent_max_memory_strategy": request.agent_memory_settings.max_memory_strategy,
    },
)

Service Invocation Design

This section describes the design of invoking services in a FastAPI application, focusing on the challenges of managing long-running operations and background tasks, and how the EleanorAI Framework addresses those challenges using eleanor.backend.service.async_run_service and eleanor.backend.service.run_background_service. It includes considerations around database session management, service lifecycle, and the specific needs of long-running and compute-intensive operations.

Why This Approach?

FastAPI is designed to handle high-throughput, low-latency API requests by leveraging asynchronous programming. However, certain operations, such as complex computations or long-running tasks, do not fit well into the default request-response lifecycle of FastAPI. This design addresses these challenges by creating a pattern for running service methods that need:

  1. Long-lived operations that exceed the usual request lifecycle.
  2. Background processing that should not block the main application thread.
  3. RDBMS session management for database interactions to avoid connection issues and session expirations.
  4. API-agnostic service layer is completely independent of the API layer, allowing for easy testing, expansion, and reuse. In practice, service layer implementations are synchronous and have zero knowledge of the API layer and DTOs.

Core Challenges

FastAPI typically manages database sessions tied to the lifecycle of a single HTTP request using dependency injection. While this works well for short-lived operations, it can lead to challenges in long-running operations:

  • Expired Sessions: SQLAlchemy sessions are designed to be short-lived. Using a session across a long-running operation or thread can lead to session expiration, making it difficult to interact with the database.
  • Blocking Behavior: For long-running or compute-intensive operations, tying the session to the FastAPI request can block the main event loop, reducing the responsiveness of the API to new requests.
  • Detached Objects: Accessing SQLAlchemy objects after the session is closed can lead to errors due to detached objects.
  • Thread Local Scope: Although the synchronous service layer benefits from SQLAlchemy’s thread-local scope, however is generally not compatible with asynchronous programming.

FastAPI Naming Approach

Use the following naming conventions for FastAPI:

{SERVICE}_

SERVICE + [PATH NOUNS] + [VERB]

Examples:

  • /checkmate/tasks/{task_id}/invoke => checkmate-task-invoke
  • /checkmate/tasks/{task_id}/status => checkmate-task-status
  • /checkmate/tasks/{task_id}/cancel => checkmate-task-cancel
  • /checkmate/tasks/cancel => checkmate-tasks-cancel

Method Table

Method Default Supported Action(s) Recommended HTTP Status
GET Get or list Get, List, Analyze, Export 200 (OK), 202 (Accepted)
POST N/A Search 200 (OK), 202 (Accepted)
PUT Create Create 201 (Created)
PATCH N/A Update, Add, Bind, Unbind, Undelete, Rebuild 200 (OK), 204 (No content)
DELETE HardDelete, SoftDelete, Truncate 204 (No content)
HTTP Status Code Description
200 (OK) Request was processed successfully, results returned
201 (Created) Request was processed successfully, resource created
202 (Accepted) Request was processed successfully, CheckMate task created and standard CheckMate task response is returned
204 (No content) Request was processed successfully, no content returned.

Metrics