Data Management

This guide explains how external data is represented, indexed, and accessed by models during inference.

Context Sources

Context sources are external data collections attached to a model. They are queried during inference to ground responses in structured or unstructured data.

Context sources are read-only at inference time and do not mutate model state.

Supported Source Types

  • Text Documents

    Unstructured text entries. Used for short reference material such as FAQs, descriptions, or instructions.

  • PDF Documents

    Extracted text content from uploaded PDF files. Indexed for semantic search during inference.

  • Frames (Structured Data)

    Frames store structured records defined by a schema. Each record is validated against the schema before indexing.

    • CSV and spreadsheet-style records
    • JSON objects with defined fields
    • Optional image field indexing
  • External Sheets

    Periodically synchronized tabular data sources. Records are refreshed according to the configured sync policy.

  • Relational Databases

    Live connections to relational databases. Queries are executed at inference time using validated connection credentials.

Schema and Indexing

Structured sources require a schema definition. Schemas define field names, types, and validation rules.

During ingestion, records that do not conform to the schema are rejected.

Indexed fields are used to retrieve relevant records during inference.

Usage During Inference

During inference, the model may query attached context sources to retrieve relevant information.

Retrieved records are used to augment model reasoning, but context sources themselves are never modified.

Constraints

  • Context sources are scoped to a model.
  • Context is read-only during inference.
  • Schema changes require re-indexing.
  • Database connections must be valid at execution time.