Core Concepts
Ground is built around a few key concepts that work together to provide grounded retrieval.Sources
A source is a repository, documentation site, or PDF that Ground indexes. Each source has:- Type:
repo(Git),docs(documentation), orpdf(uploaded PDF) - Format: For docs, either
html(web pages) oropenapi(API specs) - URL: The location to fetch content from
- Status:
pending,syncing,synced, orerror
Sources are indexed asynchronously via sync jobs. After creating a source, you must trigger a sync to index its content.
Jobs
A job represents a sync operation that fetches, parses, chunks, embeds, and indexes content from a source. Jobs progress through stages:queued→ Waiting to be processedfetch→ Downloading content from the sourceparse→ Extracting text from files/pageschunk→ Splitting content into searchable chunksembed→ Generating vector embeddingsindex→ Storing chunks in the databasefinalize→ Updating source metadata
Chunks
A chunk is a piece of indexed content with:- Content: The actual text
- Embedding: Vector representation for semantic search
- Metadata: Path, language, line numbers, version reference
- Extra metadata: For OpenAPI chunks, includes method, path, operation ID
Search
Ground uses hybrid search combining:- Vector similarity: Finds semantically similar content
- Full-text search: Matches keywords and phrases
Citations
Every search result includes a citation with:- Source name and ID
- File path or URL
- Symbol (function/class name or section heading)
- Line numbers (for code)
- Version reference (commit SHA or doc version)
- Language/chunk type
Freshness & Staleness
Ground tracks how recent each source’s content is:- Freshness: Days since last successful sync
- Staleness: When a source exceeds the configured staleness budget
- Warnings: Stale results include warnings in the response
Trust Policy
The trust policy controls search behavior:- Staleness budget: How many days before content is considered stale
- Source priorities: Weights for different source types (e.g., OpenAPI higher for API questions)
- Refusal thresholds: Minimum evidence count/score to answer
Conflicts
When multiple sources define the same thing differently (e.g., same API endpoint with different schemas), Ground detects and surfaces the conflict.Sources
Deep dive into source types and configuration
Search
How hybrid search works
Trust Policy
Configure staleness and refusal
OpenAPI
Index API specifications