Knowledge Baseadvanced

How the AI picks an article

Reference for the retrieval pipeline. Hybrid keyword + vector search over chunks, fused with Reciprocal Rank Fusion, then dedeplicated to the article level. Roles do not affect retrieval — they are browsing-visibility only.

Updated May 18, 20266 min read

How the AI picks an article

When an AI agent in Atender — a Capability inside an Agent Stack, a Specialist Agent, Sidekick, the web chat or voice agent — needs to answer a question, it asks the Knowledge Base retrieval engine for the most relevant articles. This reference is a precise account of how that engine ranks them.

The same engine powers the public help center’s search box. Optimizing for one optimizes for both.

The high-level shape

A query arrives. The engine returns a ranked list of articles. In between:

The query is embedded into a vector.
Candidate chunks (not articles) are retrieved by two parallel paths — keyword and vector — over the published, non-archived articles in the tenant.
Chunks from both paths are fused with Reciprocal Rank Fusion (RRF) so a chunk that ranks high on either path floats up.
Optionally, a reranker re-scores the top candidates (feature-flagged).
Optionally, an “outcome weight” boosts chunks from articles that have historically resolved similar conversations (feature-flagged).
Chunks are deduplicated to the article level — each article appears at most once, represented by its best-scoring chunk.
The top N articles are returned, with similarity scores and the matched chunk’s heading trail.

What chunks are

Articles are split into small passages of a few hundred words. Each passage is embedded separately. A 4,000-word reference can have 10–15 chunks. Why: a long article covers many topics; one big embedding dilutes each topic’s signal. Chunk-level embeddings let “How do I refund?” match a refund chunk inside a Billing FAQ without being drowned out by the credit-card chunks.

The chunker has a version. When the chunker is improved, articles re-chunk in the background.

The two retrieval paths

Keyword (Postgres tsvector) — Literal words, common stems — Catches exact terms — error codes, product names, brand names
Vector (pgvector cosine similarity) — Semantic meaning — Catches paraphrases — “duplicate charge” matches “charged twice”

Each path returns its own ranked list of chunks. RRF combines them by rank position, not by raw score, which makes the fusion stable even when the two paths use different scoring scales.

Filters applied before ranking

Before any ranking happens, the candidate set is filtered:

Tenant — Only the calling tenant’s articles.
Status = published — Drafts, needs-review, and archived articles are invisible to retrieval.
Embedding present — Articles whose embeddings haven’t finished generating are excluded until indexing completes.
Category scope (optional) — Some callers pass fullCategoryIds and partialCategoryIds to scope retrieval to a subset of categories.
Language (optional) — Customers in non-default languages prefer chunks in their language, with fallback to the default.
Market (optional) — Customers browsing under a market prefer chunks tagged to that market, with fallback to global.

Roles do not appear in this list. KB roles control who sees what when browsing the public help center; the retrieval engine ignores them.

What does not affect ranking

Article role assignments. Roles are browsing-visibility tags. The retrieval engine doesn’t read them.
Tags. Tags are filterable in the in-app editor and on the public site, but they don’t change retrieval scores.
difficulty / estimatedMinutes. Informational metadata; doesn’t change ranking.
views counter. Tracked but not used as a retrieval signal today.
lastReviewedAt. Tracked but not used as a retrieval signal today.

If you need to control which articles the AI retrieves, see Can I control which articles the AI uses?.

What boosts a chunk

The signals that do affect ranking, in rough order of impact:

Embedding similarity — vector cosine distance between the query and the chunk. The dominant signal.
Keyword overlap — how well the chunk’s tsvector matches the query terms.
Outcome weight (when enabled) — chunks whose parent article has resolved similar past conversations get a small boost (default weight: 0.15).
Reranker score (when enabled) — a re-scoring model rereads the top candidates and reorders them.

Title and summary get weighted higher in the keyword path. Keywords field is included in the keyword path, not the vector path — use it for synonyms and error codes you want matched without polluting the prose.

The two retrieval modes

Legacy — Pure vector similarity over chunks. Simpler, slightly less accurate on ambiguous queries. — Default when the chunked-search feature flag is off.
Chunked + RRF — Keyword + vector fused via RRF, with optional reranker and outcome weighting. — Enabled per tenant. The current default for most tenants.

The chunked path is the design Atender invests in. The legacy path is preserved for compatibility and to compare results during gradual rollouts.

Six places retrieval is called from

The same retrieval function powers every AI surface that needs to look something up:

Specialist agent’s search_documentation tool — the AI agent inside a conversation looks up articles to answer a customer question.
Voice agent’s search_kb tool — the same, but in a phone call. Limited to 5 results and short excerpts so the agent can speak the answer naturally.
Sidekick’s suggested-answer retrieval — when an agent is replying to a conversation, Sidekick offers relevant articles to insert.
Self-Learning’s gap detection — when the system looks for cases where no good answer exists, it retrieves what it would have answered with.
Internal /api/kb/search-semantic — used by the in-app editor’s search.
Public POST /api/public/kb/:tenantId/search — the search box on your public help center, where market filters and language overlays apply.

A chunk that ranks well for one of these will rank well for all six.

Practical takeaways for authoring

Write the customer’s words. Embeddings catch paraphrases, but a chunk that uses the customer’s exact phrasing wins on both paths simultaneously.
Lead with the answer. The first chunk of an article is often the most-retrieved chunk. Don’t bury the lede.
One topic per article. Two topics in one article means one chunk per topic, but the article ranks at most once — the topic with weaker chunks loses.
Use the keywords field. Synonyms, error codes, common misspellings. The keyword path needs these to match a customer who phrases the query differently than your prose.
Keep articles current. Stale articles don’t carry an explicit penalty in retrieval, but outdated content erodes the AI’s accuracy. The quarterly review pass is the cheapest fix.