Knowledge Baseadvanced

What happens when I push the same article twice?

A naive POST creates a duplicate with a -1 slug suffix. A push script that looks up the slug first and PATCHes when it exists stays idempotent. The v1 API doesn't deduplicate on its own.

Updated May 18, 20263 min read

What happens when I push the same article twice?

Two answers, depending on whether you’re using a thoughtful push script or just POSTing in a loop.

With a push script that handles idempotency

This is the normal case. A well-written push script:

Lists existing articles to build a slug → article-ID map.
For each article it’s about to push:
If the slug already exists, PATCH the existing article by ID.
If the slug doesn’t exist, POST to create.
Re-running the script is safe — the second run finds every article it created on the first run and updates them in place.

Result: pushing the same article twice updates it. No duplicates, no errors, no extra rows in the database. This is the pattern in the recipe for GitHub Actions and the pattern Atender’s own help center uses.

If the article content hasn’t actually changed since the last push, the PATCH is harmless — it updates the same fields with the same values. Embeddings only regenerate when content-bearing fields (title, summary, body, keywords) actually changed.

With a naive POST

If your script just calls POST /api/v1/kb/articles without checking for existing slugs first, the v1 API does what you’d expect a naive REST API to do:

First push — creates the article with the slug you asked for. Returns 201.
Second push with the same slug — creates a NEW article with the slug suffixed -1. Returns 201.
Third push — creates a third article with the slug suffixed -2. Returns 201.

You end up with three articles, three slugs (update-payment-method, update-payment-method-1, update-payment-method-2), three rows in the database, and three rows in retrieval. Customers searching see whichever one ranks highest.

The v1 POST /articles endpoint is intentionally a create, not an upsert. The behavior is consistent — but the consequence is that idempotency is a property of the caller, not the API.

How to recover from accidental duplicates

If you ran a naive POST loop and now have duplicates:

List your articles. Identify slugs that look like <base>-1, <base>-2, etc.
Confirm each suffix is a duplicate of the base (open both in the editor and check). Sometimes a tenant legitimately has version-1 and version-2 articles.
Delete the duplicates via DELETE /api/v1/kb/articles/<id> or in the in-app editor.

The Atender push script ships with a duplicate finder for exactly this case. Use it as a pattern if you’re writing your own — match the slug regex ^(.+)-(\d+)$ against the slug list and delete the suffixed entries when their base also exists.

Why this design?

The naive answer to “should POST upsert?” is yes, but real systems get caught on the edges:

Slug collisions across tenants are not collisions. Slugs are unique per tenant, not globally. An upsert by slug is fine within a tenant; ambiguous across tenants if anyone ever shares a key.
The upsert key is debatable. Is it slug? Title? customMetadata.sourcePath? Different tools want different keys.
Failure semantics for POST are well-known. 201 means created; the caller knows the article didn’t exist before. An upsert obscures whether the previous version was overwritten.

Putting idempotency in the client lets each pipeline pick the semantics that suit it.

A safer pattern

If you’re writing the script:

existing = {a["slug"]: a["id"] for a in list_articles()}
for article in articles_to_push:
    if article.slug in existing:
        session.patch(f"{BASE}/articles/{existing[article.slug]}", json=payload)
    else:
        session.post(f"{BASE}/articles", json=payload)

That’s the idempotency pattern in five lines. Wrap it in retries and error handling for production.

What about content that hasn’t changed?

A PATCH with identical content is a no-op for the customer-facing experience:

The article body in the database overwrites with the same string.
Embeddings might regenerate (the trigger checks whether content-bearing fields changed; identical text usually skips regeneration).
The public help center serves the same article it was already serving.
lastReviewedAt is not bumped — that field has its own dedicated endpoint, /articles/:id/mark-reviewed.

If you want to bump lastReviewedAt after confirming an article is still accurate (the quarterly review use case), call the dedicated endpoint rather than re-PATCHing the body.