Book a call
Use case — Functional area

AI in customer service

A that knows your products, processes and documents — not just stock answers. Three setup tiers from hosted FAQ bot to pipeline with a local model, with an honest call on risk, data protection and escalation.

Most are dressed-up FAQ lists. They spot keywords and serve canned answers — and the moment a question doesn't fit the schema, “please contact our support” pops up. That frustrates customers and relieves no one.

A (Retrieval-Augmented Generation) works differently: it searches your actual documents, product descriptions, manuals and knowledge articles — and phrases an answer in natural language, with visible source attribution. The risk drops significantly because the model is meant to answer only from your sources. It doesn't disappear — and that's why escalation logic is part of the architecture.

Prerequisite in all tiers: maintained knowledge sources. A system is only as good as the state of the documents it answers from. So every project starts with the question “what knowledge do we have, how current is it, who maintains it” — not with the technology choice.

Three setup tiers

Which tier fits depends on three factors: sensitivity of the content, available upkeep capacity and request volume.

Tier 1

Hosted FAQ bot with sources

Tool mix

  • SaaS chatbot with RAG function (e.g. Intercom Fin, Tidio Lyro, Crisp, HubSpot AI) — the providers host vector index and language model, the bot only gets your curated sources
  • Knowledge sources: FAQ collection, product PDFs, help-centre articles — centrally maintained, one place as single source of truth
  • Answers with source attribution (“this answer comes from article X”) — the customer sees where the knowledge comes from
  • Escalation to live chat or ticket system as soon as the bot takes three attempts or sentiment turns
  • Standard reports in the SaaS dashboard: response rate, escalation rate, most-asked questions

Fit

SMBs with a manageable set of questions, classic product info, without strict data-protection requirements beyond the SaaS DPA. Fastest path into production.

Effort & cost

Setup 2–5 days. Running cost approx. €40–250/month depending on bot provider and volume. Knowledge-source upkeep as an internal task.

Trade-off

Answers and run through the SaaS provider — usually with a GDPR DPA, but rarely locally in the EU. Unproblematic for customer service without personal detail info, the wrong tier for regulated industries or confidential matters.

Tier 2

Self-hosted RAG with frontier LLM

Tool mix

  • Your own RAG pipeline on your own infrastructure: document ingest, chunking, embeddings, vector store (e.g. PostgreSQL with pgvector or Qdrant)
  • Language model as an API from frontier providers (Claude, GPT, Gemini) with a GDPR DPA — answers are generated, but documents stay with you
  • n8n workflows for ingest of new documents, regular reindexing, logging of conversations
  • Chat widget on the website, integration with email mailbox or ticket system (Zammad, Freshdesk, Zendesk)
  • Escalation logic with conversation summary for staff — the human takes over with full context
  • Weekly AI report: which questions stayed unanswered, where sources are missing, which topic clusters are emerging

Fit

SMBs with medium volume (50–500 requests per day), maintained knowledge base and a person responsible for ongoing upkeep. Good balance between answer quality and control.

Effort & cost

Setup 8–15 days. Running cost approx. €60–200/month ( calls, hosting of database and workflows). Scales with request volume.

Trade-off

Frontier means: on every answer, questions and retrieved source snippets go to the . Acceptable for most SMBs with a DPA, usually not for customer conversations with health, legal or banking data. Answer quality is very good, costs scale with volume.

Tier 3

Full-self-hosted with local model

Tool mix

  • Tier 2 in full scope, but language model local: Llama 3, Qwen 2.5, Mistral or similar on your own GPU server or on-premise
  • Embedding model also local (e.g. BGE, E5, or a German model)
  • Optional knowledge-graph component: relations between documents, products, customers — for structured answers, not just full text
  • Audit log of every conversation: what was asked, which sources retrieved, which answer generated — for compliance and quality assurance
  • Optional: two-stage answer generation (draft by the local model, staff approval before sending on sensitive topics)

Fit

Industries with strict data-protection requirements (healthcare, law firms, tax advisors, insurers, public sector) or cases where competitive data shouldn't go to external APIs.

Effort & cost

Setup 15–30 days, hardware investment or server rental (€150–500/month) on top. Answer quality depends on the chosen model — usually good for standard questions, weaker than frontier on complex phrasing.

Trade-off

In 2026 local models have a noticeable quality gap to frontier models, especially on nuanced phrasing and multilingual service. Very good for structured FAQ answers, for complex consulting the escalation threshold has to sit lower.

What your team should understand

A isn't a “set it up once, it runs” system. Six competency areas that have to be anchored in the team so the system doesn't quietly drift:

Understand RAG architecture

Ingest, chunking, , retrieval, re-ranking, generation — the five stages where answer quality emerges or breaks. What a good chunk-size window is, when hybrid search (semantic + full-text) brings better hits.

Maintain the knowledge base

Which sources work for (structured texts, maintained FAQs, current product descriptions) and which don't (outdated PDFs, contradictory versions, internal notes without context). Why the knowledge source caps answer quality — not the model.

Prompt design for service answers

How tone, source obligation and protection get written into the . Why “only answer if the source supports it” does more than “be friendly”.

Escalation and handover

When the bot has to hand off (sentiment, third attempt, sensitive topic, explicit wish). How handover with conversation summary works so the human doesn't have to start from scratch.

Data protection and logging

What may be logged, what must be anonymised, how deletion and access requests get implemented technically. When a notice about usage in chat is mandatory (, Art. 50).

Analysis and knowledge gaps

How to read from conversations what's missing: topic clusters the bot doesn't cover, sources that contradict each other, phrasings that always trigger escalation. That feeds the next iteration of the knowledge base.

What gets automated

Eight steps the pipeline takes over in running operations — most can be implemented in each of the three tiers, some only from tier 2:

Document ingest

New PDFs, Markdown or HTML articles are automatically detected, chunked, embedded and added to the index — no manual upload step needed.

Answer generation with source attribution

Every answer visibly references the sources it was assembled from — customer sees the evidence, the bot can't hallucinate without the human noticing.

Language detection

Question in German, English or Turkish? The bot answers automatically in the language of the request, if the knowledge sources have anything there.

Sentiment-triggered escalation

Frustration, complaint signals or explicit staff requests immediately trigger handover to a human — with conversation summary for whoever takes over.

Ticket creation on knowledge gaps

If the bot finds no answer in the sources, a ticket is created in the service system and the question is added to the weekly report — the gap becomes visible instead of fading away.

Conversation summary for staff

On handover, the human gets a three-line summary of the dialog so far plus relevant customer data if linked — no repeat frustration for the customer.

Weekly topic report

Which question clusters appeared, which sources are missing, where the answers were hesitant? A narrative analysis instead of bare numbers.

Auto-routing by topic

Sales, support, complaint, return — the bot allocates when it escalates, and the ticket lands with the right person or department.

What stays MANUAL on purpose

Customer service is relationship work. These six points belong in human hands — a pipeline can support them, but not replace them:

Brand voice and tone

How the bot sounds (matter-of-fact, warm, direct, distant) — that's a brand decision, not an algorithm question. Anchored in the and checked in sample conversations.

Curation of the knowledge sources

Which documents go in and which don't, what counts as binding, how contradictory sources are resolved — that's editorial work, not a .

Define escalation rules

Which topics the bot may answer at all, when it must hand off immediately (contract details, complaints, sensitive topics) — that's a business decision.

Spot-check answer quality

Weekly read 20–30 real conversations, check answers against sources, follow up escalations. Without this discipline, the bot decays into a black box.

Handle sensitive cases personally

Complaints, contract details, individual decisions — handover to a human isn't failure, it's a feature. The bot isn't the relationship carrier.

Keep the knowledge base current

Product changes, price updates, new processes — someone in the team is responsible for keeping the sources current. A system with stale sources is worse than none.

How the build runs

From the first audit of the knowledge sources to full self-operation usually 8–14 weeks, depending on tier, upkeep state of the sources and integration depth:

1

Inventory

Which requests currently land in the service inbox? Which knowledge sources exist (FAQ, help centre, manuals, internal wikis)? Where are they maintained, where stale?

2

Use-case cut

Which questions the bot is allowed to answer and which not — a list with clear boundaries. Advisory, complaint and contract topics typically go straight to humans, product info can go to the bot.

3

Choose the setup tier

Hosted, frontier or full-self-hosted — depending on data-protection requirements, volume, existing tech stack and budget. Reasoned recommendation, you decide.

4

Build the knowledge base

Collect sources, make them duplicate- and contradiction-free, define chunking strategy, produce , build the index. This phase caps answer quality long-term — so it's done thoroughly.

5

Configure the bot

Write the (tone, source obligation, escalation rules), test sample conversations, add the usage notice, build the handover logic.

6

Integration

Chat widget on the website, integration with the service inbox or ticket system, linking with customer data where sensible and GDPR-compliant.

7

Training & hands-on handover

3–4-hour workshop with the service team: maintain the knowledge base, analyse conversations, adjust escalation rules, interpret weekly reports.

8

Guided pilot month

First 4 weeks with weekly sparring: review 20–30 real conversations together, close knowledge gaps, adjust tone, calibrate escalation thresholds.

Effort and investment depend on the chosen tier and the upkeep state of the knowledge sources — a concrete estimate comes after the source audit and as part of the pricing overview.

Ready for the next step?

Free intro call, no strings attached. In 30 minutes you'll know whether and how AI can help your business.

Book a callBAFA funding