Hallucination spike
“Hallucinations” cannot be caught by a single rule, but you can count surrogate signals:
- the share of answers that do not parse as the expected JSON schema;
- logprobs /
confidence_scorebelow a threshold; - the judge-LLM gives
0more often than usual; - regex sanity-check (“the model named a non-existent DB field”);
- links in the response that return
404.
import os, json, time, requests, statistics
WIN = []
def observe(answer_dict): score = 0 if not answer_dict.get("parsed"): score += 1 if answer_dict.get("logprob_mean", 0) < -2: score += 1 if answer_dict.get("judge", 1) == 0: score += 1 WIN.append(score) if len(WIN) > 200: WIN.pop(0) if len(WIN) >= 50: rate = statistics.mean([1 if x else 0 for x in WIN]) if rate > 0.20: push("👻 Hallucination spike", f"Подозрительные ответы: {int(rate*100)}% за окно {len(WIN)}", priority=8)
def push(t, m, p): requests.post(f"{os.environ['NOTIFLY_URL']}/message", params={"token": os.environ["NOTIFLY_TOKEN"]}, json={"title": t, "message": m, "priority": p}, timeout=5)Store this counter in Redis — it will survive restarts and cover all instances.
Related recipes
Section titled “Related recipes”- Eval / quality regression — formal evaluation.
- Vector DB / RAG — a common cause of hallucinations (context wasn’t found).