Custom cloud function for integrity checks
Most recipes in the section «Solo development with AI» boil down to a single pattern:
“Once every N minutes (or on my request) run a small piece of code in the cloud that checks something, and if the result is not as expected — send me a push.”
In Yandex Cloud this is solved by a single Cloud Function with a timer-trigger. Here is a universal skeleton that you can copy and adapt for:
- a synthetic user (login to the app, search, place an order);
- end-to-end RAG verification (request → embeddings → Qdrant → LLM → answer);
- verifying idempotency of a webhook handler;
- reconciling counters in YDB and Postgres between services;
- checking that a cron job ran and placed a file in S3.
Skeleton: a single file index.py
Section titled “Skeleton: a single file index.py”import os, time, json, requests, traceback
NOTIFLY_URL = os.environ["NOTIFLY_URL"]NOTIFLY_TOKEN = os.environ["NOTIFLY_TOKEN"]APP_BASE = os.environ.get("APP_BASE", "https://app.example.com")
def notify(title, msg, prio=8): requests.post(f"{NOTIFLY_URL}/message", params={"token": NOTIFLY_TOKEN}, json={"title": title, "message": msg, "priority": prio}, timeout=5)
# --- set of checks ----------------------------------------------------
def check_login(): r = requests.post(f"{APP_BASE}/api/login", json={"user": "synthetic@example.com", "pass": os.environ["CANARY_PASS"]}, timeout=10) assert r.status_code == 200, f"login: {r.status_code}" assert r.json().get("token"), "no token in response"
def check_rag_query(): t0 = time.time() r = requests.post(f"{APP_BASE}/api/ask", json={"q": "What is our refund policy?"}, timeout=20) assert r.status_code == 200, f"ask: {r.status_code}" body = r.json() assert "refund" in body.get("answer", "").lower(), "answer drifted" assert (time.time() - t0) < 8, "rag too slow"
def check_db_counters(): # example: the number of orders for yesterday should match the number of records in the replica a = int(requests.get(f"{APP_BASE}/internal/orders/yesterday/count", headers={"X-Probe": os.environ["PROBE_KEY"]}, timeout=5).text) b = int(requests.get(f"{APP_BASE}/internal/replica/orders/yesterday/count", headers={"X-Probe": os.environ["PROBE_KEY"]}, timeout=5).text) assert a == b, f"counter drift: app={a} replica={b}"
CHECKS = { "login": check_login, "rag-query": check_rag_query, "db-counters": check_db_counters,}
# --- entrypoint --------------------------------------------------------
def handler(event, context): """ YC Function entrypoint. Вызывается по timer-trigger или вручную (yc fn invoke). Можно ограничить набор проверок: payload {"checks": ["login"]} """ payload = event if isinstance(event, dict) else {} if isinstance(event, dict) and "body" in event: # http call try: payload = json.loads(event["body"] or "{}") except Exception: payload = {}
selected = payload.get("checks") or list(CHECKS) failed = []
for name in selected: fn = CHECKS.get(name) if not fn: continue try: fn() except Exception as e: failed.append((name, e, traceback.format_exc()))
if failed: body = "\n\n".join(f"❌ {n}: {type(e).__name__}: {e}" for n, e, _ in failed) notify(f"🩺 integrity: {len(failed)} fail", body, priority=9)
return {"statusCode": 200, "body": json.dumps({ "ok": len(failed) == 0, "failed": [n for n, _, _ in failed], "checked": selected, })}Deployment
Section titled “Deployment”cd integrity-checkzip -qr ../fn.zip .
yc serverless function version create \ --function-name notifly-integrity \ --runtime python311 \ --entrypoint index.handler \ --memory 256MB --execution-timeout 30s \ --environment NOTIFLY_URL=...,NOTIFLY_TOKEN=...,APP_BASE=...,CANARY_PASS=...,PROBE_KEY=... \ --source-path ../fn.zipTimer-trigger:
yc serverless trigger create timer \ --name notifly-integrity-timer \ --cron-expression '* * * * ? *' \ --invoke-function-name notifly-integrity \ --invoke-function-service-account-id <sa-id>After that the function is invoked once per minute, and on failure — a push to Notifly.
When to invoke manually from your application
Section titled “When to invoke manually from your application”The same endpoint can be called from your service at suspicious moments: after a deployment, after a migration, after a bulk data import. The request will run the checks immediately, without waiting for the next timer tick:
requests.post( "https://functions.yandexcloud.net/d4e.../", json={"checks": ["login", "rag-query"]}, headers={"Authorization": f"Bearer {iam_token}"},)Why this is better than “just curl /health”
Section titled “Why this is better than “just curl /health””- Real business-logic checks, not “process is alive”.
- One central file containing all your “knowledge” about what should work. Convenient for a solo developer — it’s both tests and monitoring.
- Runs from the cloud — doesn’t depend on your laptop, VPN, or cafe Wi‑Fi.
- Cheap — with
* * * * ? *this is ≤ 43,200 invocations per month at 1–2 seconds each, the Cloud Functions free tier is designed for this.
What to put in the alert text
Section titled “What to put in the alert text”- the name of the failed check;
- exception class + the first lines of the traceback;
- timestamp when the failure started (for deduplication if it fails 60 times in a row);
- link to the function dashboard (
https://console.yandex.cloud/.../functions/...).