Cold-start serverless

Cold-start — the invisible enemy of serverless: the first minute of the day, after a release or after a long pause the user gets a response in 5–10 seconds instead of 200 ms. To avoid catching this “by complaint”, run a scheduled probe, which pings the function with a deliberately cold container and measures TTFB.

import os, time, requests, uuid

URL = os.environ["FN_URL"]

def handler(event, context):
    # Cache-buster forces API Gateway to route to a new instance
    t0 = time.time()
    r  = requests.get(URL + f"?probe={uuid.uuid4().hex}", timeout=15)
    ms = int((time.time() - t0) * 1000)
    if ms > 3000:
        push("🥶 Serverless cold-start", f"{ms} мс на первом ответе", 8 if ms > 8000 else 5)
    return {"statusCode": 200, "body": str(ms)}

def push(t, m, p):
    requests.post(f"{os.environ['NOTIFLY_URL']}/message",
                  params={"token": os.environ["NOTIFLY_TOKEN"]},
                  json={"title": t, "message": m, "priority": p}, timeout=5)

Run it every 5–15 minutes with a timer-trigger. It’s useful to duplicate the probe for each region / each endpoint separately — and put the name into the alert text.

LLM latency degradation — the other half of the latency chain.
Custom cloud function integrity check — the common skeleton.

Cold-start serverless

Related recipes