Skip to content

Image generation moderation

More and more image models have built-in moderation; a typical rejection is HTTP 400 / safety_violation. Just like with safety / prompt injection, it’s important to catch and distinguish between “our bug” and an “abuse attempt”.

import os, openai, requests
def safe_generate(prompt: str, user_id: str):
try:
return openai.images.generate(model="dall-e-3", prompt=prompt, n=1)
except openai.BadRequestError as e:
body = getattr(e, "body", {}) or {}
if body.get("error", {}).get("code") == "content_policy_violation":
push("🚫 Image moderation",
f"User: {user_id}\nPrompt:\n{prompt[:600]}",
priority=7)
raise
def push(t, m, p):
requests.post(f"{os.environ['NOTIFLY_URL']}/message",
params={"token": os.environ["NOTIFLY_TOKEN"]},
json={"title": t, "message": m, "priority": p}, timeout=5)

If within an hour there are 5+ refusals from a single user_id, create a separate alert at priority=10 (potential abuse case).