Image generation moderation
More and more image models have built-in moderation; a typical rejection is
HTTP 400 / safety_violation. Just like with
safety / prompt injection, it’s important
to catch and distinguish between “our bug” and an “abuse attempt”.
import os, openai, requests
def safe_generate(prompt: str, user_id: str): try: return openai.images.generate(model="dall-e-3", prompt=prompt, n=1) except openai.BadRequestError as e: body = getattr(e, "body", {}) or {} if body.get("error", {}).get("code") == "content_policy_violation": push("🚫 Image moderation", f"User: {user_id}\nPrompt:\n{prompt[:600]}", priority=7) raise
def push(t, m, p): requests.post(f"{os.environ['NOTIFLY_URL']}/message", params={"token": os.environ["NOTIFLY_TOKEN"]}, json={"title": t, "message": m, "priority": p}, timeout=5)If within an hour there are 5+ refusals from a single user_id, create a separate alert at priority=10 (potential abuse case).