A toolkit of multimodal AI experiences powered by Google's Gemini API — image, video, audio, and text generation in one place.

The first thing that broke when I shipped GeminiOmni

I shipped the first public version of GeminiOmni on a Wednesday afternoon. It was a stripped-down landing page with a single working tool — an early version of the Nano Banana 2 image editor — and a button that said "Try it free, no sign-up."

The site went up at 14:42 Berlin time. By 15:05, my Google Cloud billing dashboard was screaming.

This post is the full breakdown of what went wrong, what it cost, and what I changed before the end of the same workday. If you're building anything that touches a paid AI API, treat this as a free lesson — most people don't write about the embarrassing parts.

The actual sequence

Here's the timeline from my own incident log (yes, I keep one — recommend it):

14:42 Deploy succeeded. Site live at geminiomni-ai.com. Posted a quick demo to a small Mastodon instance and went to make a coffee.
14:58 Came back to a Slack notification from Google Cloud: "Your bill has exceeded $5 for the current cycle." This was alarming because the cycle had been live for sixteen minutes.
15:05 Logged into the Gemini API console. Saw 412 image generation requests in the last 20 minutes, from 38 distinct IPs across three continents.
15:08 Pulled the site offline. Killed the API key. Generated a new one I wouldn't use yet.
15:42 Started rewriting the architecture.
18:30 Shipped v2 with the fix. Brought the site back online.
23:00 Total bill for the four-hour misadventure: $84.30.

For context, I'd budgeted $200 for the entire first month's API spend. I burned 42% of it in seventeen minutes because of a mistake I should have known better than to make.

What I had done

The first version of the image editor was wired up like this: a Next.js page with a React component that took the user's prompt and the uploaded image, then called Google's Gemini Image API directly from the browser using the @google/genai client and an API key embedded in NEXT_PUBLIC_GEMINI_API_KEY.

Yes. I put a paid API key in NEXT_PUBLIC_*. Which Next.js dutifully bundles into the client JavaScript. Which is visible to anyone who opens DevTools.

In my defense — and this defense is not very good — the original prototype was a private localhost experiment. I'd wired the key as a client-side env var because I was iterating fast and didn't want to bother setting up an API route. When I prepared the public deploy, I changed approximately forty things and the key location wasn't one of them.

The mistake wasn't unique to me. Half a dozen people I respect have done variations of this in 2024-2025. It's the most common preventable AI cost incident I've seen.

How it got scraped that fast

The detail that surprised me was how quickly the requests started coming in. I'd posted to a Mastodon instance with maybe 600 followers, none of whom would have done this maliciously. The traffic had to be coming from somewhere else.

Here's what I figured out from the request logs:

Within five minutes of going live, an automated crawler hit the site, extracted the bundled NEXT_PUBLIC_GEMINI_API_KEY from the JavaScript, and published it to a key-trading channel I didn't know existed.
Within ten minutes, that key was being used by approximately 38 different IPs to generate images that had nothing to do with my product. Most of the prompts I sampled in the logs were generic ("a red sports car", "anime girl in a forest"). A few were testing the model's limits with adversarial prompts.
By minute fifteen, somebody was running a tight loop generating images as fast as Google would allow.

This is the part I want indie builders to understand: there are crawlers whose entire job is to scrape new sites for API keys. They monitor recent SSL certificate issuance, hit new domains, parse the JavaScript bundles, and publish discovered keys within minutes. You don't have to be famous. You don't have to be on a popular subreddit. You just have to be online with a leaked key.

The architecture I shipped that afternoon

The fix was structural, not cosmetic. I rewrote the data flow so that no API key is ever sent to the browser, and the architecture is now what we use today:

Browser (no keys, ever)
   ↓ POST /api/ai/generate { prompt, imageFile }
Next.js API route (server-side)
   ↓ envConfigs.gemini_api_key (kept on server)
Google Gemini API

Three concrete changes:

1. Move the API key to a server-only env var. Renamed NEXT_PUBLIC_GEMINI_API_KEY to GEMINI_API_KEY. Next.js will refuse to ship server-only env vars to the client by default. Belt and suspenders: I added a startup check that crashes the app if any env var name containing "API_KEY" or "SECRET" starts with NEXT_PUBLIC_.

2. Create an /api/ai/* proxy. Every model call goes through a single server-side endpoint. The browser sends the prompt and any user-uploaded files; the server attaches the API key, makes the actual Gemini API call, and returns just the result. This sounds obvious but it's the entire fix.

3. Add a per-IP rate limit on the proxy. Even with the key safe, I didn't want one user spamming Free-tier generations to drain credits. The proxy enforces 10 requests per IP per minute on the free tier and 60 per minute on Pro+. This wouldn't have prevented the original incident — the attackers were rotating IPs — but it makes the next class of abuse much harder.

The total code change was about 180 lines added and 90 removed. It took me 2 hours 48 minutes from "site offline" to "site live again with new architecture." Not great, but not catastrophic.

What I'd do differently next time

The honest answer is: never ship a public AI app with a client-side API call to a paid service. Not for "just a prototype." Not for "MVP testing." Not for "a quick demo to friends." The threat model isn't malicious users. It's automated infrastructure that already exists for exactly this scenario.

A few more lessons that fell out of the incident:

Pre-flight check the bundle. Before shipping, run grep -r "NEXT_PUBLIC_" .next/static/ and read what's in there. If anything looks like a credential, fix it before going live. I now have this as a pre-deploy git hook.

Set a hard daily cap. Google Cloud lets you cap your Gemini API at a daily dollar amount. I set mine to $50 for the first month and $200 after. If the cap is hit, the API returns errors and I get an email — much better than letting it run to $5,000 in a single bad night.

Use the lowest-cost model that ships. I was burning $0.151 per Nano Banana 2 4K edit on a Free-tier surface. Even before the attack, my unit economics were fragile. I now route Free-tier generations through Nano Banana 2K at $0.045 per edit and keep 4K for Pro+. The 3.4× cost reduction would have made the $84 bill closer to $25.

Cap the prompt length. One of the abusers was sending massive prompts with embedded image data, which Gemini bills per token. The proxy now hard-caps incoming prompt size at 50KB. A legitimate user has never bumped this; an abuser hits it in the first request.

The receipts

For anyone who wants to verify I'm not making this up: my Gemini API console for April 28 shows 411 requests from 14:42 to 15:08, with the spend report showing $83.96 in image generation charges and $0.34 in input tokens. I emailed Google Cloud support that evening and they declined to refund — fair enough, the key was legitimately mine and I'd authorized its use. They suggested exactly the architecture I'd already shipped.

The key takeaway, written large enough that I can read it from across the room:

AI API keys are bearer credentials. They go on the server. They never go anywhere else.

If you take one thing from this post, take that.