How I stopped shipping my Anthropic API key to thousands of phones
When I added an AI feature to my app, I did it the way every tutorial shows you: call the Anthropic API straight from the React Native client. It worked beautifully in the simulator. It also meant my API key was sitting inside the app bundle, ready for anyone with ten minutes and a proxy tool to lift.
Mobile apps are not a safe place for secrets. Anything that ships to the device — environment variables, “obfuscated” constants, values pulled at runtime — can be read by a motivated stranger. With an LLM key, that's not just embarrassing; it's your bill, and potentially someone else's abuse on your account.
The fix: a thin proxy you own
The pattern is boring, and that's the point. Instead of the app talking to Anthropic directly, it talks to a tiny serverless function you control. The key lives on the server. The app authenticates to your proxy, the proxy adds the real key, forwards the request, and streams the response back.
App → your Vercel proxy → Anthropic
(holds the key) (streams back)
Three pieces make it real: a proxy endpoint that forwards requests and streams responses, a session issuer that hands each install a short-lived token, and a small client helper so the app side stays a one-liner like useAnthropicChat().
The part everyone gets wrong: streaming
It's easy to write a proxy that returns the whole response as one blob. It's much nicer — and much harder — to stream tokens as they arrive, so your chat UI feels alive instead of frozen. That means parsing server-sent events on the way through, including the genuinely annoying case where a single SSE chunk gets split across two network packets and your naïve parser drops half a word.
Add rate limiting so one bad actor can't run up your bill, and install-based auth so you know which device is calling, and you've got something that's actually production-ready rather than demo-ready.
Why I turned it into a boilerplate
This took me the better part of a week to get right the first time — debugging streaming chunk-splitting at midnight with a newborn asleep next door. And it's the exact kind of plumbing that's near-identical across every app and that nobody should have to rebuild from scratch.
So I extracted it, stripped out everything app-specific, wrote a test suite and a deploy guide, and put it up for the price of a takeaway. If you're about to wire an LLM into a React Native app, this is the boring-but-essential layer, done.