Private AI vs. public LLMs: a CFO’s data-risk checklist
When sending data to OpenAI is fine, when it isn’t, and what running GPT-4-class models inside your own infrastructure actually looks like.
The real risk isn’t the model — it’s where the data goes
With a public API, every prompt — contracts, case files, customer data — leaves your perimeter for a third party. For healthcare, finance, legal, or government that isn’t a technical footnote; it’s a compliance problem (data residency, sector regulation). The CFO’s question isn’t "how good is the model?" but "who touches my data, and under which jurisdiction?"
Open models are already enough for most of the work
For information extraction, summarization, Q&A over documents, and text generation, high-parameter open-weights models — families like Llama (Meta) and Qwen (Alibaba) — reach practical parity with top proprietary models on most enterprise tasks. They run inside your infrastructure — on-prem or a dedicated VPC with data residency in Mexico — exposing an OpenAI-compatible API, so existing code is reused unchanged.
The economics flip at scale
Accessible enterprise hardware — from a single high-end GPU server to multi-GPU configurations — handles dozens to hundreds of concurrent users depending on model size. At high inference volume, owning the hardware typically amortizes within one to two years versus accumulated per-token API costs, while removing the data leakage risk entirely.
The checklist
Is your data regulated or under NDA? Is data residency a requirement? Is your inference volume high and steady? Do you need fine-tuning on your own data without it leaving? If you answer "yes" to two or more, private AI stops being a luxury and becomes the default.