Architecture

On-premise AI for schools: data sovereignty, predictable cost, and the Mac mini case study

Updated May 2026 · 7 min readIndiLearn · Architecture
AU$2K
Mac mini M4 Pro cost — amortises in under 4 months at 600 students
~30W
Mac mini power draw under AI load vs 600W+ for a GPU rig
6–12mo
Typical break-even vs cloud AI API at school scale
$0
Per-query cost after hardware purchase

In this article

  1. The cloud economics problem for schools
  2. Why the Mac mini is the right hardware for school AI
  3. Unified memory architecture: the technical advantage
  4. Cost comparison: cloud vs on-premise at scale
  5. Data sovereignty: the procurement argument cloud AI cannot make
  6. How IndiLearn deploys on the school's Mac mini

The standard pitch for AI in education is a cloud subscription. Monthly per-student fees. API costs that scale with usage. Privacy policies that require lawyers to interpret. Terms of service that change without notice.

IndiLearn's architecture is a deliberate rejection of that model. Every school that deploys IndiLearn in its full on-premise configuration gets a Mac mini in their server room, runs all AI inference on their own network, pays nothing per query, and holds a structural guarantee that no student data ever leaves the building.

This is not primarily a cost argument, although the cost case is strong. It is a procurement argument. Cloud-based AI education tools cannot credibly tell a school principal or DET procurement officer that student data is safe, because the structural reality of cloud inference makes that claim untenable. On-premise inference makes it true by design.

The cloud economics problem for schools

Cloud AI APIs are billed per token — per unit of text processed. For an education platform, this creates a cost structure that scales with exactly what you want to scale: student use. Every lesson, every student submission, every feedback generation is a cost event. The more the platform is used, the higher the bill.

For IndiLearn's feedback platform, a realistic estimate at Sonnet-level pricing with prompt caching is AU$6–12 per student per year in API costs — for 100 feedback calls across a ten-week unit, plus theme synthesis. At that rate, a 600-student school spends AU$3,600–7,200 annually in API fees alone, every year, forever, scaling as students use it more.

That is before platform fees, support costs, or the compliance overhead of managing a cloud data processor relationship under the Australian Privacy Principles.

Why the Mac mini is the right hardware for school AI

The conventional assumption for AI inference hardware is a GPU server. High-VRAM NVIDIA cards, dedicated cooling, server rack, specialist installation. This is the right answer for some use cases. It is the wrong answer for a primary school's server room.

Apple's M-series Mac mini is the right answer for school AI for three reasons: cost, power, and form factor.

Unified memory architecture: the technical advantage

The reason the Mac mini can run large language models that previously required expensive GPU hardware is Apple Silicon's unified memory architecture. In traditional systems, the CPU and GPU have separate memory pools. For LLM inference, the model weights must be loaded into GPU VRAM — which is limited, expensive, and the primary constraint on which models can run on a given machine.

Apple Silicon eliminates this distinction. The CPU, GPU, and Neural Engine all share the same memory pool. A Mac mini M4 Pro with 48GB of unified memory can use all 48GB for model weights — equivalent to a dedicated GPU with 48GB VRAM, which does not exist in the consumer market at anything close to the Mac mini's price point.

What 48GB unified memory runs

A Mac mini M4 Pro with 48GB unified memory comfortably runs Llama 3.1 70B in 4-bit quantisation — a model that rivals cloud API quality for structured generation tasks like feedback writing and content generation. For IndiLearn's specific use cases, where the inference tasks are bounded (feedback against a rubric, decodable word generation within a grapheme set), even smaller models provide the quality required.

Cost comparison: cloud vs on-premise at scale

School sizeCloud API cost / yearMac mini cost (once)Break-even
200 studentsAU$1,200–2,400AU$2,000–3,00012–24 months
400 studentsAU$2,400–4,800AU$2,000–3,0006–12 months
600 studentsAU$3,600–7,200AU$2,000–3,0003–5 months
1,000 studentsAU$6,000–12,000AU$2,000–3,0002–4 months

The hardware cost does not increase with student count. The cloud cost scales linearly. For any school with more than approximately 200 students using the platform meaningfully, the on-premise model delivers cost savings within the first year.

Data sovereignty: the procurement argument cloud AI cannot make

Cost is the obvious argument. The more important one is data sovereignty.

When a school runs AI inference on a cloud API, the following things are structurally true regardless of what the vendor's privacy policy says: student data leaves the school network, is processed on servers the school does not control, is subject to the laws of the jurisdiction where those servers are located, and is handled under terms of service that can change at any time.

On-premise inference makes none of this true. The data is processed on a machine the school owns, on a network the school controls, subject to Australian law, under terms that cannot change because there is no third-party cloud provider in the loop.

The DET procurement context

Queensland DET and other state education departments require specific permission for student data to be processed by external platforms. For a cloud AI tool, every student interaction is technically a data transmission to an external processor. Schools must obtain and manage consent, assess vendor compliance, and monitor usage. For an on-premise tool, there is no external transmission — and therefore no consent barrier, no vendor assessment, and no monitoring overhead. The procurement conversation is structurally simpler.

How IndiLearn deploys on the school's Mac mini

IndiLearn's on-site deployment uses Ollama as the local inference server. Ollama exposes an OpenAI-compatible API at the local network address, meaning IndiLearn's application code communicates with the local model in exactly the same way it would communicate with a cloud API. The privacy guarantee is architectural, not application-level — no code change is required to move from cloud to on-premise.

The CoachingProvider abstraction in IndiLearn's codebase ensures this. The application never sees whether inference is running locally or in the cloud — it calls the provider interface and receives a response. The school's Mac mini configuration swaps one implementation for another without the teacher, student, or application being aware of the change.

What this means for a principal

A principal considering IndiLearn's on-premise configuration can tell the school community, genuinely and verifiably: no student data from this tool leaves our network. Not "our vendor says it doesn't." Not "we believe they comply." The data physically cannot leave — the processing happens on our hardware, on our network, under our control. That is a procurement statement ChatGPT cannot make.

Your school's AI. On your hardware. Under your control.

Register your school's interest for the on-premise pilot in 2026.

Register your school

Related articles