Privacy

Why student data must stay on-site: the case against cloud AI in schools

Updated May 2026 · 5 min read IndiLearn · Education Technology
2024
Australian GenAI in Schools Framework implemented Term 1
$1M
Federal investment to update student privacy principles for AI
100%
IndiLearn data stays within school network — always
0
Cloud API calls made with student data

In this article

  1. Why children's data is uniquely at risk
  2. Student work and intellectual property
  3. The structural problem with cloud AI in schools
  4. What the Australian framework actually requires
  5. Why consent forms don't fix the problem
  6. On-site inference: the only architectural solution
  7. How to evaluate any AI tool before bringing it to your school

In 2024, researchers at UNSW revealed that photographs of Australian children — scraped from YouTube, Flickr, and school newsletters without consent — had been used to train AI models to recognise what children look like. The data had been out in the world for years. Nobody knew. Nobody had agreed to it.

In the same period, a criminal group known as Radiant targeted a London childcare chain, claiming to have stolen names, addresses, photographs and personal information of 8,000 children. The data was used to extort bitcoin ransoms from parents and providers.

These are not hypothetical risks. They are documented outcomes of student data leaving the environments where it was created. And yet schools are being sold cloud-based AI tools — where student data is the input — at an accelerating rate.

Why children's data is uniquely at risk

Children's data has a specific property that makes it more valuable to bad actors than adult data: it has no history. No credit record. No criminal record. No financial footprint. Identity theft targeting a child can go completely undetected until the child turns 18 and applies for a bank account, a rental lease, or their first job — and is declined.

This delayed discovery window is exactly what makes children's data worth stealing. The victim doesn't know for years. By the time the damage surfaces, the data has been resold many times and the trail is cold.

Real example

A 2024 UNSW study found that photos of Australian children were scraped from public sources — YouTube, Flickr, school newsletters — and used to train AI systems without consent. The data was used to teach models what children look like. How that capability is deployed in future is unknown. The photos cannot be removed from the models.

For a primary school running a cloud AI tool, the data leaving the network might include: a child's first name and class, a photograph of their handwriting (including emerging literacy level), their teacher's feedback annotations, their performance trajectory across a unit. In isolation each piece seems innocuous. Combined and persistent, they constitute a profile of a real child.

Student work and intellectual property

Data privacy is not only about personal information. When students submit work to a cloud-based AI platform, their writing, drawings, and handwriting samples may also constitute intellectual property — created by the student, owned by the student, and potentially used by the platform without explicit consent.

Every time a document, image, or audio file is uploaded to a cloud service, the intellectual property embedded in that content is subject to the platform's terms of service. Most cloud AI providers include clauses that allow them to use uploaded content to improve their models. A student's handwriting sample, a teacher's rubric, a school's proprietary curriculum documents — once uploaded, all of these enter a system designed to learn from them.

The IP problem nobody mentions

Schools invest significantly in developing quality rubrics, assessment frameworks, and curriculum documents. Uploading these to cloud AI tools may transfer rights to that intellectual property to the platform provider. IndiLearn's on-site architecture means school IP — including student work and teacher-created materials — never leaves the building. It cannot be used to train external models. It stays yours.

IndiLearn's position is unambiguous: all student data and all intellectual property created within IndiLearn's tools belongs to the school and the student. It is processed on school hardware, stored on school infrastructure, and is never used to train any AI model — internal or external. This is a design commitment, not a terms-of-service clause.

The structural problem with cloud AI in schools

Cloud AI tools process data on servers that the school does not control, in jurisdictions that may not apply Australian law, by companies whose business models depend on learning from data inputs.

This is not a criticism of any specific vendor. It is the structural reality of how cloud inference works. When a student's work sample is sent to a cloud API, the following things are true:

Policy note

Victorian DET policy explicitly prohibits staff and students from loading student names, reports, personal histories, or contact details into AI tools. The policy notes that "content may be used and reused by the platform and its users, which may constitute a privacy breach." Most cloud AI tools are built around exactly this model.

What the Australian framework actually requires

The Australian Framework for Generative AI in Schools, implemented from Term 1 2024, was developed with representatives from every state and territory. Its first sentence on student data identifies it as the key concern for Education Ministers. The federal government committed $1 million specifically to update privacy principles for AI use in schools.

The framework requires schools to prioritise tools that limit the unnecessary collection or processing of personal information by default. It directs schools to monitor tool use to ensure personal information is not uploaded to tools that share data with third parties or use it to train models.

The intent is clear. The implementation is hard, because the tools on offer are almost universally cloud-based. A school trying to comply is forced to make a judgment call on every platform — one that requires reviewing lengthy terms of service documents that change without notice.

RequirementCloud AI toolIndiLearn on-site
Data stays within school networkNo — leaves school by designYes — structurally guaranteed
No third-party data sharingDepends on terms of serviceYes — no external calls
Not used for model trainingRequires explicit opt-outYes — inference only, no training
Compliant with Australian Privacy PrinciplesRequires vendor assessmentYes — data never subject to foreign law
Consent barrier for students without parental permissionCreates classroom management problemNone — no external transmission

The standard response to student data privacy concerns in school technology is: get consent. If parents sign a permission form, the school is covered.

This misunderstands two things. First, consent addresses legal exposure, not structural risk. A signed consent form does not change the fact that student data is now on a cloud server outside Australia, subject to whoever has access to that server. It just means the school disclosed that this was going to happen.

Second, consent creates a classroom management problem. In a typical Queensland primary class, a small number of students will not have parental consent for a given digital platform. The teacher must now teach around those students — managing who can use the tool and who cannot, in real time, during a lesson. This is precisely the kind of administrative friction that burns teachers out.

The DET QLD position

Queensland's Department of Education requires that specific permission exists before children access platforms that hold data. When even one child in a class does not have that consent, teachers must manage exclusions during the lesson. On-site AI removes this barrier entirely — there is no platform to consent to, because no student data leaves the school.

On-site inference: the only architectural solution

The only way to structurally guarantee that student data does not leave the school network is to run AI inference inside the school network. This is what IndiLearn does.

Every IndiLearn tool processes student data on a Mac mini server located within the school's physical premises. The model runs locally. The student's work is processed locally. The feedback is generated locally. Nothing is transmitted externally.

This is not a new concept. Local inference has been technically feasible for years. Apple's M-series chips — the same architecture in a Mac mini — are purpose-built for neural network inference and can run models that rival cloud performance at a fraction of the cost and with zero data exposure. A Mac mini with 32GB unified memory runs models that would have required a dedicated GPU rack five years ago.

The economics are straightforward. A 600-student school pays approximately AU$2,000–3,000 for the hardware. That hardware runs every AI query for every student, every lesson, for the life of the device — with zero per-query cost. Cloud AI at comparable quality would cost AU$6–12 per student per year in API fees alone, which compounds with student count and lesson frequency.

What on-site means for teachers

No consent forms for AI tools. No excluded students. No platform terms to review. No concern about what happens to a student's work after it leaves the classroom. The teacher gives a task, the student submits, the system responds — and every piece of that interaction stays inside the building.

How to evaluate any AI tool before bringing it to your school

Not all schools are ready to deploy on-site inference hardware. If your school is evaluating cloud-based AI tools, here is the minimum evaluation you should conduct before going to teachers or parents.

  1. Where does inference run? Ask the vendor directly. If inference runs on their servers, student data is leaving your network.
  2. What data is transmitted? Map every student data point the tool touches. Each one that travels to external servers is a potential exposure.
  3. Does the vendor train on customer data? Read the terms of service, not the marketing copy. Look for explicit opt-out requirements, not default protections.
  4. Where are the servers located? Data stored outside Australia is subject to foreign law. Servers in the United States are subject to US legal process, including disclosure requests that are not transparent to Australian institutions.
  5. What happens to the data if the vendor is acquired or fails? Student data persisting in a failed vendor's systems — or in the systems of an acquirer with different privacy policies — is a real risk that few schools assess.

IndiLearn keeps student data exactly where it belongs.

Our on-site architecture is a structural guarantee, not a policy commitment. Register your school's interest for pilot access.

Register your school

The bottom line

Student data privacy is not a compliance checkbox. It is the condition under which trust between schools, families, and technology providers is possible. Cloud AI tools that process student work on external servers are structurally incompatible with that trust — regardless of their privacy policies, their consent forms, or their intentions.

IndiLearn's on-site architecture is not a premium feature. It is the baseline requirement for any AI tool that belongs in a school. We built it that way from the first line of code.

Related articles