General 14 min read

Local AI Is Becoming the Default — Start With Your Notes

MMNMNOTE
local-aion-device-aiprivacyplain-textnotesownership

Local AI is becoming the default. The principle behind the shift is simple: a software feature should not require shipping your data to a vendor's server to work. The model can run on the device you already own. And the easiest place to start is the data already sitting there — your notes, in plain text, never uploaded.

For most of the past three years the assumption ran the other way. Intelligence meant a round trip: your words left your machine, reached a server farm, and came back as a suggestion. That assumption is now breaking.

In May 2026 an essay arguing that local AI needs to be the norm reached 1,903 points on Hacker News, the kind of front-page attention reserved for ideas whose time has arrived 1. A few months earlier, a major platform had already shipped an on-device language-model API to every developer who wanted one 2. The argument and the infrastructure arrived together.

What most people still believe about AI

Most people assume AI is something that happens elsewhere. You type; the text travels to a cloud server; an answer returns. Under that model, every smart feature is a data-export event — your draft, your meeting note, your half-formed idea, sent to a company you have to trust. The convenience is real. So is the cost.

This is the default that shipped with the first wave of generative tools, and it became a habit. "One of the current trends in modern software is for developers to slap an API call to OpenAI or Anthropic for features within their app," wrote Cyrus, the engineer whose essay touched the nerve 3.

The pattern is fast to build. It is also why a note app you trusted with your private thoughts may quietly forward them to a third party the moment you press its "AI" button.

The habit was reasonable while it lasted. Three years ago a capable model could not fit on a phone or a laptop, so renting one over the network was the only way to ship an intelligent feature. Convenience and capability pointed the same direction, and the architecture followed. The default is shifting now because the premise it rested on — that the model has to live somewhere else — has stopped being true.

Why the round trip is the problem

The round trip is fragile by design. It needs a network, a vendor account, and a server that stays online and stays honest. "This laziness is creating a generation of software that is fragile, invades your privacy, and fundamentally broken," Cyrus argues 4. Each external call is a dependency on someone else's uptime, pricing, and data policy.

Consider what the round trip actually moves. To summarize your note, the cloud approach uploads the note. To suggest an edit, it uploads the draft. The thing you wanted help with is the thing that leaves.

That is acceptable for a public web search and unacceptable for a private journal, and most apps draw no line between the two. The fragility is not hypothetical: vendors change terms, deprecate endpoints, and add "we store your content" footnotes between one release and the next.

There is a quieter cost beneath the privacy one. A feature built on a remote call only works when the network is up, the vendor's account is funded, and the endpoint has not been retired. Your ability to summarize your own notes becomes contingent on a company you do not control.

The local version inverts that dependency. The model is yours for as long as the device runs — offline on a plane, in a basement, or ten years from now after the vendor has moved on. Resilience, not only privacy, is what the device side buys you.

What is actually happening: the model moves to the device

The correction is already underway, and it is institutional rather than fringe. On September 29, 2025, Apple shipped its Foundation Models framework: an on-device language model handed to every app developer who wants one. A trillion-dollar platform building local inference into its operating systems is what becoming the default actually looks like.

Apple's own description states the trade plainly. The framework "allows developers to create new intelligence features that protect users' privacy and are available offline, all while using AI inference that is free of cost" 5. Privacy, offline capability, and no per-call fee — the properties the cloud round trip cannot offer — are now the platform's defaults, not a niche choice a developer has to fight for.

It is not one company, either. Hacker News, the developer commons where infrastructure trends surface early, has carried a multi-year, accelerating run of local-first projects. A privacy-respecting dictation tool reached 591 points in August 2025 6. A local-first assistant written in Rust drew 331 points in February 2026 7. AMD's open-source model server, built to run on a laptop's own GPU and NPU, reached 572 points in April 2026 8.

None of those numbers is a survey statistic. Taken together, they are the texture of a developer audience that increasingly wants the model on the machine rather than behind an API. Cyrus describes what the device-side version looks like in practice: a summary "generated on-device using Apple's local model APIs. No server detours. No prompt or user logs" 9. The feature still works. Nothing leaves the laptop to make it work.

The honest limit: on-device is smaller, not bigger

On-device models are smaller than the frontier systems that run in data centers, and pretending otherwise would be dishonest. A model that fits on a laptop will not match a cluster's raw capability on the hardest reasoning tasks. The case for local is not that it wins on size.

The case is that most everyday tasks do not need the largest model. Summarize this note. Rephrase this sentence. Pull the action items out of these minutes. For work like that, a local model is good enough — and it is the only version that does the work without exporting the work. "'AI everywhere' is not the goal. Useful software is the goal," Cyrus writes 10. The point was never the biggest model. It was the useful, honest one.

Where to start: the data already on your device

Start with the files you already own. Notes are the natural first home for local AI for a structural reason: a plain-text note already sits on your device, already in a format a model reads, already yours. There is nothing to export and nothing to upload. The data and the model share one place, so the round trip never begins.

Three concrete moves:

  1. Keep your notes in plain text (Markdown). A format the machine reads directly needs no conversion before a local model can act on it. It is also the format most likely to outlive any single app. "Apps are ephemeral, but your files have a chance to last," writes Steph Ango, CEO of Obsidian 11.
  2. Prefer features that run on your hardware. When a tool offers "AI," ask the one question that matters: does my text leave the device? Offline-capable and on-device answers are the ones that keep the round trip from ever starting. For the step-by-step version, see how to run private AI on your own notes with no cloud.
  3. Treat upload as a deliberate choice, not a default. Send data to a cloud model when the task genuinely needs the largest model — and know, each time, that you chose to.

The self-hosting community has wanted this for years. The r/LocalLLaMA forum has been a persistent home for people running models on their own machines, swapping setups and benchmarks, long before the platform vendors caught up. What changed in 2026 is that the largest of those vendors finally agreed, and shipped the on-device model as a default rather than a workaround. The principle stopped being a hobbyist preference and became the direction of the tools themselves.

Frequently asked questions

Why should I run AI locally?

Running AI locally keeps your data on your own device. There is no upload, no vendor account, and no dependency on a server staying online or honest. It also works offline. Apple's on-device framework is described by Apple as protecting "users' privacy" and "available offline" 5 — privacy and resilience are the point, not a side effect.

Can I use AI on my notes without uploading them to the cloud?

Yes. A plain-text note already lives on your device in a format a local model can read directly, so an on-device model can summarize or edit it with nothing leaving your machine. Vendors now ship on-device model APIs — Apple's Foundation Models framework, for instance — that make exactly this possible 5.

What is on-device AI, or local AI?

On-device AI is a model that runs on your own hardware instead of a remote server, so the input never leaves for a data center. Apple now ships such a model to developers, describing the inference as "available offline" and "free of cost" 5. AMD ships an open-source server that runs models on a laptop's own GPU and NPU 8.

Is local AI as good as cloud AI?

Not on raw capability. On-device models are smaller than the frontier systems in data centers and lose on the hardest reasoning tasks. But most everyday work, like summarizing or rephrasing, does not need the largest model. For those tasks a local model is good enough, and it is the only one that does the work without exporting it.

What is the best way to run AI on my own files?

Start with files that are already plain text on your device, because they need no export and no upload before a local model can read them. Then prefer tools whose AI features run on your hardware. The format and the location matter more than any specific tool ranking — own the file first, choose the model second.

Why does plain text matter for local AI?

A plain-text note is directly readable by a model with no conversion step, and it is portable across tools and durable across years. That is the same reason it survives app shutdowns. The file-over-app principle and the local-AI principle point the same direction: own the data, and the tooling becomes replaceable 11.

The default is shifting from "send your data out" to "keep the model in." Your notes are already plain text and already on your device, which makes them the first thing you can give a model without giving anything away.


To keep AI close to your own words, mnmnote.com is a browser-based markdown editor whose notes stay on your device — open Markdown you can hand to whatever model you trust, with nothing uploaded by default.

Footnotes

  1. "Local AI needs to be the norm," Hacker News (submitted by cylo, 2026-05-10), item 48085821, 1,903 points as of 2026-06-08. https://news.ycombinator.com/item?id=48085821 — accessed 2026-06-08.

  2. "Apple's Foundation Models framework unlocks new intelligent app experiences for developers," Apple Newsroom, 2025-09-29. https://www.apple.com/newsroom/2025/09/apples-foundation-models-framework-unlocks-new-intelligent-app-experiences/ — accessed 2026-06-08.

  3. Cyrus, "Local AI Needs to be the Norm," Unix Foo, 2025-12-30. https://unix.foo/posts/local-ai-needs-to-be-norm/ — accessed 2026-06-08.

  4. Cyrus, "Local AI Needs to be the Norm," Unix Foo, 2025-12-30. https://unix.foo/posts/local-ai-needs-to-be-norm/ — accessed 2026-06-08.

  5. "Apple's Foundation Models framework unlocks new intelligent app experiences for developers," Apple Newsroom, 2025-09-29. https://www.apple.com/newsroom/2025/09/apples-foundation-models-framework-unlocks-new-intelligent-app-experiences/ — accessed 2026-06-08. 2 3 4

  6. "Show HN: Whispering – Open-source, local-first dictation you can trust," Hacker News, 2025-08-18, item 44942731, 591 points. https://news.ycombinator.com/item?id=44942731 — accessed 2026-06-08.

  7. "Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory," Hacker News, 2026-02-08, item 46930391, 331 points. https://news.ycombinator.com/item?id=46930391 — accessed 2026-06-08.

  8. "Lemonade by AMD: a fast and open source local LLM server using GPU and NPU," Hacker News, 2026-04-02, item 47612724, 572 points. https://news.ycombinator.com/item?id=47612724 — accessed 2026-06-08. 2

  9. Cyrus, "Local AI Needs to be the Norm," Unix Foo, 2025-12-30. https://unix.foo/posts/local-ai-needs-to-be-norm/ — accessed 2026-06-08.

  10. Cyrus, "Local AI Needs to be the Norm," Unix Foo, 2025-12-30. https://unix.foo/posts/local-ai-needs-to-be-norm/ — accessed 2026-06-08.

  11. Steph Ango, "File over app," stephango.com. https://stephango.com/file-over-app — accessed 2026-06-08. 2