Cloud-first became the default pitch for a decade. But the truth in 2025 is simple: your computer is already powerful enough for most things people send to the cloud—documents, photos, search, note-taking, even many AI tasks. The rest is mostly bandwidth theater.
Today’s “average” laptops ship with serious accelerators
- Windows (Copilot+ class): Microsoft’s baseline for new AI PCs calls for an NPU capable of 40+ TOPS. Snapdragon X platforms hit ~45 TOPS, while AMD’s Ryzen AI 300 series advertises up to 50 TOPS.
- Apple silicon: Apple’s M4 Neural Engine is rated at up to 38 TOPS—and Apple publicly demonstrates on-device LLM inference at practical speeds on Mac.
Storage is the giveaway: local is hundreds of times faster than the internet
Consumer NVMe drives now sustain 7,000–15,000 MB/s (PCIe 4.0 → 5.0). Compare that to a solid home connection around ~200–240 Mbps (≈ 25–30 MB/s)—and you see why local wins on raw throughput.
- Samsung 990 PRO (PCIe 4.0): up to 7,450 MB/s.
- Crucial T700 (PCIe 5.0): up to 12,400 MB/s reads / 11,800 MB/s writes (and newer Gen5 controllers approach ~14,900 MB/s).
- Median fixed-broadband speeds in large markets (e.g., U.S.) sit near the low-hundreds of Mbps.
Latency and privacy: edge computing isn’t hype—it’s physics
The farther your data travels, the more latency and cost you incur. Edge computing’s entire premise is to run work near the source—on your device—reducing round trips and bandwidth.
On-device ML frameworks make this explicit: running a model strictly on the device removes the need for a network connection and helps keep personal data private. That’s local speed plus privacy by default.
On-device AI is here—and practical
- Windows: ONNX Runtime/Windows ML target NPUs directly on Copilot+ PCs for on-device inference.
- Apple: Apple engineers have shown 8B-class models running locally on Mac with real-time decode rates—no cloud calls required.
- Android: Gemini Nano is designed to run on device via Android’s AICore for low-latency features that still work offline.
The cost angle: bandwidth isn’t free
Even before compute, cloud egress adds up. Common data-transfer-out rates hover around $0.09/GB for many regions/services. For bandwidth-heavy apps, local processing sidesteps that ongoing tax.
Most people use a fraction of their machine
Usage data shows the bulk of online activity is web, email, messaging, social, and video—tasks that barely touch modern CPUs/NPUs. Translation: our hardware is idling while we ship simple work to distant servers.
When should you choose local over cloud?
- Throughput-bound tasks: Photo libraries, notes, docs, mail—your NVMe is hundreds of times faster than your ISP. Keep them local.
- Latency-sensitive tasks: Real-time transcription, summarization, autofill, or vision where milliseconds matter. Run them on the device.
- Privacy-critical data: Client files, personal content, model memories. On-device processing keeps data off the wire by default.
- Cost control: If bandwidth or egress would become your bill, keep the loop local.
Where cloud still shines
Cloud remains great for multi-user sync, heavy multi-tenant jobs, or training gigantic models. But for daily computing—and a growing slice of AI—local gives you speed, privacy, and control without subscriptions.
Clairos’ stance
We design for local by default: installable apps, readable storage, and on-device AI where it makes sense. Network features are explicit and optional. Because when data is sovereign and intelligence is local, your computer stops being a thin client—and becomes yours again.