Run open-source LLMs locally — no network, no API key, no telemetry.
Best evidence tier. Signup tested end-to-end by xmr.club curator — deposit + withdrawal + edge cases. No-KYC posture verified at retail volume. Last_verified within 12 months.
Full rubric + 7-step verification walkthrough at /methodology.
Local-first LLM runtime that pulls quantised open-source model weights and exposes them through an OpenAI-compatible API on `localhost`. Listed at Grade A because Ollama is the canonical "I want OpenAI-quality output without sending my prompts anywhere" answer for the ~80% of users who don't want to compile `llama.cpp` themselves — MIT-licensed, no account anywhere in the flow, no telemetry on the inference path, and the inference itself never leaves your machine. The strongest privacy posture available in this directory because there is no operator to trust on the data path.
What it is. Ollama is a desktop + server application that wraps the `llama.cpp` inference engine in a clean CLI (`ollama run <model>`, `ollama serve`), a model registry (`ollama.com/library` hosts quantised GGUF weights for ~80 popular open-source models — Llama, Mistral, DeepSeek, Qwen, Phi, Gemma, Mixtral, and many more), and an OpenAI-compatible HTTP API on `localhost:11434/v1`. You install it once (~600 MB binary), `ollama pull llama3` downloads the weights (~4-40 GB depending on model size), and `ollama run llama3` drops you into a chat REPL — or you point any OpenAI-SDK consumer (LangChain, Continue.dev, Cursor, Aider, the `openai` Python SDK) at the local endpoint and it works without code changes.
Background. Ollama was started in 2023 by Jeffrey Morgan and Michael Chiang as a Mac-first project (Apple Silicon's unified memory makes consumer-grade LLM inference unusually tractable). It expanded to Linux + Windows within months and now runs on CPU, NVIDIA CUDA, AMD ROCm, and Apple Metal, picking the best available accelerator automatically. The team operates Ollama, Inc. (a Delaware C-corp, SF-based) with venture backing — but the runtime is fully open-source under MIT with the codebase at `github.com/ollama/ollama`, and the company's business model is enterprise support / on-prem deployment, not the consumer CLI.
The registry at `ollama.com/library` is the project's centralised distribution surface — analogous to Docker Hub for model weights. You can also point Ollama at any GGUF file on disk via a `Modelfile` (the project's compact spec for declaring a model + system prompt + parameters), so air-gapped or fully self-hosted registry workflows are first-class.
What you trust.
Operational specs.
Operator philosophy. Jeffrey Morgan's framing in conference talks is "local inference is the default, not the fallback" — the team's design choices consistently favour latency + privacy over feature completeness on the hosted side. The Modelfile + GGUF approach makes Ollama functionally a packaging layer over `llama.cpp`, which means the project's value depreciates if the hosted-LLM economy gets cheaper / more private (good thing) and accretes if local hardware gets faster (also good thing). The Ollama, Inc. enterprise side is decoupled from the open-source runtime — the CLI doesn't degrade if you don't pay, and there's no "free tier" rate limit (because there's no server to limit).
Grade rationale. Grade A reflects: the strongest privacy posture available (inference is local, no operator on the data path, no account to compromise), open-source under permissive MIT licence (forkable + auditable), named-operator accountability without operator dependency (Ollama, Inc. + Jeffrey Morgan publicly identified, but the runtime keeps working if they vanish — switching to `llama.cpp` directly is the equivalent of changing a wrapper), broad hardware support (every consumer accelerator + CPU fallback), rich model library (~80 open-source models, all the post-2024 frontier-grade open releases), OpenAI-compatible API surface (works as a drop-in for any existing tool), kycnot.me corroboration on the no-KYC posture, no major incident or trust-erosion thread in r/LocalLLaMA / r/MachineLearning / GitHub issues in the last 12 months, and deliberate refusal to add usage telemetry. Last verified 2026-05-26.
Useful when:
Caveats:
Free · MIT · runtime + model weights local
Sourced from operator pages — verify identity via more than one channel before trusting time-sensitive instructions.
.onion mirror listed 2026-05-26 (<90d) No community reviews yet. Be the first below.
Honest, brand-neutral feedback welcome. A curator approves before it appears here. No JS required.
Silence censorship. Protect your privacy and bypass restrictions with Xeovo VPN. No email required.
Long-running no-KYC aggregator. XMR-friendly, Tor mirror, broad coin support.
Mobile + desktop multi-coin wallet (XMR, BTC, LTC, ETH) with in-app swap + CakePay.
Non-custodial cross-chain swap router with refund-on-refusal AML policy and multi-destination split swaps. No
Two-year-old no-account instant swap — in-house test swap settled in 3 minutes (0–1 conf), Trocador A privacy