NobodyWho

Local-first LLM inference for your apps

Run open-weight language models directly inside your software. Streaming chat, tool calling, structured output, embeddings, and RAG — all offline with GPU acceleration. No servers, no API keys, no Docker. Built on llama.cpp.

New to local LLMs?

Start here if you are new to running language models locally. These guides cover the core concepts — what models are, how to pick one, and how quantization works.

LLM Basics Model Selection

Choose your binding

Python

Batteries-included bindings with sync and async APIs.

pip install nobodywho

Swift

Native Swift package for macOS and iOS apps.

Swift Package Manager

React Native

Drop-in module for React Native on Android and iOS.

npm install react-native-nobodywho

Flutter

Cross-platform plugin for Flutter on mobile and desktop.

flutter pub add nobodywho

Godot

GDExtension for Godot 4.x game projects.

Asset Library or GitHub release