Sentience: 60+ AI Tools in One Local Desktop App

The Problem With Modern AI Tooling Nobody Talks About

If you've spent any time building with AI APIs, you already know the frustration. You find a model you love, you wire up your tools, you get everything humming — and then you want to switch providers. Suddenly you're rewriting dispatchers, reformatting schemas, and debugging header mismatches at midnight. The AI ecosystem has a vendor lock-in problem, and most developers have quietly accepted it as the cost of doing business.

That's exactly the problem Sentience was built to solve. Sentience is a PySide6 desktop AI assistant that runs entirely on your local machine, ships with its own browser, its own email client, its own voice controller, and exposes over 60 tool functions to whichever model you happen to be using that day. The goal was simple: build the kind of AI desktop experience that combines the polished UX of tools like Cursor with the breadth of capability you'd expect from a full-featured AI agent platform — without ever forcing you to commit to a single AI provider.

What Makes a Local AI Desktop App Worth Building

The appeal of a fully local AI desktop application goes beyond privacy, though that's certainly part of it. When your AI assistant runs on your machine, you close the laptop lid and your session is still there. There's no cloud state to sync, no session timeout to fight, no API gateway adding latency between your thought and the model's response. For developers and power users who spend their days inside AI-assisted workflows, that kind of stability matters enormously.

Sentience takes this premise seriously. Rather than being a thin wrapper around a single API, it's a genuine desktop environment for AI-assisted work. The built-in browser means the model can navigate the web without a separate tool server. The email client integration means you can ask your assistant to draft, send, and organize messages without copy-pasting between windows. The voice controller means your hands-free prompting is a first-class feature, not an afterthought. And underneath all of that sits a tool layer exposing 60+ discrete functions the model can call at will.

The Real Engineering Challenge: One Tool Schema, Four Providers

Building the tools themselves wasn't the hard part. Most of the 60+ functions in Sentience are straightforward — file operations, browser controls, email actions, system utilities. The hard part was making all of those tool schemas work identically across four fundamentally different AI providers: Groq, OpenAI, Anthropic, and a local Ollama instance.

OpenAI's /v1/chat/completions format has quietly become a de facto industry standard. Groq implements it. Ollama implements it. LocalAI implements it. That means three out of four target providers can share a single HTTP call structure and a single tool schema format — if you're willing to treat the OpenAI tool specification as your ground truth. For most multi-provider builds, that's exactly what happens: developers write to the OpenAI spec and quietly exclude Anthropic, or they maintain two entirely separate codepaths and double their maintenance burden.

Sentience chose a third option: a thin adapter layer that keeps one unified tool list and one unified dispatcher, regardless of which provider is active.

Why Anthropic Requires Special Handling

Anthropic's Messages API diverges from the OpenAI standard in several meaningful ways. Rather than embedding the system prompt as an entry in the message array, it uses a dedicated top-level system field. Authentication uses x-api-key instead of the familiar Authorization: Bearer header pattern. Every request must include an anthropic-version: 2023-06-01 header or the API will reject it. And on the response side, tool interactions use a distinct tool_use and tool_result content block format that looks nothing like what OpenAI returns.

None of these differences are insurmountable individually. The challenge is that they compound. If you're building a system where the user can switch between Claude and GPT-4o mid-session, you need to handle all of these divergences gracefully and invisibly. The user shouldn't need to think about which model they're on. The tool calls should work. The responses should feel the same. The dispatcher should never need to know which provider generated the output it's processing.

The Adapter Pattern That Makes It Work

The solution in Sentience is a provider adapter layer that sits between the unified tool schema and the actual API calls. When a request goes to Anthropic, the adapter rewrites the message structure, injects the correct headers, and converts the tool schema format before the request leaves the application. When the response comes back, the adapter normalizes the tool_use content blocks back into the standard format the dispatcher expects. From the perspective of the rest of the application, Anthropic looks exactly like OpenAI.

This approach keeps the codebase dramatically simpler than maintaining parallel execution paths. There's one tool list. There's one dispatcher. There's one place where tool results get processed and fed back into the conversation. The only provider-specific code lives inside the adapter, and the adapter's entire job is to speak the dialect of whatever API it's wrapping.

Why This Architecture Matters for the Future of Local AI

The broader lesson from Sentience's architecture is one the AI tooling ecosystem hasn't fully absorbed yet. As more developers build serious, production-grade AI applications, the question of provider portability is going to become increasingly important. Models improve rapidly. Pricing shifts. A provider that's the obvious choice today may not be the obvious choice in six months. Applications that are tightly coupled to a single provider's API format will face expensive rewrites every time the landscape changes.

Building around a unified schema with thin provider adapters is more work upfront, but it pays compounding dividends. You can benchmark models against each other on real workloads. You can failover between providers without user-facing disruption. You can support local models through Ollama for sensitive tasks and cloud models for tasks where raw capability matters, all within the same application and the same session.

What Sentience Demonstrates About Local AI Desktop Apps

Sentience is a proof of concept for a class of AI applications that the industry hasn't fully explored: fully local, fully featured, genuinely multi-provider AI desktop environments. It demonstrates that 60+ tool functions can be exposed to a model without writing provider-specific dispatchers. It demonstrates that Anthropic's API and OpenAI's API can coexist in the same application without the user ever seeing the seams. And it demonstrates that running your AI assistant entirely on your own machine doesn't mean sacrificing capability or flexibility.

For developers building in this space, the architecture is worth studying. The hard parts of multi-provider AI tooling are rarely the tools themselves — they're the plumbing that makes those tools work everywhere, all the time, regardless of which model happens to be answering today. Getting that plumbing right is what separates a demo from a desktop app you can actually close the lid on.