For the curious (and the engineers)

Under the hood.

A quick tour of what Huduko actually is — the models, the search pipeline, the storage, and what "fully local" means in practice on your Mac.

Get on Mac App Store ← Back to home

Architecture

One binary. A sidecar.
SQLite in the middle.

Huduko is a Tauri 2 desktop app with a frozen Python backend as a sidecar. The two processes talk over HTTP on localhost. Everything else — models, embeddings, index — lives on disk under ~/.huduko/.

┌─────────────────────────────────────────────────────────────────┐
│  Tauri Shell (Rust)  ←→  React + Tailwind UI                    │
│                              ↕  HTTP fetch                       │
│         FastAPI server  (localhost:8765)                         │
│                              ↕                                   │
│   SearchEngine  ←──────────  SQLite + sqlite‑vec + FTS5          │
│        ↑  ↑                                                      │
│   mxbai  CLIP    (Apple MLX, Metal GPU)                          │
│        ↑                                                         │
│   IngestionQueue  ←  FileWatcher (watchdog)                      │
└─────────────────────────────────────────────────────────────────┘

The Tauri shell spawns the backend binary on launch and kills it on quit. The backend is a PyInstaller --onefile bundle — a single aarch64 executable that contains Python, MLX, ONNX Runtime, PyMuPDF, Tesseract, and the model code. Shipped as a single DMG, signed with a Developer ID certificate and notarized by Apple.

How a query becomes a result.

Huduko runs two independent retrieval strategies in parallel and merges their rankings. Rank-based fusion means no weights to tune by hand.

Dual-encode

mxbai encodes the query (with an asymmetric query prefix) → 1024-d. CLIP's text encoder encodes the same query → 512-d. Both run locally on the Apple Silicon GPU via MLX.

Hybrid retrieve

Vector KNN (sqlite-vec) and BM25 (SQLite FTS5) run in parallel against each table. Each strategy over-fetches by a configurable multiplier so the merger has real candidates to work with.

RRF merge

Reciprocal Rank Fusion: each candidate's final score is Σ 1 / (k + rank_i) across strategies (k=60). Parameter-free, works well across keyword-heavy and semantic-heavy queries alike.

Group & rank

Documents are chunked during ingestion, so a single file can match multiple times. Results are grouped by path — only the best-ranked chunk per document surfaces.

Why RRF and not weighted-sum hybrid? Weighted sums (e.g., 0.7 * cosine + 0.3 * bm25) need per-query tuning — keyword-heavy queries reward BM25, semantic-heavy queries reward vectors, and the right weight depends on the query. RRF only cares about rank position, not score scale, so it's robust across both regimes without configuration.

Cross-modal search for free. CLIP maps text and images into the same 512-d space. A query like "golden retriever in the snow" is encoded by CLIP's text tower and matched directly against image embeddings — no filename, no tag, no caption required.

Ingestion

How files become searchable.

A background pipeline keeps the index in sync with your filesystem. It collapses rapid saves, parses by extension, and batches embeddings through the GPU.

Watch

watchdog observes your configured directories using FSEvents on macOS. Create, modify, delete, and move events all fire callbacks into the queue.

Debounce

An asyncio.Queue with a 1s debounce collapses rapid saves. Editors that write-and-rename don't generate duplicate index jobs.

Parse & chunk

Routed by extension: PyMuPDF for PDFs (with OCR fallback on image-only pages), python-docx for Word, openpyxl for Excel, python-pptx for PowerPoint, plain-text readers for code/markdown. Text is chunked to ~400 words with 50-word overlap.

Embed & upsert

Chunks batched through the text encoder (mxbai, 32 per batch by default); images through CLIP (16 per batch). Embeddings upserted into SQLite alongside FTS5 full-text entries in a single transaction.

Embedding models

Two models.
Two spaces. Zero cloud.

Huduko ships with two embedding models — one for text, one for images — and runs both on your Mac's Apple Silicon GPU via Apple MLX. Total on-disk weight: ~1.2 GB.

mxbai‑embed‑large‑v1

1024-d text embeddings

Best-in-class English retrieval at its size on MTEB. Asymmetric usage: a query prefix ("Represent this sentence…") is prepended at query time but not at document ingest time — this matches how the model was trained.

Runs on Apple MLX on M-series Macs. On Windows, fastembed via ONNX Runtime / DirectML. 639 MB on disk (fp16, MLX).

CLIP ViT‑B/32 (LAION‑2B)

512-d shared text+image space

CLIP maps natural-language descriptions and image pixels into the same vector space. Trained on LAION-2B (2 billion image-text pairs). A text query is embedded once and matched against image embeddings — no filename, OCR, or caption required for basic visual search.

Runs on Apple MLX on M-series Macs. On Windows, fastembed via ONNX Runtime. 578 MB on disk (fp16, MLX).

Why MLX? On Apple Silicon, MLX gives us Metal GPU execution with dramatically better cold-start and memory behavior than PyTorch/MPS. Model load is fast, inference is stable, and the memory footprint is small enough to coexist with whatever else is running on your Mac.

Both models are downloaded from Hugging Face on first launch, cached under ~/.huduko/models/, and never fetched again unless you wipe the cache.

Storage

Why SQLite
(and not a vector DB).

One file at ~/.huduko/db/huduko.db holds everything: document text, image metadata, Apple Photos entries, 1024-d text vectors, 512-d image vectors, and the full-text index. Opened with any SQLite client.

sqlite-vec

A tiny C extension (<1 MB) that adds KNN over float32 vectors. Ships as a virtual table alongside the normal SQL tables. Contrast with LanceDB's 314 MB bundle.

FTS5

SQLite's built-in full-text search with BM25 scoring. No extension needed. Same row-id space as the vector index, so updates stay atomic.

One transaction

A file's text, vectors, and FTS5 entry all commit in a single SQLite transaction. No eventual consistency, no vector/keyword drift.

Why not Pinecone, Weaviate, Qdrant, or LanceDB? Every one of those assumes either a server to run or a heavier embedded library than we want shipping inside a desktop app. SQLite + sqlite-vec + FTS5 is smaller, simpler, and maps cleanly to Huduko's single-user, single-node model. If you outgrow this (you probably won't — the format scales comfortably to millions of rows on consumer hardware), swapping out the storage layer is a localized change.

Privacy, in practice

What "local" actually means.

Plenty of apps say "local." Here's the concrete version.

✓ Models download once from Hugging Face on first launch, then live in ~/.huduko/models/. Huduko works offline from that point on — airplane mode friendly.
✓ Index lives at ~/.huduko/db/huduko.db. Unencrypted — we rely on your Mac's device encryption (FileVault) as the right layer for at-rest protection.
✓ No accounts, no servers. The app does not authenticate, sync, or back up anywhere.
✓ Telemetry is opt-out and anonymous. Device ID is a SHA-256 of hardware attributes (not per-user). Event types are allowlisted server-side. Transport is Python's stdlib urllib — no third-party analytics SDKs in the binary.
✓ Signed and notarized by Apple with a Developer ID certificate, running with the hardened runtime enabled.
✓ Not sandboxed — intentionally. watchdog needs filesystem-wide read access to observe your Documents, Pictures, and Desktop; the Mac App Sandbox requires Security-Scoped Bookmarks for every folder, which is a UX regression we're not willing to take pre-1.0. Mac App Store submission is a separate track.

Stack

The full tech stack.

No AI frameworks, no orchestration layers, no cloud SDKs. Boring, concrete components that each do one thing.

Desktop shell	Tauri 2 (Rust) — single-binary cross-platform, spawns the backend sidecar and kills it on quit
UI	React + Tailwind — dark / light / system theme via CSS variables
Local API	FastAPI + uvicorn on `localhost:8765` — async, no auth (loopback only)
Backend bundle	PyInstaller `--onefile` — ~148 MB aarch64 binary, Developer ID signed
Text embeddings	mxbai-embed-large-v1 — 1024-d, Apple MLX on Apple Silicon
Image embeddings	CLIP ViT-B/32 (LAION-2B) — 512-d, Apple MLX on Apple Silicon
File watcher	watchdog — FSEvents on macOS, ReadDirectoryChangesW on Windows
Document parsing	PyMuPDF, python-docx, openpyxl, python-pptx; Tesseract OCR fallback for image-only PDF pages
Vector store	SQLite + sqlite-vec — <1 MB C extension, virtual table for KNN
Full-text index	SQLite FTS5 — BM25 scoring, built into SQLite
Photo enrichment (macOS)	PhotoKit, Vision (labels + OCR), CLGeocoder (GPS → place)
Telemetry transport	stdlib urllib → Vercel serverless → Neon Postgres. Zero pip deps.

Try it

The fastest way to see
what this actually does.

$9.99 on the Mac App Store. One-time purchase, runs locally, and you can uninstall with a single drag. No accounts, no signups.