Bonfyre

What's under the hood

This isn't a wrapper around someone else's tooling. Every piece is built from scratch in C.

Static C binaries

Each is 20–60 KB (except the quantizer at 227 KB). No runtime dependencies. No containers. No Node.js. Just copy the binary and run it.

Shared library modules

Crypto, compression, threading, JSON parsing, Unicode, networking, config management, containers, CLI — all in one 180 KB library. Zero external dependencies beyond libc.

~2.1 MB

Total disk footprint

The entire backend — auth, payments, search, transcription, CMS, API gateway, compression, pipeline engine — fits in less space than a single JPEG photo.

5–8 ms

Per pipeline stage

Audio in, invoice out. Each processing stage completes in single-digit milliseconds. Re-runs skip unchanged stages entirely via content-addressed caching.

4×

AI model compression

3-bit quantization with 0.9999+ cosine similarity. AI models that required $10K cloud GPUs now run on a $500 laptop. 15 models published on HuggingFace.

167

Tests pass

Full test suite, all reproducible from source. Every benchmark number on this site comes from scripts in the repo that you can run yourself.

Bonfyre's binaries are organized into four layers: Substrate (ingest, normalize, segment), Transform (transcribe, summarize, score, embed), Surface (format, render, deliver), and Value (price, invoice, pay, prove). Each operator declares whether it's cacheable, reversible, and idempotent — so the pipeline can parallelize, skip, and retry safely.

What changed since the last public push

The stack is not just bigger. The shape of the product has changed in useful ways.

Network observation became a product surface

capture → recipe

BonfyreWire now fingerprints devices, emits canonical artifacts, and generates stitch-compatible recipes from owned or authorized captures. It is not just packet accounting anymore.

CLI became more operationally truthful

health-aware

`bonfyre list --health`, `bonfyre doctor sync-subcommands`, workflow listing, recipe browsing, and layer registry inspection now make the runtime easier to inspect and trust after updates.

Speech moved from feature to system

stacked

The repo now carries a deeper speech investigation story: architecture, production integration, quickstart, hypothesis engine, and discovery pipeline work, not just a standalone transcription binary.

Pages and runtime are converging

repo-aligned

The public docs, Pages workflows, and runtime packaging are being pointed at `bonfyre-oss` so the repo, the site, and the automation describe the same product instead of different snapshots.

What That Means In Practice

A week ago the public story was still mostly compression, transcription, and backend footprint. The newer repo shape adds a more operational Bonfyre: captures become artifacts, artifacts become recipes, recipes become execution plans, and the CLI now exposes more of the health and registry state needed to work with that system confidently.

bonfyre wire ingest-pcap capture.pcap --dumb-device --root layeros/state
bonfyre wire probe <capture_id> --root layeros/state
bonfyre wire artifacts <capture_id> --root layeros/state
bonfyre wire recipe <capture_id> --root layeros/state > recipe.json
bonfyre stitch plan recipe.json

What Bonfyre replaces

Real cost and complexity savings. These are real numbers.

vs Strapi (content management)

1,742× smaller

500 MB install → 287 KB binary. 400 npm packages → 0 dependencies. Cold start 2 min → instant. Dynamic schemas, token auth, REST API, full-text search — backed by a 26-module C runtime with its own SIMD JSON parser, BLAKE2b crypto, LZ4 compression, and work-stealing thread pool.

vs Deepgram / OpenAI (transcription)

$0 / minute

Cloud transcription at $0.006/min adds up fast. Bonfyre runs HCP-enhanced Whisper locally: speaker segmentation, filler detection, hallucination filtering via Hyperscan pattern fusion (>1 GB/s), and quality scoring with OpenSMILE eGeMAPSv02 (88 audio features). Visible metrics, not a black box. See live proof →

vs Pinecone (search)

$0 / month

$70–250/mo hosted vector search → local NEON SIMD cosine similarity in 5 ms. Embeddings stored in zero-copy LMDB (pointer casts into mmap — no deserialization). BM25 text ranking for hybrid results. Runs on your machine, scales to millions of documents.

vs Twilio (phone)

No per-call billing

SaaS vendor lock-in → local communications edge. bonfyre-tel handles calls, SMS, MMS, routing, and identity/event state; bonfyre-moq covers browser-native live media; fountain-coded swarm delivery helps move artifacts across unreliable links. No recurring fees, no per-minute markup.

vs full SaaS stack

$0 / month total

Auth + billing + API gateway + CMS + vector search + OpenAI-compatible proxy + local orchestration + phone edge + relay: typically multiple vendors and glue services. Bonfyre keeps those surfaces in one stack, backed by the same shared runtime, for ~2.1 MB on disk.

vs cloud AI APIs

100% private

Your data never leaves your device. You can still point existing AI clients at a local OpenAI-compatible endpoint through bonfyre-proxy, but the work stays on your hardware and can run offline when the path allows it.

vs LiveKit / Twilio Media (real-time relay)

$0 / stream

bonfyre-moq is a pure C WebTransport/MoQ relay for private realtime audio and media fan-out. No SaaS relay bill, no Node.js in production, no per-stream markup. It fits the same local-backend story as the rest of Bonfyre instead of pushing you into a separate realtime vendor stack. Details →

Real solutions across industries

These workflows weren't possible before — they needed expensive cloud GPUs or huge servers. Now they run on your laptop.

⚖️ Legal: Private Client Calls → Organized Briefs

Record a client intake call. Bonfyre transcribes it locally (HCP Whisper with speaker segmentation), identifies key issues via BM25 ranking, scores transcript quality, and packages everything into a structured brief with content-addressed artifacts. Your client's data never touches the cloud — BLAKE2b hashing proves provenance.

Previously required: $400/month in cloud APIs

🏥 Healthcare: Local SOAP Notes

Patient conversations transcribed and summarized into structured SOAP notes — all on-premise. HIPAA-compliant by default because nothing leaves the building. XSalsa20-Poly1305 encryption at rest, Ed25519 signing for audit trails. An 8 GB device runs the entire stack: transcription, summarization, quality scoring, and delivery.

Previously required: Cloud subscription + compliance overhead

🎙️ Podcast → Monetized Asset

Drop in a raw episode → get a transcript (speaker-segmented), show notes, social media clips, and a checkout page with dynamic pricing — all generated locally in under 5 minutes. The pipeline: bonfyre-intake → bonfyre-transcribe → bonfyre-summarize → bonfyre-format → bonfyre-offer. Each stage is content-addressed; re-runs skip unchanged work.

Previously required: GPT-4 API + Descript + manual work

🎬 Video Production: Longer, Cheaper

AI video generation models that used to need $10K+ cloud GPUs now run on a $2,000 workstation. Generate longer sequences without running out of memory — Bonfyre's compression makes the difference.

Previously required: Cloud GPU rental at $3–5/hour

🌍 Field Research: Record → Archive → Search

Record interviews in the field with no internet. Bonfyre transcribes on-device, embeds documents for SIMD vector search, creates searchable archives with zero-copy LMDB caching, and publishes a static site — all from a MacBook. Fountain codes handle lossy syncs back to base.

Previously required: Cloud transcription + database hosting

🏫 Education: Lectures → Course Materials

Record a 90-minute lecture. Bonfyre transcribes it, generates structured notes, and produces a study guide — all locally. Student data never goes to any third party.

Previously required: Otter.ai subscription + manual editing

🏭 Manufacturing: Inspection Without Cloud

Run AI quality inspection on the factory floor with no internet required. A small device handles the entire model + logging + API. Fits in a 10-watt power envelope.

Previously required: Cloud GPU + internet connection

🛰️ Defense / Remote: Air-Gapped AI

Full AI inference on hardware with no internet connection. Bonfyre's compression means large AI models now fit on portable, field-deployable devices.

Previously required: Specialized military hardware or satellite uplink

Real-Time Media, Still In The Bonfyre Style

bonfyre-moq is the realtime version of the same idea as the rest of Bonfyre: a small C binary you can run yourself. It terminates WebTransport, speaks MoQ, records stream events to SQLite, and gives you a private relay primitive for live audio, agent calls, and browser-native media flows.

Previously required: LiveKit/Twilio-style media infrastructure or a custom relay team

Technical details →

The common pattern

Calls, files, notes, recordings, repos, and live streams come in → Bonfyre processes them locally → organized output comes out as summaries, briefs, archives, pages, audio, search surfaces, and delivery flows. No cloud dependency. No recurring vendor chain.

What You Have	What You Can Run
Raspberry Pi / 8 GB device	Small AI models, local transcription, full pipeline processing
MacBook / 16 GB laptop	Speech transcription + AI inference + video processing — all running together
MacBook Pro / 64 GB Mac	Large AI models (14B parameters), full video generation, 32K-token context windows
Cloud GPU (T4 / 16 GB)	Everything a 64 GB Mac can do, plus batch processing and multi-model workflows
Workstation GPU (RTX 4090+)	Full video generation, multi-second sequences, all models simultaneously

Pick the one that matches your problem

Each is a standalone starting point — you don't need to understand the whole system.

Content Management

Replace bloated CMS platforms with a 287 KB binary. Dynamic schemas, token auth, REST API, full-text search — 1,742× smaller than Strapi with zero npm dependencies. Repo →

Local Transcription

HCP-enhanced Whisper runs locally: speaker segmentation, filler/hallucination filtering, quality scoring (88 audio features via eGeMAPSv02). No cloud, no API keys, no per-minute charges. Live proof → Repo →

Audio → Invoice Pipeline

Drop in audio → transcript → summary → quality score → pricing → packaged deliverable. 5–8 ms per stage. Content-addressed caching skips unchanged stages on re-runs. Repo →

Smart Search

NEON SIMD cosine similarity + BM25 text ranking. Zero-copy LMDB storage (pointer casts into mmap). Replace $250/mo Pinecone with local 5 ms queries. Repo →

Self-Hosted Backend

Auth, metering, API gateway, rate limiting, telephony, proxy, orchestration, work-stealing thread pools, compression, config, and CLI surfaces — all backed by the same shared runtime instead of a pile of separate services. Repo →

AI Model Compression

3-bit quantization via E8 lattice snap + μ-law warp + 16D RVQ. Cosine 0.9999+. 15 models published on HuggingFace. Run on a $500 laptop what used to require a $10K GPU. Repo →

OpenAI Drop-In Replacement

53 KB binary. Set OPENAI_API_BASE=http://localhost:8787. Same endpoints, local processing, $0 cost. Good when you want existing tools to keep working while Bonfyre handles transcription, briefing, and local model-facing flows underneath.

Machine-Only Orchestration

bonfyre-orchestrate is not a chat wrapper. It is a planner layer that scores pathways, stores policy memory, and decides when extra Bonfyre blocks like narrate, render, emit, or pack should be used to improve the result.

Long-Context AI

KV cache compression: 4× more context tokens in the same VRAM. 9 optimization passes. Works with both nn.Linear and FPQ layers. Process entire 90-minute lectures in a single pass.

Verified Narration

bonfyre-narrate turns briefs, updates, and changelogs into audio with inline feature extraction, fidelity scoring, and refinement. It is not just “TTS added later” — it is a proof-minded output layer in the same stack.

Private Realtime Relay

Run a private WebTransport/MoQ relay in C for browser-native audio, live agent sessions, and local media fan-out. Same Bonfyre values: local-first, low overhead, SQLite logs, no recurring relay vendor tax. Docs →

Communications And Delivery

Use SIP where it fits, MoQ where it fits, and swarm delivery where it fits. Bonfyre now has a broader transport story: telephony, browser-native relay, QUIC-aware artifact movement, and metered peer distribution in one family.

WordPress users

Keep using WordPress for the front end. Let Bonfyre handle the heavy lifting behind the scenes.

1. Podcast → blog posts

Turn episode audio into draft posts, summaries, and quotes automatically.

2. Smart site search

Search by meaning, not just keywords. Replace bloated search plugins.

3. Auto article briefs

Create editorial summaries from long transcripts or notes.

4. Premium member gateway

Back premium content tiers without plugin sprawl.

5. Lead magnet generator

Produce PDFs, EPUBs, and guides from your WordPress content.

6. Knowledge base search

Index docs, FAQs, and help content for fast retrieval.

7. Client portal backend

WordPress handles presentation. Bonfyre handles auth, metering, and deliverables.

8. Call recordings → CRM notes

Raw call audio into organized, quality-scored client packets.

9. Auto-tagging archives

Enrich old content with topics, categories, and clusters.

10. Content repurposing

Turn one post into snippets, email copy, and social-ready assets.

11. Research library

Semantic index + artifact pipeline for PDFs and transcripts.

12. Proposal automation

Quoting and billing from proof bundles to invoices.

13. Voice memo publishing

Upload voice notes, publish cleaned, summarized versions.

14. Local AI features

Transcription and search without cloud APIs or vendor lock-in.

15. Fast static publishing

WordPress as editor, Bonfyre to emit alternate outputs and feeds.

Replace plugin sprawl

Typical WordPress

Yoast SEO — $99/yr
MemberPress — $179/yr
SearchWP — $99/yr
WP All Import — $99/yr
Gravity Forms — $59/yr
WooCommerce + 8 add-ons
Deepgram/Otter — $/min
Zapier — $49/mo

53 binaries in the stack + 26-module runtime — $0/month
~2.1 MB total on disk
Auth + billing + metering
Local transcription
Smart search
Multi-format output
Dynamic pricing
Zero vendor lock-in

Good fit

Better run events, cleaner communication, faster promo cycles.

Try this

Post-event notes uploaded → event summary, top issues, social post ideas, sponsor recap.

Browse all recipes →

What's under the hood

What changed since the last public push

What Bonfyre replaces

Real solutions across industries

Does my hardware work?

Pick the one that matches your problem

Content Management

Local Transcription

Audio → Invoice Pipeline

Smart Search

Self-Hosted Backend

AI Model Compression

OpenAI Drop-In Replacement

Machine-Only Orchestration

Long-Context AI

Verified Narration

Private Realtime Relay

Communications And Delivery

WordPress users

1. Podcast → blog posts

2. Smart site search

3. Auto article briefs

4. Premium member gateway

5. Lead magnet generator

6. Knowledge base search

7. Client portal backend

8. Call recordings → CRM notes

9. Auto-tagging archives

10. Content repurposing

11. Research library

12. Proposal automation

13. Voice memo publishing

14. Local AI features

15. Fast static publishing

Replace plugin sprawl

Typical WordPress

Bonfyre

Good fit

Real-world recipes

🏢 Property Managers

🍺 Bars & Nightlife

🍕 Restaurants

✂️ Salons & Barbershops

🏋️ Gyms & Fitness Studios

🔧 Local Service Businesses

🏠 Real Estate Teams

🛡️ Insurance Agencies

⚖️ Law Offices

🏥 Medical & Dental Offices

💚 Nonprofits

📚 Schools & Training Orgs

⛪ Churches & Faith Communities

🏛️ Museums & Historical Groups

💼 Agencies & Consultants

🎧 Clubs & Event Venues

20 live apps running right now

Recent additions

Open source. MIT licensed. $0/month.