/sys/research

[ The Lab ]

The domain of Applied Intelligence. Engineering builds the microscopes. Research looks through them.

~ ~ ~ ◈ ~ ~ ~

Research & Foundations

Experimental architectures, protocols, and rigorous frameworks.

AI Safety Compass

Research | AI Safety | Meta-Alignment | Literature Synthesis

Research paper exploring a gap: do models actually believe what their creators say about safety? I call this "meta-alignment"—not whether models do safe things, but whether they've internalized their lab's safety philosophy. Surveyed 10 frontier models with 40 questions derived from 70+ papers. Interactive tool included.

[ GitHub ] [ Paper ] [ Live Demo ]

crystallize

Research Tooling | Reproducibility | Experimental Rigor

A framework that makes data science experiments reproducible. Jupyter notebooks hide state—you can't tell what order cells ran or what values held. Crystallize treats each experiment as an immutable record with automatic statistical checks. Long-term goal: infrastructure that lets AI agents run their own experiments.

[ GitHub ] [ Docs ]

Bias in Embedding-Based Hiring

Research Leadership | AI Fairness | Mentorship | Ethics

Mentored an intern through AI fairness research. Designed a reading curriculum building from language model basics through AI ethics and embedding bias. Structured methodology: hypotheses → experiments → paper draft. Result: unpublished paper investigating gender bias in AI-powered hiring systems.

Backprop Paper Replication

Paper Implementation | MLX | Foundations | From Scratch

Implemented backpropagation from scratch following the original 1986 Rumelhart paper. Hand-derived gradients using chain rule (∂E/∂w via ∂E/∂y → ∂E/∂x), implemented momentum updates (Δw(t) = -ε∂E/∂w + αΔw(t-1)), built MLP on Apple MLX. No AI assistance — just the paper and framework docs. Includes weight matrix evolution visualizations.

[ GitHub ]

agent-tokens

Protocol | Intent | Policy | Agent Systems

A protocol for declaring agent intent at the HTTP layer. The problem: when an agent makes a request, origins can't tell if it matches what the user actually wanted. User says "check weather," agent calls the bank API—how does the bank know to block it? Agent Tokens let agents declare their allowed scope upfront so middleware can enforce policy automatically.

[ Website ] [ Spec ]

Golden Gate Qwen

Interpretability | Sparse Autoencoder | Feature Steering | Mechanistic Interpretability

Minimal replication of Anthropic's Golden Gate Claude on consumer hardware. Trains a Sparse Autoencoder on Qwen2.5-1.5B, discovers interpretable features, and steers model behavior — all on an RTX 3070 Ti. Demonstrates that mechanistic interpretability research is accessible beyond frontier-scale compute.

[ GitHub ]

ContextWars

Interpretability | Adversarial | Model Fingerprinting | MLX

Adversarial token convergence experiments on MLX. Pits language models against each other to reveal training strata — solo mode converges in 2 iterations, adversarial mode never does. Under pressure, models collapse to their most defensible tokens, revealing composition invisible in normal evaluation.

[ GitHub ]

Synapse

Workflow Demo | Fast Weights | Crystallize + LLM

A proof-of-concept testing a new research workflow: use Crystallize to structure experiments, then let an LLM help implement hypotheses rapidly. Built in an evening to validate the loop. The insight wasn't the model—it was proving the workflow enables fast iteration.

[ GitHub ]

~ ~ ~

Applied Engineering

Production systems, tools, and deployed products.

Kern

Systems Architecture | FastAPI | Kafka | Distributed Systems

Event-driven ML service architecture enabling long-running agentic workflows. Designed around distributed systems constraints: Kafka for async (no HTTP timeouts), Redis for large results (Pusher limits), dual-mode for internal app + external API. 525K requests/month in production.

Bloomdesk

Product | AI Pipeline | SaaS

Addressing the "Translation Gap" between users and engineers. An intelligent pipeline that converts vague bug reports into structured, high-entropy technical tickets using LLMs.

[ View Project ]

resume-mcp

MCP | Cloudflare Workers | API

Your identity as an API endpoint. An MCP server that lets AI agents query your professional profile with structured tools instead of scraping HTML.

[ GitHub ] [ Live Endpoint ]

DeltaTask

MCP | Python | Obsidian | SQLite

An MCP server that enables AI assistants to manage tasks in Obsidian. Bridges the gap between conversational AI and personal knowledge management.

[ GitHub ]

gcomm

Rust | Ollama | DevTools

A Rust CLI for seamless communication with Ollama models. Fast, ergonomic, and designed for developer workflows.

[ GitHub ] [ crates.io ]