Leo Mari Cuizon · AI Systems Operator · QA · LLM Evaluation · Workflow Design
Mobile apps · Web · LLM output evaluation · AI workflow systems · Edge cases · iOS & Android
About
I started in customer success, moved into QA and system validation, and have spent the last year building AI-assisted products to understand how they break.
I work across QA, AI output evaluation, and structured execution — using AI tools throughout my process for testing, debugging, analysis, and workflow design. I've shipped multiple product experiments not to launch startups, but to stay close to the systems I test.
I don't specialize in one stack. I specialize in understanding how a system is supposed to behave, finding where it doesn't, and documenting it clearly enough that someone else can act on it.
What I Do
How I Work With AI
I design systems around AI tools — routing logic, source-of-truth rules, feedback loops — rather than using them ad hoc.
Projects
Personal experiments — not polished products. Each one was a reason to get closer to a real failure mode.
Deterministic job evaluation system that scores remote listings against a candidate profile — without AI ranking. Transparent, rule-based logic produces explainable outputs. Built a multi-GPT workflow to manage the build: a source-of-truth Hub, a specialist GPT for architecture decisions, and Codex for narrow implementation tasks with explicit constraints on what each tool could decide.
Visit projectOffline-first AI notes PWA iterated across 47+ versions. Used as a regression testing ground every time a feature was added or removed. Caught a critical auth failure caused by iOS Safari's Intelligent Tracking Prevention blocking Supabase session persistence on PWA reinstall — isolated the caching conflict and documented the fix path.
Visit project2.5D endless runner PWA built as a testing ground for continuous-state systems: collision detection, mobile control behavior, obstacle generation edge cases, state resets on game death, and performance under sustained loops. Object pooling, garbage collection pressure, and service worker behavior under offline conditions — all testable in a way most apps don't expose.
Visit projectWhat You Get
Documented outputs, not activity summaries. Here's what that looks like in practice.
Sample Bug Report
Real report, sanitized client detailsLLM Evaluation
Real evaluation, sanitized client detailsMini Case Study
AI-assisted debugging & documentationProblem
Users were logged out every time the PWA was refreshed or reinstalled on iOS Safari.
Tested
Service worker caching strategy, Supabase auth token storage, ITP cookie behavior across iOS versions.
Output
Surfaced a caching conflict blocking session persistence. Documented the issue and fix path with AI assistance.
Contact
Send me what you're building, what's breaking, or what you need validated. I'll return a clear issue list, evaluation notes, or workflow documentation — depending on what's needed. Available for 1–2 projects at a time, async-first, remote only.