Dr. Ines Wakabayashi

Author

Dr. Ines Wakabayashi

Dr. Ines Wakabayashi is BuilderProof's lead methodologist. She designs the test rigs and scoring rubrics behind every benchmark, after a decade in reproducible-systems research.

Deploy quality

Deploy-quality benchmark: SEO, accessibility and performance audits (June 2026)

We audited the deployed output of seven AI app builders with Lighthouse, axe-core and a structured SEO checklist — auditing the production build, not the in-editor preview. Performance was the strongest dimension across the board; accessibility was the weakest, with colour-contrast and form-label failures common. Lovable and v0 led overall, but no builder shipped a clean accessibility pass out of the box. This page reports per-dimension scores and the specific failures that recur, so you know what to fix after export.

11 min read9
Speed

Speed-to-first-paint across AI app builders (June 2026)

We timed two clocks for each builder: speed-to-first-paint (prompt to first rendered preview) and time-to-working-app (prompt to all acceptance checks passing). v0 by Vercel was fastest to first paint at a median 9 seconds; Base44 and Bolt.new followed. The ranking shifts for time-to-working-app, where full-app builders pay an upfront cost but reach a runnable result with fewer manual edits. We report medians of five cold runs on a fixed network profile, with the full distribution and caveats below.

10 min read9
Output quality

Benchmarking output quality across 7 AI app builders (June 2026)

We gave seven AI app builders one identical brief and scored the output on visual fidelity, code structure and functional correctness using a published, double-rated rubric. Lovable led on overall output quality (93/100), with v0 close behind on component fidelity and Bolt.new strong on framework breadth. Differences were largest in code structure, not visuals: every builder produced something that looked right, but maintainability and correctness diverged sharply. This page documents the brief, the rubric and the per-builder results, with every figure sourced.

11 min read9