How a Data Product With 138,000 Verified Records Launched in One Week and Scales to 50 States
ContractorLicensePro is a contractor verification platform serving homeowners and licensed contractors across California and Texas, with a clear path to all 50 states. At launch: 138,665 verified records, one week from concept to live, $0 per month to host. The business model is an embeddable verification badge that creates a compounding backlink engine through contractor adoption, requiring zero outreach spend to grow.
At a Glance
ContractorLicensePro is a contractor verification platform serving homeowners and licensed contractors across California and Texas, with a clear path to all 50 states. At launch: 138,665 verified records, one week from concept to live, $0 per month to host. The business model is an embeddable verification badge that creates a compounding backlink engine through contractor adoption, requiring zero outreach spend to grow.
Challenge
The Problem
Hiring a contractor starts with trust. And trust, right now, has no infrastructure.
A homeowner who wants to verify a contractor's license today visits a state board website. In California, that is one board. In Texas, a different board depending on trade. Each board operates independently, with its own search interface, its own data fields, and its own definition of what "active" means on a license record. There is no unified search. No cross-state lookup. No score that synthesizes license status, disciplinary history, bonding, and insurance into something a non-expert can read in 30 seconds.
The data exists. That is not the problem. State boards publish contractor records as a public service. The problem is that the data lives in silos, behind aging interfaces built for regulatory staff rather than homeowners making $40,000 renovation decisions. Verifying one contractor across two states requires two site visits, two different search flows, and familiarity with how each board formats its records.
The market has tried to fill this. Review platforms like HomeAdvisor and Angi surface star ratings and written reviews. But reviews are not license verification. A contractor with a 4.8 rating and an expired license is still a contractor with an expired license. What was missing: a single destination that pulls official state board data, normalizes it across states, computes a trust score from multiple license signals, surfaces disciplinary history in plain language, and makes the result searchable and embeddable.
Licensed contractors suffer the same gap from the other side. The ones who maintain active licenses, carry proper bonding, and have clean disciplinary records have no straightforward way to prove that to prospective clients during the early stages of a job inquiry. No verifiable credential to display alongside a quote or a portfolio link. The market treats licensed and unlicensed contractors identically at the discovery stage, which is where hiring decisions begin.
Government data is technically accessible. Practically, it is not.
Solution
The System
ContractorLicensePro collects official state board records, processes them through a trust scoring pipeline, generates contractor profiles at web scale, and makes those profiles searchable, filterable, and embeddable.
Data Collection Pipeline
Data collection runs through a Python pipeline with two branches. The California branch scrapes the state licensing board directly, navigating the board's search interface programmatically and extracting license records: license number, status, issue date, expiration date, trade classifications, bonding amount, insurance status, and disciplinary history entries. The Texas branch uses bulk file downloads from the relevant state boards, which publish structured data files rather than requiring scraper-level interaction. Both branches normalize their output into a shared schema. A GitHub Actions workflow runs this pipeline on a monthly cron. When a new state is added, its scraper branch maps that state's fields into the same schema. The page generation layer does not change.
Trust Score Algorithm
Each contractor receives a score from 0 to 100, computed from 6 license signals: active license status, expiration proximity, disciplinary record, bond status, insurance status, and classification coverage. The signals are weighted by consumer risk. A disciplinary action carries more weight than a license expiring eight months from now. The score converts a set of bureaucratic data fields into a single readable number that surfaces which contractors have the strongest compliance profile in their trade and location.
Dual Database Architecture
At build time, a SQLite database containing all 138,665 contractor records is committed directly to the git repository. Astro 5 reads this database during the build process and generates a static HTML page for every contractor profile, every city browse page, every state listing, and every trade classification page. The pages exist as files before any user requests them. Cloudflare Pages hosts them for free. The runtime layer handles what the static layer cannot: live search queries, lead capture, badge click tracking, and real-time contractor status lookups via Cloudflare D1 at the edge. Cloudflare Workers serve the search and lead capture APIs at 100,000 requests per day at zero cost. FTS5 full-text search runs in both databases so free-text queries return results fast at either layer.
Embeddable Badge Growth Engine
Licensed contractors with verified profiles can embed a "Verified on ContractorLicensePro" badge on their own websites, pointing back to their profile page. Every active badge is a dofollow backlink. Scale that across thousands of licensed contractors and the backlink profile grows through contractor adoption rather than outreach spend. The Chrome extension planned for the next phase enables real-time license lookup at the point where a homeowner is actively researching a contractor, not after they have already made a decision.
Results
The Impact
ContractorLicensePro launched at contractorlicensepro.com in one week from concept. At launch: 138,665 verified contractor records across California and Texas, each with license status, Trust Score, and disciplinary history. 975 state-board-check pages pre-generated and indexed. Static sitemaps submitted to search engines at deploy.
Comparable data products have taken three to four months to reach a first indexed page. The one-week timeline is a direct output of the Astro 5 build architecture, which generates all static pages in a single build pass, combined with the Cloudflare deployment pipeline, which has no infrastructure provisioning delay.
The scaling path is mechanical. Each additional US state requires one scraper branch in the Python pipeline and one monthly refresh cycle. Twelve more states brings national coverage for the most regulated contractor trades. The badge outreach program creates a compounding backlink channel: contractors embed the badge, the badge creates backlinks, backlinks improve search ranking, ranking improvement increases contractor discovery, discovery creates adoption incentive for more badges.
The hosting cost at 50 states and 500,000-plus contractor records remains $0 per month on the current Cloudflare free tier. That number does not change because the architecture does not change. Static pages do not scale with traffic costs.
Related: See more case studies → · how AI agents serve public data use cases · a similar data extraction pipeline