Data Documentation

The Serava Database

How we source, structure, enrich, and score 6,071,556 acquisition-ready businesses across 45 industries and 6 countries.

6.1M+

Companies

Industries

Countries

25+

Data Sources

97.7%

No website

the off-market majority

1,048,269

US motor carriers

499,163 with a named owner

FMCSA

968,197

Healthcare providers

965,372 named — 99.7%

NPPES

881,321

UK companies

773,099 beneficial owners + age

Companies House

659,451

US registered entities

358,544 with a named owner

Secretaries of State

399,526

NZ companies

388,818 named directors — 97.3%

NZ Companies Office

330,972

Licensed contractors

88,638 WA+OR: owner, phone, tenure

State licence boards

Owner path

Own one of these businesses?

Buyers are actively searching this market. We are seeing active demand for companies like this. Check buyer interest privately for your business.

Anonymous buyer-demand proof

We are seeing active demand for companies like this.

Dental practices

owner-operated, owner-led or associate-supported

Ontario

General dentistry, specialty, or multi-provider clinicProduction mix, hygiene base, payer mix, associate coverage, and lease terms are clear

Buyer names are not disclosed without permission; this proof uses anonymized demand category, size, geography, and criteria only.

Check buyer interest privately

Geographic Coverage

Serava maps English-speaking markets where lower-middle-market acquisition activity is highest. Each country is sourced from multiple independent datasets to improve target-map coverage.

🇺🇸

United States

All 50 states, OSM + SBA + SAM + TDLR sources

🇨🇦

Canada

All provinces, OSM + ODBus + Montréal registry

🇬🇧

United Kingdom

England, Scotland, Wales, N. Ireland via Companies House + OSM

🇦🇺

Australia

All states and territories via OSM

🇮🇪

Ireland

All counties via OSM

🇳🇿

New Zealand

399,526 companies from the NZ Companies Register — 388,818 with named directors (97.3%)

Industry Coverage

The 19 core verticals shown here (of 45 tracked) are chosen for their prevalence in PE and search fund deal flow: fragmented, recurring-revenue businesses with owner-operated characteristics. All industries are active in all 6 covered countries.

🌡️HVAC

🔧Plumbing

⚡Electrical

💻Managed IT / MSP

🐛Pest Control

🌿Landscaping

🏭Manufacturing

🏠Roofing

🧹Janitorial / Cleaning

🔩Auto Repair

🖌️Painting

🏊Pool and Spa

🔐Security Services

🧱Concrete and Masonry

📦Software Services

⚖️Legal and Professional

👥Staffing

🏢Facility Management

🏗️Property Management

Data Sources

Records originate from public registries, active license sources, filings, open map data, and business websites. Sources are free or budget-gated, and fields should be treated as signals to review before outreach.

🚚

FMCSA Motor Carrier Census

Open Data

The Federal Motor Carrier Safety Administration publishes a census of every registered US motor carrier. Serava imports 1,048,269 active carrier records, 499,163 of which carry a named owner or principal on file, making trucking and logistics one of the deepest owner-named verticals in the database. Each record arrives with the carrier's operating address and phone straight from the federal filing.

Fields captured

Legal and DBA name
Named owner / principal
Physical address
Phone number
Fleet size (power units)

🏥

NPPES Healthcare Provider Registry

Open Data

The National Plan and Provider Enumeration System is the US government registry behind every NPI number. Serava imports 968,197 healthcare provider records across dental, veterinary, optometry, physical therapy, chiropractic, and more, and 99.7% of them carry a named owner or authorized official. Because the name comes from the provider's own federal enumeration filing, it is registry-grade rather than scraped.

Fields captured

Practice / organization name
Named owner or authorized official
Practice address
Phone number
Provider taxonomy (specialty)

🏛️

US Secretaries of State Corporate Filings

Open Data

State Secretary of State offices publish the corporate registrations behind every legally formed entity. Serava imports 659,451 registered US entities from these filings, 358,544 of them with a named owner, officer, or organizer on record. These registrations capture legal formation data — entity type, registration date, and the people behind the company — that never appears on a website.

Fields captured

Legal entity name
Named owner / officer
Registered address
Entity type
Registration date

📋

State Licence and Dental Boards

Open Data

State licensing boards publish the licensees behind regulated trades and practices. Serava imports 330,972 licensed contractors from state licence boards, including 88,638 in Washington and Oregon where the board file carries the owner's name, phone, and licence tenure, plus 55,309 dental practice owners matched from state dental board licensee records. Licence tenure is one of the strongest owner-tenure signals in the scoring model.

Fields captured

Licensee / business name
Named owner (where published)
Phone number
Licence tenure
Licence status

🌐

OpenStreetMap / Overpass API

Open Data

A baseline open-map layer for visible local operators. Serava uses targeted Overpass API queries where OSM tagging is dense enough to be useful, then supplements thin markets with official registries, licenses, filings, and demand-led source imports. OSM is not treated as a complete source for niche categories.

Fields captured

Business name
Address and coordinates
Phone and website (when available)
Opening hours
Industry tags

🇬🇧

UK Companies House Bulk Data

Open Data

Companies House publishes a full bulk CSV export of all registered UK companies updated monthly. Serava streams this file, filters for active companies incorporated at least 3 years ago with a valid postcode, maps 32 SIC 2007 industry codes to our 19 industry types, and geocodes each company using the postcodes.io API. This adds 100,000 to 300,000 additional UK businesses per import cycle with legal registration data unavailable in OSM.

Fields captured

Company name and registration number
Registered address
SIC industry code
Incorporation date
Company status

🇳🇿

NZ Companies Office Bulk Data

Open Data

The New Zealand Companies Office publishes the official Companies Register as monthly bulk data. Serava imports 399,526 NZ companies from this register, 388,818 of them (97.3%) with a named director sourced from the register's director records — the founding director on file for each company — plus 62,048 registered websites. Coverage spans every NZ region from Auckland to Otago and refreshes automatically each month from the official feed.

Fields captured

Company name and NZBN
Named director (founding)
Registered region
Website (where filed)
Registration date

🇨🇦

Statistics Canada Business Register (ODBus)

Open Data

The Open Database of Businesses (ODBus) is Statistics Canada's public extract of the National Business Register. It contains business names, addresses, NAICS industry codes, and employee size class for the majority of Canadian employer businesses. Serava imports this dataset and maps Canadian NAICS codes to our industry taxonomy, supplementing OSM coverage for Canadian provinces including Quebec and the Maritime provinces where OSM tagging density is lower.

Fields captured

Business name
Province and city
NAICS industry code
Employee size band
Business type

🏙️

Ville de Montréal Commercial Registry

Open Data

The City of Montréal publishes a structured open-data file of all commercially registered businesses operating within the city limits. This registry captures small operators that may not appear in national datasets, adding granular coverage for one of Canada's largest commercial markets. The dataset is imported directly from Montréal's open data portal and geocoded to precise coordinates.

Fields captured

Business name
Commercial address
Business category
Registration date

🇺🇸

SBA 7(a) FOIA Loan Data

Open Data

The US Small Business Administration releases historical 7(a) loan data under FOIA. Each record contains the borrower's business name, city, state, NAICS code, loan amount, and the number of jobs retained. Serava cross-references this against existing database entries to append employee estimates and revenue proxies. Because SBA loans skew toward established, owner-operated businesses, this dataset is a strong signal of acquisition-relevant companies.

Fields captured

Borrower name and location
NAICS code
Loan amount (revenue proxy)
Jobs retained (employee proxy)
Approval date

🏛️

SAM.gov Federal Contractor Registry

Open Data

SAM.gov is the US government's System for Award Management, containing businesses registered to receive federal contracts. These registrations can include detailed NAICS codes, employee counts, annual revenue, and owner demographic information. Serava cross-references SAM records to enrich matching companies with self-reported employee and revenue signals.

Fields captured

Legal business name
NAICS codes
Annual revenue (self-reported)
Employee count (self-reported)
Owner demographics

🇺🇸

Texas Department of Licensing and Regulation

Open Data

TDLR publishes daily CSV license files for Texas contractors, facilities, towing and vehicle storage operators, professional employer organizations, EV supply providers, mold companies, and related licensed businesses. Serava imports active Texas business-level records, maps them into the acquisition taxonomy, and uses ZIP or county-level coordinates depending on the detail available in each file.

Fields captured

License number and type
Business or facility name
Address or county
Phone number
License status and expiration

Enrichment Layer

Raw registry data tells you a business may exist and where it appears to operate. Enrichment adds contact signals, size clues, source context, and reputation data when available. Paid sources stay budget-gated and demand-led, with lower-cost public imports and website crawls used first.

Diffbot AI Knowledge Graph

Budget-gated machine extraction from public web signals. Diffbot is reserved for demand-led queues, paid target maps, or reviewed expansion work rather than broad database enrichment.

Fields appended

Owner / principal name
Revenue estimate
Employee count
Founding year
Business description

Google Places API

Optional paid contact and reputation signals from Google's business index. Matches are made by name and coordinates when budget allows; because this can be expensive, Serava runs it only on high-intent target maps rather than the whole dataset.

Fields appended

Phone signal
Website URL
Google star rating
Review count

Wikidata SPARQL

Free, unlimited SPARQL queries against Wikidata's structured knowledge graph. Useful for notable companies where founder, CEO, employee count, and revenue are recorded as structured facts. Serava queries Wikidata for any company whose name matches a known Wikidata entity.

Fields appended

Founder and CEO name
Official website
Employee count
Founding year

ProPublica Nonprofit Explorer

IRS Form 990 data for US nonprofit organizations, associations, and foundations. ProPublica's API provides total revenue and total assets as reported on the most recent 990 filing. This is particularly useful for industry associations, healthcare nonprofits, and trade schools that appear in the home services and healthcare categories.

Fields appended

Total revenue (Form 990)
Total assets
Tax year

Evidence Grading

Every company is graded by how much verified registry data stands behind it. Verified datapoints count toward the grade; modeled estimates never inflate it.

1,200+

Platinum

48,000+

Gold

114,000+

Silver

4.5M+

Bronze

1.3M

Modeled

Acquisition Fit Score

Every mapped company receives a composite acquisition fit score between 0 and 100. The score is designed to surface businesses with traits that may matter to acquisition buyers. Higher scores rank to the top of every mandate search by default.

Owner Tenure30 pts

How many years the business has been under the same owner. Longer tenure correlates with accumulated equity, retirement motivation, and clean operational history.

Years in Business25 pts

Total operating age of the business. Older businesses have demonstrated survival through multiple economic cycles and carry lower go-forward risk.

Revenue Estimate20 pts

When available from Diffbot, SBA, or SAM sources, revenue is scored against the mandate's target range. Businesses near the centre of the range score highest.

Contact Completeness15 pts

Businesses with available phone, website, contact-page, or Google Places signals score higher. This can correlate with active operations and reachability for outreach.

Review Quality10 pts

Google star rating and review count signal customer satisfaction and market presence. High-rating, high-volume businesses indicate a stable customer base.

Deduplication Methodology

Because multiple data sources cover overlapping geographies, deduplication is applied at the database layer. Each company is identified by a composite key of its normalized name, rounded latitude, and rounded longitude. When the same business appears in both OSM and a national registry, the record is merged rather than duplicated, and any additional fields from the second source are appended to the existing record.

-- Unique constraint prevents double-counting across sources

UNIQUE INDEX (name_norm, lat_round, lng_round)

-- Conflicts are silently ignored; the first insert wins

INSERT OR IGNORE INTO companies (...) VALUES (...)

Update Cadence

The target map is not a static snapshot. Every company is continuously re-graded as verified data lands, registry re-imports follow each source's release schedule, and the enrichment layer is run against higher-intent companies first.

OpenStreetMap (Overpass API)

~1,820 queries per run across all industry-region combinations

Per import cycle

UK Companies House

Aligned to Companies House monthly bulk data release

Monthly

Statistics Canada ODBus

ODBus is updated by Statistics Canada on an annual cycle

Annual

Ville de Montréal Registry

Open data portal updates periodically

Quarterly

SBA Loan Data

FOIA release follows US fiscal year

Annual

SAM.gov

Cross-referenced on enrichment runs

On-demand

Texas Department of Licensing and Regulation

Runs with master registry import cycles and can be triggered manually from admin

Per import cycle

Enrichment (websites, Google, Diffbot, Wikidata)

Website crawls and public sources run first; paid sources require explicit confirmation on high-intent queues

Demand-led

Build a Target Map

Request access to build a mandate and map the market against your acquisition criteria.

Request Access

Free deal map · no sign-in

Put the database to work on your thesis

Describe what you are looking for in plain English and instantly see how many owner-led businesses match across 6M companies, then get your free deal map.

Find your targets free