Data Documentation

The Serava Database

How we source, structure, enrich, and score 5M+ acquisition-ready businesses across 19 industries and 6 countries.

5M+
Companies
19
Industries
6
Countries
20+
Data Sources
Owner path

Own one of these businesses?

Buyers are actively searching this market. We are seeing active demand for companies like this. Check buyer interest privately for your business.

Anonymous buyer-demand proof

We are seeing active demand for companies like this.

Dental practices
$750K-$6M revenue, owner-led or associate-supported
Ontario
General dentistry, specialty, or multi-provider clinicProduction mix, hygiene base, payer mix, associate coverage, and lease terms are clear

Buyer names are not disclosed without permission; this proof uses anonymized demand category, size, geography, and criteria only.

Check buyer interest privately

Geographic Coverage

Serava maps English-speaking markets where lower-middle-market acquisition activity is highest. Each country is sourced from multiple independent datasets to improve target-map coverage.

🇺🇸
United States
All 50 states, OSM + SBA + SAM + TDLR sources
🇨🇦
Canada
All provinces, OSM + ODBus + Montréal registry
🇬🇧
United Kingdom
England, Scotland, Wales, N. Ireland via Companies House + OSM
🇦🇺
Australia
All states and territories via OSM
🇮🇪
Ireland
All counties via OSM
🇳🇿
New Zealand
All regions via OSM

Industry Coverage

Serava covers 19 industry verticals chosen for their prevalence in PE and search fund deal flow: fragmented, recurring-revenue businesses with owner-operated characteristics. All industries are active in all 6 covered countries.

🌡️HVAC
🔧Plumbing
Electrical
💻Managed IT / MSP
🐛Pest Control
🌿Landscaping
🏭Manufacturing
🏠Roofing
🧹Janitorial / Cleaning
🔩Auto Repair
🖌️Painting
🏊Pool and Spa
🔐Security Services
🧱Concrete and Masonry
📦Software Services
⚖️Legal and Professional
👥Staffing
🏢Facility Management
🏗️Property Management

Data Sources

Records originate from public registries, active license sources, filings, open map data, and business websites. Sources are free or budget-gated, and fields should be treated as signals to review before outreach.

🌐

OpenStreetMap / Overpass API

Open Data

A baseline open-map layer for visible local operators. Serava uses targeted Overpass API queries where OSM tagging is dense enough to be useful, then supplements thin markets with official registries, licenses, filings, and demand-led source imports. OSM is not treated as a complete source for niche categories.

Fields captured
  • Business name
  • Address and coordinates
  • Phone and website (when available)
  • Opening hours
  • Industry tags
🇬🇧

UK Companies House Bulk Data

Open Data

Companies House publishes a full bulk CSV export of all registered UK companies updated monthly. Serava streams this file, filters for active companies incorporated at least 3 years ago with a valid postcode, maps 32 SIC 2007 industry codes to our 19 industry types, and geocodes each company using the postcodes.io API. This adds 100,000 to 300,000 additional UK businesses per import cycle with legal registration data unavailable in OSM.

Fields captured
  • Company name and registration number
  • Registered address
  • SIC industry code
  • Incorporation date
  • Company status
🇨🇦

Statistics Canada Business Register (ODBus)

Open Data

The Open Database of Businesses (ODBus) is Statistics Canada's public extract of the National Business Register. It contains business names, addresses, NAICS industry codes, and employee size class for the majority of Canadian employer businesses. Serava imports this dataset and maps Canadian NAICS codes to our industry taxonomy, supplementing OSM coverage for Canadian provinces including Quebec and the Maritime provinces where OSM tagging density is lower.

Fields captured
  • Business name
  • Province and city
  • NAICS industry code
  • Employee size band
  • Business type
🏙️

Ville de Montréal Commercial Registry

Open Data

The City of Montréal publishes a structured open-data file of all commercially registered businesses operating within the city limits. This registry captures small operators that may not appear in national datasets, adding granular coverage for one of Canada's largest commercial markets. The dataset is imported directly from Montréal's open data portal and geocoded to precise coordinates.

Fields captured
  • Business name
  • Commercial address
  • Business category
  • Registration date
🇺🇸

SBA 7(a) FOIA Loan Data

Open Data

The US Small Business Administration releases historical 7(a) loan data under FOIA. Each record contains the borrower's business name, city, state, NAICS code, loan amount, and the number of jobs retained. Serava cross-references this against existing database entries to append employee estimates and revenue proxies. Because SBA loans skew toward established, owner-operated businesses, this dataset is a strong signal of acquisition-relevant companies.

Fields captured
  • Borrower name and location
  • NAICS code
  • Loan amount (revenue proxy)
  • Jobs retained (employee proxy)
  • Approval date
🏛️

SAM.gov Federal Contractor Registry

Open Data

SAM.gov is the US government's System for Award Management, containing businesses registered to receive federal contracts. These registrations can include detailed NAICS codes, employee counts, annual revenue, and owner demographic information. Serava cross-references SAM records to enrich matching companies with self-reported employee and revenue signals.

Fields captured
  • Legal business name
  • NAICS codes
  • Annual revenue (self-reported)
  • Employee count (self-reported)
  • Owner demographics
🇺🇸

Texas Department of Licensing and Regulation

Open Data

TDLR publishes daily CSV license files for Texas contractors, facilities, towing and vehicle storage operators, professional employer organizations, EV supply providers, mold companies, and related licensed businesses. Serava imports active Texas business-level records, maps them into the acquisition taxonomy, and uses ZIP or county-level coordinates depending on the detail available in each file.

Fields captured
  • License number and type
  • Business or facility name
  • Address or county
  • Phone number
  • License status and expiration

Enrichment Layer

Raw registry data tells you a business may exist and where it appears to operate. Enrichment adds contact signals, size clues, source context, and reputation data when available. Paid sources stay budget-gated and demand-led, with lower-cost public imports and website crawls used first.

1

Diffbot AI Knowledge Graph

Budget-gated machine extraction from public web signals. Diffbot is reserved for demand-led queues, paid target maps, or reviewed expansion work rather than broad database enrichment.

Fields appended
  • Owner / principal name
  • Revenue estimate
  • Employee count
  • Founding year
  • Business description
2

Google Places API

Optional paid contact and reputation signals from Google's business index. Matches are made by name and coordinates when budget allows; because this can be expensive, Serava runs it only on high-intent target maps rather than the whole dataset.

Fields appended
  • Phone signal
  • Website URL
  • Google star rating
  • Review count
3

Wikidata SPARQL

Free, unlimited SPARQL queries against Wikidata's structured knowledge graph. Useful for notable companies where founder, CEO, employee count, and revenue are recorded as structured facts. Serava queries Wikidata for any company whose name matches a known Wikidata entity.

Fields appended
  • Founder and CEO name
  • Official website
  • Employee count
  • Founding year
4

ProPublica Nonprofit Explorer

IRS Form 990 data for US nonprofit organizations, associations, and foundations. ProPublica's API provides total revenue and total assets as reported on the most recent 990 filing. This is particularly useful for industry associations, healthcare nonprofits, and trade schools that appear in the home services and healthcare categories.

Fields appended
  • Total revenue (Form 990)
  • Total assets
  • Tax year

Acquisition Fit Score

Every mapped company receives a composite acquisition fit score between 0 and 100. The score is designed to surface businesses with traits that may matter to acquisition buyers. Higher scores rank to the top of every mandate search by default.

Owner Tenure30 pts

How many years the business has been under the same owner. Longer tenure correlates with accumulated equity, retirement motivation, and clean operational history.

Years in Business25 pts

Total operating age of the business. Older businesses have demonstrated survival through multiple economic cycles and carry lower go-forward risk.

Revenue Estimate20 pts

When available from Diffbot, SBA, or SAM sources, revenue is scored against the mandate's target range. Businesses near the centre of the range score highest.

Contact Completeness15 pts

Businesses with available phone, website, contact-page, or Google Places signals score higher. This can correlate with active operations and reachability for outreach.

Review Quality10 pts

Google star rating and review count signal customer satisfaction and market presence. High-rating, high-volume businesses indicate a stable customer base.

Deduplication Methodology

Because multiple data sources cover overlapping geographies, deduplication is applied at the database layer. Each company is identified by a composite key of its normalized name, rounded latitude, and rounded longitude. When the same business appears in both OSM and a national registry, the record is merged rather than duplicated, and any additional fields from the second source are appended to the existing record.

-- Unique constraint prevents double-counting across sources
UNIQUE INDEX (name_norm, lat_round, lng_round)
-- Conflicts are silently ignored; the first insert wins
INSERT OR IGNORE INTO companies (...) VALUES (...)

Update Cadence

The target map is not a static snapshot. Data sources refresh on independent schedules, and the enrichment layer is run against higher-intent companies first.

OpenStreetMap (Overpass API)
~1,820 queries per run across all industry-region combinations
Every 30 days
UK Companies House
Aligned to Companies House monthly bulk data release
Monthly
Statistics Canada ODBus
ODBus is updated by Statistics Canada on an annual cycle
Annual
Ville de Montréal Registry
Open data portal updates periodically
Quarterly
SBA Loan Data
FOIA release follows US fiscal year
Annual
SAM.gov
Cross-referenced on enrichment runs
On-demand
Texas Department of Licensing and Regulation
Runs after each 30-day master database scrape and can be triggered manually from admin
Every Build All cycle
Enrichment (websites, Google, Diffbot, Wikidata)
Website crawls and public sources run first; paid sources require explicit confirmation on high-intent queues
Demand-led

Build a Target Map

Request access to build a mandate and map the market against your acquisition criteria.

Request Access