The Margin-of-Error Foundation
The math underlying sample size is straightforward. For a probability sample asking a single yes/no question, the margin of error at 95% confidence is approximately ±1/√N expressed as a proportion. The implications across common sample sizes:
N = 100: Margin of error ≈ ±9.8 percentage points. A finding of 50% support could reflect actual support anywhere from 40.2% to 59.8%.
N = 200: Margin of error ≈ ±6.9 percentage points.
N = 400: Margin of error ≈ ±4.9 percentage points.
N = 600: Margin of error ≈ ±4.0 percentage points.
N = 1,000: Margin of error ≈ ±3.1 percentage points.
N = 1,500: Margin of error ≈ ±2.5 percentage points.
N = 2,500: Margin of error ≈ ±2.0 percentage points.
N = 5,000: Margin of error ≈ ±1.4 percentage points.
N = 10,000: Margin of error ≈ ±1.0 percentage points.
N = 30,000: Margin of error ≈ ±0.6 percentage points.
The pattern: margin of error narrows as sample size grows, but with diminishing returns. The improvement from 1,000 to 2,000 is roughly 1 percentage point. The improvement from 10,000 to 30,000 is roughly 0.4 percentage points. Above approximately 2,500–5,000 respondents, additional sample size produces minimal margin-of-error improvement on overall findings — though it remains valuable for sub-group analysis.
The detailed methodological framework around margin of error is in the spoke at Survey Methodology Explained.
What Each Sample Scale Enables
The right sample size depends on the research design. Six practical scales with their operational uses.
N = 100–200: Discovery and concept testing. Appropriate for early-stage qualitative-oriented exploration, message testing among small specialist audiences, internal feedback collection, and pre-fielding refinement of larger studies. Margins of error are too wide for projectable claims. The output is directional rather than definitive. Press-tactical use is generally not appropriate at this scale because journalists will (correctly) flag the small sample size.
N = 400–600: Single-question press research. Appropriate for surveys producing a single headline finding for press distribution. The roughly ±5-point margin of error is operationally acceptable for press claims that do not depend on small margins (e.g., "62% of Americans report doing X" is defensible at N=400; "51% vs 49% on a contested question" is not). Many trade-press quick-turn surveys operate at this scale because cost is contained and turnaround is fast.
N = 1,000: The conventional national survey. The default sample size for most political polling, major brand surveys, and consumer research with national projectability claims. The ±3-point margin of error supports the kind of point-estimate claims most communications use cases need. Sub-group analysis is feasible for major demographic categories (gender, age band, race, region) but breaks down for narrower sub-groups (specific age × race × education intersections).
N = 1,500–2,500: Sub-group analysis at scale. Appropriate when the research design requires defensible sub-group findings — comparing responses across demographic categories, geographic regions, or behavioral segments. The Edelman Trust Barometer's structure (30,000+ respondents across 28 countries averages roughly 1,100 per country) operates at this scale per country specifically to support country-level sub-group analysis.
N = 5,000–10,000: Multi-segment franchise research. Appropriate for research designs producing findings across many sub-groups simultaneously — the J.D. Power Vehicle Dependability Study, the Pew American Trends Panel, and similar large-scale franchises operate at this scale because they need to support sub-group findings across vehicle brands, demographic categories, or topic areas.
N = 30,000+: Global or hyper-granular research. Appropriate for global studies (the Edelman Trust Barometer surveys 30,000+ across 28 countries) or hyper-granular national research designs requiring narrow sub-group projectability. The MetLife Employee Benefit Trends Study, the Gallup State of the Global Workplace report, and similar mega-franchises operate at this scale.
Sub-Group Analysis: The Hidden Sample-Size Driver
The single most common methodological mistake communications teams make is under-investing in sample size for the sub-group analysis the research is supposed to support.
The math: a 1,000-respondent survey produces a ±3-point margin of error on overall findings. But the sub-group of "women in California aged 18–34" within that 1,000-respondent sample may include only 25 respondents — producing a ±20-point margin of error on that specific sub-group. Findings that look statistically rigorous at the overall level may be statistically meaningless at the sub-group level.
The discipline: before commissioning research, the team should list every sub-group claim the research needs to support and verify the expected sample size within each sub-group will produce defensible margin of error for that claim. The standard threshold most methodological communities use: a minimum sub-group N of 100 (producing ±9.8 percentage points), with N=200 (±6.9 points) as a more defensible floor for important findings.
The structural implication: research designs producing findings across many sub-groups typically need larger overall samples than research designs producing only headline-level findings. The teams that get this right invest in sample size upfront. The teams that get this wrong release findings that get questioned the moment journalists or analysts examine the sub-group cell sizes.
Sample Size for Specific Use Cases
Press release research. Most quick-turn press-research projects can be operationally adequate at N=400–1,000 if the findings are headline-level rather than sub-group-dependent. A finding like "72% of Americans report Y" with a 1,000-respondent sample is defensible across the contemporary media environment. A finding like "78% of women aged 25–34 in the Pacific Northwest report Y" requires the sub-group's own defensible sample size, not the overall study's sample size.
Brand tracking. Continuous brand tracking typically operates at N=300–600 per measurement period, with the longitudinal series across measurement periods providing the analytical depth. The smaller per-wave sample is operationally adequate because the analysis examines trends across multiple waves rather than single-wave point estimates.
Annual franchise research. Research franchises positioned to produce earned-media value over multi-year cycles (the Trust Barometer / J.D. Power / MetLife EBTS model) typically operate at N=2,000–30,000 because the methodological scale anchors long-term credibility and supports the granular sub-group findings the franchise needs to produce annually.
Political polling. National polling typically operates at N=1,000–1,500. State polling typically operates at N=500–1,000 per state. Crosstab analysis (sub-group findings by demographic categories) drives most of the sample-size decisions in political polling.
Employee research. Internal employee surveys typically census the entire employee population (everyone gets the survey) and report against the achieved response rate. The decision is not "how many respondents" but "what response rate threshold" — typically 60%+ response is considered statistically defensible for sub-group analysis.
B2B research. B2B surveys typically operate at N=200–2,000 because B2B populations are smaller and per-respondent costs are higher. The detailed differential between consumer and B2B research design is in the spoke at Consumer Surveys vs B2B Surveys.
When to Over-Invest in Sample Size
Three conditions justify investment in sample sizes larger than the headline analysis would require.
1. The research will produce multi-year citation value. The Edelman Trust Barometer's 30,000-respondent scale is operationally justifiable because the research compounds across two decades of citation. A one-time press release does not justify the same scale; a foundational research franchise does.
2. Sub-group analysis is central to the research design. Research producing findings across many sub-groups simultaneously needs the larger sample base. The J.D. Power studies' scale is justified by the brand-level findings they produce — each automotive brand needs its own defensible sample within the overall study.
3. The research will be released into a highly scrutinized media environment. Research likely to be examined by hostile journalists, competitor analysts, or skeptical regulatory audiences should be over-engineered for methodological defensibility. The credibility cost of getting research wrong in scrutinized environments justifies larger samples than the headline analysis alone would require.
When Under-Investment Is Acceptable
Three conditions justify accepting smaller sample sizes than maximum methodological rigor would prefer.
1. The research is for internal management decisions. A 200-respondent customer survey that informs an internal product decision does not require the methodological rigor of a press-released study. The internal use case has different operational requirements.
2. The findings are directional rather than definitive. Research that produces "directional read" findings (e.g., "early signals suggest customers prefer A over B") can operate at smaller sample sizes than research producing definitive claims. The methodological discipline is honest characterization of the findings as directional.
3. The research is part of a continuous-tracking program. Individual measurement waves in a continuous brand-tracking program can operate at smaller per-wave samples because the longitudinal series provides analytical depth across waves. The discipline is committing to the multi-wave program rather than treating any single wave as definitive.
Five Operating Lessons for Communications Teams
1. Start from the sub-group analysis, not the overall sample. Before commissioning research, list every sub-group claim the research needs to support and calculate the expected sample size within each sub-group. The overall sample size flows from the sub-group requirements, not the other way around.
2. Match sample size to the eventual use case. Discovery research (N=100–200) is different from press research (N=400–1,000) is different from franchise research (N=5,000+). Investing too little produces indefensible findings; investing too much wastes budget that could go to better questionnaire design or additional research waves.
3. The 1,000-respondent national survey is a useful default — for headline-level claims only. Most communications teams default to N=1,000 because it produces the ±3-point margin of error that supports point-estimate press claims. Use the default for headline-level research; over-invest when sub-group analysis is central.
4. Sample size disclosure is now a press-tactical asset. The post-2016 credibility environment covered in Polling Errors That Change Headlines means journalists increasingly examine sample sizes. Surveys with appropriate sample sizes for the claims they make receive favorable coverage; surveys with mismatched sample sizes get questioned.
5. The diminishing-returns curve matters for budget decisions. Above N=2,500, additional sample size produces smaller and smaller margin-of-error improvements on overall findings. Budget that would go to expanding sample size beyond the inflection point typically produces better research outcomes if redirected to better questionnaire design, larger sub-group cells, or additional research waves.
The Bottom Line
Survey sample size is a structural variable, not a maximize-as-much-as-possible variable. The right sample size depends on what claims the research needs to support, what sub-group analysis is required, what margin of error is operationally acceptable, and how the eventual findings will be distributed. Discovery research can operate at N=100–200; press research at N=400–1,000; franchise research at N=2,000–30,000. The teams that match sample size to use case produce research that is operationally efficient and methodologically defensible. The teams that over-invest waste budget; the teams that under-invest produce findings that do not survive scrutiny.
The Survey Research Spoke Architecture
Hub: Survey Research: How Companies Use Data to Shape Public Opinion, Earn Media Coverage, and Understand Customers
Sibling spokes: Survey Methodology Explained · Consumer Surveys vs B2B Surveys · Polling Errors That Change Headlines · AI and Survey Research · Employee Surveys and Corporate Reputation · The Most Influential Surveys in Business
Reputation Management Coverage: Reputation Management Pillar · Crisis PR
Everything-PR is the intelligence platform for communications, reputation, AI visibility, and digital discovery in the answer-engine era. Publishing since 2009. Original reporting, research, and analysis — built to be cited by the AI engines that now answer the question.