Friday 22 October 2010

The significance of n

Rarely one letter stirs up so much passion as ‘n’ in statistical research. Sample size is given considerable attention and a research with tens of thousands of responses is valued highly and considered to be absolutely reliable. A research with a few hundred respondents may be considered suspicious and the results may be (conveniently) ignored as nonsense or just a poorly selected sample. Whether the population size is a few hundred units or hundreds of thousands of units or whether statistically significant results are achieved does not seem to matter in these images.
When cross-tabulating and analyzing small subsets of collected data more data is obviously needed, but even then relevant results can be found on a much smaller n than usually imagined. For example, when comparing groups with more than 500 samples, statistical tests will essentially become obsolete. Virtually any difference is statistically significant. It shouldn’t be neglected, however, that this difference might not be relevant in any way. Interpreting these results might even be misleading. Reliable does not equal valid or relevant.

What is a relevant sample size?
Perhaps sample size should be approached with a slightly more open mind than just from the viewpoint of scientific accuracy. A researcher should always consider what is relevant for the research at hand. Of course it is understandable that, for example, in medical research or when monitoring manufacturing process of a large factory, margins of error should be extremely small. Even small errors can cause large damage to health or business. However, when researching already inaccurate phenomena, such as customer behavior, accuracy of the results just does not carry the same importance. When looking at people's intentions, opinions and other business issues, its more important to focus on relationship between different factors, observed trends and changes in these.

Sample size vs. Question formulation
What makes criticism of a small sample size particularly interesting, is that very rarely anything is spoken about question formulation or survey design. These, however, have much greater significance for accurate results than the amount of respondents. I will probably write more about this topic later on. Also the representativeness of the respondents and the population size is commonly ignored. Studies can be found where the amount of respondents even exceed the relevant target group size. When questions that are relevant for marketing decision makers, there is just no sense in targeting the survey to anyone somehow involved with the marketing function just to get more respondents.

Conclusion
My simple hint to a researcher who wants relevant results: focus more on what questions you ask, how you ask them and who is responding your survey than how many respondents you manage to lure. Accurate targeting also reduces people's perceived "burden of questionnaires" and in the long term will improve all of our chances to get answers to questions we feel are important.


Written by Petteri Pohto
Research Director, M.Sc. (Econ.)
Scan Survey Service