BASIC CONCEPTS OF SURVEY RESEARCH
Debdulal Dutta Roy
Indian Statistical Institute, Kolkata
Venue: Andhra University
Date: 16.2. 2014
What is Survey ?
• It is an investigation about the characteristics of
a given population by means of collecting data from a sample of that
population and estimate their characteristics through the systematic use of statistical
methodology or technique.
What is sample survey ?
•
A sample survey is sampling method in
which a portion only and not the whole proportion is surveyed.
Sample
•
It is a subset of a frame where
elements are selected based on a randomized process with a known probability of
selection.
•
Representative sample: A
representative sample is one that has all the important characteristics of the
population from which it is drawn.
What is sampling frame ?
It
is a list of all members of a population used as a basis for sampling.
What is sampling ?
It
is the research strategy of collecting data from a part of population with a
view to drawing inferences about the whole.
Sample size
The
number of sampling units which are to be included in the sample.
Sampling unit
It
is one of the units into which an aggregate is divided for the purpose of
sampling, each unit being regarded as individual or individuals.
Sampling fraction
The
ratio of the sample size to the population size.
Probability sampling
•
In probability sampling, each
population element has a known and nonzero probability of being selected.
Selection probabilities arise from the use of a randomized procedure, such as
random number tables.
•
It requires the existence of a
sampling frame from which the sample can be drawn. Major advantage is that
statistical theory can be employed to derive the properties of the sample
estimators. Bias in sample selection is avoided.
Non-probability sampling
•
It is any form of sampling that fails
to meet the conditions for probability sampling.
Non-probability sampling techniques
•
Haphazard,
convenience or accidental sampling:
The sampled elements are chosen for convenience or haphazardly, with the
purpose of making inference about some general population. Examples include a
sample of volunteers, street corner interviews, pull-out questionnaires in a
magazine.
•
Judgment or
purposive sampling or expert choice: Sampled units are selected carefully to provide a ‘ representative
sample’. This is possible when expert has a good deal of information about the
population element. For example, rather than relying on random choice, the
individual is selected purposefully.
•
Quota
sampling: Researcher has quotas of respondents of different types to
interview. For example, an interviewer may require 7 men under 55 years and 5
men 55 years or older.
Probability sampling
•
Before I can explain the various
probability methods we have to define some basic terms. These are:
•
N = the number
of cases in the sampling frame
•
n = the number
of cases in the sample
•
NCn = the number
of combinations (subsets) of n from N
•
f = n/N = the
sampling fraction
•
That's it. With those terms defined
we can begin to define the different probability sampling methods.
Simple Random Sampling
•
This is the simplest form of
probability sampling. To select a simple random sample you need to make a
numbered list of all the units in the population from which you want to draw a
sample or use an already existing one (sampling frame).
•
Objective: To select n units out of N such that each NCn
has an equal chance of being selected.
NCn = the number of combinations (subsets) of n from N
•
Procedure: Use a table of random numbers, a computer random number
generator, or a mechanical device to select the sample.
Systematic sampling
•
Selection : In systematic sampling, sampled units are chosen at
regular intervals from the sampling frame. For this method we randomly select a
number to tell us where to start selecting individuals from the list.
•
Example: a systematic
sample is to be selected from 1,200 students at a school. The sample size
selected is 100. The sampling fraction is 1200/100. The sampling interval is
therefore 12. The number of the first student to be included in the sample is
chosen randomly, for example, by blindly picking one out of 12 pieces of paper,
numbered 1 to 12. If number 6 is picked, then every twelfth student will be
included in the sample, starting with student number 6, until 100 students are
selected. The numbers selected would be 6, 18, 30, 42, etc.
•
Advantage: Systematic
sampling is usually less time-consuming and easier to perform than simple
random sampling.
•
Disadvantage: However, there is a risk of bias, as the sampling interval
may coincide with a systematic variation in the sampling frame. For instance,
if we want to select a random sample of days on which to count clinic attendance,
systematic sampling with a sampling interval of 7 days would be inappropriate,
as all study days would fall on the same day of the week, which might, for
example, be a market day.
Stratified sampling
•
Assumption: This technique
requires classification of the total population into strata. Here, sampling frame is divided into strata
with assumption that strata will have significant effect on change in dependent
variable.
•
Requirement: It requires
large sample. Stratified sampling is only possible when we know what proportion
of the study population belongs to each group we are interested in. An
advantage of stratified sampling is that it is possible to take a relatively
large sample from a small group in the study population. This makes it possible
to get a sample that is big enough to enable researchers to draw valid
conclusions about a relatively small group without having to collect an
unnecessarily large (and hence expensive) sample of the other, larger groups.
However, in doing so, unequal sampling fractions are used and it is important
to correct for this when generalizing our findings to the whole study
population.
•
Example: A survey is
conducted on self-medication practices in a district comprising 20,000
households, of which 20% are urban and 80% rural. It is suspected that in urban
areas self-medication is less common due to the vicinity of health centres. A
decision is made to include 100 urban households (out of 4,000, which gives a 1
in 40 sample) and 200 rural households (out of 16,000, which gives a 1 in 80
sample). This allows for a good comparison between urban and rural
self-medication practices. Because we know the sampling fraction for both
strata, the rates for self-medication for all the district households can be
calculated.
Proportional Stratified Random Sampling
•
Proportional
or quota random sampling, involves dividing your population into
homogeneous subgroups and then taking a simple random sample in each subgroup.
In more formal terms:
•
Objective: Divide the population into non-overlapping groups (i.e.,
strata) N1, N2, N3, ... Ni, such that N1 + N2 + N3 + ... + Ni = N. Then
do a simple random sample of f = n/N in each strata.
•
Advantage: It
represent not only the overall population, but also key subgroups of the
population, especially small minority groups.
•
Strata
properties: Each strata must be homogenous
otherwise lot of statistical precisions are required.
Cluster (Area) Random Sampling
•
Use: when
we have to sample a population that's disbursed across a wide geographic region
is that you will have to cover a lot of ground geographically in order to get
to each of the units you sampled.
•
Advantage: It reduces
travelling time, money and energy of
researcher.
•
Steps:
–
divide population into clusters
(usually along geographic boundaries)
–
randomly sample clusters
–
measure all units within
sampled clusters
Multi-Stage Sampling
•
It is the combination of all sampling
methods.
•
Example: consider the idea of sampling New York
State residents for face-to-face interviews. Clearly we would want to do some
type of cluster sampling as the first stage of the process. We might sample
townships or census tracts throughout the state. But in cluster sampling we
would then go on to measure everyone in the clusters we select. Even if we are
sampling census tracts we may not be able to measure everyone who is in
the census tract. So, we might set up a stratified sampling process within the
clusters. In this case, we would have a two-stage sampling process with
stratified samples within cluster samples. Or, consider the problem of sampling
students in grade schools. We might begin with a national sample of school
districts stratified by economics and educational level. Within selected
districts, we might do a simple random sample of schools. Within schools, we
might do a simple random sample of classes or grades. And, within classes, we
might even do a simple random sample of students. In this case, we have three
or four stages in the sampling process and we use both stratified and simple
random sampling. By combining different sampling methods we are able to achieve
a rich variety of probabilistic sampling methods that can be used in a wide range
of social research contexts.
Summary
•
There are two types of sampling
techniques – probabilistic and non-probabilistic.
•
Probabilistic sampling requires basic
assumption of normal probability curve where as non probability sampling does not need.
No comments:
Post a Comment