Saturday, February 15, 2014


Debdulal Dutta Roy
Indian Statistical Institute, Kolkata
Venue: Andhra University
Date: 16.2. 2014

What is Survey ?

       It is an investigation about the characteristics of a given population by means of collecting data from a sample of that population and estimate their characteristics through the systematic use of statistical methodology or technique.

What is sample survey ?

       A sample survey is sampling method in which a portion only and not the whole proportion is surveyed.


       It is a subset of a frame where elements are selected based on a randomized process with a known probability of selection.

        Representative sample: A representative sample is one that has all the important characteristics of the population from which it is drawn.

 What is sampling frame ?

It is a list of all members of a population used as a basis for sampling.

What is sampling ?

It is the research strategy of collecting data from a part of population with a view to drawing inferences about the whole.

Sample size

The number of sampling units which are to be included in the sample.

Sampling unit

It is one of the units into which an aggregate is divided for the purpose of sampling, each unit being regarded as individual or individuals.

Sampling fraction

The ratio of the sample size to the population size.

Probability sampling

       In probability sampling, each population element has a known and nonzero probability of being selected. Selection probabilities arise from the use of a randomized procedure, such as random number tables.

       It requires the existence of a sampling frame from which the sample can be drawn. Major advantage is that statistical theory can be employed to derive the properties of the sample estimators. Bias in sample selection is avoided.

Non-probability sampling

       It is any form of sampling that fails to meet the conditions for probability sampling.

Non-probability sampling techniques

         Haphazard, convenience or accidental sampling: The sampled elements are chosen for convenience or haphazardly, with the purpose of making inference about some general population. Examples include a sample of volunteers, street corner interviews, pull-out questionnaires in a magazine.

         Judgment or purposive sampling or expert choice: Sampled units are selected carefully to provide a ‘ representative sample’. This is possible when expert has a good deal of information about the population element. For example, rather than relying on random choice, the individual is selected purposefully.

         Quota sampling:  Researcher has quotas of respondents of different types to interview. For example, an interviewer may require 7 men under 55 years and 5 men 55 years or older.

Probability sampling

        Before I can explain the various probability methods we have to define some basic terms. These are:

        N = the number of cases in the sampling frame

        n = the number of cases in the sample

        NCn = the number of combinations (subsets) of n from N

        f = n/N = the sampling fraction

        That's it. With those terms defined we can begin to define the different probability sampling methods.

Simple Random Sampling

        This is the simplest form of probability sampling. To select a simple random sample you need to make a numbered list of all the units in the population from which you want to draw a sample or use an already existing one (sampling frame).

        Objective: To select n units out of N such that each NCn has an equal chance of being selected.

NCn = the number of combinations (subsets) of n from N

        Procedure: Use a table of random numbers, a computer random number generator, or a mechanical device to select the sample.

Systematic sampling

       Selection : In systematic sampling, sampled units are chosen at regular intervals from the sampling frame. For this method we randomly select a number to tell us where to start selecting individuals from the list.

       Example:  a systematic sample is to be selected from 1,200 students at a school. The sample size selected is 100. The sampling fraction is 1200/100. The sampling interval is therefore 12. The number of the first student to be included in the sample is chosen randomly, for example, by blindly picking one out of 12 pieces of paper, numbered 1 to 12. If number 6 is picked, then every twelfth student will be included in the sample, starting with student number 6, until 100 students are selected. The numbers selected would be 6, 18, 30, 42, etc.

       Advantage:  Systematic sampling is usually less time-consuming and easier to perform than simple random sampling.

       Disadvantage: However, there is a risk of bias, as the sampling interval may coincide with a systematic variation in the sampling frame. For instance, if we want to select a random sample of days on which to count clinic attendance, systematic sampling with a sampling interval of 7 days would be inappropriate, as all study days would fall on the same day of the week, which might, for example, be a market day.

Stratified sampling

          Assumption:  This technique requires classification of the total population into strata.  Here, sampling frame is divided into strata with assumption that strata will have significant effect on change in dependent variable.

          Requirement:    It requires large sample. Stratified sampling is only possible when we know what proportion of the study population belongs to each group we are interested in. An advantage of stratified sampling is that it is possible to take a relatively large sample from a small group in the study population. This makes it possible to get a sample that is big enough to enable researchers to draw valid conclusions about a relatively small group without having to collect an unnecessarily large (and hence expensive) sample of the other, larger groups. However, in doing so, unequal sampling fractions are used and it is important to correct for this when generalizing our findings to the whole study population.

          Example:  A survey is conducted on self-medication practices in a district comprising 20,000 households, of which 20% are urban and 80% rural. It is suspected that in urban areas self-medication is less common due to the vicinity of health centres. A decision is made to include 100 urban households (out of 4,000, which gives a 1 in 40 sample) and 200 rural households (out of 16,000, which gives a 1 in 80 sample). This allows for a good comparison between urban and rural self-medication practices. Because we know the sampling fraction for both strata, the rates for self-medication for all the district households can be calculated. 

Proportional Stratified Random Sampling

         Proportional or quota random sampling, involves dividing your population into homogeneous subgroups and then taking a simple random sample in each subgroup. In more formal terms:

         Objective: Divide the population into non-overlapping groups (i.e., strata) N1, N2, N3, ... Ni, such that N1 + N2 + N3 + ... + Ni = N. Then do a simple random sample of f = n/N in each strata.

         Advantage:  It represent not only the overall population, but also key subgroups of the population, especially small minority groups.

         Strata properties: Each strata must be homogenous otherwise lot of statistical precisions are required.

Cluster (Area) Random Sampling

         Use:   when we have to sample a population that's disbursed across a wide geographic region is that you will have to cover a lot of ground geographically in order to get to each of the units you sampled.

         Advantage:  It reduces travelling  time, money and energy of researcher.


      divide population into clusters (usually along geographic boundaries)

      randomly sample clusters

      measure all units within sampled clusters

 Multi-Stage Sampling

         It is the combination of all sampling methods.

         Example: consider the idea of sampling New York State residents for face-to-face interviews. Clearly we would want to do some type of cluster sampling as the first stage of the process. We might sample townships or census tracts throughout the state. But in cluster sampling we would then go on to measure everyone in the clusters we select. Even if we are sampling census tracts we may not be able to measure everyone who is in the census tract. So, we might set up a stratified sampling process within the clusters. In this case, we would have a two-stage sampling process with stratified samples within cluster samples. Or, consider the problem of sampling students in grade schools. We might begin with a national sample of school districts stratified by economics and educational level. Within selected districts, we might do a simple random sample of schools. Within schools, we might do a simple random sample of classes or grades. And, within classes, we might even do a simple random sample of students. In this case, we have three or four stages in the sampling process and we use both stratified and simple random sampling. By combining different sampling methods we are able to achieve a rich variety of probabilistic sampling methods that can be used in a wide range of social research contexts.


       There are two types of sampling techniques – probabilistic and non-probabilistic.

       Probabilistic sampling requires basic assumption of normal probability curve where as non probability sampling  does not need. 

No comments:

Post a Comment