Debdulal Dutta Roy
Psychology Research Unit
Indian Statistical Institute
203, B. T. Road
Kolkata - 700108
1. It is made of set of questions or items having good discrimination power
2. It should measure response consistency by time and internal structure
3. It measures what it intends to measure
4. It is free from subjectivity. There is uniformity in instructions, scoring and evaluation
5. It has norm for assessment of individual differences
B. QUESTIONNAIRE IN EPIDEMIOLOGY
1. Questionnaire is a device to assess individual differences in responses through set of questions. A good questionnaire must be reliable and valid.
2. Question is unit of questionnaire. Questionnaire measures some construct or the abstraction of concept. The set of questions measure some characteristics of questionnaire.
3. Questionnaire is of two types : uni and multidimensional. Sometimes construct is defined by multiple dimensions. In that case, questionnaire becomes multidimensional. And it is uni dimentional when construct is measured with single characteristics. For example, socio-economic status may be measured with only level of income and sometimes it is measured with housing conditions besides level of income.
4. Questionnaire can be administered through paper pencil format and sometimes through computer assisted instruction. In paper pencil format, supportive interview can be provided when responder can not follow meaning of questions.
5. Each item includes two things - item stem and responses.
6. Number of items in the questionnaire depends on scope of construct, dimensionality, Error probability,respondent characteristics and time allocated for data collection.
7. One item should not measure multiple construct. Therefore, item stem should be simple sentence with single finite verb
8. For respondents with less intelligence, or with highly inhibitive temperament, item stem should not be complex.
It should be easy to understand.
Researcher can use Interrogative sentences with yes and no response categories.
9. In bio-medical research questionnaire is useful on Epidemiological survey. Three types of data can be provided. These are Descriptive Epidemiology, Analytical epidemiology and Evaluation epidemiology.
9.1 Descriptive Epidemiology: Descriptive Epidemiology determines the distribution of a disease. It describes the health problem, its frequency, those affected, where, and
when. The events of interest are defined in terms of the time period, the place and the
population at risk.
9.2 Analytical epidemiology compares those who are ill with those who are not in order
to identify the risk of disease or protective factors (determinant of a disease). It
examines how the event (illness, death, malnutrition, injury) is caused (e.g.
environmental and behavioural factors) and why it is continuing. Standard
mathematical and statistical procedures are used.
Example: Investigating an outbreak of an unknown disease in a displaced
population settlement.
9.3 Evaluation epidemiology examines the relevance, effectiveness and impact of
different programme activities in relation to the health of the affected populations.
Example: Evaluating a malaria control programme for displaced populations.
Item
Reliability
Reliability refers to the consistency of scores
obtained by the same persons when re-examined with the same questionnaire on
different occasions, or with different sets of equivalent items, or under other
variable examining conditions
(Anastasi, 1990). It indicates the extent to which individual differences in
questionnaire scores are attributable to “true” differences in the
characteristics under consideration and the extent to which they are
attributable to chance errors.
Reliability of a questionnaire is given by the proportion of true
variance resulting from the presence of specific situation under consideration
and error variance resulting from the presence of some factors irrelevant to
the present situation. Four principal techniques are used for measuring the
reliability of questionnaire scores:
Test – retest reliability:
Reliability
is tested by repeating the identical questionnaire on the second occasion. In
this technique, the error variance may
result in part from uncontrolled testing conditions, such as extreme
changes in weather, sudden noises and other distractions. To some extent,
however, they arise from changes in the condition of the questionnaire takers
themselves, such as illness, emotional strain, worry, recent experience of
pleasant or unpleasant nature and the like. Pearson’s Product moment
correlation coefficient can be used to assess test – retest reliability when
the sample size is large.
Alternate – form reliability:
Instead
of repeating the same questionnaire on second occasion, a parallel form having
the same characteristics of the original form is administered in successive
session. The error variance in this
case represents fluctuations in performance from one set of items to another.
Under this condition, the reliability coefficient becomes an index of
equivalence of the two forms of the questionnaire. This method is satisfactory
when sufficient time has intervened between the administration of the two forms to weaken or eliminate
memory and practice effects. In developing alternate forms, care must be
exercised to match the materials for content, difficulty and form; and
precautions must be taken not to have the items in the two forms too similar.
If possible, an interval of at least two to four weeks should be allowed
between administration of the test.
Split-half reliability:
In
this method, the questionnaire scores is divided into two equivalent halves and
reliability coefficient is measured between these two halves following
Spearman-Brown prophecy formula:
Spearman-Brown
prophecy formula = ((2+reliability coefficient of the half-test)/(1+
reliability coefficient of the half-test))
Any
difference between a person’s scores on the two halves represents the error
variance. This type of reliability coefficient is
sometimes called a coefficient of internal consistency, since only a single
administration of a single form is required. The split-half method is employed when
it is not possible to construct parallel forms of the test nor advisable to
repeat the test itself. This method has
few advantages – (a) collection of data in one occasion and (b) assessing good
internal consistency. This method can not be used when sample items in both
halves are not correlated. This is not
applicable where in statements were arranged in terms of the order of
difficulty.
Rational equivalence:
This
technique is applicable when responses are binary in nature as Yes, No. It stresses the
inter-correlation coefficients of the items in the questionnaire and the
correlation coefficients of the items with the questionnaire as a whole. It
utilizes a single administration of a single form and is based on the
consistency of responses to all items in the questionnaire (inter-item
consistency). This inter-item consistency is influenced by two sources of
error variance: (i) content sampling (ii) heterogeneity of the behavior domain
sampled. The formula is given below:
rtt
= (n/(n-1)) X ((s2t- åpq) / s2t)
in which,
rtt= reliability
coefficient of the whole test
n= number of
items
st= the SD of the total scores
p= proportion
of the group giving ‘yes’ responses
q= (1-p)= the
proportion of the group giving ‘no’ responses
When
questionnaire responses are not dichotomous but multiple in nature. In stead of
rational equivalence, the useful method is Cronbach’s coefficient alpha
(Cronbach, 1951).
Cronbach’s coefficient alpha
Perhaps it is
the most pervasive of the internal consistency indices. If all items are perfectly reliable and
measure the same thing (true score), then coefficient alpha is equal to 1. The
formula of alpha is below:
α= (k/(k-1)*[1- Σ(s2i)/(s2
sum]
α =
coefficient alpha
s2i
= the variances for
the k individual items;
s2
sum = variances
for the sum of all items;
Alpha varies
with inter correlation among the items Box 1.3. Dutta Roy (2000) noted increase
in alpha value when item total correlation coefficient was high and
significant. This suggests that alpha denotes internal structure of test.
Validity
Content validity: It involves systematic
examination of the questionnaire content to determine whether it covers a representative
sample of the behavior domain to be measured. Content validity is built
into a questionnaire from the outset through the choice of appropriate items.
The preparation of items is preceded by a thorough and systematic examination
of relevant materials as well as by consultation with subject-matter experts.
Construct validity: Construct
is an abstraction consists of a set of propositions about its relationship to
other variables
– other constructs or directly observable behaviour. The extent to which the
questionnaire measures a theoretical construct for which the questionnaire has
been developed is called construct
validity. There are different techniques for assessing construct validity as
factorial, convergent and divergent validity.
Convergent validity: In case of factorial validity, item correlation with
extracted factors was studied. This is made within the measures of
questionnaire. But in case of convergent validity, the extracted construct is valid or not is tested using other measure
assessing same construct.
Divergent validity: It indicates that the results obtained by this
instrument do not correlate too strongly with measurements of a similar but
distinct trait. For example,
a questionnaire measures global work satisfaction should relate more closely to
other general work satisfaction scales than to measure of specific facets of
job satisfaction. Instead of correlation, ANOVA can be used to assess divergent
or discriminative validity of questionnaire.
Criterion related
validity: It indicates the effectiveness
of a questionnaire in predicting an individual’s performance in specified
activities. Performance on the questionnaire is checked against a criterion, a
direct and independent measure of that which
the questionnaire is designed to predict. The criterion measure against which the questionnaire is validated may be
obtained at approximately the same time as the questionnaire scores or after a
stated interval. On the basis of these time relations between criterion
and questionnaire the 1985 Testing Standards differentiate between predictive
and concurrent validation.
Predictive validity: It means prediction from the questionnaire to any criterion
situation or in the more limited sense of prediction over a time interval. It describes how closely
scores on a questionnaire correspond (correlate) with behavior as measured in
other contexts in future.
Source: Dutta Roy, D. (2009). Principles of Questionnaire Development with Empirical Studies
http://www.amazon.in
No comments:
Post a Comment