Wednesday, March 12, 2014

Business Research Methods (Module 5,6), IIM., Shillong)

 Lecture notes on Business Research Method  (Module 5,6),  IIM., Shillong)
D. Dutta Roy
Indian Statistical Institute
Kolkata - 700108


Module 5: Multivariate Data Analysis (6 Classes)
Coavariance, Correlation, Factor Analysis, Cluster Analysis, Discriminant Analysis, Multiple Regression, Limited Dependent Variable, Longitudinal data analysis

·         Module 6: Report Writing (1 Class)
Types of report- Research report-Harvard system of referencing-bibliography, footnote

Multivariate Techniques

Q 1: What is multivariate analysis?
Ans.: Multivariate analytical techniques are being widely applied in solving different industrial problems. It is also used by the business people in market research. Modern business data are complex and volatile. Therefore, multivariate techniques are helpful for the business related decision making. Multivariate analysis refers to all statistical methods that simultaneously analyze multiple measurements on each individual/object under investigation. Any simultaneous analysis of more than two variables can be loosely considered as multivariate analysis. Multivariate techniques are extensions of uni-variate analysis (analysis of single variable distribution) and bivariate analysis (cross classification, correlation, analysis of variance and simple regression used to analyze two variables).

Q 2: What is variate?
Ans.: Variate is the building block of multivariate analysis. Variate is a linear combination of variables with empirically determined weights. Weights are determined empirically by the multivariate techniques to meet a specific objective. The variables are specified by the researchers while the weights are determined by the multivariate technique to meet a specific objective. A variate of n weighted variables (X1 to Xn) can be stated mathematically as:
Variate value = W1X1 + W2X2 + W3X3 +…WnXn
Where, Xn is the observed variable and Wn is the weight determined by the multivariate technique. The result is a single value representing a combination of the entire set of variables that best achieves the objective of the specific multivariate analysis.

Q 3: What are the different types of multivariate techniques?
Ans.: Multivariate analysis is an ever-expanding set of techniques for data analysis. Among the more established techniques are-

i.                    Factor Analysis
ii.                  Multiple Regression Analysis and Multiple Correlation
iii.                Multiple Discriminant Analysis
iv.                Multiple Analysis of Variance and Covariance
v.                  Cluster Analysis
vi.                Longitudinal Data Analysis

Q 4: What is Factor Analysis?
Ans.: Factor Analysis is a statistical approach that is used to analyze inter-relationships among a large number of variables and to explain these variables in terms of their common underlying dimensions (factors). The objective is to find a way of condensing the information contained in a number of original variables into a smaller set of variates (factors) with a minimum loss of information.
SPSS format: Analyze>Data reduction>Factor analysis
Select variables
Extraction>Principal component analysis
Tick sign on Correlation matrix (provided data are metric in nature and correlated)
Tick sign on unrotated factor solution
Tick sign on scree plot
Tick sign on Extract eigenvalues over 1
>continue>Factor analysis rotation>Method
select varimax provided independent factors are needed
display rotated solution, loading factors
Continue
Factor analysis options :sorted by size
ok.

Syntax:
       GET
  FILE='C:\Users\ddroy\Downloads\Final_Dataset_BRM_Multivariate.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
FACTOR
  /VARIABLES SelfAwakening Emotionalcontrol Systematic SelfInsulatingLess Fearless Cleanliness NoWorkFamilyConflict NiskamPrinciple
   Challenging SelfUnderstanding Doubtless Freefromfearoffailure Resolute Active
  /MISSING LISTWISE
  /ANALYSIS SelfAwakening Emotionalcontrol Systematic SelfInsulatingLess Fearless Cleanliness NoWorkFamilyConflict NiskamPrinciple C
   hallenging SelfUnderstanding Doubtless Freefromfearoffailure Resolute Active
  /PRINT INITIAL EXTRACTION ROTATION
  /FORMAT SORT
  /PLOT ROTATION
  /CRITERIA MINEIGEN(1) ITERATE(25)
  /EXTRACTION PC
  /CRITERIA ITERATE(25)
  /ROTATION VARIMAX
  /METHOD=CORRELATION.  



Q 5: What is multiple regression analysis?
Ans.: Multiple regression analysis is the appropriate method of analysis when the research problem involves a single metric dependent variable presumed to be related to two or more metric independent variables. The objective of multiple regression analysis is to predict the changes in the dependent variable in response to changes in the independent variable. This objective is most often achieved through the statistical rule of least squares. It is used to predict the amount of magnitude of the dependent variable. For example, business researcher can predict amount of change in company’s sale from information on its expenditure for advertising, the number of sales people and the number of stores carrying its products.


Q 6: What is Multiple Discriminant Analysis?
Ans.: It is the multivariate technique to understand group differences and to predict the likelihood that an entity will belong to a particular class or group based on several metric independent variables. For example, discriminant analysis might be used to distinguish successful entrepreneurs from non-successful ones according to their demographic and psychographic profiles.

Q 7: What is Multivariate Analysis of Variance?
Ans.: Multivariate Analysis of Variance is a statistical technique that can be used to simultaneously explore the relationship between several categorical independent variables and two or more metric dependent variables. It represents an extension of univariate analysis of variance.

Q 8: What is Multivariate Analysis of Covariance?
Ans.: Multivariate Analysis of Covariance is a multivariate statistical technique to remove the effect of any uncontrolled metric independent variables on the dependent variables. It is similar to bi-variate partial correlation in which the effect of a third variable is removed from the correlation.

Q 9: What is Cluster Analysis?
Ans.: Cluster analysis is an analytical technique for developing meaningful subgroups of individuals or objects. Here the objective is to classify a sample of entities (individuals or objects) into a small number of mutually exclusive groups based on the similarities among the entities.
         Cluster analysis involves three steps. The first is the measurement of form of similarity or association among the entities to determine how many groups really exist in the sample. The second step is the actual clustering process whereby entities are partitioned into groups (clusters). The final step is to profile the persons or variables to determine their composition.

Q 10: What are the steps of examining data in multivariate data analysis?
Ans.:
Step I: Examining the shape of the distribution: The stem and leaf diagram provides a general shape of the distribution as well as it provides actual data value.
Step II: Examining the relationship between two or more variables: Researcher can use scatter plot matrix of metric variables to explore the relationship among the variables. Scatterplot matrix represents relations of all the variables, the correlation coefficients as well as the histogram of variables. 
Step III: Examining the outlier: Most of the multivariate statistics are robust in nature. So they are sensitive to change.
Step IV: Analysis of the missing data: There are some strategies to deal with missing data.
  1. Deletion of the data of the complete case.
  2. Researcher determines the extent of missing data on each case and variables, and then deletes the case or variables.
  3. Replacement of the missing data with estimated values based on other information available in the sample. For example, mean substitution, regression imputation etc.
Step V: Assessing homoscedasticity: Homoscedasticity is an assumption related primarily to dependence relationships between variables. It refers to the assumption that dependent variables exhibit equal level of variance across the range of predictor variables. The concept of homoscedasticity is based on the spread of dependent variable variance across the range of predictor variables. The most common statistical test is the Levene’s test to assess whether the variances of a single metric variable are equal across any number of groups. In case of heteroscedasticity (inequality of variances) an easy solution is data transformation.
Step VI: Examining the linearity: Almost all multivariate techniques are based upon correlational measures of association including multiple regression, factor analysis etc. Therefore correlation and association among the variables are important.
Step VII: Incorporating non-metric data with dummy variables: Researcher can use dichotomous variables known as dummy variables which act as replacement variables. A dummy variable is a dichotomous variable that represents one category of a non-metric independent variable. For example, gender has two categories, female and male. Therefore two new dummy variables are to be created. X1 would represent those individuals who are female with a value of 1 and would give all males a value of 0. Likewise X2 would represent all males with a value of 1 and give females a value of 0.

ASSIGNMENTS TO STUDENTS OF SEVEN GROUPS FOR BOTH SESSION 1 AND SESSION 2.

  • 1)      Factor analysis of SET 1 and SET 2.
  • 2)      MANOVA  for Male and Female respondents in Set 1 and Set 2.
  • 3)      Comparison of factor score of male and female in set 1 and set 2.
  • 4)      Multiple Regression to predict the total score of set 1, and set 2 by the respective values.
  • 5)      Multiple discriminant analysis to predict male and female respondents based on respective set of values in set 1 and set 2.
  • 6)      Hierarchical cluster analysis of set 1 and set 2 values.
  • 7)      Cronbach's alpha for set of values in set 1 and set 2. Is alpha values for both male and female different ?
Ans. to 1: There are 5 factors in Set 1 and 6 accounting for 53.47 and 61.5 total variances. 

Ans. to 2: 

Ans. to 3: No significant mean differences between male and female in 6 extracted factors of set 2. 

Ans. to 5: For set 1, 74.8% and for set 2, 69.6% are correct classifications to predict differences in genders through linear combination of values in their respective sets. Structured coefficient reveals for set 1, male (n=92) and female (n=59) differ in challenging, cleanliness and self-understanding. Female possessed high coefficients in cleanliness. For set 2, only pleasure differs by gender. Male has high structured coefficient. 

No comments:

Post a Comment