Sunday, December 17, 2017

Discriminant function analysis Pre-conference workshop NAOP, 2017

ANOVA


>Sex=c(1,2,1,2,1,2,1,2,1,2,1,2,1)
>age=c(10,12,10,13,10,12,13,12,21,31,13,14,15)
>boxplot(age~sex)
> ana=aov(age~sex)
> summary(ana)
Df Sum Sq Mean Sq F value Pr(>F)
sex 1 20.6 20.58 0.595 0.457
Residuals 11 380.2 34.56
# if there are more factors,TukeyHSD can be made
>TukeyHSD(ana)


MANOVA

2X2 factorial MANOVA

>mtcar
>x=mtcar

> head(x)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

> manova1=manova(cbind(x$mpg,x$disp,x$hp,x$drat)~x$cyl)
> summary(manova1,test="Wilks")
          Df   Wilks approx F num Df den Df    Pr(>F)    
x$cyl      1 0.12459    47.43      4     27 7.882e-12 ***
Residuals 30                                             
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


CORRELATIONS
>plot(mtcars$mpg,mtcars$disp)
>y=cor(mtcars, use="complete.obs", method="pearson") 
>plot(x$mpg,x$disp,pch=16,cex=1.3,col="blue")

> lm(x$mpg~x$disp)

REGRESSION LINE

> height <- c(176, 154, 138, 196, 132, 176, 181, 169, 150, 175)
> bodymass <- c(82, 49, 53, 112, 47, 69, 77, 71, 62, 78)
> plot(bodymass, height)
> plot(bodymass, height, pch = 16, cex = 1.3, col = "blue", main = "HEIGHT PLOTTED AGAINST BODY MASS", xlab = "BODY MASS (kg)", ylab = "HEIGHT (cm)")
> lm(height ~ bodymass)

Call:
lm(formula = height ~ bodymass)

Coefficients:
(Intercept)     bodymass  
    98.0054       0.9528  

> abline(98.0054, 0.9528)
> abline(lm(height ~ bodymass))

REF=http://www.theanalysisfactor.com/linear-models-r-plotting-regression-lines/



https://www.youtube.com/watch?v=Z5WKQr4H4Xk
REGRESSION LINE
#ddroy_sta
> height <- c(176, 154, 138, 196, 132, 176, 181, 169, 150, 175)
> bodymass <- c(82, 49, 53, 112, 47, 69, 77, 71, 62, 78)
> plot(bodymass, height)
> plot(bodymass, height, pch = 16, cex = 1.3, col = "blue", main = "HEIGHT PLOTTED AGAINST BODY MASS", xlab = "BODY MASS (kg)", ylab = "HEIGHT (cm)")
> lm(height ~ bodymass)
Call:
lm(formula = height ~ bodymass)
Coefficients:
(Intercept) bodymass
98.0054 0.9528
> abline(98.0054, 0.9528)
> abline(lm(height ~ bodymass))

Image may contain: text




DISCRIMINANT FUNCTION ANALYSIS


>install.packages("MASS")
Installing package into ‘C:/Users/cssc/Documents/R/win-library/3.4’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/MASS_7.3-47.zip'
Content type 'application/zip' length 1171307 bytes (1.1 MB)
downloaded 1.1 MB

package ‘MASS’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
        C:\Users\cssc\AppData\Local\Temp\RtmpKqK8MD\downloaded_packages
> library(MASS)
Warning message:
package ‘MASS’ was built under R version 3.4.3
> library(MASS)
> ldf<-lda(CODE~CLEAN_TO+SAFETY_T+COMFORT+ADEQ_TOT+EXPL_TOT+RELB_TOT+EASY_TOT+EQ_OPP_T+WILL_TOT,data=infra)
>ldf
Call:
lda(CODE ~ CLEAN_TO + SAFETY_T + COMFORT + ADEQ_TOT + EXPL_TOT +
    RELB_TOT + EASY_TOT + EQ_OPP_T + WILL_TOT, data = infra)

Prior probabilities of groups:
        3         4
0.5214724 0.4785276

Group means:
  CLEAN_TO SAFETY_T  COMFORT ADEQ_TOT EXPL_TOT RELB_TOT EASY_TOT EQ_OPP_T
3 3.482353 4.082353 3.600000 9.635294 9.529412 4.423529 5.905882 4.258824
4 2.692308 4.179487 2.987179 9.102564 8.423077 3.423077 5.589744 3.064103
  WILL_TOT
3 7.223529
4 6.102564

Coefficients of linear discriminants:
                  LD1
CLEAN_TO -0.292921993
SAFETY_T  0.404075174
COMFORT  -0.684190835
ADEQ_TOT  0.065820028
EXPL_TOT -0.158518134
RELB_TOT -0.874906156
EASY_TOT  0.006923607
EQ_OPP_T -0.244198968
WILL_TOT  0.062651845
>infra.ldf.p<-predict(ldf,newdata=infra[,c(5,6,7,8,9,10,11,12,13)])$class
>infra.ldf.p
  [1] 3 3 3 3 3 3 4 3 4 3 4 4 4 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 [39] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3
 [77] 3 3 3 3 4 3 3 3 3 3 4 3 4 3 4 4 3 4 4 4 3 4 4 4 3 4 4 4 4 4 3 4 4 4 4 3 4 4
[115] 4 4 3 3 4 4 3 3 4 4 4 4 4 4 4 4 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[153] 4 3 3 4 4 4 4 4 4 4 4
Levels: 3 4
> table(infra.ldf.p,infra[,3])

infra.ldf.p  3  4
          3 76 16
4  9 62
> plot(infra[,c(5,6,7,8,9,10,11,12,13)],col=infra[,3])


 No automatic alt text available.



No comments:

Post a Comment