Waikato Environment for Knowledge Analysis = W E K A
c <- read.csv("C:/5772/country.csv", header = T) lifeExpFemale70plus <- subset(c, lifeexpf >= 70, select = c("lifeexpf", "region", "birthrate")) ?subset()
October 2014: Quiz #1 data set... country.csv Comma Separated Values .csv format.
Study Guide/Outline: Quiz One and outline of readings from DSUR for homework assigned on 09/24/2015.
DSUR - Readings from Discovering
Statistics Using R,
the fall of 2013 textbook.
Needed for doing the country.csv R HOMEWORK sssignment (due Oct 1st, 2015).
Monday, Sep 14th: Verzani Section13 Regression Analysis... R code not covered on Thursday, September 10th. NEW and VIP: Assignment One modification (possibly a welcome change).
Assignment One email: email090315.txt...
Page 83-84 of Verzani - 13.1 and 13.2
Simplify usage of lm with simple.lm from UsingR package.
NEW and VIP: Assignment One modification (possibly a welcome change).
UsingR package and simple.lm() and predict() examples - the class #6 topic from 09/10/2015 Thursday.
# Introduction to ggplot2 and the mpg dataset (from the qqplot2 library) install.packages("ggplot2") library(ggplot2) # Look at the data from ggplot2 libary that we're going to use - miles per gallon ?mpg head(mpg) str(mpg) names(mpg) # Basic scatterplot qplot(displ, hwy, data = mpg) # Add an additional variable with aesthetics: colour, shape, size qplot(displ, hwy, data = mpg, colour = class) qplot(displ, hwy, data = mpg, colour = cyl) qplot(displ, hwy, data = mpg, shape = factor(cyl)) qplot(displ, hwy, data = mpg, shape = factor(cyl), colour = factor(cyl)) # Add an additional variable with faceting qplot(displ, hwy, data = mpg) qplot(displ, hwy, data = mpg) + facet_grid(. ~ cyl) qplot(displ, hwy, data = mpg) + facet_grid(drv ~ .) qplot(displ, hwy, data = mpg) + facet_grid(drv ~ cyl) qplot(displ, hwy, data = mpg) + facet_wrap(~ class) # Deal with overplotting by using JITTER qplot(cty, hwy, data = mpg) qplot(cty, hwy, data = mpg, geom = "jitter") qplot(cty, hwy, data = mpg, geom = "jitter", colour = year) qplot(cty, hwy, data = mpg, geom = "jitter", colour = class) # Note: On 09/11/Thursday # We did NOT do the following two R qplots # with the added very smooth GEOM method lm (linear model) qplot(cty, hwy, data = mpg) + geom_smooth(method = "lm") qplot(cty, hwy, data = mpg, geom = "jitter", colour = class) + geom_smooth(method = "lm") # Reordering + boxplots qplot(class, hwy, data = mpg) qplot(reorder(class, hwy), hwy, data = mpg) qplot(reorder(class, hwy), hwy, data = mpg, geom = "jitter") qplot(reorder(class, hwy), hwy, data = mpg, geom = "boxplot") qplot(reorder(class, hwy), hwy, data = mpg, geom = c("jitter", "boxplot"))
Crosstabs and CHI SQUARE, recoding variables from interval to ordinal (birthrate and female life expectancy for 15 and for 122 countries), linear regression, scatter plots. The best fitting linear regression line goes through the point that is the mean for the DV and the mean for the IV. DV = dependendent variable = y. IV = independent variable = x.
SPSS Statistics Essential Training with Barton Poulson. This is another lynda.uni.edu resource. (5 hours and 5 minutes).
In this course, author Barton Poulson takes a practical, visual, and non-mathematical approach to the basics of statistical concepts and data analysis in SPSS, the statistical package for business, government, research, and academic organization. From importing spreadsheets to creating regression models to exporting presentation graphics, this course covers all the basics, with an emphasis on clarity, interpretation, communicability, and application.
Up and Running with R with Barton Poulson. (2 hours 25 minutes).
Join author Barton Poulson as he introduces the R statistical processing language, including how to install R on your computer, read data from SPSS and spreadsheets, and use packages for advanced R functions.The course continues with examples on how to create charts and plots, check statistical assumptions and the reliability of your data, look for data outliers, and use other data analysis tools. Finally, learn how to get charts and tables out of R and share your results with presentations and web pages.
http://www.stat.wisc.edu/~st571-1/...
ggplot2.org...
Final Exam period: 3-4:50 pm on Tuesday, December 15th.