Answers for WPA06 of Basic data and decision analysis in R, taught at the University of Konstanz in Winter 2017/2018.

To complete and submit these exercises, please remember and do the following:

Your WPAs should be written as scripts of commented code (as .Rmd files) and submitted as reproducible documents that combine text with code (in .html or .pdf formats).
- A simple .Rmd template is provided here.
- (Alternatively, open a plain R script and save it as LastnameFirstname_WPA##_yymmdd.R.)
Also enter the current assignment (e.g., WPA06), your name, and the current date at the top of your document. When working on a task, always indicate which task you are answering with appopriate comments.
Complete as many exercises as you can by Wednesday (23:59).
Submit your script or output file (including all code) to the appropriate folder on Ilias.

A. In Class

Here are some warm-up exercises that review some important skills from previous and practice the basic concepts of the current chapter:

Preparations

0. The following steps prepare the current session by opening an R project, creating a new .Rmd file, and compiling it into an .html output file:

0a. Open your R project from last week (called RCourse or something similar), which contains some files and at least two subfolders (data and R).

0b. Create a new R Markdown (.Rmd) script and save it as LastFirst_WPA06_yymmdd.Rmd (with an appropriate header) in your project directory.

0c. You need the latest version of the yarrr package (v0.1.2) in this WPA. Install the package from CRAN with install.packages() if you haven’t already done so.

0d. Insert a code chunk and load the rmarkdown, knitr and yarrr packages. (Hint: It’s always a good idea to name code chunks and load all required packages with library() at the beginning of your document.)

library(rmarkdown)
library(knitr)
library(yarrr)

0e. Save the original graphic settings into an object opar. (Hint: This allows you to restore them later by evaluating par(opar) Use the options message = FALSE and warning = FALSE for this code chunk.)

opar <- par() # saves original (default) par settings
par(opar)  # restores original (default) par settings

0f. Make sure that you can create an .html output-file by “knitting” your current document.

Testing statistical hypotheses

In the following exercises, you will test simple statistical hypotheses about differences or relationships between one or two samples. The type of test depends on the number of samples, as well as the measurement characteristics and distributions of the data.

Reporting test results

Please report your answers to all hypothesis test questions in proper American Pirate Association (APA) style.

Reporting test parameters:
- chi-square test: \(X\)(df) = XXX, \(p\) = YYY
- t-test: \(t\)(df) = XXX, \(p\) = YYY
- correlation test: \(r\) = XXX, \(t\)(df) = YYY, \(p\) = ZZZZ
If a \(p\)-value is less than \(\alpha = .01\), write only \(p < .01^{*}\).

Example

Here is a question, a statistical test, and an answer in the appropriate (APA) format:

Question: Do pirates with headbands have different numbers of tattoos than those who not wearing headbands?

We can find out whether this is the case by using a 1-sample \(t\)-test on the pirates data set:

library(yarrr)  # for the pirates data set
t.test(tattoos ~ headband, data = pirates)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  tattoos by headband
#> t = -19.313, df = 146.73, p-value < 2.2e-16
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -5.878101 -4.786803
#> sample estimates:
#>  mean in group no mean in group yes 
#>          4.699115         10.031567

Answer: Pirates with headbands have significantly more tattoos on average than those who do not wear headbands (10.0 vs. 4.7, respectively): \(t(146.73) = -19.31\), \(p < .01^{*}\).

T-tests

1. The Nile data set available in R lists 100 measurements of the annual flow of the river Nile (in \(10^{8} m^{3}\)), at Aswan, from 1871 to 1970. Inspect this data set first.

str(Nile)
#>  Time-Series [1:100] from 1871 to 1970: 1120 1160 963 1210 1160 1160 813 1230 1370 1140 ...
# head(Nile)
# ?Nile

1a. Does the annual flow differ from 1000 (\(10^{8} m^{3}\)) when considering the entire measurement period (1871–1970)? (Hint: Comparing the mean of an interval-scaled variable to a reference value calls for a 1-sample \(t\)-test.)

# t-test 1a:
t.test(x = Nile,  # all data (1871 - 1970) 
       mu = 1000) # Null hypothesis: Mean is 1000
#> 
#>  One Sample t-test
#> 
#> data:  Nile
#> t = -4.7658, df = 99, p-value = 6.461e-06
#> alternative hypothesis: true mean is not equal to 1000
#> 95 percent confidence interval:
#>  885.7716 952.9284
#> sample estimates:
#> mean of x 
#>    919.35

Answer: Yes, the annual average flow of 919.4 is significantly lower than 1000 (\(10^{8} m^{3}\)): \(t(99) = -4.77\), \(p < .01^{*}\)

1b. Someone claims that (a) the flow before the year 1898 is higher than 1000 (\(10^{8} m^{3}\)) and (b) the flow from the year 1898 onwards is lower than 1000 (\(10^{8} m^{3}\)). Test both of these claims (with two 1-sample \(t\)-tests on parts of the data). (Hint: Determine the appropriate ranges of the Nile vector to conduct 1-sample \(t\)-tests on them.)

# At which position is the year 1898 in this vector?
years <- 1871:1970 # MAKE a vector of the range of years!

# (a) using numerical indexing:
my.year <- 1898
ix.yr <- which(years == my.year) # At which position in years is 1898?
data.pre.change  <- Nile[1:(ix.yr - 1)] 
data.post.change <- Nile[ix.yr:length(Nile)] 
# (b) using logical indexing:
data.pre.change  <- Nile[years <  1898 ] # Use Booleans to indicate desired range.
data.post.change <- Nile[years >= 1898 ]

# t-test 1b: Using data before 1898: 
test.1 <- t.test(x = data.pre.change, # Use only data before 1898 
          mu = 1000)           # Null hypothesis: Mean is 1000
test.1
#> 
#>  One Sample t-test
#> 
#> data:  data.pre.change
#> t = 3.689, df = 26, p-value = 0.001046
#> alternative hypothesis: true mean is not equal to 1000
#> 95 percent confidence interval:
#>  1043.247 1152.086
#> sample estimates:
#> mean of x 
#>  1097.667

# t-test 1b: Using data from 1898 onwards: 
test.2 <- t.test(x = data.post.change, # Use only data from 1898 onwards 
          mu = 1000)            # Null hypothesis: Mean is 1000
test.2
#> 
#>  One Sample t-test
#> 
#> data:  data.post.change
#> t = -9.8383, df = 72, p-value = 5.854e-15
#> alternative hypothesis: true mean is not equal to 1000
#> 95 percent confidence interval:
#>  823.6923 883.1022
#> sample estimates:
#> mean of x 
#>  853.3973

# To report results in APA format:
yarrr::apa(test.1)
#> [1] "mean = 1097.67, t(26) = 3.69, p < 0.01 (2-tailed)"
yarrr::apa(test.2)
#> [1] "mean = 853.4, t(72) = -9.84, p < 0.01 (2-tailed)"

Answer: Both claims are true:

Before the year 1898, the average annual flow of 1097.7 was significantly higher than 1000: \(t(26) = 3.69\), \(p < .01^{*}\).
After the year 1898, the average annual flow of 853.4 was significantly lower than 1000: \(t(72) = -9.84\), \(p < .001^{***}\).

1c. Directly test whether the average annual flow before 1893 and from 1898 onwards differed systematically. (Hint: This requires a 2-sample \(t\)-test on the Nile ranges from above.)

# t-test 1c:
test.3 <- t.test(x = data.pre.change,  # only data before 1898 
                 y = data.post.change) # only data from 1898 onwards
test.3 # print the test results
#> 
#>  Welch Two Sample t-test
#> 
#> data:  data.pre.change and data.post.change
#> t = 8.0404, df = 43.506, p-value = 3.824e-10
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  183.0224 305.5164
#> sample estimates:
#> mean of x mean of y 
#> 1097.6667  853.3973

apa(test.3) # print results in APA format
#> [1] "mean difference = -244.27, t(43.51) = 8.04, p < 0.01 (2-tailed)"

Answer: Yes, the flow was significantly reduced (with the same means as reported above): \(t(43.5) = 8.04\), \(p < .001^{***}\).

1d. Plot the annual Nile flow data, drawing a vertical line at the year 1898, and horizontal lines for the averages before 1898 and from the year 1898 onwards.

plot(Nile, lwd = 2, col = "gray50")
mn.pre <- mean(data.pre.change)
mn.post <- mean(data.post.change)

abline(v = my.year, lty = 2, lwd = 2, col = "steelblue3")
abline(h = mn.pre, lty = 3, lwd = 2, col = "green3") 
abline(h = mn.post, lty = 3, lwd = 2, col = "red3")

# add text labels:
text(x = my.year + 4, 
     y = max(Nile),
     labels = paste(my.year),
     col = unikn.pal.ty[4]
     )
text(x = 1886,
     y = max(Nile),
     labels = paste("before ", my.year, ": ",
                    round(mn.pre, 0), sep = ""),
     col = "green3"
     )
text(x = 1940, 
     y = min(Nile) + 100,
     labels = paste("after ", my.year, ": ",
                    round(mn.post, 0), sep = ""),
     col = "red3"
     )

Correlation tests

2. The mtcars data set available in R records fuel consumption (in mpg) and 10 further variables of 32 cars (from the 1974 issue of Motor Trend magazine). We’ll use this data to test some hypotheses about the correlation \(r\) between two numeric variables.

str(mtcars)
#> 'data.frame':    32 obs. of  11 variables:
#>  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#>  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
#>  $ disp: num  160 160 108 258 360 ...
#>  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
#>  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
#>  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
#>  $ qsec: num  16.5 17 18.6 19.4 17 ...
#>  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
#>  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
#>  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
#>  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
# head(mtcars)
# ?mtcars

2a. Is there a systematic relationship between a car’s fuel consumption (mpg) and its gross horsepower (hp)? (Hint: Use a correlation test with the variable notiation on columns of of mtcars to test this relationship.)

my.test <- cor.test(x = mtcars$hp, 
                    y = mtcars$mpg)
my.test
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  mtcars$hp and mtcars$mpg
#> t = -6.7424, df = 30, p-value = 1.788e-07
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.8852686 -0.5860994
#> sample estimates:
#>        cor 
#> -0.7761684

# Note that you can print the results in APA format with the apa() function of the `yarrr` package:
# library(yarrr)
yarrr::apa(my.test)
#> [1] "r = -0.78, t(30) = -6.74, p < 0.01 (2-tailed)"

Answer: Yes, there is a significant negative correlation between fuel consumption and horsepower: r = -0.78, t(30) = -6.74, p < 0.01 (2-tailed).

2b. Is there a systematic relationship between a car’s weight (wt) and its displacement (disp)? (Hint: Use a correlation test with the formula notation to test this relationship.)

cor.test(formula = ~ wt + disp, 
         data = mtcars) 
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  wt and disp
#> t = 10.576, df = 30, p-value = 1.222e-11
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  0.7811586 0.9442902
#> sample estimates:
#>       cor 
#> 0.8879799

Answer: Yes, there is a significant positive correlation between a car’s weight and displacement: r = 0.89, t(30) = 10.58, p < 0.01 (2-tailed).

2c. Does the systematic relationship between a car’s fuel consumption (mpg) and its gross horsepower (hp) still hold when only considering cars with automatic transmission (am == 0) and more than four cylinders (cyl > 4)?

cor.test(formula = ~ hp + mpg, 
         data = mtcars, 
         subset = (am == 0 & cyl > 4)
         ) 
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  hp and mpg
#> t = -3.7496, df = 14, p-value = 0.002155
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.8909390 -0.3268242
#> sample estimates:
#>        cor 
#> -0.7078515

Answer: Yes, there is still a significant negative correlation between fuel consumption and horsepower for the subset of cars with automatic transmission and more than four cylinders: r = -0.71, t(14) = -3.75, p < 0.01 (2-tailed).

2d. Plot the relationship of Exercise 2a. and add a linear regression line and a label stating the value of the correlation.

plot(x = mtcars$hp, 
     y = mtcars$mpg, 
     pch = 21,
     bg = yarrr::transparent("orange", .50), 
     col = yarrr::transparent("red4", .25),
     ylim = c(0, max(mtcars$mpg)),
     main = "MPG by horse power", 
     xlab = "Horse power",
     ylab = "Fuel consumption (mpg)"
     )

 # Add regression line:
 abline(lm(mtcars$mpg ~ mtcars$hp), 
        col = unikn.pal.ty[4], lty = 3, lwd = 2)

# Add correlation test label: 
text(x = 290, y = 8, 
     labels = paste0("cor = ", round(cor(mtcars$mpg, mtcars$hp), 2)), 
     col = unikn.pal.ty[4]
     )

Chi-square tests

3. The mtcars data set can also be used to test some hypotheses about the equal distribution of (the frequency of) instances in categories (with \(\chi^{2}\)-tests).

3a. What are the frequencies of cars with automatic vs. manual transmission (am)? What are the frequencies of cars with different numbers of cylinders (cyl)? Are both of these categories distributed equally or unequally? (Hints: Obtain two table()s and conduct two corresponding 1-sample \(\chi^{2}\)-tests to find out.)

# 1 sample: transmission
t1a <- table(mtcars$am)
t1a
#> 
#>  0  1 
#> 19 13
# One-sample chi-square test:
chisq.test(x = t1a)
#> 
#>  Chi-squared test for given probabilities
#> 
#> data:  t1a
#> X-squared = 1.125, df = 1, p-value = 0.2888

Answer: There are 19 cars with automatic transmission and 13 cars with manual transmission, but we cannot reject the hypothesis that both types of transmission occur with equal frequency: \(\chi^{2} = 1.13\), \(p = .29\).

# 1 sample: cylinders
t1b <- table(mtcars$cyl)
t1b
#> 
#>  4  6  8 
#> 11  7 14
# One-sample chi-square test:
chisq.test(x = t1b)
#> 
#>  Chi-squared test for given probabilities
#> 
#> data:  t1b
#> X-squared = 2.3125, df = 2, p-value = 0.3147

Answer: There are 11 (7, 14) cars with 4 (6, 8) cylinders, respectively, but we cannot reject the hypothesis that cars with these numbers of cylinders occur with equal frequency: \(\chi^{2} = 2.31\), \(p = .31\).

3b. Do the frequencies of cars with automatic vs. manual transmission (am) vary as a function of the number of cylinders (cyl)? (Hints: Obtain a 2x2 table() and 2-sample \(\chi^{2}\)-test to find out.)

# 2 samples:
t2 <- table(mtcars$am, mtcars$cyl)
t2
#>    
#>      4  6  8
#>   0  3  4 12
#>   1  8  3  2
# Two-sample chi-square test:
chisq.test(x = t2)
#> 
#>  Pearson's Chi-squared test
#> 
#> data:  t2
#> X-squared = 8.7407, df = 2, p-value = 0.01265

Answer: The following table shows the frequency of cars by type of transmission and their number of cylinders:

Transmission:	4 cyl:	6 cyl:	8 cyl:
automatic (0):	3	4	12
manual (1):	8	3	2

We can reject the hypothesis that this distribution is equal: \(\chi^{2} = 8.7\), \(p < .05^{*}\) (or \(p = .013\)). It looks like cars with automatic transmission tend to have more cylinders, whereas manual transmission tend to have fewer cylinders. (However, due to the low counts in some cells this interpretation has to be treated with caution.)

Checkpoint 1

At this point you completed all basic exercises. This is good, but additional practice will deepen your understanding, so please keep carrying on…

B. At Home:

A Student Survey

In this part, we will analyze data from a fictional survey of 100 students.

The data are located in a tab-delimited text file at http://Rpository.com/down/data/WPA06.txt.

Data description

The data file has 100 rows and 12 columns. The columns contain variables that are defined as follows:

student survey

sex (string): A string indicting the sex of the participant (“m” = male, “f” = female).
age (integer): An integer indicating the age of the participant.
major (string): A string indicating the participant’s major.
haircolor (string): The participant’s hair color.
iq (integer): The participant’s score on an IQ test.
country (string): The participant’s country of origin.
logic (numeric): Amount of time it took for a participant to complete a logic problem (smaller is better).
siblings (integer): The participant’s number of siblings.
multitasking (integer): The participant’s score on a multitasking task (higher is better).
partners (integer): The participant’s number of sexual partners (so far).
marijuana (binary): Has the participant ever tried marijuana? (“0” = no, “1” = yes).
risk (binary). Would the participant play a gamble with a 50% chance of losing EUR 20 and a 50% chance of earning EUR 20? (“0” means the participant would not play the gamble, “1” means he or she would agree to play).

Data loading and preparation

4a. The data file is available online at http://Rpository.com/down/data/WPA06.txt. Load this data into R as a new object called wpa6.df (Hint: Use read.table() and note that the file contains a header row and is tab-delimited.)

# Read data into a data frame wpa6.df:
wpa6.df <- read.table(#file = "http://nathanieldphillips.com/wp-content/uploads/2016/11/wpa6.txt",
                      file = "http://Rpository.com/down/data/WPA06.txt",
                      header = TRUE,        # there is a header row
                      sep = "\t")           # data are tab-delimited

4b. Save the data as a comma-delimited text file (entitled wpa6.txt) into your local data folder of your project’s working directory. (Hint: Use write.table() with appropriate arguments.)

# Write wpa6.df to a tab-delimited text file in your local data folder:
write.table(wpa6.df,                  # object to be written
            file = "data/wpa6.txt",   # save wpa6.txt into the data folder
            sep = ",")                # save in comma-delimited format

4c. Use the head(), str(), and View() functions to inspect the dataset and make sure that it was loaded correctly. If the data don’t look correct (i.e., if there isn’t a header row and 100 rows and 12 columns) you probably didn’t load it correctly.

T-test(s)

5. Let’s first test some simple hypotheses about the means of groups or populations.

5a. The average IQ in the general population is 100. Do the participants of our survey have an IQ different from the general population? (Hint: Answer this question with a one-sample t-test.)

t.test(x = wpa6.df$iq,
       mu = 100)
#> 
#>  One Sample t-test
#> 
#> data:  wpa6.df$iq
#> t = 20.847, df = 99, p-value < 2.2e-16
#> alternative hypothesis: true mean is not equal to 100
#> 95 percent confidence interval:
#>  109.0935 111.0065
#> sample estimates:
#> mean of x 
#>    110.05

Answer: Yes, participants’ IQ values differ from the general population. More specifically, they have significantly higher IQ values than the general population: \(t(99) = 20.847\), \(p < .01^{*}\).

5b. The friend of your mother’s hairdresser’s cousin claims that students have 2.5 siblings on average. Test this claim with a one-sample t-test.

t.test(x = wpa6.df$siblings,
       mu = 2.5)
#> 
#>  One Sample t-test
#> 
#> data:  wpa6.df$siblings
#> t = -0.9083, df = 99, p-value = 0.3659
#> alternative hypothesis: true mean is not equal to 2.5
#> 95 percent confidence interval:
#>  2.181545 2.618455
#> sample estimates:
#> mean of x 
#>       2.4

Answer: The data do not reject the person’s claim — that is, the mean number of siblings of the students in our survey is not significantly different from 2.5: \(t(99) = -0.91\), \(p = 0.37\). (Note that this does not mean that the claim is confirmed.)

5c. Do students that have smoked marijuana have different IQ levels than those who have never smoked marijuana? (Hint: Test this hypothesis with a two-sample t-test. You can either use the vector or the formula notation for a t-test.)

t.test(formula = iq ~ marijuana,
       data = wpa6.df)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  iq by marijuana
#> t = -0.72499, df = 96.078, p-value = 0.4702
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -2.610103  1.213545
#> sample estimates:
#> mean in group 0 mean in group 1 
#>        109.6939        110.3922

Answer: We do not have evidence that IQ values vary as a function of marijuana use: \(t(96) = -0.72\), \(p = 0.47\).

Correlation test(s)

6. Does some numeric variable vary as a function of another one?

6a. Do students with higher multitasking skills tend to have more romantic partners than those with lower multitasking skills? (Hint: Test this with a correlation test.)

cor.test(formula = ~ multitasking + partners,
         data = wpa6.df,
         alternative = "greater")
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  multitasking and partners
#> t = -0.79444, df = 98, p-value = 0.7856
#> alternative hypothesis: true correlation is greater than 0
#> 95 percent confidence interval:
#>  -0.2422602  1.0000000
#> sample estimates:
#>         cor 
#> -0.07999296

Answer: We do not have evidence that people with higher multitasking skills tend to have more romantic partners than those with lower skills: \(t(98) = -0.79\), \(p = 0.79\).

6b. Do people with higher IQs perform better (i.e., faster) on the logic test? (Hint: Answer this question with a correlation test.)

cor.test(formula = ~ iq + logic,
         data = wpa6.df,
         alternative = "less")
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  iq and logic
#> t = 1.2065, df = 98, p-value = 0.8847
#> alternative hypothesis: true correlation is less than 0
#> 95 percent confidence interval:
#>  -1.0000000  0.2808334
#> sample estimates:
#>       cor 
#> 0.1209815

Answer: We do not have evidence that people with higher IQs perform faster on the logic test: \(t(98) = 1.21\), \(p = 0.88\).

Chi-square test(s)

7. Are the instances of categorical variables equally or unequally distributed?

7a. Are some majors more popular than others? (Hint: Answer this question with a one-sample chi-square test, but also check the frequencies.)

chisq.test(table(wpa6.df$major))
#> 
#>  Chi-squared test for given probabilities
#> 
#> data:  table(wpa6.df$major)
#> X-squared = 130.96, df = 3, p-value < 2.2e-16
table(wpa6.df$major)
#> 
#>    biology  economics psychology  sociology 
#>          3         15         74          8

Answer: We do find significant evidence that some majors are more popular than others: \(\chi^{2}(3) = 130.96\), \(p < .01\). (Psychology seems the most popular major in our sample.)

7b. In general, were the students in this survey more likely to take a risk than not taking it? (Hint: Answer this question with a one-sample chi-square test, but also check the frequencies.)

chisq.test(table(wpa6.df$risk))
#> 
#>  Chi-squared test for given probabilities
#> 
#> data:  table(wpa6.df$risk)
#> X-squared = 17.64, df = 1, p-value = 2.669e-05

# But look at the direction!!!
table(wpa6.df$risk)
#> 
#>  0  1 
#> 71 29

Answer: No! We do not find significant evidence that people are more likely to take a risk than not. In fact, we find evidence that they are significantly less likely to take a risk than not taking one: \(\chi^{2}(1) = 17.64\), \(p < .01^{*}\).

7c. Is there a relationship between hair color and students’ academic major? (Hint: Answer this with a two-sample chi-square test, but also check the frequencies.)

table(wpa6.df$major, wpa6.df$haircolor)
#>             
#>              black blonde brown red
#>   biology        0      0     3   0
#>   economics      2      5     7   1
#>   psychology    15     20    31   8
#>   sociology      1      3     4   0
chisq.test(table(wpa6.df$major, wpa6.df$haircolor))
#> 
#>  Pearson's Chi-squared test
#> 
#> data:  table(wpa6.df$major, wpa6.df$haircolor)
#> X-squared = 5.9227, df = 9, p-value = 0.7476

Answer: We do not find significant evidence for a relationship between hair color and major: \(\chi^{2}(9) = 5.92\), \(p = 0.74\).

You pick the test!

8. In the following exercises it’s up to you to select an appropriate test.

8a. Is there a relationship between whether a student has ever smoked marijuana and his/her decision to accept or reject the risky gamble?

chisq.test(table(wpa6.df$marijuana, wpa6.df$risk))
#> 
#>  Pearson's Chi-squared test with Yates' continuity correction
#> 
#> data:  table(wpa6.df$marijuana, wpa6.df$risk)
#> X-squared = 0.56829, df = 1, p-value = 0.4509

Answer: We do not find significant evidence that marijuana is related to selecting the risky gamble: \(\chi^{2}(1) = 0.57\), \(p = 0.45\).

8b. Do males and females have different numbers of sexual partners on average?

t.test(partners ~ sex,
       data = wpa6.df)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  partners by sex
#> t = 0.9219, df = 93.765, p-value = 0.3589
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -0.7615073  2.0815073
#> sample estimates:
#> mean in group f mean in group m 
#>            7.54            6.88

Answer: We do not find signifciant evidence that males and females have a different number of sexual partners on average: \(t(93.77) = 0.92\), \(p = 0.36\).

8c. Do males and females differ in how likely they are to have smoked marijuana?

chisq.test(table(wpa6.df$marijuana, wpa6.df$sex))
#> 
#>  Pearson's Chi-squared test with Yates' continuity correction
#> 
#> data:  table(wpa6.df$marijuana, wpa6.df$sex)
#> X-squared = 0.16006, df = 1, p-value = 0.6891

Answer: We do not find signifcant evidence that males and females differ in how likely they are to have smoked marijuana: \(\chi^{2}(1) = 0.16\), \(p = 0.69\).

8d. Do people who have smoked marijuana have different logic scores on average than those who never have smoked marijuana?

t.test(logic ~ marijuana, data = wpa6.df)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  logic by marijuana
#> t = 0.95997, df = 90.88, p-value = 0.3396
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -0.4373672  1.2554465
#> sample estimates:
#> mean in group 0 mean in group 1 
#>        9.945510        9.536471

Answer: We do not find significant evidence that those who have smoked marijuana have different logic scores on average than those who have not smoked marijuana: \(t(90.88) = 0.96\), \(p = 0.34\).

8e. Do people with higher IQ scores tend to perform better on the logic test than those with lower IQ scores?

cor.test(~ iq + logic, 
         data = wpa6.df,
         alternative = "less")
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  iq and logic
#> t = 1.2065, df = 98, p-value = 0.8847
#> alternative hypothesis: true correlation is less than 0
#> 95 percent confidence interval:
#>  -1.0000000  0.2808334
#> sample estimates:
#>       cor 
#> 0.1209815

Answer: We do not find significant evidence that people with higher IQ scores tend to perform better on the logic test than those with lower IQ scores: \(t(98) = 1.21\), \(p = 0.88\).

Checkpoint 2

If you got this far you’re doing great, but don’t give up just yet…

More complicated tests

9. The following exercises typically require you to first select an appropriate subset of the data.

9a. Are Swiss students more likely than not to have tried marijuana? (Hint: Use a one-sample chi-square test with a subset argument, but also check the frequencies.)

mar.swi <- wpa6.df$marijuana[wpa6.df$country == "switzerland"]
table(mar.swi) # to see frequencies
#> mar.swi
#>  0  1 
#>  8 12
chisq.test(mar.swi)
#> 
#>  Chi-squared test for given probabilities
#> 
#> data:  mar.swi
#> X-squared = 8, df = 19, p-value = 0.9867

Answer: We do not find evidence that Swiss students were more likely than not to have tried marijuana: \(\chi^{2}(19) = 8\), \(p = 0.99\).

9a. Does the relationship between IQ scores and multitasking performance differ between males and females? (Hint: Test this by conducting two separate tests — one for males and one for females. Do your conclusions differ?)

# For males:
cor.test(formula = ~ iq + multitasking,
         data = subset(wpa6.df, sex == "m"))
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  iq and multitasking
#> t = 0.40605, df = 48, p-value = 0.6865
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.2234787  0.3314582
#> sample estimates:
#>        cor 
#> 0.05850849

# For females:
cor.test(formula = ~ iq + multitasking,
         data = subset(wpa6.df, sex == "f"))
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  iq and multitasking
#> t = -1.45, df = 48, p-value = 0.1536
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.45713553  0.07793781
#> sample estimates:
#>       cor 
#> -0.204854

Answer: While the raw correlations for males and females are different (\(r=-.21\) vs. \(r=.06\), respectively), we do not find a significant correlation between IQ scores and multitasking performance for either males or females: \(t(48) = .41\), \(p = 0.69\) and \(t(48) = -1.45\), \(p = 0.15\), respectively.

9b. Do the IQ scores of people with brown hair differ from those with blonde hair? (Hint: This is a two-sample t-test that requires you to use the subset() argument to tell R which two groups you want to compare.)

t.test(iq ~ haircolor,
       data = subset(wpa6.df, haircolor %in% c("brown", "blonde")))
#> 
#>  Welch Two Sample t-test
#> 
#> data:  iq by haircolor
#> t = 0.50574, df = 59.053, p-value = 0.6149
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -1.757470  2.946358
#> sample estimates:
#> mean in group blonde  mean in group brown 
#>             109.7500             109.1556

Answer: We do not find a significant difference in IQ scores between blondes and brown-haired people: \(t(59.05) = 0.51\), \(p = 0.61\).

9c. Only for men from Germany, is there a relationship between age and IQ?

cor.test(formula = ~ age + iq,
         data = subset(wpa6.df, country == "germany"))
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  age and iq
#> t = 1.1545, df = 58, p-value = 0.253
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.1081605  0.3890006
#> sample estimates:
#>       cor 
#> 0.1498806

Answer: We do not find sufficient evidence to show that men from Germany show a significant relationship between their age and IQ scores: \(t(58) = 1.15\), \(p = 0.25\).

9d. Considering only people who chose the risky gamble, do people that have smoked marijuana have more sexual partners than those who have never smoked marijuana?

t.test(formula = partners ~ marijuana,
       data = subset(wpa6.df, risk == 1))
#> 
#>  Welch Two Sample t-test
#> 
#> data:  partners by marijuana
#> t = 0.87669, df = 25.529, p-value = 0.3888
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -1.307151  3.248327
#> sample estimates:
#> mean in group 0 mean in group 1 
#>        7.500000        6.529412

Answer: No, for the people who chose the risky gamble, we do not find evidence for a relationship between marijuana use and their average number of sexual partners: \(t(25.53) = 0.88\), \(p = 0.39\).

9e. Considering only people who chose the risky gamble and have never tried marijuana, is there a relationship between IQ scores and performance on the logic test?

cor.test(formula = ~ iq + logic,
         data = subset(wpa6.df, risk == 1 & marijuana == 0))
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  iq and logic
#> t = 1.4249, df = 10, p-value = 0.1846
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.2134024  0.7968450
#> sample estimates:
#>       cor 
#> 0.4108122

Answer: We do not find evidence for a relationship between IQ scores and logic test performance for those who chose the risky gamble and have not tried marijuana: \(t(10) = 1.42\), \(p = 0.18\).

Checkpoint 3

If you got this far you’re doing an amazing job — well done! Enjoy the following challenge…

C. Challenges

Anscombe’s quartet

10. In the next few exercises, we’ll explore Anscombe’s famous data quartet. This famous dataset will illustrate the dangers of interpreting statistical tests (like a correlation test), without first plotting the data!

10a. Run the following code to create the anscombe.df dataframe. This dataframe contains 4 datasets (x1 and y1, x2 and y2, x3 and y3 and x4 and y4):

# Just COPY, PASTE, and RUN this code:
anscombe.df <- data.frame(x1 = c(10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5),
                          y1 = c(8.04, 6.95, 7.58, 8.81, 8.33, 9.96, 7.24, 4.26, 10.84, 4.82, 4.68),
                          x2 = c(10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5),
                          y2 = c(9.14, 8.14, 8.74, 8.77, 9.26, 8.1, 6.13, 3.1, 9.13, 7.26, 4.74),
                          x3 = c(10, 8, 13, 9, 11, 14, 6, 4, 12, 7, 5),
                          y3 = c(7.46, 6.77, 12.74, 7.11, 7.81, 8.84, 6.08, 5.39, 8.15, 6.42, 5.73),
                          x4 = c(8, 8, 8, 8, 8, 8, 8, 19, 8, 8, 8),
                          y4 = c(6.58, 5.76, 7.71, 8.84, 8.47, 7.04, 5.25, 12.5, 5.56, 7.91, 6.89))

10b. Calculate the four correlations between \(x1\) and \(y1\), \(x2\) and \(y2\), \(x3\) and \(y3\), and between \(x4\) and \(y4\) (separately, i.e., what is the correlation between \(x1\) and \(y1\)? Next, what is the correlation between \(x2\) and \(y2?\), etc.). What can you report about the correlation values for each test?

cor.test(~x1 + y1, data = anscombe.df) # 1
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  x1 and y1
#> t = 4.4844, df = 9, p-value = 0.001523
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  0.4612713 0.9549196
#> sample estimates:
#>       cor 
#> 0.8311601

cor.test(~x2 + y2, data = anscombe.df) # 2 
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  x2 and y2
#> t = 4.2386, df = 9, p-value = 0.002179
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  0.4239389 0.9506402
#> sample estimates:
#>       cor 
#> 0.8162365

cor.test(~x3 + y3, data = anscombe.df) # 3
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  x3 and y3
#> t = 4.2394, df = 9, p-value = 0.002176
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  0.4240623 0.9506547
#> sample estimates:
#>       cor 
#> 0.8162867

cor.test(~x4 + y4, data = anscombe.df) # 4
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  x4 and y4
#> t = 4.243, df = 9, p-value = 0.002165
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  0.4246394 0.9507224
#> sample estimates:
#>       cor 
#> 0.8165214

Answer: The correlations are all statistically significant (\(p < .01^{*}\)) and are all almost of the same magnitude (\(r\ \approx .82-.83\)).

10c. Now run the following code to generate a scatterplot of each data pair. What do you find?

# Just COPY, PASTE, and RUN this code:

# Plot the famous Anscombe quartet:

par(mfrow = c(2, 2)) # create a 2 x 2 plotting grid

for (i in 1:4) {   # Loop over the 4 datasets:
 
  # Assign x and y for current value of i
  if (i == 1) {x <- anscombe.df$x1
               y <- anscombe.df$y1} 
  
  if (i == 2) {x <- anscombe.df$x2
               y <- anscombe.df$y2} 
  
  if (i == 3) {x <- anscombe.df$x3
               y <- anscombe.df$y3} 
  
  if (i == 4) {x <- anscombe.df$x4
               y <- anscombe.df$y4} 

  # Create corresponding plot:
  plot(x = x, y = y, pch = 21, 
       main = paste("Anscombe", i, sep = " "), 
       bg = "orange", col = "red4", 
       xlim = c(0, 20), ylim = c(0, 15)
       )

 # Add regression line:
 abline(lm(y ~ x, 
          data = data.frame(y, x)),
          col = "steelblue", lty = 3, lwd = 2)

  # Add correlation test text: 
  text(x = 3, y = 12,
       labels = paste0("cor = ", round(cor(x, y), 2)),
       col = "steelblue")
  
  }


par(mfrow = c(1, 1)) # reset plotting grid

Answer: Despite their statistical similarities (in test results), the four data sets are very different.

Conclusion: What you can see in the four scatterplots is the famous Anscombe’s quartet, a dataset designed to show you how important is to always plot your data before running a statistical test on it. You can learn more about this at the corresponding Wikipedia page.

Also, see this or this blog post on The Datasaurus Dozen for similar examples that illustrate the importance of visualizing data.

That’s it – hope you enjoyed working on this assignment!

[WPA06_answers.Rmd updated on 2017-12-07 21:34:54 by hn.]

WPA06: Statistics: 1- and 2-sample null-hypothesis testing (Answers)

Hansjörg Neth, SPDS, uni.kn

2017 Dec 07