Answers for WPA03 of Basic data and decision analysis in R, taught at the University of Konstanz in Winter 2017/2018.

To complete and submit these exercises, please remember and do the following:

Your WPAs can be written and submitted either as scripts of commented code (as .R or .Rmd files) or as reproducible documents that combine text with code (in .html or .pdf formats).
- A simple .Rmd template is provided here.
- Alternatively, open a plain R script and save it as LastnameFirstname_WPA##_yymmdd.R.
Also enter the current assignment (e.g., WPA03), your name, and the current date at the top of your document. When working on a task, always indicate which task you are answering with appopriate comments.

Here is an example how your file JamesJane_WPA03_171113.R could look:

# Assignment: WPA 03
# Name: Jane, James
# Date: 2017 Nov 13
# ~~~~~~~~~~~~~~~~~~~~~~~~~~
# A. In Class

# Combining vectors to matrices and data frames:

# Exercise 1:
a <- letters[1:3] # define some vector

# ...

Complete as many exercises as you can by Wednesday (23:59).
Submit your script or output file (including all code) to the appropriate folder on Ilias.

A. In Class

Here are some warm-up exercises that review the basic concepts of the current chapters:

Combining vectors to matrices and data frames

1. Define these vectors in your R session to complete the following exercises:

# Define some vectors:
a <- letters[1:3]
b <- letters[4:6]
c <- letters[7:9]
x <- 1:3
y <- 4:6
z <- 7:9

1a. Define the following matrices without using the matrix() command. (Hint: Use cbind() and rbind() to combine the vectors defined above.)

m1
#>      x   y   z   a   b   c  
#> [1,] "1" "4" "7" "a" "d" "g"
#> [2,] "2" "5" "8" "b" "e" "h"
#> [3,] "3" "6" "9" "c" "f" "i"

m2
#>   [,1] [,2] [,3]
#> a "a"  "b"  "c" 
#> b "d"  "e"  "f" 
#> c "g"  "h"  "i" 
#> x "1"  "2"  "3" 
#> y "4"  "5"  "6" 
#> z "7"  "8"  "9"

m3
#>   a   b   c              
#> x "a" "d" "g" "1" "2" "3"
#> y "b" "e" "h" "4" "5" "6"
#> z "c" "f" "i" "7" "8" "9"

m1 <- cbind(x, y, z, a, b, c)

m2 <- rbind(a, b, c, x, y, z)

m3 <- cbind(cbind(a, b, c), rbind(x, y, z))

1b. What is the type of these three matrices? (Use the typeof() function to compare the types of the vectors and matrices.)

typeof(a)  # ==> "character"
#> [1] "character"
typeof(x)  # ==> "integer"
#> [1] "integer"
typeof(m3) # ==> "character"
#> [1] "character"
# Answer: Matrices containing a mix of "character" and "integer" elements are of type "character".

1c. Re-create the matrices m1 and m2 without using the cbind() and rbind() commands and without using the vectors defined above. (Hint: Use matrix() and define the data of each matrix as a new vector. Note that the names of rows or columns may vary depending on your construction method, but ensure that all matrix elements match the original ones.)

m1b <- matrix(data = c(1:9, letters[1:9]),
              nrow = 3,
              ncol = 6,
              byrow = FALSE)

m2b <- matrix(data = c(letters[1:9], 1:9),
              nrow = 6,
              ncol = 3,
              byrow = TRUE)

1d. Re-create matrix m3 without using the rbind() command. (Hint: Use matrix() to create two 3x3 matrices and combine them with cbind().)

m3a <- matrix(data = letters[1:9],
              nrow = 3,
              ncol = 3,
              byrow = FALSE)

m3b <- matrix(data = c(1:9),
              nrow = 3,
              ncol = 3,
              byrow = TRUE)

m3d <- cbind(m3a, m3b)
m3d
#>      [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] "a"  "d"  "g"  "1"  "2"  "3" 
#> [2,] "b"  "e"  "h"  "4"  "5"  "6" 
#> [3,] "c"  "f"  "i"  "7"  "8"  "9"

# Note that m3 and m3d have different row and column names.  However, 
m3d == m3 # verifies that they share the same elements.
#>      a    b    c               
#> x TRUE TRUE TRUE TRUE TRUE TRUE
#> y TRUE TRUE TRUE TRUE TRUE TRUE
#> z TRUE TRUE TRUE TRUE TRUE TRUE

2. This exercise addresses some details of the cbind() and rbind() functions:

2a. Assume that vector x changed from 1:3 to 1:9, but y and z remained unchanged. Predict, check and explain the result of rbind(x, y, z) and cbind(x, y, z).

x <- 1:9

rbind(x, y, z) # elements of y and z are recycled (to match the length of x):
#>   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#> x    1    2    3    4    5    6    7    8    9
#> y    4    5    6    4    5    6    4    5    6
#> z    7    8    9    7    8    9    7    8    9

cbind(x, y, z) # elements of y and z are recycled (to match the length of x):
#>       x y z
#>  [1,] 1 4 7
#>  [2,] 2 5 8
#>  [3,] 3 6 9
#>  [4,] 4 4 7
#>  [5,] 5 5 8
#>  [6,] 6 6 9
#>  [7,] 7 4 7
#>  [8,] 8 5 8
#>  [9,] 9 6 9

2b. Describe the differences between cbind(a, z) and data.frame(a, z) (assuming the definitions above).

mt <- cbind(a, z) # yields a matrix of characters (due to combining characters and integers in a matrix):
m1
#>      x   y   z   a   b   c  
#> [1,] "1" "4" "7" "a" "d" "g"
#> [2,] "2" "5" "8" "b" "e" "h"
#> [3,] "3" "6" "9" "c" "f" "i"
typeof(mt)
#> [1] "character"

df <- data.frame(a, z) # yields a data frame containing 1 column of characters and 1 column of numbers:
typeof(df$a)
#> [1] "integer"
typeof(df$z)
#> [1] "integer"

2c. Assuming the following definitions of df1 and df2, describe and explain the results of cbind(df1, df2) and rbind(df1, df2).

df1 <- data.frame(cbind(y, z))
# df1

df2 <- data.frame(cbind(y + 100, z + 100))
# df2

df3 <- cbind(df1, df2)
df3  # combines 2 data frames into 1 (but note names(df3)[c(3, 4)]):

df4 <- rbind(df1, df2) # returns an error, as dfs to be combined by rbind() 
                       # must have the same variable names.

2d. Change (the column names of) df2 to make rbind(df1, df2) possible. (Hint: Use the names() function to get and set a data frame’s column names.)

names(df2) <- names(df1) # copy names of df1 to df2

df4 <- rbind(df1, df2)
df4
#>     y   z
#> 1   4   7
#> 2   5   8
#> 3   6   9
#> 4 104 107
#> 5 105 108
#> 6 106 109

Working with data frames

3. In this exercise, we’ll explore the in-built data frame InsectSprays.

3a. Familiarize yourself with this data frame by using the ?, View(), head(), str() and summary() commands. (Hint: Copy the original InsectSprays into a new data frame called df to leave the original unchanged and facilitate your typing.)

# ?InsectSprays # provides useful background info on data.

df <- InsectSprays # copy data (to leave original unchanged).

### Explore df:
# head(df)
# str(df)
summary(df)
#>      count       spray 
#>  Min.   : 0.00   A:12  
#>  1st Qu.: 3.00   B:12  
#>  Median : 7.00   C:12  
#>  Mean   : 9.50   D:12  
#>  3rd Qu.:14.25   E:12  
#>  Max.   :26.00   F:12

3b. How many cases of observations (rows) and variables (columns) does this data frame contain?

nrow(df) # number of rows/cases: 
#> [1] 72
ncol(df) # number of columns/variables: 
#> [1] 2
# OR: 
dim(df) # number of rows and columns
#> [1] 72  2

3c. Compute the mean and the median of the count variable. What do their values suggest about the distribution of count values?

mean(df$count)
#> [1] 9.5
median(df$count)
#> [1] 7

# A lower median than mean suggests that there are more low than high values.
hist(df$count) # shows the distribution of count values.

3d. How many cases were treated with each of the insect sprays?

table(df$spray) # => 12 cases in each of 6 sprays. Note that 
#> 
#>  A  B  C  D  E  F 
#> 12 12 12 12 12 12
# summary(df)   # has already told us this (above).

3e. Add a new variable hi.avg to df that indicates whether a count is higher than the average (or mean()) count.

df$hi.avg <- df$count > mean(df$count)
# OR (in 2 steps): 
# hi.avg <- df$count > mean(df$count)
# df <- cbind(df, hi.avg)

3f. Change the name of the count variable to insect.count (by using logical indexing).

names(df)[names(df) == "count"] <- "insect.count"

3g. Save the initial and the final 10 rows of df into new data frames and combine them into a data frame first.final.10.

first.10 <- df[1:10, ]

n <- nrow(df)
final.10 <- df[(n - 9):n, ]

first.final.10 <- rbind(first.10, final.10)
# first.final.10

3h. Save all cases that were treated with spray A that are lower than the overall average and all cases that were treated with spray F that are higher than the overall average in separate data frames (using the subset() command). Which of them has a higher average insect count? (Hint: Proceed in multiple steps:

create new subsets of df that contain the cases that you want to compare,
compute and compare the means of these new subsets.)

# a: Get 2 subsets that meet the above conditions:
A.lo <- subset(df, df$spray == "A" & df$hi.avg == FALSE)
F.hi <- subset(df, df$spray == "F" & df$hi.avg == TRUE)

# b: Compute and compare their averages:
mean(A.lo$insect.count) # average insect.count of A (BELOW the overall average)
#> [1] 7
mean(F.hi$insect.count) # average insect.count of F (ABOVE the overall average)
#> [1] 17.36364
mean(A.lo$insect.count) < mean(F.hi$insect.count) # A(..) < F(...)
#> [1] TRUE

3i. Is the average insect count of all cases that were treated with spray C and are above average (within spray C) smaller or larger than the average insect count of all cases that were treated with spray F and are below average (within spray F)? (Hint: Proceed in multiple steps:

identify subsets of df that contain only cases of spray C and F,
compute the averages of these subsets,
create new subsets of df that contain the cases that you want to compare,
compute and compare the means of these new subsets.

# a: Get subsets C and F:
set.C <- subset(df, df$spray == "C")
set.F <- subset(df, df$spray == "F")

# b: Determine their averages:
avg.C <- mean(set.C$insect.count)
avg.F <- mean(set.F$insect.count)

# c: Get subsets that meet the above conditions: 
C.hi <- subset(df, df$spray == "C" & df$insect.count > avg.C)
F.lo <- subset(df, df$spray == "F" & df$insect.count < avg.F)

# c' alternatively:
C.hi.2 <- subset(set.C, set.C$insect.count > avg.C)
F.lo.2 <- subset(set.F, set.F$insect.count < avg.F)

# verify that c and c' yield the same results:
all.equal(C.hi, C.hi.2) # ==> TRUE
#> [1] TRUE
all.equal(F.lo, F.lo.2) # ==> TRUE
#> [1] TRUE

# d: Compute their averages:
mean(C.hi$insect.count) # average insect.count of C (ABOVE the average of C)
#> [1] 4.25
mean(F.lo$insect.count) # average insect.count of F (BELOW the average of F)
#> [1] 12.75

# Compare averages: 
mean(C.hi$insect.count) < mean(F.lo$insect.count) # still C(...) < F(...)
#> [1] TRUE

All this can be done in far fewer lines of code, of course, but separating tasks into different steps (and corresponding objects) typically makes it clearer.

Saving and loading data

4. Save your data frame df as a comma-delimited text file named myInsectSprays.csv (into a subfolder data) and then re-load this file into a new data frame df2.

write.table(df, file = "data/myInsectSprays.csv", sep = ",")
# OR: 
# write.csv(df, file = "data/myInsectSprays.csv")

df2 <- read.table(file = "data/myInsectSprays.csv", header = TRUE, sep = ",")
# OR: 
# df2 <- read.csv2(file = "data/myInsectSprays.csv", header = TRUE, sep = ",")

Checkpoint 1

At this point you completed all warm-up exercises. This is good, but please keep carrying on…

B. At Home

Priming Study

In a provocative paper, Bargh, Chen and Burrows (1996) sought to test whether or not priming people with trait concepts would trigger trait-consistent behavior. In one study, they primed participants with either neutral words (e.g., bat, cookie, pen), or with words related to an elderly stereotype (e.g., wise, stubborn, old). They then, unbeknownst to the participants, used a stopwatch to record how long it took the participants to walk down a hallway at the conclusion of an experiment. They predicted that participants primed with words related to the elderly would walk slower than those primed with neutral words.

In the following, we will analyze fake data corresponding to this cover story.

Dataset description

Our simulated study data has 3 primary independent variables:

prime: What kind of primes was the participant given? There were 2 conditions: neutral means neutral primes, elderly means elderly primes;
prime.duration: For how much time (in minutes) were primes shown to participants? There were 4 conditions: 1, 5, 10, or 30 minutes;
grandparents: Did the participant have a close relationship with their grandparents? yes means yes, no means no, none means that they had no relationship with their grandparents.

There was one primary dependent variable:

walk: For how long (in seconds) did participants walk down the hallway?

There were 4 additional variables that characterise each particpant:

id: The order in which participants completed the study;
age: Participants’ age;
gender: Participants’ gender;
attention: Was an attention check passed? 0 indicates a failed, 1 a passed attention check.

Project management

5a. Start a new R-project called RCourse (or similar). Then (either within RStudio or in a file manager outside of R), navigate to the location of your RCourse project, and add two folders named R and data.

# ok!

5b. Open a new R script called LastFirst_WPA03_161114.R (if First is your first and Last is your last name), and save it into the R folder.

# ok!

5c. When beginning a new project, it’s always a good idea to remove all previous assignments and objects from your workspace. (Hint: Check the help file of the rm() function to obtain the correct command.)

rm(list = ls()) # clean all (without warning).

5d. A text file containing the data (called WPA03_priming.txt) is available at http://Rpository.com/down/data/WPA03_priming.txt. Right-click the link and save the data file into the data folder. (Note that this data file is not the original data, but freshly simulated data from 2017.)

# Ok, I saved this file into my "data" folder. Now what?

5e. Use read.table() to load the data into a new R object called priming. (Hint: Note that the text file is tab-delimited and contains a header row, so be sure to include the sep = "\t" and header = TRUE arguments.)

priming <- read.table(file = "data/WPA03_priming.txt",
                      sep = "\t",
                      header = TRUE)

Viewing and naming data

6a. Explore the data or specific variables using View(), head(), str(), and summary(). How many cases (rows) and variables (columns) does priming contain?

# View(priming)
head(priming)
str(priming)
# summary(priming)
dim(priming) # ==> 450 rows and 8 columns

6b. Obtain and study the names of priming with names(). Those aren’t very useful are they? Change the names to more informative values. (Hint: Make your life easy by using the same naming scheme as in the dataset description above.)

names(priming) <- c("id", "gender", "age", "attention", "prime", "prime.duration", "grandparents", "walk")

Applying functions to variables

7a. What was the mean age of the participants?

mean(priming$age)
#> [1] 21.98889

7b. How many participants were there from each gender?

table(priming$gender)
#> 
#>   f   m 
#> 215 235

7c. What was the median walking time?

median(priming$walk)
#> [1] 33.7

7d. What percentage of participants passed the attention check? (Hint: To calculate a percentage from a binary [0, 1] variable, use mean().)

mean(priming$attention) * 100
#> [1] 82.22222

7e. Walking time is currently in seconds. Add a new column to the dataframe called walking.m that shows the walking time in minutes rather than seconds (rounded to the nearest 2 decimals).

priming$walking.m <- round(priming$walk / 60, 2)

Indexing and subsettting data frames

Hint: Many of the following problems are best solved by splitting your answers into two steps:

Step 1: Index or subset the original data and store it as a new object with a new name.
Step 2: Calculate the appropriate summary statistic using the new, subsetted object that you just created.

8a. What were the genders of the first 10 participants (i.e., the first 10 rows)?

# in 2 steps:
priming.10 <- subset(priming, subset = id < 10)
priming.10$gender
#> [1] f f m f f f f m f
#> Levels: f m
# OR in 1 step: 
priming$gender[1:10]
#>  [1] f f m f f f f m f f
#> Levels: f m

8b. Show all the data for the 50th participant (row)?

subset(priming, subset = id == 50)
#>    id gender age attention prime prime.duration grandparents walk
#> 50 50      f  22         1  asdf              5         none   37
#>    walking.m
#> 50      0.62
# OR:
priming[50, ]
#>    id gender age attention prime prime.duration grandparents walk
#> 50 50      f  22         1  asdf              5         none   37
#>    walking.m
#> 50      0.62

8c. What was the mean walking time for the elderly prime condition?

# in 2 steps:
priming.e1 <- subset(priming, subset = prime == "elderly")
priming.e2 <- priming[priming$prime == "elderly", ] # alternative solution

mean(priming.e1$walk)
#> [1] 40.3704
mean(priming.e2$walk)
#> [1] 40.3704

# OR in 1 step:
mean(priming$walk[priming$prime == "elderly"])
#> [1] 40.3704
mean(priming[priming$prime == "elderly", ]$walk) # alternative solution
#> [1] 40.3704

8d. What was the mean walking time for the neutral prime condition?

# in 2 steps:
priming.n <- subset(priming, subset = prime == "neutral")
mean(priming.n$walk)
#> [1] 28.77652

# OR in 1 step:
mean(priming$walk[priming$prime == "neutral"])
#> [1] 28.77652
mean(priming[priming$prime == "neutral", ]$walk) # alternative solution
#> [1] 28.77652

8e. What was the mean walking time for participants less than 23 years old?

# in 2 steps:
priming.23 <- subset(priming, subset = age < 23)
mean(priming.23$walk)
#> [1] 31.45991

# OR in 1 step:
mean(priming$walk[priming$age < 23])
#> [1] 31.45991

8f. What was the mean walking time for females with a close relationship with their grandparents?

# in 2 steps:
dat <- subset(priming, subset = gender == "f" & grandparents == "yes")
mean(dat$walk)
#> [1] 39.10852

# OR in 1 step:
mean(priming$walk[priming$gender == "f" & priming$grandparents == "yes"])
#> [1] 39.10852

8g. What was the mean walking time for males over 24 years old without a close relationship with their grandparents?

# in 2 steps:
dat <- subset(priming,
              subset = (gender == "m") & 
                (age > 24) & 
                (grandparents %in% c("no", "none")))
mean(dat$walk)
#> [1] 34.6

Checkpoint 2

At this point you are doing very well. Try to hang on for a few more difficult tasks…

Creating new data frames

9a. Create a new data frame called priming.att that only includes rows where participants passed the attention check. (Hint: Use logical indexing or subset().)

priming.att <- subset(priming, subset = attention == 1)
str(priming.att)
#> 'data.frame':    370 obs. of  9 variables:
#>  $ id            : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ gender        : Factor w/ 2 levels "f","m": 1 1 2 1 1 1 1 2 1 1 ...
#>  $ age           : int  22 23 22 21 23 22 23 21 22 22 ...
#>  $ attention     : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ prime         : Factor w/ 3 levels "asdf","elderly",..: 1 3 3 1 3 3 2 2 1 1 ...
#>  $ prime.duration: int  1 1 5 30 10 10 1 10 5 10 ...
#>  $ grandparents  : Factor w/ 3 levels "no","none","yes": 3 2 3 3 2 2 1 1 2 2 ...
#>  $ walk          : num  43.3 34.8 28.6 31.1 37.4 32.2 41.2 39.3 44 32.1 ...
#>  $ walking.m     : num  0.72 0.58 0.48 0.52 0.62 0.54 0.69 0.65 0.73 0.54 ...

9b. Some of the data do not make sense and must be mistaken. For example, some walking times are negative, some prime values are incorrect, and some prime.duration values were not part of the original study plan. This should be fixed before carrying out further analyses.
Create a new data frame called priming.c (for ‘priming clean’) that only includes rows with valid values for each column. Do this by looking for strange values in each column, and by comparing them with the original dataset description. Additionally, only include participants who passed the attention check. Here’s a skeleton of how your code should look:

# Create priming.c, a subset of the original priming data: 
# (Replace __ with the appropriate values.)
priming.c <- subset(x = priming,
                    subset = gender %in% c(__) & 
                             age > __ &
                             attention == __ &
                             prime %in% __ &
                             prime.duration %in% __ &
                             grandparents %in% __ &
                             walk > __)

# Create priming.c, a subset of the original priming data: 
# (Replace __ with the appropriate values.)
priming.c <- subset(x = priming,
                    subset = gender %in% c("m", "f") & 
                             age > 17 &
                             attention == 1 &
                             prime %in% c("elderly", "neutral") &
                             prime.duration %in% c(1, 5, 10, 30) &
                             grandparents %in% c("yes", "no", "none") &
                             walk > 0)

9c. How many participants gave valid data and passed the attention check? (Hint: Use the result from your previous answer.)

nrow(priming.c) # ==> 213 participants remain.
#> [1] 213

9d. Of those participants who gave valid data and passed the attention check, what was the mean walking time of those given the elderly and neutral prime (calculate these separately).

priming.c.eld <- subset(priming.c, subset = prime == "elderly")
priming.c.neu <- subset(priming.c, subset = prime == "neutral")

mean(priming.c.eld$walk)
#> [1] 43.59912
mean(priming.c.neu$walk)
#> [1] 30.46768

Saving and loading data

10a. Save your two dataframe objects priming and priming.c in an .RData file called priming.RData in the data folder of your project

save(priming, priming.c, file = "data/priming.RData")

10b. Save your priming.c object as a tab-delimited text file called priming_clean.txt in the data folder of your project.

write.table(priming.c, file = "data/priming_clean.txt", sep = "\t")

10c. Clean your workspace by running the appropriate rm() command again.

rm(list = ls()) # clean all (without warning).

10d. Re-load your two data frame objects using load().

load(file = "data/priming.RData")

11. A colleague of yours asks for access to the data, but is only interested in the data from females who experienced the neutral prime.

11a. Create a dataframe called priming.f that only includes these data. Additionally, do not include the id column as this could be used to identify the participants.

priming.f <- subset(priming, 
                    subset = (gender == "f"),
                    select = c("gender", "age", "attention", "prime", "prime.duration", "grandparents", "walk"))

11b. Save your priming.f object as a tab–delimited text file called priming_females.txt in the data folder of your project.

write.table(priming.f, file = "data/priming_females.txt", sep = "\t")

11c. Save your entire workspace using to an .RData file called priming_ws.RData in the data folder of your project.

save.image(file = "data/priming_ws.RData")

Checkpoint 3

At this point you are doing great, well done! If you are curious, perhaps you also enjoy the following challenges?

C. Challenges

12. Use your cleaned dataframe (priming.c) for the following exercises.

12a. Did the effect of priming condition (neutral vs. elderly) on walking times differ between the first 100 and the last 100 participants? (Hint: Given a total of \(n\) participants, you can find the id of the \(100\). and of the \(n-100\). participant. Then use these id values to determine subsets of priming.c.)

# A) Determine id of first and last 100 participants:
n.c <- nrow(priming.c) # number of participants (rows) in priming.c
id.f100 <- priming.c$id[100]      # id of the 100. participant
id.l100 <- priming.c$id[n.c - 99] # id of the 1st of the 100 last participants

# B) First 100 participants: 
neutral.f100 <- subset(priming.c, id <= id.f100 & prime == "neutral")
elderly.f100 <- subset(priming.c, id <= id.f100 & prime == "elderly")

nrow(neutral.f100) + nrow(elderly.f100) # check if sum = 100
#> [1] 100

# Difference between conditions in first 100:
mean(elderly.f100$walk) - mean(neutral.f100$walk)
#> [1] 11.17937

# C) Last 100 participants:
neutral.l100 <- subset(priming.c, id >= id.l100 & prime == "neutral")
elderly.l100 <- subset(priming.c, id >= id.l100 & prime == "elderly")

nrow(neutral.l100) + nrow(elderly.l100) # check if sum = 100
#> [1] 100

# Difference between conditions in last 100
mean(elderly.l100$walk) - mean(neutral.l100$walk)
#> [1] 15.00224

# Answer: The results seem different, with last 100 participants with elderly prime 
#         taking 15.0 sec longer, whereas first 100 participants with elderly prime 
#         take 11.2 sec longer than their peers with neutral prime.
#         Could the effect increase over the duration of the study?

12b. Due to a computer error, the data from every participant with an even id number is invalid. Remove these data from your priming.c dataframe.

priming.c <- priming.c[priming.c$id %in% seq(1, n.c, 2),]

12c. Do you find evidence that a participant’s relationship with their grandparents affects how they responded to the primes?

# "No" relationship conditions:
neutral.no <- subset(priming.c, 
                     subset = grandparents == "no" & prime == "neutral")

elderly.no <- subset(priming.c, 
                     subset = grandparents == "no" & prime == "elderly")

# Condition effect for grandparents == "no":
mean(elderly.no$walk) - mean(neutral.no$walk)
#> [1] 10.81786

# "yes" relationship conditions:
neutral.yes <- subset(priming.c, 
                      subset = grandparents == "yes" & prime == "neutral")

elderly.yes <- subset(priming.c, 
                      subset = grandparents == "yes" & prime == "elderly")

# Condition effect for grandparents == "yes":
mean(elderly.yes$walk) - mean(neutral.yes$walk)
#> [1] 14.26875

# "none" relationship conditions:
neutral.none <- subset(priming.c, 
                       subset = grandparents == "none" & prime == "neutral")

elderly.none <- subset(priming.c, 
                       subset = grandparents == "none" & prime == "elderly")

# Condition effect for grandparents == "none":
mean(elderly.none$walk) - mean(neutral.none$walk)
#> [1] -3.766667

# Answer: It appears that the effect was strongest for participants  
#         with a close relationship with their grandparents, 
#         and around zero (and a trend in the opposite direction) 
#         for those with no relationship with their grandparents.

# Note: A pirateplot provides a descriptive overview:
yarrr::pirateplot(formula = walk ~ prime + grandparents,
                  data = priming.c,
                  main = "Walking time by prime and relationship to grandparents",
                  theme = 2,
                  pal = c("orange3", "steelblue3"), 
                  back.col = gray(.99),
                  gl.col = gray(.20)
                  )

That’s it – hope you enjoyed working on this assignment!

[WPA03_answers.Rmd updated on 2017-11-16 18:43:24 by hn.]

WPA03: Matrices and data frames, managing the workspace (Answers)

Hansjörg Neth, SPDS, uni.kn

2017 Nov 16

A. In Class

Combining vectors to matrices and data frames

Working with data frames

Saving and loading data

Checkpoint 1

B. At Home

Priming Study

Dataset description

Project management

Viewing and naming data

Applying functions to variables

Indexing and subsettting data frames

Checkpoint 2

Creating new data frames

Saving and loading data

Checkpoint 3

C. Challenges