uni.kn.logo

Answers for WPA02 of Basic data and decision analysis in R, taught at the University of Konstanz in Winter 2017/2018.


Instructions

To complete and submit these exercises, please remember and do the following:

  1. Your WPAs can be written and submitted either as scripts of commented code (as .R or .Rmd files) or as reproducible documents that combine text with code (in .html or .pdf formats).

    • A simple .Rmd template is provided here.

    • Alternatively, open a plain R script and save it as LastnameFirstname_WPA##_yymmdd.R.

  2. Also enter the current assignment (e.g., WPA02), your name, and the current date at the top of your document. When working on a task, always indicate which task you are answering with appopriate comments.

Here is an example how your file JillsomeJack_WPA01_161106.Rmd could look:

# Assignment: WPA 02
# Name: Jackson, Jill
# Date: 2017 Nov 06
# ~~~~~~~~~~~~~~~~~~~~~~~~~~
# A. In Class

# Numerical indexing:

# Exercise 1:
a <- letters[1:3] # a vector of the 1st 3 letters of the alphabet
rev(rev(a))       # reverse vector a twice

# Exercise 1a: 
# ...
  1. Complete as many exercises as you can by Wednesday (23:59).

  2. Submit your script or output file (including all code) to the appropriate folder on Ilias.


A. In Class

Here are some warm-up exercises that repeat the basic concepts of the current chapter:

Numerical indexing

1a. Use numerical indexing to print the 18th, 9th, 19th, 3rd, 15th, 15th, and 12th letters of the alphabet. (Hint: The letters of the alphabet are stored in a vector letters.)

ix <- c(18, 9, 19, 3, 15, 15, 12)
s1a <- letters[ix]
s1a
#> [1] "r" "i" "s" "c" "o" "o" "l"

1b. Use numerical indexing to print the alphabet (a, b, c, etc.) except for its 18th, 9th, 19th, 3rd, 15th, 15th, and 12th letters. (Hint: Use your solution to 1a.)

s1b <- letters[c(-18, -9, -19, -3, -15, -15, -12)] # OR: 
s1b.alt <- letters[-ix]

# Check for identity: 
all.equal(s1b, s1b.alt) # ==> TRUE
#> [1] TRUE

1c. Use numerical indexing to print every 5th letter of the alphabet.

letters[seq(from = 1, to = length(letters), by = 5)]
#> [1] "a" "f" "k" "p" "u" "z"

1d. Use numerical indexing to print the alphabet in reversed order (without using the function rev()).

z.to.a <- letters[length(letters):1]
z.to.a
#>  [1] "z" "y" "x" "w" "v" "u" "t" "s" "r" "q" "p" "o" "n" "m" "l" "k" "j"
#> [18] "i" "h" "g" "f" "e" "d" "c" "b" "a"

# Check that z.to.a yields the same vector as rev(letters):
all.equal(z.to.a, rev(letters)) # ==> TRUE
#> [1] TRUE

1e. Use numerical indexing to print the last 10 letters of the alphabet (in reversed order).

letters[length(letters):(length(letters) - 9)]
#>  [1] "z" "y" "x" "w" "v" "u" "t" "s" "r" "q"

1f. Use numerical indexing to print all odd numbers between 1 and 100.

x <- 1:100
odds <- x[seq(from = 1, to = 100, by = 2)] # numerical
odds
#>  [1]  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45
#> [24] 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91
#> [47] 93 95 97 99

Logical indexing

2a. Use logical indexing to print all odd numbers between 1 and 100. (Hint: \(x\) is an odd number iff the remainder of dividing \(x\) by 2 is 1: x %% 2 == 1.)

x <- 1:100
odds.2 <- x[x %% 2 == 1] # logical 
odds.2
#>  [1]  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45
#> [24] 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91
#> [47] 93 95 97 99

# Check that both yield the same solution:
all.equal(odds, odds.2) # ==> TRUE
#> [1] TRUE

3b. Use logical indexing to print all numbers between 1000 and 2000 that are divisible by 17.

v <- 1000:2000
v[v %% 17 == 0]
#>  [1] 1003 1020 1037 1054 1071 1088 1105 1122 1139 1156 1173 1190 1207 1224
#> [15] 1241 1258 1275 1292 1309 1326 1343 1360 1377 1394 1411 1428 1445 1462
#> [29] 1479 1496 1513 1530 1547 1564 1581 1598 1615 1632 1649 1666 1683 1700
#> [43] 1717 1734 1751 1768 1785 1802 1819 1836 1853 1870 1887 1904 1921 1938
#> [57] 1955 1972 1989

3c. Use logical indexing to print all numbers between 1 and 100 that are not multiples of 2 or 3.

x <- 1:100
x[(x %% 2 != 0) & (x %% 3 != 0)]
#>  [1]  1  5  7 11 13 17 19 23 25 29 31 35 37 41 43 47 49 53 55 59 61 65 67
#> [24] 71 73 77 79 83 85 89 91 95 97

Indexing of pets and people

4. The following vectors describe a group of people and their beloved pets:

name <- c("Anna", "Boris", "Cloe", "David", "Emma", "Fred", "Gundula", "Heidi", "Ian", "John", "Ken", "Ludmilla", "Mary", "Nathan")
age <- c(21, 22, 29, 27, 22, 19, 26, 17, NA, 28, 35, 21, 23, 23)
pet <- c("cat", "dog", "cat", "rabbit", "dog", "cat", "cat", "horse", "dog", "hamster", "snake", "cat", "guinea pig", "nintendo")

Copy this data into your environment to do the following exercises.

4a. What are the names of the people younger than 21? (Solve this once without and once with using which().)

name[age < 21]
#> [1] "Fred"  "Heidi" NA
# OR: 
name[which(age < 21)]
#> [1] "Fred"  "Heidi"

4b. Who has a horse as a pet?

name[pet == "horse"]
#> [1] "Heidi"
# OR: 
name[which(pet == "horse")]
#> [1] "Heidi"

4c. Who has neither a cat nor a dog as a pet? What pets do these people have instead? (Try solving this once with and once without using the %in% operator.)

name[pet != "cat" & pet != "dog"]
#> [1] "David"  "Heidi"  "John"   "Ken"    "Mary"   "Nathan"
# OR: 
name[! pet %in% c("cat", "dog")]
#> [1] "David"  "Heidi"  "John"   "Ken"    "Mary"   "Nathan"

pet[pet != "cat" & pet != "dog"]
#> [1] "rabbit"     "horse"      "hamster"    "snake"      "guinea pig"
#> [6] "nintendo"
# OR: 
pet[! pet %in% c("cat", "dog")]
#> [1] "rabbit"     "horse"      "hamster"    "snake"      "guinea pig"
#> [6] "nintendo"
# OR: 
pet[name %in% name[pet != "cat" & pet != "dog"]]
#> [1] "rabbit"     "horse"      "hamster"    "snake"      "guinea pig"
#> [6] "nintendo"

4d. Which cat owners are older than 21?

name[pet == "cat" & age > 21]
#> [1] "Cloe"    "Gundula"

4e. What is the average (or mean) age of all dog owners? (Exclude any people with unknown age from this analysis.)

mean(age[pet == "dog" & !is.na(age)])
#> [1] 22

4f. How many people own a rodent as a pet? (Hint: Create a vector of all rodents first. Then determine the number of rodent owners, by using (a) logical or (b) numerical indexing, without and with using which().)

rodent <- c("hamster", "mouse", "guinea pig", "rabbit")

# a) logical indexing: 
sum(pet %in% rodent) 
#> [1] 3

# b) numerical indexing: 
length(which(pet %in% rodent)) 
#> [1] 3

4g. Print a list of all unique pets (once using unique() and once using duplicate()).

unique(pet)
#> [1] "cat"        "dog"        "rabbit"     "horse"      "hamster"   
#> [6] "snake"      "guinea pig" "nintendo"
pet[!duplicated(pet)]
#> [1] "cat"        "dog"        "rabbit"     "horse"      "hamster"   
#> [6] "snake"      "guinea pig" "nintendo"

Changing vector values

5a. Nathan broke his Nintendo and got a fish as a pet. Change the pet vector accordingly.

pet[name == "Nathan"] <- "fish"

5b. Ian’s previously unknown age is 38. Adjust the age vector accordingly and check how this changes the average age of dog owners.

age[name == "Ian"] <- 38
# OR:
age[is.na(age)] <- 38

mean(age[pet == "dog" & !is.na(age)]) # increased to
#> [1] 27.33333

5c. It turns out that the age of all cat owners was wrong and they actually are 3 years older. Change the age vector accordingly.

age[pet == "cat"] <- age[pet == "cat"] + 3

5d. According to new regulations by the Ministry of Pet Ownership (MoPO), a person under the age of 18 may only own a hamster. Thus, the two people currently owning a horse and a hamster agree to swap their pets. Determine their names, change the pet vector accordingly, and verify that the new assignment is legal.

name[pet == "hamster"]
#> [1] "John"
name[pet == "horse"]
#> [1] "Heidi"

# Get indices of the two pets to be swapped:
i.johns.hamster <- which(pet == "hamster")
i.heidis.horse  <- which(pet == "horse")

# Change the pet vector twice: 
pet[i.johns.hamster] <- "horse"
pet[i.heidis.horse]  <- "hamster"

# Verify the legality of pet ownership:
pet[age < 18]
#> [1] "hamster"

Checkpoint 1

At this point you completed all warm-up exercises. This is good, but please keep carrying on…


B. At Home

Bar Survey

The following contain (fictional) data from a survey of 200 people at one of two bars in Konstanz (Casba and Klimperkasten) last Friday night at 1:30am. Each person was asked their age and which brand of perfume (cologne) they were wearing. After answering this question, a (very busy) researcher recorded how long each person spent talking to other people at the bar. All data is stored in the following 6 vector objects:

  • age: The participant’s age (in years)
  • bar: Which bar the person went to (casba or klimperkasten)
  • cologne: Which cologne did the person wear (gio or calvinklein)
  • gender: The person’s gender (male or female)
  • id: An ID code indicating the participant in the form x.n, where x is the name of the bar the participant was at, and n is a random indexing number)
  • talk.time: The amount of time the person spent talking to other people (in minutes)

Thankfully, you don’t need to type in the collected data yourself. The objects are stored in an RData file online.

A. Load the vectors into your current R session by running the following code:

load(file = url("http://Rpository.com/down/data/WPA02.RData"))

B. The str() function provides basic information about the structure of objects. Familiarize yourself with the objects (‘bar’, id, gender, age, cologne, and talk.time) by running the str() function on each of the 6 vectors. Of what types are they? (Note that this information is also provided in the ‘Environment’ tab of RStudio.)

Reviewing the data

6a. Get the average, minimum and maximum age of people and describe the distribution of the age values. (Hint: Use hist() to plot a histogram.)

mean(age)
#> [1] 33.075
range(age) # gets min() and max():
#> [1] 17 82
hist(age)

6b. How many people were of each gender? (Hint: Use table())

table(gender)
#> gender
#>   f   m 
#> 100 100

6c. What was the average time a person spent talking? (Hint: Compute the mean of talk.time.)

mean(talk.time)
#> [1] 165.16

6d. What was the standard deviation of the talking times?

sd(talk.time)
#> [1] 46.5709

6e. Create talk.time.z a \(z\)-score transformation of talk.time and verify the mean and standard deviation of the talk.time.z is as expected. (Hint: The \(z\)-score of \(x\) is defined as \(\frac{x - mean(x)}{sd(x)}\).)

# z-Transformation: 
talk.time.z <- (talk.time - mean(talk.time)) / sd(talk.time)

# Verification: 
mean(talk.time.z) # should approximate 0:
#> [1] 8.612685e-17
sd(talk.time.z)  # should approximates 1:
#> [1] 1

Numerical indexing

7a. What was the value of the very first talk.time?

talk.time[1]
#> [1] 281

7b. Of what gender were the first ten participants?

gender[1:10]
#>  [1] "f" "f" "f" "m" "f" "m" "m" "f" "f" "m"
table(gender[1:10])
#> 
#> f m 
#> 6 4

7c. Which brand of cologne were the 10th through 20th participants wearing?

cologne[10:20]
#>  [1] "gio"         "calvinklein" "gio"         "calvinklein" "calvinklein"
#>  [6] "calvinklein" "gio"         "gio"         "gio"         "calvinklein"
#> [11] "gio"

7d. Which bar did the last participant go to? (Hint: Don’t write the indexing number directly; instead, index the vector using the length() function with the appropriate argument.)

bar[length(bar)]
#> [1] "casba"

Logical indexing (1 variable)

8a. How many people went to Casba? How many went to Klimperkasten?

sum(bar == "casba")
#> [1] 100
sum(bar == "klimperkasten")
#> [1] 100

8b. What percentage of all people went to Casba? (Hint: Use mean() combined with a logical vector)

mean(bar == "casba") * 100 
#> [1] 50

8c. How many people were less than 18 years old? How many were over 50?

sum(age < 18)
#> [1] 2
sum(age > 50)
#> [1] 15

8d. What percentage of people was at least 20 but not older than 30?

mean(age >= 20 & age <= 30) * 100
#> [1] 47

8e. How many people wore Gio? How many people wore Calvin Klein?

sum(cologne == "gio")
#> [1] 100
sum(cologne == "calvinklein")
#> [1] 100

8f. How many people talked to others for less than an hour?

sum(talk.time < 60)
#> [1] 7

8g. What percentage of talking times were longer than three hours?

mean(talk.time > 180) * 100
#> [1] 31.5

8h. What percentage of talking times were longer than 20 minutes but less than one hour?

mean(talk.time > 20 & talk.time < 60) * 100
#> [1] 3

Logical indexing (2 variables)

9a. What were the IDs of all people who went to Casba?

id[bar == "casba"]
#>  [1] "cb.7"  "cb.16" "cb.79" "cb.5"  "cb.25" "cb.62" "cb.4"  "cb.64"
#>  [9] "cb.21" "cb.6"  "cb.28" "cb.45" "cb.39" "cb.81" "cb.58" "cb.80"
#> [17] "cb.89" "cb.6"  "cb.15" "cb.76" "cb.53" "cb.27" "cb.74" "cb.42"
#> [25] "cb.29" "cb.61" "cb.19" "cb.50" "cb.85" "cb.1"  "cb.9"  "cb.33"
#> [33] "cb.8"  "cb.44" "cb.23" "cb.26" "cb.66" "cb.52" "cb.20" "cb.51"
#> [41] "cb.57" "cb.2"  "cb.55" "cb.37" "cb.24" "cb.17" "cb.36" "cb.56"
#> [49] "cb.13" "cb.47" "cb.86" "cb.72" "cb.32" "cb.3"  "cb.68" "cb.87"
#> [57] "cb.67" "cb.38" "cb.8"  "cb.63" "cb.69" "cb.90" "cb.43" "cb.1" 
#> [65] "cb.48" "cb.22" "cb.34" "cb.73" "cb.82" "cb.75" "cb.14" "cb.65"
#> [73] "cb.46" "cb.30" "cb.54"
#>  [ reached getOption("max.print") -- omitted 25 entries ]

9b. What was the age of the youngest and oldest person at Klimperkasten?

range(age[bar == "klimperkasten"])
#> [1] 18 82

9c. At which of the bars did people talk for longer? (Hint: Compare the average talking time of people who went to Casba vs. Klimperkasten.)

mean(talk.time[bar == "casba"])
#> [1] 194.08
mean(talk.time[bar == "klimperkasten"])
#> [1] 136.24

9d. Did people with different perfumes talk for different amounts of time? (Hint: Compare the average talking times of people wearing Gio vs. Calvin Klein.)

mean(talk.time[cologne == "gio"])
#> [1] 161.48
mean(talk.time[cologne == "calvinklein"])
#> [1] 168.84

9e. Based on what you’ve learned so far, if someone wants to talk as much (or for as long) as possible, what brand of cologne should they wear?

# They should wear Calvin Klein!

Changing vector values by indexing

In the following exercises, we’ll use indexing and assignment to change some values within a vector. Because we typically don’t want to change the original data, we’ll make all of our adjustments on new vectors.

10a. Create new objects bar.r, cologne.r and talk.time.r that are copies of the original bar, cologne and talk.time objects. (Hint: Just assign the existing vectors to new objects.)

bar.r <- bar
age.r <- age
cologne.r <- cologne
talk.time.r <- talk.time

10b. In the bar.r vector, change all "casba" values to "c" and change all "klimperkasten" values to "k".

bar.r[bar.r == "casba"] <- "c"
bar.r[bar.r == "klimperkasten"] <- "k"

10c. In the cologne.r vector, change the "gio" values to "G" and change the "calvinklein" values to "C"

cologne.r[cologne.r == "gio"] <- "G"
cologne.r[cologne.r == "calvinklein"] <- "C"

10d. As a measure against age-based discrimination, change all values in the age.r vector that are lower than 21 to 21 and all age values above 39 to 39. (Check and describe the new age distribution with hist().)

age.r[age.r < 21] <- 21
age.r[age.r > 39] <- 39
hist(age.r)

10e. In the talk.time.r vector, change all talk time values greater than 280 to 280. Confirm that you did this correctly by checking the maximum talking time in talk.time.r.

talk.time.r[talk.time.r > 280] <- 280
max(talk.time.r)
#> [1] 280

Checkpoint 2

If you got this far you’re doing very well. But as things are just getting more interesting, you shouldn’t stop just yet…

Solving a paradox…

Remember the question about the average talking times of people wearing different brands of cologne? Perhaps there is a (causal or correlational?) relationship between both of these variables?

11a. Make a prediction: Based on what you’ve learned so far, if someone wanted to talk to people for as long as possible, what brand of cologne should they wear?

# I already said Calvin Klein above... Why are you asking me again?

Let’s see if your prediction holds up…

11b. What was the average talking time of people who went to Casba and wore Gio?

mean(talk.time[bar == "casba" & cologne == "gio"])
#> [1] 294.8

11c. What was the average talking time of people who went to Casba and wore Calvin Klein?

mean(talk.time[bar == "casba" & cologne == "calvinklein"])
#> [1] 182.8889

11d. What was the average talking time of people who went to Klimperkasten and wore Gio?

mean(talk.time[bar == "klimperkasten" & cologne == "gio"])
#> [1] 146.6667

11e. What was the average talking time of people who went to Klimperkasten who wore Calvin Klein?

mean(talk.time[bar == "klimperkasten" & cologne == "calvinklein"])
#> [1] 42.4

11f. Based on what you’ve learned now, if someone’s goal was to talk to people for as long as possible, what brand of cologne should they wear?

# They should wear Gio, as mean talking times are longer for Gio in BOTH bars!
# Even though when aggregated across bars talking times are longer for Calvin Klein. 
# This seems just crazy!

You can visualize the data using the following code:

# Combine the relevant vectors in a dataframe: 
survey.df <- data.frame(bar, cologne, talk.time)

# Create pirateplots to visualize the data:
yarrr:::pirateplot(talk.time ~ cologne, data = survey.df, 
                   main = "Talk times by brand of cologne")

yarrr:::pirateplot(talk.time ~ cologne + bar, data = survey.df, 
                   main = "Talk times by brand of cologne at each bar")

What you’ve just seen is an example of Simpson’s Paradox. If you want to learn more about this, check out its Wikipedia page.

# Yes, I looked at the Wikipedia page...  Very interesting indeed!

Checkpoint 3

If you got this far you’re doing great. Let’s see whether you can also solve the following challenges…


C. Bonus challenges

12a. What were the mean ages of people of either gender at Casba?

# Create a vector with the ages of females and males at Casba:
age.casba.f <- age[bar == "casba" & gender == "f"]
age.casba.m <- age[bar == "casba" & gender == "m"]

# Get corresponding percentages:
mean(age.casba.f)
#> [1] 31.29167
mean(age.casba.m)
#> [1] 31.57692

# OR, do it all at once:
mean(age[bar == "casba" & gender == "f"])
#> [1] 31.29167
mean(age[bar == "casba" & gender == "m"])
#> [1] 31.57692

12b. What percentage of women wore Calvin Klein?

# Create a vector cologne.f with the colognes of women only:
cologne.f <- cologne[gender == "f"]

# What percentage of them wore Calvin Klein?
mean(cologne.f == "calvinklein") * 100
#> [1] 51

# OR all at once:
mean(cologne[gender == "f"] == "calvinklein") * 100
#> [1] 51

13a. Let’s examine so-called tireless talkers, which are defined as people who talked for at least 100 minutes. Compare the median talking times of this group for both bars.

median(talk.time[bar == "casba" & talk.time > 100])
#> [1] 184.5
median(talk.time[bar == "klimperkasten" & talk.time > 100])
#> [1] 149

13b. Compare the median talking times of tireless talkers wearing Gio for both bars.

median(talk.time[bar == "casba" & cologne == "gio" & talk.time > 100])
#> [1] 292
median(talk.time[bar == "klimperkasten" & cologne == "gio" & talk.time > 100])
#> [1] 149

13c. What percentage of participants either went to Casba and talked for less than 2 hours or went to Klimperkasten and talked for more than 2 hours, but no longer than 3 hours?

mean((bar == "casba" & talk.time < 120) | 
     (bar == "klimperkasten" & talk.time > 120 & talk.time <= 180))
#> [1] 0.405

14. Let’s make the Calvin Klein wearers look better by adding some bonus to their talking times.

14a. For all of the Calvin Klein wearers, add a random sample from a normal distribution (with a mean of 100 and a standard deviation of 10) to obtain their revised talking times. (Hint: Copy the original talk.time vector into a new object and add the bonus to Calvin Klein wearers by using logical indexing.)

# Step 0: Copy original talking time vector:
rev.talk.time <- talk.time
# mean(rev.talk.time)

# Step 1: Create a logical vector of people who wore Calvin Klein:
ck.log <- cologne == "calvinklein"

# Step 2: Get bonus times:
set.seed(4711)
bonus.time <- rnorm(n = sum(ck.log), mean = 100, sd = 10)

# Step 3: Add the bonus only to the times of Calvin Klein wearers:
rev.talk.time[ck.log] <- rev.talk.time[ck.log] + bonus.time
# mean(rev.talk.time)

## OR do Steps 1 to 3 at once:
set.seed(4711)
rev.talk.time[cologne == "calvinklein"] <- rev.talk.time[cologne == "calvinklein"] + rnorm(n = sum(cologne == "calvinklein"), mean = 30, sd = 5)
# mean(rev.talk.time)

14b. Verify that the averages of revised talking times for Calvin Klein wearers exceed the talking times for Gio wearers for each bar.

# A. Numerical solution:
# revised talk times at Casba:
mean(rev.talk.time[bar == "casba" & cologne == "calvinklein"])
#> [1] 313.6776
mean(rev.talk.time[bar == "casba" & cologne == "gio"])
#> [1] 294.8

# revised talk times at Klimperkasten:
mean(rev.talk.time[bar == "klimperkasten" & cologne == "calvinklein"])
#> [1] 165.2864
mean(rev.talk.time[bar == "klimperkasten" & cologne == "gio"])
#> [1] 146.6667

Visualizing raw data

The following code provides a graphical solution to 14b. by using the pirateplots() function of the yarrr package:

# Bonus: Graphical solution: 
# Combine the relevant vectors in a dataframe: 
df <- data.frame(bar, cologne, talk.time, rev.talk.time)



# (1) Use the yarrr package to create pirateplots:
yarrr::pirateplot(talk.time ~ cologne + bar, data = df,
                  main = "(a) Original talk times by cologne at each bar", 
                  pal = c("orange3", "steelblue3"))

yarrr::pirateplot(rev.talk.time ~ cologne + bar, data = df,
                  main = "(b) Revised talk times by cologne at each bar",
                  pal = c("orange3", "steelblue3"))

Alternatively, the ggplot2 package allows the following visualizations:

# (2) Use ggplot2 to show raw data and distributions:
library(ggplot2)
ggplot(data = df) +
  geom_violin(aes(x = bar, y = talk.time)) +
  geom_jitter(aes(x = bar, y = talk.time, color = cologne), width = .05, alpha = .5) +
  labs(title = "(a) Original talk times by cologne at each bar", x = "bar", y = "talk.time") + 
  scale_color_manual(values = c("orange3", "steelblue3")) + 
  theme_light() 


ggplot(data = df) +
  geom_violin(aes(x = bar, y = rev.talk.time)) +
  geom_jitter(aes(x = bar, y = rev.talk.time, color = cologne), width = .05, alpha = .5) +
  labs(title = "(b) Revised talk times by cologne at each bar", x = "bar", y = "rev.talk.time") + 
  scale_color_manual(values = c("orange3", "steelblue3")) + 
  theme_light() 


That’s it – hope you enjoyed working on this assignment!


[WPA02_answers.Rmd updated on 2017-11-09 16:35:42 by hn.]