Lab 5: Inference & Confidence Intervals for Two Means

Goals for this lab.

Identify when a theory-based approach would be valid to find the p-value or a confidence interval when evaluating the relationship between one binary and one quantitative variable.
Use the Theory-Based methods to find theory-based p-values and confidence intervals for a test of two means.
Draw appropriate conclusions from Theory-based techniques for two means

Setup and packages

As usual, we start by loading our two packages: mosaic and ggformula. To load a package, you use the library() function, wrapped around the name of a package. I’ve put the code to load one package into the chunk below. Add the other package you need.

library(mosaic)
library(ggformula)
# put in the other package that you need here

Loading in data

We’ll load the example data, VideoAggression.txt from this Url: http://www.isi-stats.com/isi/data/chap6/VideoAggression.txt and since this is a text file we will use the read.table() function.

#load data
VidGames <- read.table("http://www.isi-stats.com/isi/data/chap6/VideoAggression.txt", header=TRUE)

Research Question

Does playing violent video games lead people to be more or less aggressive?

Study design

To investigate an association between violent video games and aggressive behavior, British researchers Hollingdale and Greitemeyer (2014) randomly assigned 49 students from a university in the United Kingdom to play Call of Duty: Modern Warfare (a violent video game) and 52 students to play LittleBigPlanet 2 (a nonviolent/neutral video game). After 30 minutes of playing the video games, the subjects were asked to complete a marketing survey investigating a new hot chili sauce recipe. They were told they were to prepare some chili sauce for a taste tester and that the taste tester “couldn’t stand hot chili sauce but was taking part due to good payment.” They were then presented with what appeared to be a very hot chili sauce and asked to spoon what they thought would be an appropriate amount into a bowl for a new recipe. The amount of chili sauce was weighed in grams after the participant left the experiment. The amount of chili sauce was used as a measure of aggression: The more chili sauce, the greater the subject’s aggression.

Experiment or Observational Study? Notice that this is an experiment because the participants were randomly assigned to the two treatment groups.

Explanatory Variable: type of video game played (violent/nonviolent) categorical binary

Response Variable: ChiliSauce (grams of chili sauce added) quantitative

Hypotheses

Null hypothesis: There is no association between the type of video game played and the level of behavioral aggression as measured by the amount of chili sauce added to the recipe.

Alternative hypothesis: There is an association between the type of video game played and the level behavioral aggression as measured by the amount of chili sauce added to the recipe.

Parameters of interest:

\(\mu_{violent}\), the long-run mean amount of chili sauce used in the recipe by people after playing a violent video game

\(\mu_{nonviolent}\), the long-run mean amount of chili sauce used in the recipe by people after playing a nonviolent video game

Now we can rewrite our hypotheses in terms of the notation for our parameters \(\mu_{violent}\) and \(\mu_{nonviolent}\):

\[H_0:\mu_{violent}=\mu_{nonviolent}\] \[H_a:\mu_{violent}\neq \mu_{nonviolent}\] or equivalently \[H_0:\mu_{violent}-\mu_{nonviolent}=0\] \[H_a:\mu_{violent}- \mu_{nonviolent} \neq 0\]

Explore the Data

Let’s start by plotting boxplots displays of our data. We will use the command gf_boxplot and input ResponseVariable ~ ExplanatoryVariable. This order reflects that we want to view the response variable in terms of the explanatory. Notice that the categories of the explanatory variable are on the horizontal axis and the response variable on the vertical axis.

gf_boxplot(ChiliSauce ~ VideoGame, data=VidGames, xlab="Type of video game", ylab="Grams of Chili Sauce")

The five number summary (and more) can be obtained using the favstats command, short for favorite statistics.

favstats(ChiliSauce ~VideoGame, data=VidGames)

##    VideoGame min   Q1 median    Q3 max      mean        sd  n missing
## 1 nonviolent   0 2.75    8.5 11.25  38  9.057692  7.652152 52       0
## 2    violent   1 5.00   11.0 22.00  63 16.122449 15.296558 49       0

We can also graph histograms of the chili sauce amounts for the two groups by first filtering the data into groups, then graphing each group. Notice that each histogram includes a title corresponding to the type of video game played. Also notice that the label on the horizontal axis includes the units (grams) for the amount of chili sauce.

Vio <- filter(VidGames, VideoGame=='violent')
Nonvio <- filter(VidGames, VideoGame=='nonviolent')

gf_histogram(~ChiliSauce, data=Vio, title = "Violent video game players", xlab = "Chili sauce amount (in grams)")

gf_histogram(~ChiliSauce, data=Nonvio, title = "Nonviolent video game players", xlab = "Chili sauce amount (in grams)")

From our favstats calculations we can see that the sample size of the two groups, violent and nonviolent video game players are 49 and 52, respectively. These are our two sample sizes \(n_{vio} = 49\) and \(n_{nonvio} = 52\). For the calculations that follow, we define variable names and values for the two means, standard deviations and sample sizes.

n_vio = 49
n_vio

## [1] 49

xbar_vio = 16.122449
xbar_vio

## [1] 16.12245

sd_vio = sd(~ChiliSauce, data=Vio)
sd_vio

## [1] 15.29656

n_nonvio = 52
n_nonvio

## [1] 52

xbar_nonvio = 9.057692
xbar_nonvio

## [1] 9.057692

sd_nonvio = sd(~ChiliSauce, data=Nonvio)
sd_nonvio

## [1] 7.652152

Two Means: Validity Conditions for theory-based inference and confidence intervals

Validity Conditions: The quantitative variable should have a symmetric distribution in both groups, or you should have at least 20 observations in each group and the sample distributions should not be strongly skewed.

Looking back at the histogram plots, neither the violent game nor nonviolent game group data has a symmetric distribution; both are right skewed. However, our validity conditions are met because our sample sizes of 49 in the violent game group and 52 in the nonviolent game group are both much larger than 20 and our sample distributions are not strongly skewed.

Calculate the standardized statistic, a \(t\)-statistic.

Let’s start by finding our observed statistic.

diff_means <- xbar_vio - xbar_nonvio
diff_means

## [1] 7.064757

For two means, the standard error of \(\bar{x}_1 - \bar{x}_2\) is given by

\[ SE=\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}. \]

Using our numerical values from above

SE = sqrt(sd_vio^2/n_vio + sd_nonvio^2/n_nonvio)
SE

## [1] 2.429252

Next, we can calculate the standardized statistic using the formula

\[ t = \frac{\textit{statistic} - \textit{hypothesized value in null}}{\textrm{SE(}\textit{statistic})}= \frac{\bar{x}_1 - \bar{x}_2 - 0}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}\]

diff_means/SE

## [1] 2.908203

What does this standardized statistic suggest regarding our hypothesis test?

Our standardized \(t\)-statistic is 2.9. A \(t\)-statistic that is larger than 2 is strong evidence against the null hypothesis and larger than 3 is very strong evidence against the null. So we have strong evidence (nearly very strong evidence) against the null hypothesis that there is no association between the type of video game played and aggression as measured by added grams of chili sauce to a recipe.

Calculate the 2SD confidence interval

To do find confidence intervals for a difference of means, we use the standard error calculation from above.

Recall that the basic formula for a confidence interval is \[ \textit{statistic } \pm \textit{ margin of error}\]

In the setting of a difference in two means we have

\[ (\bar{x}_1 - \bar{x}_2 ) \pm \textit{ multiplier } \cdot \sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}\] When the sample size in each group is large, the \(t\)-distribution is close to the normal distribution and the multipliers are close to 1.96 for 95% confidence intervals (or approximately 2 as we will use), 1.645 for the 90% confidence intervals, and 2.576 for 99% confidence intervals.

So our 2SD confidence interval can be calculated using the formula \[ (\bar{x}_1 - \bar{x}_2 ) \pm 2 \cdot \sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}\]

We can use the standard error for the difference between the two means to calculate a 2SD confidence interval as follows.

lower_endpoint <- diff_means - 2*SE
lower_endpoint

## [1] 2.206254

upper_endpoint <-diff_means + 2*SE
upper_endpoint

## [1] 11.92326

Our 2SD confidence interval is (2.21, 11.92).

Calculate the \(p\)-value. Inference for difference of two means.

Next we calculate the theory based \(p\)-value using the same command used for inference with one mean, namely t.test. We must be careful to enter our variables in the correct order, Response ~ Explanatory, include the name of the data file and specify whether our alternative hypothesis is two.sided, greater, or less.

#inference for two means
t.test(ChiliSauce ~ VideoGame, data = VidGames, alternative = "two.sided")

## 
##  Welch Two Sample t-test
## 
## data:  ChiliSauce by VideoGame
## t = -2.9082, df = 69.662, p-value = 0.004874
## alternative hypothesis: true difference in means between group nonviolent and group violent is not equal to 0
## 95 percent confidence interval:
##  -11.910160  -2.219353
## sample estimates:
## mean in group nonviolent    mean in group violent 
##                 9.057692                16.122449

The \(p\)-value for our hypothesis test is 0.0048, which implies very strong evidence against the null hypothesis.

#USE THIS COMMAND if you have the means, standard deviations and sample sizes but not the data.  Note that a new library is needed. Remember to install the package first, then load the library.
library(BSDA)

#the command tsum.test calculates a two sample t-test from summary values
tsum.test(xbar_vio, sd_vio, n_vio, xbar_nonvio, sd_nonvio, n_nonvio, alternative="two.sided", conf.level = 0.95, mu=0)

## 
##  Welch Modified Two-Sample t-Test
## 
## data:  Summarized x and y
## t = 2.9082, df = 69.662, p-value = 0.004874
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   2.219353 11.910161
## sample estimates:
## mean of x mean of y 
## 16.122449  9.057692

Conclusions

Significance: Our standardized \(t\)-statistic is 2.9, meaning that the observed difference in sample means of 7.065 grams is 2.9 standard deviations away from the hypothesized difference of 0. A \(t\)-statistic that is larger than 2 is strong evidence against the null hypothesis and larger than 3 is very strong evidence against the null. So we have strong evidence (nearly very strong evidence) against the null hypothesis that there is no association between the type of video game played and aggression as measured by added grams of chili sauce to a recipe.

The \(p\)-value for our hypothesis test is 0.0048, which implies very strong evidence against the null hypothesis. If there were really no association between the type of video game played and the amount of chili sauce used, there would only be a 0.0048 chance of obtaining, by random assignment alone, sample means as far apart or even farther apart as were found in this study. Thus, we have statistically significant evidence that there is a genuine difference in mean chili sauce amounts between those that play violent video games and those that play nonviolent video games.

Estimation:

Our 2SD 95% confidence interval is (2.21, 11.92), so we are about 95% confident that, in the long run, the mean amount of chili sauce used would be 2.21 to 11.92 grams higher for those that play a violent video game compared to those that play a nonviolent video game. Notice that this interval contains only positive values and does not contain 0. Thus, we are 95% confident that the difference in mean amounts of hot chili sauce used between those who play violent video games and those that play nonviolent video games is not 0.

Causation:

Considering that the study was a randomized experiment, we can conclude a cause-and-effect relationship between the type of video game and the amount of chili sauce used.

Generalization:

There was no random sampling in this study. The participants in this study were all university students in the U.K. This limits the population to whom we can generalize these results. The association between type of video game played and amount of chili sauce used may not hold true for people from other cultures and of other ages.

Exercises

Anchoring Anchoring is “the common human tendency to rely too heavily, or ‘anchor,’ on one trait or piece of information when making decisions.” (Source: Wikipedia.) A group of students taking an introductory statistics course at a four-year university in California were asked to guess the population of Milwaukee, Wisconsin. Some of the students were randomly chosen to be told that the nearby city of Chicago, Illinois, has a population of about 3 million people, while the rest of the students were told that the nearby city of Green Bay, Wisconsin, has a population of about 100,000. Previous studies have shown that these numbers serve as a psychological anchor, so people told about Chicago tend to guess a higher population for Milwaukee than people told about Green Bay. (For more about this phenomenon, see the book Nudge: Improving Decisions about Health, Wealth, and Happiness by Richard H. Thaler and Cass R. Sunstein.) The purpose in analyzing the data is to see whether we find strong evidence of this phenomenon among students like the ones in this study.

The data for this study can be found here http://www.isi-stats.com/isi/data/chap6/Milwaukee.txt.

Load the data from the Url: http://www.isi-stats.com/isi/data/chap6/Milwaukee.txt and name the data Milwaukee. How many observational units are there? What are the names and types of the variables? Which variable is the explanatory variable? Which is the response?

Milwaukee <- read.table("http://www.isi-stats.com/isi/data/chap6/Milwaukee.txt", header=TRUE)

Observational units and number:

Variables and type:

Explanatory variable:

Response variable:

Define (in words) the parameters of interest of this study. Also, assign symbols to the parameters. The population in question is students like the ones in the study. (Hint: the notation \(\mu_{GB}\) and \(\mu_{C}\) might be useful.)
State the appropriate null and alternative hypotheses

\[H_o: \]

\[H_a: \]

Display the data with boxplots and histograms. Label your axes with units and give your plots titles that include applicable context.
Calculate the favorite statistics for the estimate size of Milwaukee for the two groups of anchor cities.
Use R as a calculator to find and display the following:

\(\bar{x}_1 - \bar{x}_2\)
the standard error \(SE(\bar{x}_1 - \bar{x}_2)\)
the standardized statistic
the 2SD confidence interval for the difference in means
Please remember to write your formulas in a code chunk so that the answers are displayed in your knitted document.

Check the Validity Conditions for a two sample \(t\)-test. Explain what you are checking, any numerical values you are comparing, and whether or not the conditions have been met.
Use the proper command in R to calculate the theory based \(p\)-value for the hypothesis test.
Calculate a theory-based 99% confidence interval and interpret the resulting interval in the context of the study.
Based on your findings, state a complete conclusion about the study. Be sure to address significance (p-value and standardized statistic), estimation (confidence interval), causation, and generalization.

Significance with context:

Estimation with interpretation:

Causation:

Generalization: