Goals for this lab.

Setup and packages

As usual, we start by loading our two packages: mosaic and ggformula. To load a package, you use the library() function. I’ve put the code to load one package into the chunk below. Add the other package you need.

library(ggformula)


# put in the other package that you need here

Nearsightedness and Nightlights Revisited

Recall the study investigating whether there is a relationship between use of night lights in a child’s room before age 2 and the child’s eyesight condition a few years later. In Chapter 4, we presented a two-way table of counts from the study examining whether the child eyesight was associated with the light in the room while sleeping. We will recreate that two-way table of counts.

Load and explore the data

We’ll load the example data, LightSightData1.csv from this Url: https://raw.githubusercontent.com/IJohnson-math/Math138/main/LightSightData1.csv and since this is a csv file we will use the read.csv() function.

  1. Load and look at the data using the glimpse command and by looking in the data file after it is loaded
#load the data here
#NightlightData <- 

The explanatory variable and variable type: light level while sleeping (dark/ nightlight/ room light) a categorical variable

the response variable and variable type: sight (nearsighted/not nearsighted) a binary categorical variable

  1. To study a potential association between nearsightedness and bedroom light levels, we use the parameters \(\pi_{D}\), \(\pi_{NL}\), and \(\pi_{RL}\). Explain the meaning of these parameters in the space provided below.

\(\pi_{D}\)

\(\pi_{NL}\)

\(\pi_{RL}\)

  1. Our hypotheses in words are:

\[H_0:\textrm{ The proportion of near-sightness is the same in each of the three groups } \] \[H_0:\textrm{ At least one of the population proportions is different }\]

Equivalently, our hypotheses in notation are

\[ H_0: \pi_{dark} = \pi_{NL} = \pi_{RL}\] \[ H_a: \textrm{ not all of } \pi_{dark}, \pi_{NL}, \pi_{RL} \textrm{ are equal}\]

  1. Create a two-way table of counts and a two-way table of proportions for the data. Remember to use the correct order for the explanatory and response variables in your code.

  2. Are the validity conditions for a chi-square test met? Explain why or why not, and what you are checking.

  3. Create a segmented bar graph for the data. Use contrasting colors in your graph that will display nicely even if the document is printed in black and white. Give your graph a title and label your axes.

Recall, one way to calculate the chi-square statistic, \(\chi^2\), is to

  • standardize our sample proportions,
  • square these standardized values, and
  • add them up.

\[\displaystyle{\chi^2 = \stackrel{\Large \Sigma}{\small \textrm{i groups}} \left( \frac{\hat{p}_i - \hat{p}}{\sqrt{\frac{\hat{p}(1-\hat{p})}{n_i}}}\right)^2 }\]

where \(n_i\) is the sample size of group \(i\) with success proportion \(p_i\). The pooled proportion of success is denoted by $.

  1. Find: the overall pooled proportion of children who are near-sighted. Find, name, and enter into RStudio: the sample size of each group and the proportion of nearsighted children in each group. Calculate the standardized z-statistics and, from those values, the chi-square statistic. Make sure to write your code so that the value of \(\chi^2\) is displayed.
#total sample size for each group



# the pooled proportion of nearsighted children



# proportions of nearsighted children



# standardize proportion darkness



# standardize proportion night light



# standardize proportion room light



#chi-square statistic
#chiSq <- 
  1. Go to the ISI Applets and use the chi-square statistic to calculate a simulation-based p-value for the NightLight1 data. Note the data is pre-loaded in the applet. Report your p-value below.

simulation-based p-value:

  1. Use the chisq.test function to calculate a theory-based p-value. Write your executable code in the code chunk below. Record your theory-based p-value below. Does your value of the chi-squared statistic match the values calculated in the applet and with the chisq.test function?
#theory-based p-value from a chi-square test  

theory-based p-value:

$^2 = $

  1. Write your conclusions below. Does this study suggest that use of night lights and room lights causes an increase to the chance that a child is nearsighted? Why or why not? Do you have any concerns regarding the conclusion of this study? Explain.

Significance with context:

Causation:

Recall, there is another more general formula for calculating the chi-square statistic.

\[ \chi^2 = \ \stackrel{ \Large \Sigma}{\small \textrm{all cells}} \frac{(\textrm{observed count } - \textrm{ expected count})^2}{\textrm{expected count}}\]

  1. Use the worksheet from class to calculate \(\chi^2\) again using the more general formula. Include an image of your completed worksheet as shown below

Calculating Chi-squared general formula.

Calculating Chi-squared