Identify the predictor. Identify the criterion. Compute the regression line. Using the regression line – identify the predicted profiling score for a person with a score of 8 for prejudice.

PSY212: Graded Assignment #5

Chi Square and Regression

80 points

 This assignment includes conceptual and computations problems regarding regression as well as SPSS. Please note that each problem has several parts. For full credit, please respond to each part completely and show all your work. Take a picture and insert your work for all hand computation which are noted with the following icon  Insert them in the place noted. Do not included them only at the end and do not submit them in a separate file. Submissions must be in PDF or word format – no other formats will be accepted for credit.

DO NOT change the formatting or the numbering of the exam. Keep all questions and point values on this document. Simply respond in the places noted.

*Note that this assignment should be completed individually. Think of this assignment as a “take-home quiz.” Therefore, please do not discuss this assignment with anyone other than your professor. When complete submit an electronic copy on Bb.

 

 

TYPE all of your answers in addition to the pictures of your work.

  1. A store owner is trying to decide whether he should order equal amounts of four types of milk (skim, 1%, 2%, and whole). To help determine if certain kinds of milk are more popular, he records how many gallons of each kind of milk he sells over the course of a week. His data shows that he sold: 31 gallons of skim milk, 23 gallons of 1% milk, 37 gallons of 2% milk, and 25 gallons of whole milk. The owner is interested in whether the store sells equal amounts of the four types of milk.

Compute a Goodness of Fit Chi Square. Report and interpret your results including the value of chi square, degrees of freedom and statistical and practical significance.

df = k – 1

Where  = chi square, fo = frequency observed, fe = frequency expected, ϕ = phi (measure of practical significance)

Show your work here (for full credit you must show your work for all steps – type in below or attach a picture). Also show your work for the computation of phi.  10 points :

fo fe
Skim milk      
1% milk      
2% milk      
Whole milk      
Sum      

Write out your interpretation including frequencies for all groups and use the following format for the reporting of the results, χ² () =  , p .05, Φ =) to report your results. 10 points

 

 

  1. Santos et al. (1994) described the pique technique. They claimed that people are more likely to comply with strange requests than with “typical” ones, even if the strange request is larger.

To test this hypothesis, a researcher asked 160 strangers for money. He asked some people if they could spare a quarter, and asked others if they could spare 37 cents. He then recorded how many people gave him the amount requested and how many people did not.

Were people more likely to give the researcher 37 cents than they were to give him 25 cents?

Compute a Chi Square Test of Independence. Report and interpret your results including the value of chi square, degrees of freedom and statistical and practical significance.

 

Observed Frequencies

Show your work here (type in row totals, column totals and grand total here):

(for full credit you must show your work for all steps – type in below or attach a picture)

YES – Did give the $ NO – Did not give the $ Row Total
Asked for 25 cents 18 42  
Asked for 37 cents 54 46  
Column Total     Grand Total =

 

Show your work here (for full credit you must show your work for all steps – type in below or attach a picture). Also show your work for the computation of phi.  10 points :

Expected Frequencies (For each cell compute the following: row total x column total/grand total)

YES – Did give the $ NO – Did not give the $
Asked for 25 cents
Asked for 37 cents

 

df = (ka – 1)(kb – 1)

(for full credit you must show your work for all steps – type in below or attach a picture)

 

 

YES – Did give the $ NO – Did not give the $
Asked for 25 cents    
Asked for 37 cents    

 

Write out your interpretation including frequencies for all groups and use the following format for the reporting of the results, χ² () =  , p .05, Φ =) to report your results. 10 points

 

 

  1. A researcher wants to assess whether students’ level of prejudice predict attitudes toward racial profiling.  As part of a larger survey, students complete two scales pertaining to those variables.  Higher scores on the prejudice measure indicate greater prejudice and higher scores on the profiling scale indicate greater support for racial profiling.  Scores on both measures are obtained from each of 20 students.
  Prejudice Profiling    
  X Y XY X2
  7 5 35 49
  4 4 16 16
  5 3 15 25
  6 7 42 36
  2 2 4 4
  3 4 12 9
  6 7 42 36
  4 5 20 16
  8 6 48 64
  9 8 72 81
  6 6 36 36
  5 3 15 25
  4 4 16 16
  6 5 30 36
  2 4 8 4
  6 8 48 36
  5 4 20 25
  7 8 56 49
  8 4 32 64
  7 9 63 49
Sum 110 106 630 676
Mean 5.5 5.3    

 

Insert one picture for work for both c and d here and Type your answers below

  1. Identify the predictor. 2 points
  2. Identify the criterion. 2 points
  3. Compute the regression line. 20 points
  4. Using the regression line – identify the predicted profiling score for a person with a score of 8 for prejudice. 4 points

 

 

  1. Peterson is interested in assessing whether self-esteem predicts reading ability. The data is as follows:
Self Esteem Reading Ability
4 13
6 10
7 16
8 13
10 17
11 12
13 14
13 17

 

  1. Identify the predictor variable (IV) 1 point
  2. Identify the criterion variable (DV) 1 point
  3. Compute the Regression Analysis in SPSS using the data provided on self esteem and reading ability. Paste your output here (you will need to use a snipping tool to copy from SPSS, it will not allow you to copy and paste). Output has several parts – it must include: variables entered/removed, model summary, ANOVA, ad coefficients tables. 5 points
  4. Report results in sentence format – see handout instructions on what to include and how to format it. 5 points

 

 

 

In a recent poll, 260 people were asked if they liked dogs, and 29% said they did. Find the Margin of Error for this poll, at the 99% confidence level. Give your answer to four decimal places if possible.

RECENT POLL

In a recent poll, 260 people were asked if they liked dogs, and 29% said they did. Find the Margin of Error for this poll, at the 99% confidence level. Give your answer to four decimal places if possible.

Is the Error term? Are the ramifications of running your tires at 38 PSI? What is the manufacturer of the tire telling us about running a tire higher or lower than the recommended range?

Confidence Intervals We Use Everyday Discussion

Your car’s owner’s manual states that the tire pressure should be 32 Pounds Per Square (PSI) Inch + 3 PSI; what:

  • Is the range of properly operating tire pressure?
  • Is the Error term?
  • Are the ramifications of running your tires at 38 PSI?
  • What is the manufacturer of the tire telling us about running a tire higher or lower than the recommended range?

Determine the mean, the standard deviation, the variance, and the five-number summary. Using the fence rule, determine if data set has any outliers. Draw a histogram. Properly name the graph and the axes.

ASSIGNMENT D

The center thickness (in mils) of a sample of twenty-five contact lenses are given below

0.3978                   0.4019                   0.4031                   0.4044                   0.3984                   0.3972                   0.3981

0.3947                   0.4012                   0.4043                   0.4051                   0.4016                   0.3994                   0.3999

0.4062                   0.4048                   0.4071                   0.4015                   0.3991                   0.4021                   0.4009

0.3988                   0.3994                   0.4016                   0.4010

Using this data set and RStudio answer the five following questions

  1. Determine the mean, the standard deviation, the variance, and the five-number summary.
  2. Using the fence rule, determine if data set has any outliers.
  3. Draw a histogram. Properly name the graph and the axes.
  4. Draw a boxplot (horizontal).
  5. Construct a qqplot and a qqline and determine if this data set is approximately normal.
  6. To experimentally verify the central limit theorem, take 1000 random samples of size from a population that is distributed  exponentially with .  To this end, you must show

 

  • The histogram of the distribution is approximately normal.

HINT: Use the following session commands:

> xbar <- rep(0,1000)

> for(i in 1:1000){xbar[i]=mean(rexp(33, 3))}

> hist(xbar, prob = TRUE)

  • Its expected value is .

HINT: Use

        > mean(xbar)

What is the expected value of an exponential distribution?

  • Its variance is .

HINT: Use

        > sd(xbar)

What is the standard deviation of an exponential distribution?

Clearly state the analyses you intend to do Using R Studio. Report the test results (include all required values.

Stats Research Report

What need to include in research report:

1. Description of the study / The research question. If you know how the variables were measured, include a short description of that. (3 pts)

2. Clearly state the analyses you intend to do Using R Studio. (3 pts)

3. Report the test results (include all required values. (3 pts)

4. An appropriate plot (Make sure it’s clearly labeled!. (3 pts)

5. Some kind of conclusion: say what the test results mean for the (hypothetical) study. Also mention if you notice anything strange or have any concerns about the results (e.g., a limitation of they study or maybe something you’d lack at in a follow-up. (3 pts)

How many redhorse individuals have been measured for LEN_MM (total length) in this dataset? What years does this dataset span?

Testing Goodness-of-Fit for a Single Categorical Variable

How many redhorse individuals have been measured for LEN_MM (total length) in this dataset? (1/4 pt)

What years does this dataset span? (1/4 pt)

Which redhorse species is most abundant in Minnesota? (1/4 pt)

Describe the distribution shape of LEN_MM for redhorse overall (1pt).

Watch the videos and complete the Excel sheet by giving a grade based on the criteria and writing a small review of 3 stat project videos.

Review and grade 3 videos of linear regression

Watch the videos and complete the Excel sheet by giving a grade based on the criteria and writing a small review of 3 stat project videos.

Set a criterion for a decision. Collect sample data and compute an F statistic. Make a statistical decision. Draw a conclusion and interpret results.

ANOVA Google Sheets Homework

Suppose we are interested in studying the effect of different types of fertilizer on the yield of wheat crops. We have collected data from four different farms, each using a different type of fertilizer. The yield in tons per acre for each farm is shown below:

  • Farm 1: 2.5, 3.1, 3.5, 3.8, 4.2
  • Farm 2: 2.1, 2.5, 2.9, 3.2, 3.7
  • Farm 3: 2.8, 3.2, 3.6, 3.9, 4.1
  • Farm 4: 2.3, 2.7, 3.0, 3.3, 3.8

To analyze this data using One-Way ANOVA, we would treat the type of fertilizer (i.e., the four different farms) as the categorical independent variable and the yield as the dependent variable. Use One-Way ANOVA in google sheets to find whether there is a significant difference in crop yield between the four farms.

Turn in a Word document answering the following:

Step 1: State hypotheses

Step 2: Set a criterion for a decision.

Step 3: Collect sample data and compute an F statistic.

Step 4: Make a statistical decision.

Step 5: Draw a conclusion and interpret results.

Have appropriate interaction plots and explain whether the interaction is significant or not. Develop the final regression model and have relevant contour plot from your final model. Analyze the residual to check whether there are any potential concerns about the validity of the assumptions.

Stat 490 Group project

Due 04/26

In experimental design, designing an experiment is as important as analyzing the results. Through the semester we have learned 4 ways to design an experiment

  1. Design an experiment that is BIBD.
  2. Design an experiment that is an optimal design.

For these two types of designs you need to use R language to help you unless for simple cases. Even for the same input, you may end up different designs if you rerun the code. For example, the runs in the same treatment group may end up with different blocks. Or you may end up with different subset of the total runs that is still optimal.

  1. Design a blocked experiment with confounding structure.
  2. Design a fractional experiment with aliases structure.

For these two types of designs you can use either R or Excel to help you design the experiment. With the same confounding structure, same aliases structure, you will end up with the same design.

In this group project, you are asked to design 4 experiments and analyze the results. There is an excel file called “group_response.csv” on canvas, with two variables  and  in it. These will be the response variables for your analysis below.

Experiment 1

  1. Design a BIBD that has 6 treatment groups and 10 blocks. What is the number of runs in each treatment group? What is the number of runs in each block? What is the value of ?
  2. If the response variable is (sorted by treatment and block), analyze the results. Is there a significant main effect at treatment group? Is there a significant block effect?
  3. Analyze the residual to check whether there are any potential concerns about the validity of the assumptions.

 

Experiment 2

  1. Suppose that factor A has 3 levels, factor B and C each has 2 levels. Assuming you only have budget to have 30 runs, design a D-optimal experiment with these 3 factors such that a model with all first order term and second order term for A can be estimated, and there are 3 replicates in each treatment combination.
  2. If the response variable again is (sorted by treatment A, B, C), analyze the results. Is there a significant main effect at factor A, B or C?
  3. Have appropriate interaction plots and explain whether the interaction is significant or not. Develop the final regression model and have relevant contour plot from your final model.
  4. Analyze the residual to check whether there are any potential concerns about the validity of the assumptions.

 

Experiment 3

  1. Design a blocked experiment. Choose two three order or higher order interactions to be confounded with the blocks. To choose the confounded interaction terms, use the names from every member’s name. Pick one distinct letter (A-E) from each person’s name and form the interaction.
  2. If the response variable is (sorted by the order from the output from conf.design() function, that is the order of Blocks, E, D, C, B, A), analyze the results. Identify the significant factors and develop your model.
    1. Be careful that when you run Yates analysis the data should be in standard order, while the output from conf.design() function is not in standard order.
  3. Is there a significant block effect?
  4. Write down the complete confounding structure. Confirm the confounding structure using SS.
  5. Analyze the residual to check whether there are any potential concerns about the validity of the assumptions. Analyze dispersion effect if there is any.

Experiment 4

  1. Design a To choose the generators for the design, use the names from every member’s name. Pick one distinct letter (A-G) from each person’s name and form the generator.
  2. If the response variable again is , analyze the results. Identify the potential significant factors and develop your model.
    1. Use the same order as your data in experiment 3, that is you can just add two more factors to your experiment 3 data using the generators you have
    2. Be careful that conf.design() function is based on 0/1 coding, while defining relation is based on -1/1 coding.
  3. Write down the aliases structure for the main effects and two order interactions (ignore higher order interactions) and confirm the resolution of the design.
  4. Analyze the residual to check whether there are any potential concerns about the validity of the assumptions. Analyze dispersion effect if there is any.

 

 

What is the probability that a person is divorced and has a bachelor’s degree? What is the probability that someone known to have an advanced degree is married with spouse present?

Census Education

Using the data in the Excel file Census Education Data, construct a joint probability distribution for marital status and educational status. You can make diagram in excel

  1. What is the probability that a person is divorced and has a bachelor’s degree?
  2. What is the probability that someone known to have an advanced degree is married with spouse present?
  3. What is the probability that an individual who is known to have never married has at least an associate’s degree?