Statistics Archives - Page 24 of 29

Discuss what strategies you have implemented to get things done.

August 12, 2020/in Statistics /by Eunice

In this journal, you have the opportunity to discuss privately with your instructor the status of your final project. At this point, you have completed all of your milestones and have received feedback, and you should be preparing to hand in your final project in Module Nine. What questions do you have for your instructor that might impact your final project? Do you feel that you understand the project? Are you on track to hand it in on time? This is a chance to alert your instructor if you need help, and to seek assistance and feedback. If you feel that you are in a good position, discuss what strategies you have implemented to get things done.

What is the impact of using a linear regression model in this case?

August 12, 2020/in Statistics /by Eunice

Module 7 Discussion:

For your initial post, choose one of the following two prompts to respond to. Then in your two follow up posts, respond at least once in each option. Use the discussion topic as a place to ask questions, speculate about answers, and share insights. Be sure to embed and cite your references for any supporting images.

Option 1:

Think of a problem dealing with two possibly related variables (Y and X) that you may be interested in. Share your problem and discuss why a regression analysis could be appropriate for this problem.

Specifically, what statistical questions are you asking? Why would you want to predict the value of Y? What if you wanted to predict a value of Y that’s beyond the highest value of X (for example if X is time and you want to forecast Y in the future)?

You should describe the data collection process that you are proposing but you do not need to collect any data.

Option 2:

Give an example of a problem dealing with two possibly related variables (Y and X) for which a linear regression model would not be appropriate. For example, the relationship could be curved instead of linear, or there may be no significant correlation at all.

What is the impact of using a linear regression model in this case? What options, other than linear regression, can you see? You do not need to collect any data.

For your response to a classmate (two responses required, one in each option), examine your classmate’s problem to assess the appropriateness and accuracy of using a linear regression model. Discuss the meaning of the standard error of the estimate and how it affects the predicted values of Y for that analysis.

Study Material for this Discussion:

Requirements for a Regression Analysis The calculation of the linear regression line Y = mX + b in Module Six is strictly a mathematical one. The statistical methods that analyze the significance of the correlation between X and Y, however, require specific statistical assumptions. Even when X and Y have a linear relationship, individual values of Y will not be exactly equal to the predicted values mX + b. The difference is called the error, written as ε (the Greek letter epsilon). This error is a random quantity that can be different for every data point. If (x1, y1) is the first data value, then the first error ε1 is: ε1 = y1 − (mx1 + b). There are two main requirements that must be met in order to perform a statistical analysis: 1. X and Y are linearly related with a random deviation (error) affecting each measurement. In regression analysis theory, the population parameter for the slope is usually written as β1 (the Greek letter beta, instead of m) and the population parameter for the intercept is usually written as β0 (instead of b). The linear model is thus: Y = β1 X + β0 + ε 2. The errors ε (one for each data point) are independent of each other and normally distributed with mean 0 and the same standard deviation. With those two requirements, techniques for hypothesis testing can be used. Hypothesis Tests of the Slope In Module Six, the significance of the correlation coefficient r was tested; however, it is much more common to test the significance of the slope. The formal statement of a hypothesis test for the slope is: H0: β1 = 0 2 MAT 240 Module Seven H1: β1 ≠ 0 The alternative hypothesis is usually two-tailed. The estimate of the slope, b1 (also written as m), has a t distribution with n–2 degrees of freedom (n is the number of data values). The calculation of the standard error of b1, the estimate of β1, can be lengthy; the use of software (StatCrunch or Excel) for the computations is strongly encouraged. As for previous hypothesis tests, this can use either the classical method (critical value) or the p-value method. Rejecting the null hypothesis indicates that there is a significant correlation between X and Y. Not rejecting the null hypothesis indicates that the data is inconclusive. Hypothesis Tests of the Intercept In addition to hypothesis tests of the slope, hypothesis tests of the intercept can be performed. The parameter β0 is the value of Y when X = 0. The value β0 = 0 may be of interest, but other values could be meaningful too. For example, in an economic model where Y is the total cost and X is the number of units produced, β0 can be interpreted as the fixed cost. The formal statement of a hypothesis test for the intercept is: H0: β0 = some number H1: β0 ≠ some number where the number is chosen to be appropriate for the application under study. The calculation of the standard error of b0, the estimate of β0, can be lengthy, and students are strongly encouraged to use software (StatCrunch or Excel) for the computations. The Multiple Linear Regression Model The linear regression models studied so far use only one independent variable—an X. In real life, there are often multiple independent or explanatory variables that should be considered. A multiple linear regression model is one that uses multiple independent variables (X1, X2, X3, …, Xk) to model one dependent variable (Y). A model that uses only one independent variable is called a simple linear regression. The equation for the multiple linear regression line is: Y = β1X1 + β2X2 + … + βkXk + β0 MAT 240 Module Seven 3 where X1, X2, …, Xk are k independent variables, β1, β2, …, βk are their coefficients (i.e., slopes), and β0 is the intercept. As for simple linear regression, the errors ε (one for each data point) need to be independent of each other and normally distributed with mean 0 and the same standard deviation. Interpreting Multiple Linear Regression Coefficients Analyzing multiple linear regression models is complicated because the variables X1, X2, …, Xk are usually correlated among themselves. The coefficient β1 measures the effect of the variable X1 on Y, but only when the values of X2, X3, …, Xk do not change. Rejecting the null hypothesis of β1 = β2 = … = βk (using either the p-value method or the classical method with a critical value of the F distribution) means that the variables X1, X2, …, Xk, taken as a whole, are significantly correlated with Y. There is no conclusion about any specific variable.

what is the relationship between years of experience of the instructor and the scores obtained on the final exam given at the conclusion of the courses?

August 12, 2020/in Statistics /by Eunice

Create a professional, client-ready document for Teton Grand where a correlation is run on a set of client data. Specifically, what is the relationship between years of experience of the instructor and the scores obtained on the final exam given at the conclusion of the courses? You will need to decide if you are using a Pearson correlation or a Spearman correlation for this analysis.

In approximately 2 pages, explain the results and potential action steps for Teton Grand based on what you find. and adhere to the APA format and guidelines (i.e. citing sources appropriately, including references and using APA formatted tables to show data). Data file and variable key are attached.

Describe in detail the dependent and independent variable(s) that you believe would be appropriate.

August 12, 2020/in Statistics /by Eunice

A written posting that is 250 to 450 words (i.e., a narrative less than 2 pages in APA format) that is embedded in the discussion board (no attachments).
Your initial/original posting should address the situation/points below.

Discuss an example of a professional decision-making situation for which you believe it would potentially be beneficial to use simple or multiple linear regression analysis to improve the quality of the decision-making process. You shall be expected to select a situation that involves a decision-making process that is complex and involves important outcomes. Your posting shall be expected to address each of the following matters:

Describe the situation in detail, including describing the complexity of the decision(s) being made and the importance of the decision(s) being made.
Describe in detail the dependent and independent variable(s) that you believe would be appropriate.
Describe in detail how you envision the proposed quantitative analysis technique(s) that would potentially improve decision-making relative to the described situation.
Describe in detail any potential challenges or impediments to using simple or multiple linear regression analysis in the described situation that you might foresee.

Use the discussion topic as a place to ask questions, speculate about answers, and share insights.

August 3, 2020/in Statistics /by Eunice

Module 6-1 Discussion:

Choose one of the following two prompts to respond to. In your two follow up posts, respond at least once to each prompt option. Use the discussion topic as a place to ask questions, speculate about answers, and share insights. Be sure to embed and cite your references for any supporting images.

Perform the following analysis by analyzing a possible linear relation between two variables.

Option 1:

Using the data set provided from the NOAA for Manchester, NH, select any month between January 1930 and December 1958. Use the variables “MMXT” and “MMNT” for your analysis. Begin with your chosen month and analyze the next 61 data values (i.e. 5 years and 1 month) to determine if a relationship exists between the maximum temperature (MMXT) and the minimum temperature (MMNT).

Using Excel, StatCrunch, etc. create a scatter plot for your sample. Determine the
linear regression equation and correlation coefficient. Embed this scatter plot in your
initial post.

For your responses to your classmates (two responses required): Discuss the relationships between the scatter plot, the correlation coefficient, and the linear regression equation for the sample. Comment on the similarities and differences between your correlation and linear regression equation and that of your classmates. Why are there differences since you are drawing from the same population? Did you expect the differences will be large? Why or why not?

Option 2:

Write a mathematical scenario to describe each of the scatter plots.

Answer prompts made by fellow students with the following information:

Discuss an alternative scenario to represent the data in the scatter plots. In this scenario, assume there is correlation but where it would be inappropriate to conclude causation.

What is the 95% Confidence Interval of the Mean? Set up and perform a hypothesis test to evaluate the competitor’s claim for the mean. Use 95% confidence. Set up and perform a hypothesis test or a confidence interval to evaluate the competitor’s claim for the standard deviation. Use 95% confidence.

July 7, 2020/in Statistics /by Donald

Statistics: The modulus of tungsten alloy armor plate is typically 147 GPa, with a standard deviation of < 10 GPa. Your company is a large supplier of this armor plate to the manufacturers of M1 Abrams tank industry. Your competitor recently announced that his armor plate has a modulus of at least 167 GPa, with a standard deviation of <7 GPa. This claimed greater modulus and “tighter” sigma will enable armor plate that is 20% lighter than those made from your plate. So, you are anxious to evaluate the claim. You obtain twenty samples of this plate and measure the following moduli. (Note if there are any questionable data points. Discuss the normality of the data, but analyze as if the data were normal. These data were generated with a normal random number generator):

167.7
149.3
160.2
165.4
165.0
167.5
158.1
167.5
164.5
163.6
165.5
160.5
165.5
164.1
165.1
168.4
166.2
167.0
169.7
167.2

What is the 95% Confidence Interval of the Mean?

Set up and perform a hypothesis test to evaluate the competitor’s claim for the mean. Use 95% confidence.

Set up and perform a hypothesis test or a confidence interval to evaluate the competitor’s claim for the standard deviation. Use 95% confidence.

In a large population, what % of the samples would you expect to be below 160 GPa?

What chance is there for a Type II Error, if the mean should be >162.

Discuss the results and what you would do with the data.

Flip a coin 10 times and record the observed number of heads and tails. We would expect that the distribution of heads and tails to be 50/50. How far away from 50/50 are you for each of your three samples? Reflect upon why might this happen?

July 7, 2020/in Statistics /by Donald

Flip a coin 10 times and record the observed number of heads and tails. For example, with 10 flips one might get 6 heads and 4 tails. Now, flip the coin another 20 times (so 30 times in total) and again, record the observed number of heads and tails. Finally, flip the coin another 70 times (so 100 times in total) and record your results again.

We would expect that the distribution of heads and tails to be 50/50. How far away from 50/50 are you for each of your three samples? Reflect upon why might this happen?

Using the provided datasets of offenses reported, calculate the mean, median, mode, max, min, and range for each of the crimes. The list of crimes includes violent crime total, murder and non-negligent manslaughter, legacy rape, revised rape, robbery, aggravated assault, property crime total, burglary, larceny-theft, and motor vehicle theft.

July 1, 2020/in Statistics /by Donald

Although analyzing statistical data can be challenging, it is equally challenging to convert these data into a written format. Therefore, in this activity, you will practice the important skill of data analysis and presenting statistical information in a written format. Using the provided datasets of offenses reported, calculate the mean, median, mode, max, min, and range for each of the crimes. The list of crimes includes violent crime total, murder and non-negligent manslaughter, legacy rape, revised rape, robbery, aggravated assault, property crime total, burglary, larceny-theft, and motor vehicle theft.
The specific steps are as follows:
1.Download 1 of the following datasets of offenses from the Uniform Crime Report:
Accomack County Sheriff’s Office
Honolulu Police Department
Los Angeles Police Department
2. Calculate the mean, median, mode, max, min, and range for each of the following:
Violent crime total
Murder and non-negligent manslaughter
Legacy rape
Revised rape
Robbery
Aggravated assault
Property crime total
Burglary
Larceny-theft
Motor vehicle theft
3. Write 1 paragraph for each of the crimes, where you present the statistical results to the reader in a written format.
Reference
U.S. Department of Justice, Federal Bureau of Investigation, Uniform Crime Reporting Statistics. (2017). Welcome to a new way to access UCR statistics

Contrast how the three different populations (p0 = 0.5) evolve over time. Include reference to your three, representative graphs in your answer. Based on your data does the frequency of an allele determine its likelihood of fixation due to drift? Explain, with reference to the data you collected.

July 1, 2020/in Statistics /by Donald

You are to manipulate a two-allele, multi-generational, STOCHASTIC model of a population evolving under the influence of genetic drift. This model should allow you to assess the impact of population size on the magnitude of drift, and the likelihood of fixation.

Give you additional practice with MS Excel and quantitative reasoning.

List the different microevolutionary processes covered in thecourse and contrast how they change allele frequencies over time. Apply population genetic equations to explore evolutionary processes.

This exercise will give you the opportunity to solidify your understanding of genetic drift now that we have covered the subject in lecture. The Excel file has already been set up to contain a single population of 50 individuals. You are to use copy/paste (with some editing) to add four additional populations to the graph (total of 5 pop’s). Starting with p₀ = 0.5 you will collect data for 100 populations of size 50 by hitting the F9 key 19 times (20 total x 5 pops=100 pops). F9 causes Excel to recalculate, and the spreadsheet calculation is based, to some extent, on a random number. Record how many of those 100 populations fix for each of the p and q alleles (table below). Select a representative graph for screen-grabbing to turn in with this assignment. Then repeat this process with p₀ = 0.1.

Next, copy the contents of this page to the N=30 worksheet and delete off the last 20 individuals (i.e., you are reducing N to 30). Make sure your graph is using data from this new page, and not the N=50 page, otherwise you will not see increased drift effects (see Select Data after right-clicking inside the graph). Again, generate the 100 populations of data for p₀ = 0.5 and p₀ = 0.1 and record the outcomes. Again, screengrab a representative graph.

Finally, copy the contents of this page to the N=10 worksheet and delete off the last 20 individuals (i.e., you are reducing N to 10). Again, make sure your graph is using data from this new page, and not the N=30 page. And again, generate the 100 populations of data for p₀ = 0.5 and p₀ = 0.1 and record the outcomes. Again, screengrab a representative graph.

Contrast how the three different populations (p₀ = 0.5) evolve over time. Include reference to your three, representative graphs in your answer.
Based on your data does the frequency of an allele determine its likelihood of fixation due to drift? Explain, with reference to the data you collected.

In this discussion, you will explore the use of inferential and descriptive analysis in public safety agencies. Selecting 2 different public safety agencies, find 1 example displaying the use of descriptive data and 1 example reflecting inferential analysis.

July 1, 2020/in Statistics /by Donald

In this discussion, you will explore the use of inferential and descriptive analysis in public safety agencies. Selecting 2 different public safety agencies, find 1 example displaying the use of descriptive data and 1 example reflecting inferential analysis. In 2 paragraphs, you will introduce your examples to the class, name the type of analysis undertaken, and suggest other types of statistical analysis that could be undertaken with these data or by including other data.
The specific steps are as follows:
Engage in independent research, and find 2 different public safety agencies using statistical analysis.
Of your selected examples, 1 example must reflect the use of descriptive data and 1 must be an example of inferential analysis.
Write a post introducing this research to your classmates.
Engage in a discussion with your classmates on the diverse types of statistical analysis and how they are used to improve public safety.

Archive for category: Statistics

Discuss what strategies you have implemented to get things done.

What is the impact of using a linear regression model in this case?

what is the relationship between years of experience of the instructor and the scores obtained on the final exam given at the conclusion of the courses?

Describe in detail the dependent and independent variable(s) that you believe would be appropriate.

Use the discussion topic as a place to ask questions, speculate about answers, and share insights.

Flip a coin 10 times and record the observed number of heads and tails. We would expect that the distribution of heads and tails to be 50/50. How far away from 50/50 are you for each of your three samples? Reflect upon why might this happen?

Contrast how the three different populations (p0 = 0.5) evolve over time. Include reference to your three, representative graphs in your answer. Based on your data does the frequency of an allele determine its likelihood of fixation due to drift? Explain, with reference to the data you collected.

In this discussion, you will explore the use of inferential and descriptive analysis in public safety agencies. Selecting 2 different public safety agencies, find 1 example displaying the use of descriptive data and 1 example reflecting inferential analysis.

Quick Links

We Accept

Use Our Writing Service