QUESTIONS
Part one
1.Suppose you have two datasets with the following summary statistics:
- Dataset A: mean = 10, variance = 25
- Dataset B: mean = 15, variance = 36
Which dataset has more variability, and why? Show your calculations.
- Calculate the sample variance of the following data: 5, 8, 10, 12, 15. Show your work and round to two decimal places.
- Given a dataset with a mean of 100 and a variance of 49, calculate the standard deviation of the dataset. What does the standard deviation tell us about the dataset?
- A company tracks the sales of two products, A and B, over a period of 10 months. The mean sales for product A is $500, and the mean sales for product B is $700. The sample variances for the two products are 1000 and 2500, respectively. Which product has more variability in sales, and why? Show your calculations.
- Suppose you have a dataset with a mean of 50 and a standard deviation of 10. You add a constant value of 5 to each data point in the dataset. What happens to the mean and standard deviation of the new dataset, and why? Show your calculations.
Part two
Video: TED Talk – Why You Should Love Statistics
Video: TED Talk – Lies, Damned Lies and Statistics
1.In the TED Talk – Why You Should Love Statistics, list 5 reasons the speaker says as a reason you should love statistics. Do you agree with any of these? Why?
2. In the TED Talk – Lies, Damned Lies and Statistics, explain in detail why the speaker notes that deception is so prevalent in statistics. Why is this? What would you do to help rectify this issue?
3.In the TED Talk – How Statistics Supports our Intuition, list 5 key points the speaker states on why statistics does, in fact, support our intuition. Do you agree with these 5 points? Explain.
4.In the TED Talk – The Need for Statistical Literacy, list 5 points of the speaker on why we do, in fact, need statistical literacy. Do you agree with these points? Explain.
5.Watch the lecture videos and the TED Talks and complete the readings before attempting this assignment. remember all work must be in your own words, no cutting or pasting from the Internet, and all responses must come from the Module 2 materials, not outside materials.
Complete in a Word document. To show calculations you can complete on paper, take a picture, then upload the image if you want.
- What is bias and how does it affect the field of statistics?
- What are the Halo Effect and the Horn Effect? How do they affect our interpretation of things?
- What is the difference between a population and a sample?
- What are deceptive statistics and why are they bad/dangerous/misleading?
- What is the difference between descriptive statistics and inferential statistics?
- What are outliers and how are they determined to be outliers?
- What is the difference between variance and standard deviation?
– Note the following Population Set:
P = {2, 5, 7, 4, 11, 15, 23, 34, 5, 19, 1200}
a.List any outliers b. Calculate the mean. c. Calculate the mode. d. Calculate the median. e. Calculate the range. f. Calculate the IQR. g. Calculate the variance. h. Calculate the standard deviation. i. List the 5 Number Summary. j. Draw a box plot of the values (indicating the outliers, if any)
– Note the following Sample Set:
S = {120, 177, 201, 292,156,118, 220, 243, 167}
a. List any outliers
b. Calculate the mean.
c. Calculate the mode.
d. Calculate the median.
e. Calculate the range.
f. Calculate the IQR.
g. Calculate the variance.
h. Calculate the standard deviation.
i. List the 5 Number Summary.
j. Draw a box plot of the values (indicating the outliers, if any)
- What is the difference between continuous data and discrete data?
- What are the effects of changing units, or having inconsistent units in statistical analysis?
Part 3
- What is meant by the vital phrase “Correlation is not equal to Causation”? Give an example of this.
- What are outliers in data? Should you use them or get rid of them? Why or why not? Can outliers tell you anything about the population or sample?
Part4
What are the potential benefits and drawbacks of AI for society and the economy?
Assignment: Exploring the Housing Market in Python
Objective: To gain practical experience in importing and managing data in Python, and to use data visualization techniques to explore trends in the housing market.
Task:
- Obtain a dataset of housing prices and related variables such as square footage, number of rooms, location, etc.
- Import the data into Python using the Pandas library.
- Perform any necessary data cleaning and preparation, such as handling missing values and removing duplicates.
- Explore the data using data visualization techniques, such as histograms, scatter plots, and heatmaps.
- Identify any trends or patterns in the data that you find interesting, and write a brief report (2-3 pages) summarizing your findings.
Resources:
- Pandas documentation: https://pandas.pydata.org/docs/
- Links to an external site.
- Matplotlib documentation: https://matplotlib.org/stable/contents.html
- Links to an external site.
- Seaborn documentation: https://seaborn.pydata.org/
- Links to an external site.