Using density curves or boxplots, compare hospital cost and revenues across the three markets. What is the impact of being in a high-competitive market on hospital revenues and cost? Do you think being in a high-competitive market has a positive impact on hospital net benefits?

Exercise #3:

The dataset provides Herfindahl–Hirschman Index and herfindahl index categories, use the herf_cat variable to answer the following questions:

Note: “The Herfindahl–Hirschman Index is a commonly accepted measure of market concentration used by antitrust enforcement agencies and scholars in the field. The HHI is calculated by squaring the market share of each firm competing in the market and then summing the resulting numbers” (NASI, 2015; pp: 14-16).

Read more from here:

https://www.urban.org/sites/default/files/publication/50116/2000212-Addressing-Pricing-Power-in-Health-Care-Markets.pdf

For this exercise you do not need to compute the HHI, but if you have any questions, please do not hesitate to ask. However, try to learn more about this you will need that to report your findings.

Using the dataset from week 1 exercise, answer the following questions.

  • Comparing the following information between hospitals located in high, moderate and low competitive markets below. (Table 3)
  • What are the main significant differences between hospitals in different markets? (use Anova test)
  • Using density curves or boxplots, compare hospital cost and revenues across the three markets.
  • What is the impact of being in a high-competitive market on hospital revenues and cost? Do you think being in a high-competitive market has a positive impact on hospital net benefits? What about the number of Medicare and Medicaid discharges? Do you think hospitals in high-competitive market are more likely to accept more Medicare and Medicaid patients? What is the impact of other variables? Discuss your findings in 1-2 paragraphs.

Table 3. Comparing hospital characteristics and market, 2011 and 2012

High Competitive Market Moderate Competitive Market Low Competitive

Market ANOVA/Chi-Sq (results)

P value

Hospital Characteristics N Mean St. Dev N Mean St. Dev N Mean St. Dev

Hospital Beds

Number of paid Employees

Number of non-paid Employees

Interns and Residents

System Membership

Total Hospital Cost ($)

Total Hospital Revenues ($)

Hospital Net Benefit ($)

Available Medicare Days

Available Medicaid Days

Total Hospital Discharges

Total Medicare Discharges

Total Medicaid Discharges

  1. Herfindahl Index

Round mean and st.dev figures to nearest whole number (except Herfindahl index). Use 2 decimal places for p values.

 

Define your population parameter, including hypothesis statements, and specify the appropriate test. Define your population parameter. Write the null and alternative hypotheses. Specify the name of the test you will use. Identify whether it is a left-tailed, right-tailed, or two-tailed test. Identify your significance level. Data Analysis

Analyzing real estate data

You have been hired by the Regional Real Estate Company to help them analyze real estate data. One of the company’s Pacific region salespeople just returned to the office with a newly designed advertisement. The average cost per square foot of home sales based on this advertisement is $280. The salesperson claims that the average cost per square foot in the Pacific region is less than $280. In other words, he claims that the newly designed advertisement would result in higher average cost per square foot in the Pacific Region. He wants you to make sure he can make that statement before approving the use of the advertisement. In order to test his claim, you will generate a random sample size of 750 using data for the Pacific region and use this data to perform a hypothesis test.

Prompt:

Generate a sample of size 750 using data for the Pacific region. Then, design a hypothesis test and interpret the results using significance level α = .05.

You will work with this sample in the assignment.

Briefly describe how you generated your random sample. Use the House Listing Price by Region Spreadsheet document and the National Summary Statistics and Graphs House Listing Price by Region PDF documents to help support your work on this assignment.

You may also use the Descriptive Statistics in Excel PDF and Creating Histograms in Excel PDF tutorials for support.

Address the following, using the Module Five Assignment Template Word Document

Hypothesis Test Setup: Define your population parameter, including hypothesis statements, and specify the appropriate test. Define your population parameter. Write the null and alternative hypotheses. Specify the name of the test you will use. Identify whether it is a left-tailed, right-tailed, or two-tailed test. Identify your significance level. Data Analysis

Preparations: Describe sample summary statistics, provide a histogram and summary, check assumptions, and find the test statistic and significance level. Provide the descriptive statistics (sample size, mean, median, and standard deviation). Provide a histogram of your sample. Describe your sample by writing a sentence describing the shape, center, and spread of your sample.

Determine whether the conditions to perform your identified test have been met.

Calculations: Calculate the p value, describe the p value and test statistic in regard to the normal curve graph, discuss how the p value relates to the significance level, and compare the p value to the significance level to reject or fail to reject the null hypothesis. Calculate the sample mean and standard error. Determine the appropriate test statistic, then calculate the test statistic.

Note: This calculation is (mean – target)/standard error. In this case, the mean is your regional mean (Pacific), and the target is 280. Calculate the p value. Note: For right-tailed, use the T.DIST.RT function in Excel, left-tailed is the T.DIST function, and two-tailed is the T.DIST.2T function. The degree of freedom is calculated by subtracting 1 from your sample size.

Choose your test from the following: =T.DIST.RT([test statistic], [degree of freedom]) =T.DIST([test statistic], [degree of freedom], 1) =T.DIST.2T([test statistic], [degree of freedom]) Using the normal curve graph as a reference, describe where the p value and test statistic would be placed.

Test Decision: Discuss the relationship between the p value and the significance level, including a comparison between the two, and decide to reject or fail to reject the null hypothesis. Discuss how the p value relates to the significance level. Compare the p value and significance level, and make a decision to reject or fail to reject the null hypothesis.

Conclusion: Discuss how your test relates to the hypothesis and discuss the statistical significance. Explain in one paragraph how your test decision relates to your hypothesis and whether your conclusions are statistically significant.

 

 

Provide a general introduction, background, and purpose of the paper, with your main statement resting on the idea of using statistical analysis to achieve better business decision and increase profitability and business activities.

Real estate industry

Provide a general introduction, background, and purpose of the paper, with your main statement resting on the idea of using statistical analysis to achieve better business decision and increase profitability and business activities.

Also include a discussion of the real estate industry and the impacts that influence the health, viability, and success of the real estate marketplace; particularly in the Northeastern region of the U.S.

Let’s suppose you know the owner of a local real estate company in a mid-sized Northeastern city of the United States. The name of your acquaintance’s company is, “North Valley Real Estate.” Your acquaintance also has a large clientele and a top-notch staff of salespeople. Luckily, the company also has some good data on the homes that have been sold by “North Valley Real Estate (NVRE)” throughout the community, over the years. This data can be found as an Excel Dataset File located in the “Files” Section of the course (Left-hand Side Menu Bar).

As you may have guessed, the owner of NVRE wants to use the data to find insights into the home-selling market, and how this same insight can be used to increase their profit and business activity. As such, your knowledge of statistics and research applications can provide a valuable service to your friend and the NVRE company; and they have hired you as their consultant.

Using the data identified above, embark upon an analysis effort and creating an accompanying Report for the Owner of North Valley Real Estate.

Your purpose in the project is to glean information, knowledge, and insight from the provided data that can be used to increase the profitability of the Firm (NVRE). Your research should align to the CLOs provided for the course and lean heavily on Units 4-8 of the course for the statistical application/analysis.

Specifically, the statistical tool of regression. The Capstone Term Project Report is separated into preliminary parts for a total of 150 points (four assignments worth 25 points each, and one worth 50 points, spanning Units 2-6).

The final draft is also 150 points (thus, a grand total is 300 points). The paper is to be written in APA style, 7th edition. Deviation from APA style is strongly discouraged, and will be penalized on the final draft

 

Using the classmate’s frequency distribution, construct a histogram or bar chart. Copy and paste the histogram/bar chart into your reply or attach it as a separate file. State two unique observations that can be made based on the histogram/bar chart.

Frequency Distribution and Histogram

Review a classmate’s thread and their chosen quantitative variable.

  1. Using the classmate’s frequency distribution, construct a histogram (for continuous variables) or bar chart (for discrete variables).
  2. Copy and paste the histogram/bar chart into your reply or attach it as a separate file.
  3. State two unique observations that can be made based on the histogram/bar chart.

Using the classmate’s frequency distribution, construct a histogram (for continuous variables) or bar chart (for discrete variables). Copy and paste the histogram/bar chart into your reply or attach it as a separate file. State two unique observations that can be made based on the histogram/bar chart.

Frequency Distribution and Histogram

Review a classmate’s thread and their chosen quantitative variable.

  1. Using the classmate’s frequency distribution, construct a histogram (for continuous variables) or bar chart (for discrete variables).
  2. Copy and paste the histogram/bar chart into your reply or attach it as a separate file.
  3. State two unique observations that can be made based on the histogram/bar chart.

Define x and y. Which variable is useful for making predictions? Is there an association between x and y? Describe the association you see in the scatter plot. What do you see as the shape (linear or nonlinear)? If you had a 1,800 square foot house, based on the regression equation in the graph, what price would you choose to list at? Do you see any potential outliers in the scatterplot?Why do you think the outliers appeared in the scatterplot you generated? What do they represent?

Generate a Representative Sample of the Data

Smart businesses in all industries use data to provide an intuitive analysis of how they can get a competitive advantage. The real estate industry heavily uses linear regression to estimate home prices, as cost of housing is currently the largest expense for most families. Additionally, in order to help new homeowners and home sellers with important decisions, real estate professionals need to go beyond showing property inventory. They need to be well versed in the relationship between price, square footage, build year, location, and so many other factors that can help predict the business environment and provide the best advice to their clients.

Prompt
You have been recently hired as a junior analyst by D.M. Pan Real Estate Company. The sales team has tasked you with preparing a report that examines the relationship between the selling price of properties and their size in square feet. You have been provided with a Real Estate Data Spreadsheet spreadsheet that includes properties sold nationwide in recent years. The team has asked you to select a region, complete an initial analysis, and provide the report to the team.

Note: In the report you prepare for the sales team, the response variable (y) should be the listing price and the predictor variable (x) should be the square feet.

Specifically, you must address the following rubric criteria, using the Module Two Assignment Template Word Document:

Generate a Representative Sample of the Data

  • Select a region and generate a simple random sample of 30 from the data.
  • Report the mean, median, and standard deviation of the listing price and the square foot variables.

Analyze Your Sample

  • Discuss how the regional sample created is or is not reflective of the national market.
  • Compare and contrast your sample with the population using the National Summary Statistics and Graphs Real Estate Data PDF document.
  • Explain how you have made sure that the sample is random.
  • Explain your methods to get a truly random sample.

Generate Scatterplot

  • Create a scatterplot of the x and y variables noted above and include a trend line and the regression equation

Observe patterns

Answer the following questions based on the scatterplot:

  • Define x and y. Which variable is useful for making predictions?
  • Is there an association between x and y? Describe the association you see in the scatter plot.
  • What do you see as the shape (linear or nonlinear)?
  • If you had a 1,800 square foot house, based on the regression equation in the graph, what price would you choose to list at?
  • Do you see any potential outliers in the scatterplot?Why do you think the outliers appeared in the scatterplot you generated?
  • What do they represent?

You can use the following tutorial that is specifically about this assignment. Make sure to check the assignment prompt for specific numbers used for national statistics and/or square footage. The video may use different national statistics or solve for different square footage values.

Devise a big-data-oriented application relevant to your organisation (or organisation like your own) and consider how that application could be represented using the given architecture as a foundation. Provide a description of the application, stating the main data components, their sources and their relationships. Justify the data components included. Amend the diagram given above to include the data components of your application. Explain how big data techniques might be used to harness and process the data within your application.

DATA ANALYTICS

Task 1: Data Architecture Analysis (25 Marks)

            (Write in third person  – academic style)

The set text provides an end state architecture to which organisations should aspire. It is shown below.

Snapshot : from: Inmon, W. and Linstedt, D. (2019) Data Architecture: A Primer for the Data Scientist, Academic Press; 2nd edition, pp 48

Devise a big-data-oriented application relevant to your organisation (or organisation like your own) and consider how that application could be represented using the given architecture as a foundation.

Provide a description of the application, stating the main data components, their sources and their relationships. Justify the data components included.

Amend the diagram given above to include the data components of your application.

Explain how big data techniques might be used to harness and process the data within your application.

 

Task 2: Data Analysis (40 marks)

                (Write in third person – academic style)

You will carry out a data analysis and produce visualisations. You will need a data set which can be analysed.  You will also need a research question.

  • Research question and hypotheses (10 marks)

Identify a research question that might be usefully answered using your analytics record. Develop hypotheses.

Evaluate the potential impact of insights that might occur following exploration of the research question.

  • Dataset Generation (10 marks)

Identify data from your application that might be analysed to provide business insights and in particular to answer your research question in (a) above. Based on your selection, create a data set with at least 1000 rows. You can create the data set either using real data (suitably anonymised) or, if this is not possible, you will need to generate a realistic data set. You will need to specify realistic shape and relationships within the data in order to generate realistic data.

Briefly explain why the data chosen has been selected and the reasoning behind your design of the data shape and relationships. The data set will form your analytics record.

Include an appendix that describes the meta-data of your data set together with a sample of some rows.

  • Hypothesis Testing (10 marks)

Analyse the data against the hypothesis.

Carry out suitable statistical significance testing.

Evaluate results and justify statistical significance testing method selected.

  • Analysis and Visualisation (10 marks)

Using Power BI or another suitable visualisation tool, create at least three visualisations from your data set. Provide a discussion of the visualisations selected, explaining how they were created and what additional insight they bring.

 

Task 3: Data Governance (15 marks)

            (Write in third person – academic style)

 Outline a data governance framework suitable for the organisation. Justify the components included and outline the responsibilities of the data governance function.

It is a must that you be critical (contrast both positives and negatives) in your evaluation.

Tip: use words: however, on the other hand etc

 

Task 4: Evaluation (10 marks)

            (Write in first person – reflective style)

 Evaluate your experience in carrying out the assignment. What went well and what was your response to any challenges? Briefly discuss your main points of learning. Again, criticality and looking at both sides (what went well and didn’t) are very important.

Tip: use words: however, on the other hand etc

 

Academic Conventions (10 marks)

The standard of academic writing will be considered as well as the report presentation and structure.  Harvard referencing style is expected.  Sections should be numbered.  Figures and tables should be numbered and should have captions. Pages should be numbered.  Appropriate front pages should be used.

There are several assumptions you could test, but for the sake of simplicity we will focus on the normality assumption. Evaluate normality using the appropriate test. In 30 words or less, report the statistical test and indicate what the results of the code reveal. Was the assumption violated? How can you tell?

Submit a Word-knitted version of the completed R Markdown file found in this zip file

Q1 [2% of total marks]

First, as always, let’s visualise the data. In the code chunk for this question make the appropriate figure visualizing the attendance for each of the venues. Your figure should show each of the individual data points as well as a different geom displaying a summary for each venue, a colour scheme (other than the default) with different colours for each venue and the colours of the data points matching the geom, no legend, and of course a title, informative axis labels, and a nice theme.

Q2 [2% of total marks]

There are several assumptions you could test, but for the sake of simplicity we will focus on the normality assumption (you may presume that homogeneity of variance holds). Evaluate normality using the appropriate test. In 30 words or less, report the statistical test and indicate what the results of the code reveal (including stats reference). Was the assumption violated? How can you tell?

Q3 [4% of total marks]

Run the appropriate statistical test to evaluate the research hypothesis given Q2. Report on the results in 60 words or less. In your report, don’t worry about including descriptive statistics but do include an explanation of which statistical test you used, what the predictor and outcome variables were, the appropriate stats reference, the effect size and its interpretation, and the interpretation of this data in terms of the research question.

Q4 [4% of total marks]

Perform post-hoc pairwise tests and in 40 words or less, report the tests and which venues (if any) had significantly lower attendance than others, along with their p-values. You do not need to report the non-significant venues or their p-values.

Explore data collection methods. Include a timeframe for data collection. Data collection tools presented in an appendix will not be included in assessment word count. Describe how data will be managed. Present a data analysis plan. Present how your intended research will improve practice.

Nursing Research Protocal

Paper Guidelines for writing and presenting your project protocol.
Throughout, justification / defence of chosen methods should be provided from available literature.

I. Introduction (approx. 100 words):
Provide a brief outline / signposting to your project protocol.
Introduce your topic, concisely describe your chosen topic, and present your research question (using framework PICOT).

II. Research design and rationale for chosen design. (approx. 600 words):
Depending on the type of study (research design) to be conducted, a different tool/checklist should be used. To assist with research protocol preparation, the following checklists can be used as a template:
Tool / checklist name Hyperlinked abbreviated name Intended study type
Strengthening the Reporting of Observational studies in Epidemiology STROBE checklist
Observational studies
Consolidated Standards of Reporting Trial CONSORT Statements
Randomised controlled trials
Standards for Reporting Qualitative Research SRQR recommendations
Qualitative Research
Standards for Quality Improvement Reporting Excellence SQUIRE guidelines
Quality improvement studies

III. Setting (approx. 275 words):
This should include details of how participants will be sampled. Inclusion and exclusion criteria. Explore how your will sample be accessed.

IV. Data collection (approx. 275 words):
Explore data collection methods. Include a timeframe for data collection.
Data collection tools presented in an appendix will not be included in assessment word count.
Explore potential ethical considerations.

V. Data management and analysis (approx. 325 words):
Describe how data will be managed.
Present a data analysis plan.

VI. Potential implications for practice (approx. 250 words):
Present how your intended research will improve practice.

VII. Conclusion (approx. 100 words):
Provide a summary of the main points covered in your project protocol. All claims and opinions should be supported by available evidence.

Explain in your own words the “power of a statistical test”. Do you proceed with the test if you don’t have enough “power”? How does the power affect the results of the test? Can statistical analysis be done on distributions that are not “normal”? Explain in detail.

Statistics Discussion Questions

Instructions:

-Use Canadian/US English, make sure your writing is clear and no grammar or punctuation errors

-Answer each discussion question separately

-Provide a response to EACH of the following discussion questions

-Minimum of 225-275 words for each response (for each question)

-Everything must be in YOUR OWN WORDS

-Do NOT copy or plagiarize anything from online sources

-No need to provide any sources or references. This is because everything must be in your own words

-You’re simply providing an answer to each of the discussion questions mentioned below

-Make sure to answer ALL parts of each question, answer the questions fully

 

1.Sample size can play an important role in gathering and analyzing data. Describe this importance. Can samples of small size still be relevant? Why or why not? What constitutes a “small sample”?

OR

Explain in your own words the “power of a statistical test”. Do you proceed with the test if you don’t have enough “power”? How does the power affect the results of the test?

 

2.Can statistical analysis be done on distributions that are not “normal”? Explain in detail.

OR

Are all data sets best described with graphs? Talk about when a data set might not benefit from being displayed graphically.

 

3.Many times, multiple variables can be correlated, affecting the outcome of the dependent variable. Describe, in detail, the process for determining if more than one variable contributes to the outcome of a single dependent variable.

OR

How accurate is a regression analysis and how do you know? What attributes of the analysis will determine whether the analysis is accurate and to what extent? Can inaccurate regression analyses be used to an analyzer’s benefit? Explain in detail.

 

4.For reference, this is an online statistics course (Undergraduate level). Just come up with something to answer this question: (It can be anything). Your answer can be generic, no need to be too specific.

Qs. 4: Name at least one thing that you really enjoy about the class (come on, there must be something!) and at least one improvement that could make this class even better. Feel free to editorialize. Also, this is a good time to make a “keep-quit-start” resolution: What are you doing to help yourself learn in this class that you are going to keep doing because it’s working? What are you doing that is hindering you from learning in this class that you are going to quit doing? What do you think will help you improve your learning in this class that you are going to start doing?

 

5.Sometimes it is necessary to compare the means of several variables. When might this be the case? Why?

OR

Comparing two means can be tricky in statistics since usually one of the groups is manipulated in some type of way (i.e., given a placebo). How do statisticians overcome this obstacle to generate good data?

 

6.There has been a lot of discussion so far about effect size. In your own words, describe what is meant by effect size and why it is important when analyzing data.

OR

What are the pros and cons of using Factorial Designs? What can and should you do to minimize the cons?

 

7.Talk about factorial designs, repeated measure designs, and mixed design. Explain in your own words the difference between these. Give specific examples for which you would choose one method instead of the others.

OR

Explain in detail a “Goodness-of-Fit” test and when and why this test would be used versus any other analysis test.

 

8.For reference, this is an online statistics course. Just come up with something to answer this question: (It can be anything). Your answer can be generic, no need to be too specific.

Qs. 8: Tell us about your best tips for succeeding in this course and what your future educational plans are. (Assuming you are just about to finish undergraduate level). In terms of this class, you’re the expert — what most helped you to succeed in this course, or what do you wish you’d done that you think would have helped you most to succeed? Your comments can help us make this a better class for future statistics students. In terms of the future – what’s next? What classes are you taking in the upcoming semester? How close are you to tossing that mortarboard skyward? What are your plans post-graduation? This is your chance to give feedback and say good-bye!