Practical Data Analytics
Task 1: Data Architecture Analysis (25 Marks)
(Write in third person – academic style)
The set text provides an end state architecture to which organisations should aspire. It is shown below.
Snapshot : from: Inmon, W. and Linstedt, D. (2019) Data Architecture: A Primer for the Data Scientist, Academic Press; 2nd edition, pp 48
Devise a big-data-oriented application relevant to your organisation (or organisation like your own) and consider how that application could be represented using the given architecture as a foundation.
Provide a description of the application, stating the main data components, their sources and their relationships. Justify the data components included.
Amend the diagram given above to include the data components of your application.
Explain how big data techniques might be used to harness and process the data within your application.
Task 2: Data Analysis (40 marks)
(Write in third person – academic style)
You will carry out a data analysis and produce visualisations. You will need a data set which can be analysed. You will also need a research question.
- Research question and hypotheses (10 marks)
Identify a research question that might be usefully answered using your analytics record. Develop hypotheses.
Evaluate the potential impact of insights that might occur following exploration of the research question.
- Dataset Generation (10 marks)
Identify data from your application that might be analysed to provide business insights and in particular to answer your research question in (a) above. Based on your selection, create a data set with at least 1000 rows. You can create the data set either using real data (suitably anonymised) or, if this is not possible, you will need to generate a realistic data set. You will need to specify realistic shape and relationships within the data in order to generate realistic data. If you use real data make sure you have permission from your organisation and that your use complies with your organisation’s data governance policy.
Briefly explain why the data chosen has been selected and the reasoning behind your design of the data shape and relationships. The data set will form your analytics record.
Include an appendix that describes the meta-data of your data set together with a sample of some rows.
- Hypothesis Testing (10 marks)
Analyse the data against the hypothesis.
Carry out suitable statistical significance testing.
Evaluate results and justify statistical significance testing method selected.
- Analysis and Visualisation (10 marks)
Using Power BI or another suitable visualisation tool, create at least three visualisations from your data set. Provide a discussion of the visualisations selected, explaining how they were created and what additional insight they bring.
Task 3: Data Governance (15 marks)
(Write in third person – academic style)
Outline a data governance framework suitable for the organisation. Justify the components included and outline the responsibilities of the data governance function.
Task 4: Evaluation (10 marks)
(Write in first person – reflective style)
Evaluate your experience in carrying out the assignment. What went well and what was your response to any challenges? Briefly discuss your main points of learning.
Academic Conventions (10 marks)
The standard of academic writing will be considered as well as the report presentation and structure. Roehampton Harvard (https://library.roehampton.ac.uk/ld.php?content_id=32542499) referencing style is expected. Sections should be numbered. Figures and tables should be numbered and should have captions. Pages should be numbered. Appropriate front pages should be used.