Python Question

Instructions: All code should be submitted as PDF and not as a picture (you can use pictures for your flowcharts only). PDFs should be submitted as a primary resource, and a zip file including the .ipynb file and any additional files (for instance, a picture or pdf for your flowchart) as a secondary resource. If you do not submit the pdf as a primary resource, you will be penalized. If you do not submit an ipynb file, you will be penalized.

Failure to comply with the instructions will result in 0 grade on the relevant portions of the assignment. Your instructor will grade your submission based on what you submitted. Failure to submit an assignment or submitting an assignment for another class will result in a 0 grade, without the opportunity to resubmit. Make sure that you submit your original work. Suspected plagiarism cases will be treated as possible academic misconduct and will be reported to the College Academic Integrity Committee for formal investigation.**** As part of this procedure, your instructor may require you to meet with them for an oral exam on the assignment.

**Important note: **You can use either Anaconda or Colab to work on the Jupyter notebook that you will submit as your final project on Forum:

1 – Start by downloading this Jupyter notebook to your local machine.

2 – Open a tab in your browser and type https://colab.research.google.com/.

3 – This will open a small window. Choose the last option Show notebooks in Drive on the upper menu, “Upload”. Then choose the Jupyter notebook you have saved in step 1.

4 – You can start working on your assignment by answering the questions in the corresponding cells.

5 – If you have any questions, please reach out to your instructors and the CIS tutors.

Overview

This assignment will allow you to practice algorithmic thinking and basic Python programming with several small-scale problems. As you solve each problem, follow the steps of algorithmic thinking as outlined below.

NOTE: you only need to provide an algorithm, flowchart and test cases for part 2 (no algorithm/flowchart/test cases are needed for part 1).

Step 1: Algorithm Description. Use an algorithm and a flow chart to develop and express your algorithm that accomplishes the given task. Remember, you have to be very explicit and clear to make sure one can actually accomplish the task following your directions. Describe the input(s), output(s) and the process of the algorithm.

Step 2: Program Code – Implementation: Implement the algorithm in Python using the basic structures we covered in class (ONLY USE CONCEPTS COVERED IN CLASS):

  • User input
  • Variables
  • Operators
  • Conditional execution
  • For/while loops
  • Data structures
  • Functions and modules
  • Pandas

Step 3: Program Testing: Create a Test Plan with two or three test cases that demonstrate your code works as intended. Explain how you used these test cases in your comments.

Step 4: Program Documentation: Be sure to comment thoroughly so that it is clear that you understand what every line of the code is intended to accomplish.

Part 1: Data Analysis and Visualization

You will work with a dataset that contains information on a coffee shop’s sales. The dataset is below. DOWNLOAD THE DATASET AS A CSV FILE ON YOUR COMPUTER FROM THE LINK BELOW AND READ IT IN PANDAS FROM THERE. DO NOT READ IT FROM THE LINK BELOW.

Dataset: https://drive.google.com/file/d/141afTVoF0J2FjpLI-VfERyJM7aWUQ8az/view?usp=sharing

Variables:

  • transaction_id – transaction id
  • transaction_date – transaction date
  • transaction_time – transaction time
  • sales_outlet_id – sales outlet (A, B, C, D, E, F or G)
  • staff_id – id of the staff member
  • customer_id – ID of the customer
  • instore_yn – whether the sale was in the store (yes or no)
  • product_id – id of the product
  • quantity – quantity purchased
  • unit_price – price per unit (item) in USD
  • promo_item_yn – whether the item was on promotion (yes or no)

Question 1.

Import the csv file in pandas and save it as a dataframe. Then, write a code that returns: (1) the first 10 and last 10 rows; and (2) the number of rows and columns in the data set. Discuss what the code shows you about the data set.

Question 2.

Write a code that returns: (1) the distribution of sales outlets (including a count of each outlet type and a bar chart); (2) the minimum and maximum transaction_id; (3) the minimum, maximum and average customer_id; and (4) the distribution of products in bought in store (yes or no) using a pie chart.

Question 3.

You discover that the variable unit_price was incorrectly recorded. Create a new variable unit_price_corrected where you add 1.50 to unit_price for the first 100 items, and you subtract 1.50 from the unit price for the remaining items in the data set. Then, calculate and compare the average of unit_price and unit_price_corrected.

Question 4.

The coffee shop’s management wants to find out which of the outlets has the highest revenue. Calculate the total revenue for each of the outlets. Remember that total revenue will be unit_price_corrected multiplied by quantity. Also, present your calculations using a line graph. Explain what you found and what the chart shows.

Question 5.

The coffee shop’s management wants to find out how the staff are doing in terms of sales. For each of the staff ids, calculate the total product units sold and the total revenue sold. Provide two bar charts (one for total product units, one for total revenue) by staff id, and interpret your findings.

Question 6.

Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.

Question 7.

Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.

 

Part 2

You are hired to develop an online management system for a cafe. This program will be used by the café admins and will help them manage online orders. Use a function to develop a program with the following features:

  1. Allow the café admin to enter the menu items until the user enters quit to stop. The list should include a minimum of 10 items. For example: main_categories = [Americano, Espresso, Cheese sandwich]
  2. Use the main menu list you created in step 1 to create a dictionary that should contain the price of each of the menu items with their respective cost. For example: items_price= {“Americano”: 13, “Espresso”: 9, “Cheese sandwich”:15}
  3. Use the main menu list you created in step 1 to create another dictionary that should contain the quantity of each menu item. items_quantity={“Americano”: 50, “Esspresso”: 30, “Cheese sandwich”:10}
  4. Use the main menu list you created in step 1 to create another dictionary that allows the cafe admin to record the rating received from customers on menu items. The ratings are scored on a scale from 1 to 5, with 5 indicating the maximum customer satisfaction. For example: items_rating = {“Americano”: 4, “Esspresso”: 1, “Cheese sandwich”:5}

Your function should return the following data structures separately:

  1. The dictionary that includes all entries.
  2. A list named satisfied_item, which includes the items with satisfaction of 3 or higher.
  3. A list named highprice_item, which includes the items with price above 10 .
  4. A list named few_items, which includes the items with quantity less than 5.

For part 2 only: First, create a step-by-step algorithm and a flowchart and then translate it into a fully functional and documented Python code. Follow the flowchart shape conventions from the session 3 reading, available here.