For your bank, what percent of its loans went to businesses in census tracts in your bank’s designated counties where the median family income is >= 50% and < 80% of the MSA/MD median family income?

Database Spring 2023

Answer each of the questions in the document directly under the question. Below all of the questions, include your code and indicate the question it addresses.

Part 1: Questions
Limit these results to loans that mapped to businesses with good address in census tracts
with populations >= 100?

For these questions, use the American Community Survey (ACS) values: total_pop, hispanic_pop, black_pop, white_pop. (`bigquerypublicdata.census_bureau_acs.censustract_2020_5yr`)

1. For your bank in your bank’s designated counties, how many loans mapped to census tracts with populations >= 100?

2. For your bank in your bank’s designated counties, what percentage of those loans went to businesses in census tracts that were majority black?

3. For your bank in your bank’s designated counties, what percentage of those loans went to businesses in census tracts that were majority Hispanic?

4. For your bank in your bank’s designated counties, what percentage of those loans went to businesses in census tracts that were majority white?

(Note: Don’t break down the results by individual county.)

5. For all banks that made loans to business in your bank’s designated counties, how many loans went to census tracts with populations >= 100?

6. For all banks that made loans to business in your bank’s designated counties, what percent of those loans (tract populations >= 100) went to businesses in census tracts that were majority black?

7. For all banks that made loans to business in your bank’s designated counties, what percent of those loans (tract populations >= 100) went to businesses in census tracts that were majority Hispanic?

8. For all banks that made loans to business in your bank’s designated counties, what percent of those loans (tract populations >= 100) went to businesses in census tracts that were majority white?

Part 2: Questions
Using the income table
: `classdata324219.ppp.census_income` (see guide below). Limit these results to loans that mapped to businesses with good address in tracts with populations >= 100?

9. For your bank, what percent of its loans went to businesses in census tracts in your bank’s designated counties where the median family income is > 0% and < 50% of the MSA/MD median family income (Income_Level_Ind =1)?

10. For your bank, what percent of its loans went to businesses in census tracts in your bank’s designated counties where the median family income is >= 50% and < 80% of the MSA/MD median family income?

11. For your bank, what percent of its loans went to businesses in census tracts in your bank’s designated counties where the median family income is >= 80% and < 120% of the MSA/MD median family income?

12. For your bank, what percent of its loans went to businesses in census tracts in your bank’s designated counties where the median family income is >= 120% of the MSA/MD median family income?

(Note: Don’t break down the results by individual county.)

13. For all banks that made loans to business in your bank’s designated counties, what percent of those loans (tract populations >= 100) went to businesses in census tracts median family income is > 0% and < 50% of the MSA/MD median family income?

14. For all banks that made loans to business in your bank’s designated counties, what percent of those loans (tract populations >= 100) went to businesses in census tracts median family income is >= 50% and < 80% of the MSA/MD median family income?

15. For all banks that made loans to business in your bank’s designated counties, what percent of those loans (tract populations >= 100) went to businesses in census tracts median family income is >= 80% and < 120% of the MSA/MD median family income?

16. For all banks that made loans to business in your bank’s designated counties, what percent of those loans (tract populations >= 100) went to businesses in census tracts median family income is >= 120% of the MSA/MD median family income?

Write at least a half-page (no more than 2 pages) proposal on your final project website. It should provide information on a topic of your choosing.

IT 238 final project

Write at least a half-page (no more than 2 pages) proposal on your final project website. It should provide information on a topic of your choosing. It may be easier to generate content for a real subject than a fictional one, but remember that you may not use copyrighted trademarks or materials.

Which one of the following is not a web browser? Which of the following is the keyboard shortcut for creating a new tab?

MICROSOFT EDGE & USING EMAIL ACTIVITIES Activity 1

Choose the correct option for the following statements.

  1. WWW stands for __________________.
  2. Wide World Web
  3. World Wide Web
  4. Web Wide World
  5. World Wide Website
  6. URL stands for____________________.
  7. Uniform Resource Locator
  8. Universal Resource Locator
  9. Universal Receive Location
  10. Uniform Rescue Locator

Which one of the following is not a web browser?

  1. Microsoft Edge
  2. Mozilla Firefox
  3. Microsoft Word
  4. Safari

Which of the following is the keyboard shortcut for creating a new tab?

  1. Ctrl + T
  2. Ctrl + Shift + T
  3. Ctrl + Shift + P
  4. Ctrl + N

 

Module One: ALL ABOUT COMPUTERS, INTERNET & MIRCROSOFT EDGE

Fill in the blanks with the appropriate words provided in the box.

Web Note Microsoft Edge Reading View Edge Toolbar Home Page

  1. A _______ is the main page or the opening screen of a website.
  2. _______ is the latest web browser released with windows 10 by Microsoft.
  3. _______ feature of Edge lets you annotate a snapshot of the webpage.
  4. The ______ feature of Edge displays the current webpage as it might appear in an e-book reader.Activity 3

Choose TRUE or FALSE for the following statements.

  1. You cannot Delete or Rename items from the Edge Favorites list.

( ) TRUE

( ) FALSE

  1. While using InPrivate browsing, Edge opens a separate window that doesn’t track your browsing info or credentials.

( ) TRUE

( ) FALSE

  1. Post Office Protocol (POP3) email service can download mails onto your computer.

( ) TRUE

( ) FALSE

  1. Spam Blocker is designed to identify and eliminate spam or junk mail.

( ) TRUE

( ) FALSE

Which key words are notably missing from a set of texts when we might expect their presence, or which words are used with surprisingly high frequency when we wouldn’t expect them?

DISCUSSION QUESTION

The question you choose to study might vary greatly in scope and specificity. In the same way you’ve chosen your own text or set of texts, you should also choose your own question to answer something that drives your interest. Here are some sample questions to give you some ideas, with parenthetical additions suggesting the kind of thing that would bear further consideration in the Discussion section of the report:

  • The words “good” and “evil” show up a lot in the Bible. Do they show up in the same places, or does one only appear when the other is absent? (And how do we interpret our results?)
  • The word “power” and related words like “powerful” play a crucial role in The Wizard of Oz. With what other words are related to it, and how are these words used? (And what does that suggest about the distribution of power in Oz?)
  • Which parts of a text have smaller or bigger sentences on average? (And how does that relate to what’s happening in the text in these places?)
  • Which key words are notably missing from a set of texts when we might expect their presence, or which words are used with surprisingly high frequency when we wouldn’t expect them? (And how do these findings shift our understanding of the source material?)

Requirements: 2 pages

 

At what cache size and stride level do significant changes in access times occur? Do these timing changes correlate to typical cache sizes or changes in stride? Is it possible to determine the cache sizes of the different levels based on the produced data?

Computer Architecture

Introduction

The memory hierarchy of a given microprocessor typically is composed of at least three levels of cache between the CPU and main memory.  It is possible for a fourth level to be present in the form of eDRAM in some CPUs.  In multicore processors, each core will have its own level 1 and level 2, with level 3 shared among all the cores.  The hierarchy of the memory system is designed to provide seamless transfer of data from main memory to the CPU; it is not usually possible to determine by observation how many levels of cache a system has.

The goal of this assignment is to attempt to expose the memory hierarchy through programming. A program, written in C, is provided that exercises access through all levels of the memory hierarchy to RAM and collects performance information in the form of memory access times.  This information may provide a hint as to the structure of the memory hierarchy of the system being tested.

 

Background

The basis for this assignment comes from Case Study 2 described on pages 150 – 153 in the text book.  A program is provided that is designed to generate data that will allow timing various accesses to the memory hierarchy.  This program, which can be downloaded from Blackboard, is written in C.  A compiled version in the form of an executable for Windows systems is also provided.  For non-Windows systems, the source program should be compiled using a C compiler for the target system.

 

Additional documentation about the process is provided as an appendix to this assignment.

 

Notes About the Program

There are two important components embedded in the program.  The first is the ability to access the system clock to collect timing information of memory accesses.  Most programming languages provide a programming interface for this purpose in the form of a function or method.  The time.h header file provides this function in C and allows the capture of timing values in nanoseconds.

 

The second component is an appropriate data structure that can be accessed by dynamically varying the stride of the memory access.  The memory access stride is defined as the distance between addresses between two successive memory accesses and is generally a power of 2.  For this assignment, the simplest data structure for our purpose is a two-dimensional array whose size must be declared large enough to encompass the largest potential cache size.  The program declares a 4096 x 4096 maximum cache size which is equivalent to 16 mebibytes.

 

Program Design

The provided program is written in C because it compiles directly to native executable files and will typically provide more accurate results on most operating systems.  You may entertain the option of adapting the sample program to another language.  Java programs compile to bytecode which are interpreted by the JVM (Java Virtual Machine) and this additional overhead of execution may affect the timings that are collected.  Python is similar to Java in that it is an interpreted language but it is possible to produce an executable file using an add-on utility (your option).  If you wish, you may try to rewrite the program in a language that is most convenient for you.

 

The output of the program is directed to a text file so you have a record of the timing information your program generates.  The output is also printed on the monitor so that you can follow the program’s execution.  The format of the output is the size of the cache, the stride and the time for the read/write of the array on each cache size/stride increment.  The program is set up to output the data into a comma separated text file where each row represents a cache size, each column a stride value.  This format makes it possible to import the data into Excel so that the analysis part of this assignment somewhat easier.

 

Program Structure

There are two nested loops: the outer loop increments through the cache sizes (from 1K – 16M) and the inner loop increments through the strides for each cache size (1 – cache size/2).  Within the inner loop are two do loops. The first performs repeated read/writes to the matrix.  The second repeats the loop without access to the matrix to capture the overhead of the loop.  The difference between the two times provides the data access times which are averaged over the number of accesses per stride. This is represented by the variable loadtime in the program.

 

This program takes a long time to run because it constantly loops on each cache size and stride for 20 seconds.  Even on a fast computer, the run time can be more than 1.5 hours.  So, you need to allow enough time for the program to complete execution.

 

Observations and Analysis (What To Do For This Assignment)

Run the program on a computer system to generate a complete sequence of memory access timings. Once the program has completed, you will need to analyze the results. Using Excel or a comparable spreadsheet, you can import your data and then create graphs to show your data.  A sample graph, as presented in the textbook on page 152, is shown below.  You can use the graph you create using your own results data as a reference for your analysis and conclusions. Review the results and see if you can use the results to answer the following questions:

 

  1. At what cache size and stride level do significant changes in access times occur?
  2. Do these timing changes correlate to typical cache sizes or changes in stride?
  3. Is it possible to determine the cache sizes of the different levels based on the produced data?
  4. What in your data doesn’t make sense? What questions arise from this data?

Compare your data against the actual cache information for the system.  For Windows-based systems, there is a freeware product called CPU-Z which will report detailed CPU information including cache.  On Unix & Linux systems, /prod/cpuinfo or lspci will provide similar information. MacCPUID is a tool used for displaying detailed information about the microprocessor in a Mac computer.  Both CPU-Z and MacCPUID are free and can be downloaded from the Internet.  You may also refer to the specifications for the processor which are published online.

 

If time allows, run the program on a second, different system and compare the results.  Are they similar or different?  How are they different?

 

What to submit on Blackboard?

  1. A spreadsheet file where you consolidated and analyzed your results
  2. A summary of your observations (create a separate tab in the spreadsheet)
  3. Additional comments about your experience with this assignment (challenges, difficulties, surprises encountered, etc.) on the same page as your summary

 

Note:  due to many potential factors that could influence the outcome of your work on this assignment, there is no right vs. wrong solution.  Grading will be based on the observed level of effort presented through your analysis and documented results.  Your analysis should not just be a reiteration of the results, but should reflect your interpretation of the results, as well as posing any questions you formed in viewing the results.

 

Example graph from textbook showing program results:

Assignment Addendum

Cache Access Measurement Process Summary

 

Most contemporary processors today contain multilevel cache memory as part of the memory hierarchy.  Each level of cache can be characterized by the following parameters:

 

Size: typically in the Kibibyte or Mebibyte range

Block size (a.k.a. line size): the number of bytes contained in a block

Associativity: the number of sets contained in a cache location

 

Let D = size, b = block size and a = associativity. The number of sets in a cache is defined as D / ab. So if a cache were 64 KB with a block size of 64 bytes and an associativity of 2, the number of sets would be 64K / (64*2) = 512.

 

The program that you are using for this assignment is supposed to exercise the memory hierarchy by repeatedly accessing a data structure in memory and measuring the time associated with the access.  We stated that a simple two-dimensional array would suffice as the test data as long as its size was declared larger than the largest cache size in the system.  An appropriate upper limit would be 16 Mbytes as most caches are smaller than this size.  The program logic should vary the array size from some minimum value, e.g. 1 Kbyte, to the maximum and for each array size vary the indexing of the array using a stride value in the range 1 to N/2 where N is the size of the array.  Let s represent the stride.

Depending on the magnitudes of N and s, with respect to the size of the cache (D), the block size (b) and the associativity (a), there are four possible categories of operations.  Each of these categories are characterized by the rate at which misses occur in the cache.  The following table summarizes these categories.

 

Category Size of Array Stride Frequency of Mises Time per Iteration
1 1 £ N £ D 1 £ s £ N/2 No misses Tno-miss
2 D £ N 1 £ s £ b 1 miss every b/s elements Tno-miss + Ms/b
3 D £ N b £ s £ N/a 1 miss every element Tno-miss + M
4 D £ N N/a £ s £ N/2 No misses Tno-miss

 

T is access time and M is the miss penalty representing the time that it takes to read the data from the next lower cache or RAM and resume execution.

 

Discussion

 

Category 1:  N £ D

The complete array fits into the cache and thus, independently of the stride (s), once the array is loaded for the first time, there are no more misses.  The execution time per iteration (Tno-miss) includes the time to read the element from the cache, compute its new value and store the result back into the cache.

 

Category 2: N > D and 1 £ s < b

 

The array is bigger than the cache and there are b/s consecutive accesses to the same cache line.  The first access to the block always generates a miss because every cache line is displaced before it can be reused in subsequent accesses.  This follows from N > D. Therefore, the execution time per iteration is Tno-miss + Ms/b.

 

Category 3: N > D and b £ s < N/a

 

The array is bigger than the cache and there is a cache miss every iteration as each element of the array maps to a different line.  Again, every cache line is displaced from the cache before it can be reused.  The execution time per iteration is Tno-miss + M.

 

Category 4:  N > D and N/a £ s < N/2

 

The array is bigger than the cache but the number of addresses mapping to a single set is less than the set associativity.  Thus, once the array is loaded, there are no more misses.  Even when the array has N elements, only N/s < a of these are touched by the program and all of them can fit in a single set.  This follows from the fact that N/a £ s.  The execution time per iteration is Tno-miss.

 

By making a plot of the values of execution time per iteration as a function of N and s, we might be able to identify where the program makes a transition from one category to the next.  And using this information we can estimate the values of the parameters that affect the performance of the cache, namely the cache size, block size and associativity.

 

Our approach is somewhat flawed in that we are neglecting the effect of virtual memory and the use of a TLB (translation-lookaside buffer).    For our purpose, we can neglect these issues and still gain an understanding of the operation and performance of the caches in a given system.

 

 

Why do you want to study the course? What have you done that makes you suitable for the course? What else have you done that makes you somebody who will contribute to the course and to the university?

UCAS Application

Why do you want to study the course?
What have you done that makes you suitable for the course?
What else have you done that makes you somebody who will contribute to the course and to the university?

Why do you want to study the course? What have you done that makes you suitable for the course? What else have you done that makes you somebody who will contribute to the course and to the university?

UCAS Application

Why do you want to study the course?
What have you done that makes you suitable for the course?
What else have you done that makes you somebody who will contribute to the course and to the university?

Identify the scanner used to produce the report. Is the tool open source or commercial? Do you consider the tool to be industry standard? What are some advantages to using the tool? Disadvantages? What is your overall impression of the tool’s output? Does the tool provide enough reporting detail for you as the analyst to focus on the correct vulnerabilities? Can you appropriately discern the most critical vulnerabilities?

Mercury USA’s business

Overview

 In this section, provide a brief overview to establish the purpose of your memorandum. You should introduce the topics in Parts 1, 2, and 3, below. Remember that you are writing to your immediate boss to help her address the CEO’s concerns over recent cybersecurity attacks against the transportation sector. Additionally, your boss has provided you with the results of a recent pen testing engagement performed by a third party on behalf of Mercury USA.

 

Part 1: Vulnerability Management (VM) Process Recommendation

In this section, present a recommended VM process for Mercury USA. Highlight the major VM process components as you learned in your studies. Explain how your recommendation meets the business needs of Mercury USA. Consider the transportation sector and the overall scenario in context. The text and questions below represent specifics to focus on while writing the memorandum. Do not include the specific text of the questions in your final submission.

  • What are the main elements of a VM process, tailored to Mercury USA and the transportation sector?
  • How will you plan for and define the scope of a VM process?
  • How will you identify the assets involved?
  • How will you scan and assess vulnerabilities?
  • What is/are the industry standard scanning tools? Support your findings.
  • What frequency of scanning do you recommend and why?
  • How will you report the results of scanning and recommended countermeasures?

 

Part 2: Vulnerability Scanning Tool Evaluation and Recommendations
After performing an analysis of the vulnerability report provided by the third-party penetration testers, present your evaluation of the tool and your recommendations here. The text and questions below represent the specifics to focus on while writing your memorandum. Do not include the specific text of the questions in your final submission.

  • Identify the scanner used to produce the report. Is the tool open source or commercial? Do you consider the tool to be industry standard?
  • What are some advantages to using the tool? Disadvantages?
  • What is your overall impression of the tool’s output?
  • Does the tool provide enough reporting detail for you as the analyst to focus on the correct vulnerabilities? Can you appropriately discern the most critical vulnerabilities?
  • Do you think mitigations for the vulnerabilities are adequately covered in the report?
  • Do you think the reports are suitable for management? Explain why or why not.
  • Would you distribute the report automatically? Explain why or why not.
  • Would you recommend that Mercury USA use the tool? Explain why or why not.

 

 Part 3: Business Case Example

 In this section, provide an example of what could happen if Mercury USA does not implement your recommendations for a VM process (e.g., data exfiltration, hacker intrusions, ransomware, etc.). The text and questions below represent the specifics to focus on while writing your memorandum. Do not include the specific text of the questions in your final submission.

  • What are some of the outcomes to the business if your example occurred?
  • How does your recommended VM process address the example you used?
  • For the tool you evaluated in Part 2 above, do you think the tool will be adequate? Why or why not?

 

Closing

 In this section, summarize the main points of your argument for a VM process, tool evaluation, and use the case example to support your recommendations. Keep in mind that you are addressing the CEO’s concerns over recent cybersecurity attacks against the transportation sector and how you can help increase Mercury USA’s overall security posture to protect the organization against attacks, breaches, and data loss

Design and implement complete solutions using massive amounts of data based on artificial-intelligence, machine-learning, or statistical-analysis methods. Design and implement new methods to capture and analyze massive amounts of data.

Applying for Master’s Degree of Data Science

The Master of Science in Data Science program, offered by the College of Science’s Department of Computer Science and Department of Mathematics and Statistics. This degree is designed for students with an undergraduate degree in the sciences or engineering, and it provides both mathematical and algorithmic foundational knowledge and practical programming skills for data science careers.

The CIP code for the MS Data Science is 30.0801. Per the current DHS posted list from 2016 [pdf] this is a STEM Designated degree program, making this program eligible for the STEM OPT extension program.

Upon completion of this program, students will be able to:

Design and implement complete solutions using massive amounts of data based on artificial-intelligence, machine-learning, or statistical-analysis methods.
Design and implement new methods to capture and analyze massive amounts of data.
Use and improve existing tools for capturing and analyzing massive amounts of data.
This degree assists those with a baccalaureate degree in the sciences or engineering to pursue a data science career by providing them with a rigorous and affordable education in mathematics, statistics, computer science and machine learning. Graduates will appeal to a number of data-based careers including but not limited to data scientist, data analyst, data mining specialist, machine learning engineer, and data and analytics manager.

What benefits and challenges are associated with these kinds of relationships across the life cycle?

Developmental issues that are associated with sibling relationships and friendships. For this discussion, compare and contrast these groups and address the following:

What are key factors to understanding the impact of each type of relationship throughout the life cycle?
What benefits and challenges are associated with these kinds of relationships across the life cycle?
Identify a friendship or sibling relationship in your life (past or present), and provide one example interaction or experience from that relationship that illustrates a key concept from this week’s readings.