Discuss Data Mining Methods that are appropriate for your project problem.
Review the German Credit DataSet (Links to an external site.) (https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)) in the attachment. It has 1,000 observations. Train, test and validate a neural network with the first 980 observations and however many neurons in the hidden layer as you like.
Take a look at the data and remove a few attributes that you think do not help to determine the creditworthiness of a customer.
The last column is whether a customer is actually “good” or “bad” (i.e., their credit rating). See if you can improve the accuracy by changing various parameters, such as the number of neurons, and the number of layers. After you train, test your holdout 20 samples and report the results using the method below.
If your predictions are correct (good or bad) for each example, that counts as 0. If your prediction for a good customer is “bad” add 1 to your total. If your prediction for a bad customer is “good”, add 5 to your total. The lower the total the better your neural network.
What you need to do.
1) Partition the data into training (980 data points) and holdout (last 20 data points) datasets.
2) Get the evaluation score of the 20 holdout data.
Submission:
a) Detailed Project Report that follows the example of the academic paper “Combining Feature Selection and Neural Network for Solving Classification Problem”.
1) Summary of your project
2) Introduction of applying data mining on the business problem (in this case, consumer credit card)
3) Discuss Data Mining Methods that are appropriate for your project problem.
4) TechnICAL Description of the German Credit Data Mining Process.
You can choose any one of three data mining processes to describe your approaches. The following is based on the Cross-Industry Standard Process for Data Mining (CRISP-DM).
a) Business Understanding
b) Data Understanding
c) Data Preparation
d) Model Building Based on Neural Networks
e) Testing and Evaluations (show all your screen plots and performance results).
f) Deployment (potential concerns and issues when you apply your NN system to a real-life credit card company)
5) Description of Results and Analyze your experiment results and provide wisdom that you obtained from the project.