source: 3Blue1Brown

Source: 3Blue1Brown

Neural Network has been evolving recently. Many applications have been developed by using neural network. The development of GPU provided by big company like NVIDIA has fasten how the training process in Architecture of Neural network that needs many parameters in order to get better result for any kind of problems. You can see on the GIf above how neural network works throguh many layers that involve many parameters that can create good output that can identify the real value. The practical way is image identification where Neural network through combining many layers and parameters, activation function, and loss that can be improved to identify the image based on the GIF above. We will learn the implementation through Pytorch in this tutorial.

Study Case 1

You work as an assistant of the mayor of Somerville and the HR department has asked you to build a model capable of predicting whether a person is happy with the current administration based on their satisfaction with the city's services

import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt
import torch 
import torch.nn as nn 
import torch.optim as optim
import warnings
warnings.filterwarnings("ignore")
df = pd.read_csv("SomervilleHappinessSurvey2015.csv")
df.head()
D X1 X2 X3 X4 X5 X6
0 0 3 3 3 4 2 4
1 0 3 2 3 5 4 3
2 1 5 3 3 3 3 5
3 0 5 4 3 3 3 5
4 0 5 4 3 3 3 5

Columns Information:

  • D = decision attribute (D) with values 0 (unhappy) and 1 (happy)
  • X1 = the availability of information about the city services
  • X2 = the cost of housing
  • X3 = the overall quality of public schools
  • X4 = your trust in the local police
  • X5 = the maintenance of streets and sidewalks
  • X6 = the availability of social community events

  • Attributes X1 to X6 have values 1 to 5.

X = torch.tensor(df.drop("D",axis=1).astype(np.float32).values)
y = torch.tensor(df["D"].astype(np.float32).values)
X[:10]
tensor([[3., 3., 3., 4., 2., 4.],
        [3., 2., 3., 5., 4., 3.],
        [5., 3., 3., 3., 3., 5.],
        [5., 4., 3., 3., 3., 5.],
        [5., 4., 3., 3., 3., 5.],
        [5., 5., 3., 5., 5., 5.],
        [3., 1., 2., 2., 1., 3.],
        [5., 4., 4., 4., 4., 5.],
        [4., 1., 4., 4., 4., 4.],
        [4., 4., 4., 2., 5., 5.]])
# 6 from how many features we have 
# output whether happy or unhappy
model = nn.Sequential(nn.Linear(6,1),
                     nn.Sigmoid())
print(model)
Sequential(
  (0): Linear(in_features=6, out_features=1, bias=True)
  (1): Sigmoid()
)
loss_func = nn.MSELoss()
optimizer = optim.Adam(model.parameters(),lr=1e-2)
losses =[]
for i in range(20):
    y_pred = model(X)
    loss = loss_func(y_pred,y)
    # item() is used for getting value from tensor 
    losses.append(loss.item())
    optimizer.zero_grad()
    #do back propagation
    loss.backward()
    #update weights during backward propagation
    optimizer.step()
    
    if i%5 ==0:
        print(i,loss.item())
    
0 0.5094006061553955
5 0.46509113907814026
10 0.3730385899543762
15 0.26985085010528564

Plot the loss for each epochs

plt.plot(range(0,20),losses)
plt.xlabel("Epoch")
plt.ylabel("Loss")
Text(0, 0.5, 'Loss')

This is just a simple sequential neural network by implementing pytorch. We will deep dive into a process of building model in the study case 2 where we implement the process of Data Science lifecycle from cleaning the data,splitting the data, making the prediction and evaluating the prediction.

Study Case 2

Deep Learning in Bank

Deep Learning has been implementing in many sectors including Bank.The problem thas has been happening for this sector is to predict whether bank should grant loan for the customers who will be making credit card. This is essential for Bank because it can measure how they can validate how much money that they can provide and estimate the profit from customers who will use the credit card based on a period of time. We will detect the customers who will be potential to grant loan that can affect to the income of the bank through this dataset.

We will follow a few steps before modelling our data into ANN using Pytorch including :

  • Understand the data including dealing with quality of data
  • rescale the features (giving different scales for each features may result that a given features is more important thatn others as it has higher numerical values)
  • split the data
df_credit = pd.read_excel("default of credit card clients.xls",skiprows=1)
df_credit.head()
ID LIMIT_BAL SEX EDUCATION MARRIAGE AGE PAY_0 PAY_2 PAY_3 PAY_4 ... BILL_AMT4 BILL_AMT5 BILL_AMT6 PAY_AMT1 PAY_AMT2 PAY_AMT3 PAY_AMT4 PAY_AMT5 PAY_AMT6 default payment next month
0 1 20000 2 2 1 24 2 2 -1 -1 ... 0 0 0 0 689 0 0 0 0 1
1 2 120000 2 2 2 26 -1 2 0 0 ... 3272 3455 3261 0 1000 1000 1000 0 2000 1
2 3 90000 2 2 2 34 0 0 0 0 ... 14331 14948 15549 1518 1500 1000 1000 1000 5000 0
3 4 50000 2 2 1 37 0 0 0 0 ... 28314 28959 29547 2000 2019 1200 1100 1069 1000 0
4 5 50000 1 2 1 57 -1 0 -1 0 ... 20940 19146 19131 2000 36681 10000 9000 689 679 0

5 rows × 25 columns

print(f"Rows : {df_credit.shape[0]}, Columns:{df_credit.shape[1]}")
Rows : 30000, Columns:25
data_clean =df_credit.drop(["ID","SEX"],axis=1)
data_clean.head()
LIMIT_BAL EDUCATION MARRIAGE AGE PAY_0 PAY_2 PAY_3 PAY_4 PAY_5 PAY_6 ... BILL_AMT4 BILL_AMT5 BILL_AMT6 PAY_AMT1 PAY_AMT2 PAY_AMT3 PAY_AMT4 PAY_AMT5 PAY_AMT6 default payment next month
0 20000 2 1 24 2 2 -1 -1 -2 -2 ... 0 0 0 0 689 0 0 0 0 1
1 120000 2 2 26 -1 2 0 0 0 2 ... 3272 3455 3261 0 1000 1000 1000 0 2000 1
2 90000 2 2 34 0 0 0 0 0 0 ... 14331 14948 15549 1518 1500 1000 1000 1000 5000 0
3 50000 2 1 37 0 0 0 0 0 0 ... 28314 28959 29547 2000 2019 1200 1100 1069 1000 0
4 50000 2 1 57 -1 0 -1 0 0 0 ... 20940 19146 19131 2000 36681 10000 9000 689 679 0

5 rows × 23 columns

(data_clean.isnull().sum()/data_clean.shape[0]).plot()
<AxesSubplot:>
data_clean.describe()
LIMIT_BAL EDUCATION MARRIAGE AGE PAY_0 PAY_2 PAY_3 PAY_4 PAY_5 PAY_6 ... BILL_AMT4 BILL_AMT5 BILL_AMT6 PAY_AMT1 PAY_AMT2 PAY_AMT3 PAY_AMT4 PAY_AMT5 PAY_AMT6 default payment next month
count 30000.000000 30000.000000 30000.000000 30000.000000 30000.000000 30000.000000 30000.000000 30000.000000 30000.000000 30000.000000 ... 30000.000000 30000.000000 30000.000000 30000.000000 3.000000e+04 30000.00000 30000.000000 30000.000000 30000.000000 30000.000000
mean 167484.322667 1.853133 1.551867 35.485500 -0.016700 -0.133767 -0.166200 -0.220667 -0.266200 -0.291100 ... 43262.948967 40311.400967 38871.760400 5663.580500 5.921163e+03 5225.68150 4826.076867 4799.387633 5215.502567 0.221200
std 129747.661567 0.790349 0.521970 9.217904 1.123802 1.197186 1.196868 1.169139 1.133187 1.149988 ... 64332.856134 60797.155770 59554.107537 16563.280354 2.304087e+04 17606.96147 15666.159744 15278.305679 17777.465775 0.415062
min 10000.000000 0.000000 0.000000 21.000000 -2.000000 -2.000000 -2.000000 -2.000000 -2.000000 -2.000000 ... -170000.000000 -81334.000000 -339603.000000 0.000000 0.000000e+00 0.00000 0.000000 0.000000 0.000000 0.000000
25% 50000.000000 1.000000 1.000000 28.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 ... 2326.750000 1763.000000 1256.000000 1000.000000 8.330000e+02 390.00000 296.000000 252.500000 117.750000 0.000000
50% 140000.000000 2.000000 2.000000 34.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 19052.000000 18104.500000 17071.000000 2100.000000 2.009000e+03 1800.00000 1500.000000 1500.000000 1500.000000 0.000000
75% 240000.000000 2.000000 2.000000 41.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 54506.000000 50190.500000 49198.250000 5006.000000 5.000000e+03 4505.00000 4013.250000 4031.500000 4000.000000 0.000000
max 1000000.000000 6.000000 3.000000 79.000000 8.000000 8.000000 8.000000 8.000000 8.000000 8.000000 ... 891586.000000 927171.000000 961664.000000 873552.000000 1.684259e+06 896040.00000 621000.000000 426529.000000 528666.000000 1.000000

8 rows × 23 columns

outliers  ={}

for i in range(data_clean.shape[1]):
    min_t = data_clean[data_clean.columns[i]].mean()-(3*data_clean[data_clean.columns[i]].std())
    max_t = data_clean[data_clean.columns[i]].mean()+(3*data_clean[data_clean.columns[i]].std())
    
    count =0
    for j in data_clean[data_clean.columns[i]]:
        if j < min_t or j > max_t:
            count +=1
            
    percentage = count/data_clean.shape[0]
    
    outliers[data_clean.columns[i]] = round(percentage,3)
    
from pprint import pprint
pprint(outliers)
{'AGE': 0.005,
 'BILL_AMT1': 0.023,
 'BILL_AMT2': 0.022,
 'BILL_AMT3': 0.022,
 'BILL_AMT4': 0.023,
 'BILL_AMT5': 0.022,
 'BILL_AMT6': 0.022,
 'EDUCATION': 0.011,
 'LIMIT_BAL': 0.004,
 'MARRIAGE': 0.0,
 'PAY_0': 0.005,
 'PAY_2': 0.005,
 'PAY_3': 0.005,
 'PAY_4': 0.006,
 'PAY_5': 0.005,
 'PAY_6': 0.004,
 'PAY_AMT1': 0.013,
 'PAY_AMT2': 0.01,
 'PAY_AMT3': 0.012,
 'PAY_AMT4': 0.013,
 'PAY_AMT5': 0.014,
 'PAY_AMT6': 0.015,
 'default payment next month': 0.0}
data_clean["default payment next month"].value_counts().plot(kind="bar")
<AxesSubplot:>
target = data_clean["default payment next month"]
yes = target[target == 1].count()
no = target[target == 0].count()
data_yes = data_clean[data_clean["default payment next month"] == 1]
data_no = data_clean[data_clean["default payment next month"] == 0]
over_sampling = data_yes.sample(no, replace=True, \
                                random_state = 0)
data_resampled = pd.concat([data_no, over_sampling], \
                            axis=0)
data_resampled["default payment next month"].value_counts().plot(kind="bar")
<AxesSubplot:>
data_resampled = data_resampled.reset_index(drop=True)
X = data_resampled.drop("default payment next month",axis=1)
y =data_resampled["default payment next month"]
X = (X-X.min())/(X.max()-X.min())
X.head()
LIMIT_BAL EDUCATION MARRIAGE AGE PAY_0 PAY_2 PAY_3 PAY_4 PAY_5 PAY_6 ... BILL_AMT3 BILL_AMT4 BILL_AMT5 BILL_AMT6 PAY_AMT1 PAY_AMT2 PAY_AMT3 PAY_AMT4 PAY_AMT5 PAY_AMT6
0 0.080808 0.333333 0.666667 0.224138 0.2 0.2 0.2 0.2 0.2 0.2 ... 0.093789 0.173637 0.095470 0.272928 0.001738 0.000891 0.001116 0.001610 0.002345 0.009458
1 0.040404 0.333333 0.333333 0.275862 0.2 0.2 0.2 0.2 0.2 0.2 ... 0.113407 0.186809 0.109363 0.283685 0.002290 0.001199 0.001339 0.001771 0.002506 0.001892
2 0.040404 0.333333 0.333333 0.620690 0.1 0.2 0.1 0.2 0.2 0.2 ... 0.106020 0.179863 0.099633 0.275681 0.002290 0.021779 0.011160 0.014493 0.001615 0.001284
3 0.040404 0.166667 0.666667 0.275862 0.2 0.2 0.2 0.2 0.2 0.2 ... 0.117974 0.178407 0.100102 0.276367 0.002862 0.001078 0.000733 0.001610 0.002345 0.001513
4 0.494949 0.166667 0.666667 0.137931 0.2 0.2 0.2 0.2 0.2 0.2 ... 0.330672 0.671310 0.559578 0.625196 0.062961 0.023749 0.042409 0.032591 0.032237 0.026047

5 rows × 22 columns

final_data =pd.concat([X,y],axis=1)
final_data.to_csv("data_prepared.csv",index=False)

Build Model

import torch.nn.functional as F
from sklearn.model_selection import train_test_split 
from sklearn.utils import shuffle 
from sklearn.metrics import accuracy_score
data = pd.read_csv("data_prepared.csv")
data.head()
LIMIT_BAL EDUCATION MARRIAGE AGE PAY_0 PAY_2 PAY_3 PAY_4 PAY_5 PAY_6 ... BILL_AMT4 BILL_AMT5 BILL_AMT6 PAY_AMT1 PAY_AMT2 PAY_AMT3 PAY_AMT4 PAY_AMT5 PAY_AMT6 default payment next month
0 0.080808 0.333333 0.666667 0.224138 0.2 0.2 0.2 0.2 0.2 0.2 ... 0.173637 0.095470 0.272928 0.001738 0.000891 0.001116 0.001610 0.002345 0.009458 0
1 0.040404 0.333333 0.333333 0.275862 0.2 0.2 0.2 0.2 0.2 0.2 ... 0.186809 0.109363 0.283685 0.002290 0.001199 0.001339 0.001771 0.002506 0.001892 0
2 0.040404 0.333333 0.333333 0.620690 0.1 0.2 0.1 0.2 0.2 0.2 ... 0.179863 0.099633 0.275681 0.002290 0.021779 0.011160 0.014493 0.001615 0.001284 0
3 0.040404 0.166667 0.666667 0.275862 0.2 0.2 0.2 0.2 0.2 0.2 ... 0.178407 0.100102 0.276367 0.002862 0.001078 0.000733 0.001610 0.002345 0.001513 0
4 0.494949 0.166667 0.666667 0.137931 0.2 0.2 0.2 0.2 0.2 0.2 ... 0.671310 0.559578 0.625196 0.062961 0.023749 0.042409 0.032591 0.032237 0.026047 0

5 rows × 23 columns

X = data.drop("default payment next month",axis=1)
y =data["default payment next month"]
X_new , X_test,y_new,y_test =train_test_split(X,y,test_size=0.2,random_state=3)
dev_per = X_test.shape[0]/X_new.shape[0]
X_train,X_dev,y_train,y_dev = train_test_split(X_new,y_new,test_size=dev_per,random_state=3)
print("Training sets:",X_train.shape, y_train.shape)
print("Validation sets:",X_dev.shape, y_dev.shape)
print("Testing sets:",X_test.shape, y_test.shape)
Training sets: (28036, 22) (28036,)
Validation sets: (9346, 22) (9346,)
Testing sets: (9346, 22) (9346,)
X_dev_torch = torch.tensor(X_dev.values).float()
y_dev_torch = torch.tensor(y_dev.values)
X_test_torch = torch.tensor(X_test.values).float()
y_test_torch = torch.tensor(y_test.values)
class Classifier(nn.Module):
    def __init__(self, input_size):
        super().__init__()
        self.hidden_1 = nn.Linear(input_size, 10)
        self.hidden_2 = nn.Linear(10, 10)
        self.hidden_3 = nn.Linear(10, 10)
        self.output = nn.Linear(10, 2)
        
    def forward(self, x):
        z = F.relu(self.hidden_1(x))
        z = F.relu(self.hidden_2(z))
        z = F.relu(self.hidden_3(z))
        out = F.log_softmax(self.output(z), dim=1)
        
        return out
model = Classifier(X_train.shape[1])
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

epochs = 50
batch_size = 128 # for faster training procces/mini batch gradient descent
train_losses,dev_losses,train_acc,dev_acc =[],[],[],[]

for epoch in range(epochs):
    X_,y_ =shuffle(X_train,y_train)
    running_loss=0
    running_acc=0
    iterations =0
    
    for i in range(0,len(X_),batch_size):
        iterations +=1
        b = i +batch_size
        X_batch = torch.tensor(X_.iloc[i:b,:].values).float()
        y_batch = torch.tensor(y_.iloc[i:b].values)
        
        pred = model(X_batch)
        loss = criterion(pred,y_batch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        running_loss +=loss.item()
        ps = torch.exp(pred)
        top_p,top_class = ps.topk(1,dim=1)
        running_acc +=accuracy_score(y_batch,top_class)
        
    dev_loss =0
    acc =0
    
    with torch.no_grad():
        pred_dev = model(X_dev_torch)
        dev_loss =criterion(pred_dev,y_dev_torch)
        
        ps_dev = torch.exp(pred_dev)
        top_p,top_class_dev = ps_dev.topk(1,dim=1)
        acc +=accuracy_score(y_dev_torch,top_class_dev)
        
    train_losses.append(running_loss/iterations)
    dev_losses.append(dev_loss)
    train_acc.append(running_acc/iterations)
    dev_acc.append(acc)
    print("Epoch: {}/{}.. ".format(epoch+1, epochs),
          "Training Loss: {:.3f}.. ".format(running_loss/iterations),
          "Validation Loss: {:.3f}.. ".format(dev_loss),
          "Training Accuracy: {:.3f}.. ".format(running_acc/iterations),
          "Validation Accuracy: {:.3f}".format(acc))
        
Epoch: 1/50..  Training Loss: 0.675..  Validation Loss: 0.617..  Training Accuracy: 0.602..  Validation Accuracy: 0.663
Epoch: 2/50..  Training Loss: 0.607..  Validation Loss: 0.600..  Training Accuracy: 0.676..  Validation Accuracy: 0.686
Epoch: 3/50..  Training Loss: 0.598..  Validation Loss: 0.599..  Training Accuracy: 0.686..  Validation Accuracy: 0.688
Epoch: 4/50..  Training Loss: 0.595..  Validation Loss: 0.592..  Training Accuracy: 0.691..  Validation Accuracy: 0.695
Epoch: 5/50..  Training Loss: 0.592..  Validation Loss: 0.590..  Training Accuracy: 0.692..  Validation Accuracy: 0.693
Epoch: 6/50..  Training Loss: 0.589..  Validation Loss: 0.586..  Training Accuracy: 0.692..  Validation Accuracy: 0.693
Epoch: 7/50..  Training Loss: 0.585..  Validation Loss: 0.584..  Training Accuracy: 0.695..  Validation Accuracy: 0.696
Epoch: 8/50..  Training Loss: 0.583..  Validation Loss: 0.579..  Training Accuracy: 0.696..  Validation Accuracy: 0.700
Epoch: 9/50..  Training Loss: 0.576..  Validation Loss: 0.575..  Training Accuracy: 0.701..  Validation Accuracy: 0.703
Epoch: 10/50..  Training Loss: 0.572..  Validation Loss: 0.572..  Training Accuracy: 0.704..  Validation Accuracy: 0.707
Epoch: 11/50..  Training Loss: 0.572..  Validation Loss: 0.571..  Training Accuracy: 0.705..  Validation Accuracy: 0.709
Epoch: 12/50..  Training Loss: 0.570..  Validation Loss: 0.569..  Training Accuracy: 0.704..  Validation Accuracy: 0.709
Epoch: 13/50..  Training Loss: 0.569..  Validation Loss: 0.568..  Training Accuracy: 0.708..  Validation Accuracy: 0.709
Epoch: 14/50..  Training Loss: 0.567..  Validation Loss: 0.568..  Training Accuracy: 0.708..  Validation Accuracy: 0.709
Epoch: 15/50..  Training Loss: 0.566..  Validation Loss: 0.566..  Training Accuracy: 0.710..  Validation Accuracy: 0.711
Epoch: 16/50..  Training Loss: 0.566..  Validation Loss: 0.567..  Training Accuracy: 0.708..  Validation Accuracy: 0.710
Epoch: 17/50..  Training Loss: 0.564..  Validation Loss: 0.566..  Training Accuracy: 0.711..  Validation Accuracy: 0.708
Epoch: 18/50..  Training Loss: 0.566..  Validation Loss: 0.564..  Training Accuracy: 0.707..  Validation Accuracy: 0.710
Epoch: 19/50..  Training Loss: 0.564..  Validation Loss: 0.564..  Training Accuracy: 0.712..  Validation Accuracy: 0.708
Epoch: 20/50..  Training Loss: 0.565..  Validation Loss: 0.564..  Training Accuracy: 0.710..  Validation Accuracy: 0.711
Epoch: 21/50..  Training Loss: 0.562..  Validation Loss: 0.563..  Training Accuracy: 0.712..  Validation Accuracy: 0.707
Epoch: 22/50..  Training Loss: 0.563..  Validation Loss: 0.562..  Training Accuracy: 0.710..  Validation Accuracy: 0.711
Epoch: 23/50..  Training Loss: 0.561..  Validation Loss: 0.564..  Training Accuracy: 0.713..  Validation Accuracy: 0.710
Epoch: 24/50..  Training Loss: 0.562..  Validation Loss: 0.562..  Training Accuracy: 0.712..  Validation Accuracy: 0.709
Epoch: 25/50..  Training Loss: 0.561..  Validation Loss: 0.563..  Training Accuracy: 0.712..  Validation Accuracy: 0.710
Epoch: 26/50..  Training Loss: 0.562..  Validation Loss: 0.562..  Training Accuracy: 0.708..  Validation Accuracy: 0.707
Epoch: 27/50..  Training Loss: 0.562..  Validation Loss: 0.563..  Training Accuracy: 0.710..  Validation Accuracy: 0.709
Epoch: 28/50..  Training Loss: 0.560..  Validation Loss: 0.565..  Training Accuracy: 0.711..  Validation Accuracy: 0.712
Epoch: 29/50..  Training Loss: 0.561..  Validation Loss: 0.562..  Training Accuracy: 0.711..  Validation Accuracy: 0.709
Epoch: 30/50..  Training Loss: 0.560..  Validation Loss: 0.564..  Training Accuracy: 0.711..  Validation Accuracy: 0.710
Epoch: 31/50..  Training Loss: 0.561..  Validation Loss: 0.561..  Training Accuracy: 0.713..  Validation Accuracy: 0.709
Epoch: 32/50..  Training Loss: 0.560..  Validation Loss: 0.563..  Training Accuracy: 0.712..  Validation Accuracy: 0.706
Epoch: 33/50..  Training Loss: 0.559..  Validation Loss: 0.561..  Training Accuracy: 0.711..  Validation Accuracy: 0.709
Epoch: 34/50..  Training Loss: 0.559..  Validation Loss: 0.561..  Training Accuracy: 0.714..  Validation Accuracy: 0.710
Epoch: 35/50..  Training Loss: 0.560..  Validation Loss: 0.561..  Training Accuracy: 0.711..  Validation Accuracy: 0.710
Epoch: 36/50..  Training Loss: 0.562..  Validation Loss: 0.559..  Training Accuracy: 0.710..  Validation Accuracy: 0.710
Epoch: 37/50..  Training Loss: 0.560..  Validation Loss: 0.561..  Training Accuracy: 0.713..  Validation Accuracy: 0.708
Epoch: 38/50..  Training Loss: 0.557..  Validation Loss: 0.563..  Training Accuracy: 0.713..  Validation Accuracy: 0.710
Epoch: 39/50..  Training Loss: 0.561..  Validation Loss: 0.559..  Training Accuracy: 0.712..  Validation Accuracy: 0.711
Epoch: 40/50..  Training Loss: 0.560..  Validation Loss: 0.560..  Training Accuracy: 0.710..  Validation Accuracy: 0.708
Epoch: 41/50..  Training Loss: 0.559..  Validation Loss: 0.559..  Training Accuracy: 0.712..  Validation Accuracy: 0.710
Epoch: 42/50..  Training Loss: 0.557..  Validation Loss: 0.559..  Training Accuracy: 0.713..  Validation Accuracy: 0.710
Epoch: 43/50..  Training Loss: 0.556..  Validation Loss: 0.559..  Training Accuracy: 0.714..  Validation Accuracy: 0.709
Epoch: 44/50..  Training Loss: 0.557..  Validation Loss: 0.563..  Training Accuracy: 0.714..  Validation Accuracy: 0.701
Epoch: 45/50..  Training Loss: 0.557..  Validation Loss: 0.558..  Training Accuracy: 0.713..  Validation Accuracy: 0.713
Epoch: 46/50..  Training Loss: 0.560..  Validation Loss: 0.559..  Training Accuracy: 0.710..  Validation Accuracy: 0.711
Epoch: 47/50..  Training Loss: 0.558..  Validation Loss: 0.559..  Training Accuracy: 0.713..  Validation Accuracy: 0.713
Epoch: 48/50..  Training Loss: 0.556..  Validation Loss: 0.558..  Training Accuracy: 0.714..  Validation Accuracy: 0.714
Epoch: 49/50..  Training Loss: 0.556..  Validation Loss: 0.564..  Training Accuracy: 0.714..  Validation Accuracy: 0.699
Epoch: 50/50..  Training Loss: 0.559..  Validation Loss: 0.560..  Training Accuracy: 0.712..  Validation Accuracy: 0.711
fig = plt.subplots(figsize=(15,5))
plt.plot(train_losses,label="Training loss")
plt.plot(dev_losses,label="Validation loss")
plt.legend(frameon=False, fontsize=15)
<matplotlib.legend.Legend at 0x27ee31061c0>
fig = plt.subplots(figsize=(15,5))
plt.plot(train_acc,label="Training accuracy")
plt.plot(dev_acc,label="Validation accuracy")
plt.legend(frameon=False, fontsize=15)
<matplotlib.legend.Legend at 0x27efa576d30>

We can see there is a change in the loss and accuracy accuracy during each epoch. We can do tune the learning rate for getting better result. You can do the experimentation by comparing the LR as well. This is a few steps in modelling using pytorch. You can see that we can do modelling by using Sequential or custom models using torch.nn.

# by adding hidden layer or epochs for training process

class Classifier_Layer(nn.Module):
    def __init__(self, input_size):
        super().__init__()
        self.hidden_1 = nn.Linear(input_size, 10)
        self.hidden_2 = nn.Linear(10, 10)
        self.hidden_3 = nn.Linear(10, 10)
        self.hidden_4 = nn.Linear(10, 10)
        self.output = nn.Linear(10, 2)
        
    def forward(self, x):
        z = F.relu(self.hidden_1(x))
        z = F.relu(self.hidden_2(z))
        z = F.relu(self.hidden_3(z))
        z = F.relu(self.hidden_4(z))
        out = F.log_softmax(self.output(z), dim=1)
        
        return out
model = Classifier_Layer(X_train.shape[1])
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

epochs = 100
batch_size = 150 # for faster training procces/mini batch gradient descent
train_losses,dev_losses,train_acc,dev_acc =[],[],[],[]

for epoch in range(epochs):
    X_,y_ =shuffle(X_train,y_train)
    running_loss=0
    running_acc=0
    iterations =0
    
    for i in range(0,len(X_),batch_size):
        iterations +=1
        b = i +batch_size
        X_batch = torch.tensor(X_.iloc[i:b,:].values).float()
        y_batch = torch.tensor(y_.iloc[i:b].values)
        
        pred = model(X_batch)
        loss = criterion(pred,y_batch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        running_loss +=loss.item()
        ps = torch.exp(pred)
        top_p,top_class = ps.topk(1,dim=1)
        running_acc +=accuracy_score(y_batch,top_class)
        
    dev_loss =0
    acc =0
    
    with torch.no_grad():
        pred_dev = model(X_dev_torch)
        dev_loss =criterion(pred_dev,y_dev_torch)
        
        ps_dev = torch.exp(pred_dev)
        top_p,top_class_dev = ps_dev.topk(1,dim=1)
        acc +=accuracy_score(y_dev_torch,top_class_dev)
        
    train_losses.append(running_loss/iterations)
    dev_losses.append(dev_loss)
    train_acc.append(running_acc/iterations)
    dev_acc.append(acc)
    print("Epoch: {}/{}.. ".format(epoch+1, epochs),
          "Training Loss: {:.3f}.. ".format(running_loss/iterations),
          "Validation Loss: {:.3f}.. ".format(dev_loss),
          "Training Accuracy: {:.3f}.. ".format(running_acc/iterations),
          "Validation Accuracy: {:.3f}".format(acc))
        
Epoch: 1/100..  Training Loss: 0.618..  Validation Loss: 0.595..  Training Accuracy: 0.662..  Validation Accuracy: 0.688
Epoch: 2/100..  Training Loss: 0.594..  Validation Loss: 0.587..  Training Accuracy: 0.686..  Validation Accuracy: 0.690
Epoch: 3/100..  Training Loss: 0.588..  Validation Loss: 0.580..  Training Accuracy: 0.687..  Validation Accuracy: 0.693
Epoch: 4/100..  Training Loss: 0.584..  Validation Loss: 0.583..  Training Accuracy: 0.692..  Validation Accuracy: 0.692
Epoch: 5/100..  Training Loss: 0.583..  Validation Loss: 0.577..  Training Accuracy: 0.695..  Validation Accuracy: 0.694
Epoch: 6/100..  Training Loss: 0.580..  Validation Loss: 0.573..  Training Accuracy: 0.696..  Validation Accuracy: 0.703
Epoch: 7/100..  Training Loss: 0.575..  Validation Loss: 0.569..  Training Accuracy: 0.703..  Validation Accuracy: 0.704
Epoch: 8/100..  Training Loss: 0.574..  Validation Loss: 0.574..  Training Accuracy: 0.704..  Validation Accuracy: 0.709
Epoch: 9/100..  Training Loss: 0.571..  Validation Loss: 0.582..  Training Accuracy: 0.705..  Validation Accuracy: 0.708
Epoch: 10/100..  Training Loss: 0.571..  Validation Loss: 0.564..  Training Accuracy: 0.706..  Validation Accuracy: 0.712
Epoch: 11/100..  Training Loss: 0.569..  Validation Loss: 0.565..  Training Accuracy: 0.707..  Validation Accuracy: 0.712
Epoch: 12/100..  Training Loss: 0.569..  Validation Loss: 0.568..  Training Accuracy: 0.707..  Validation Accuracy: 0.705
Epoch: 13/100..  Training Loss: 0.566..  Validation Loss: 0.569..  Training Accuracy: 0.710..  Validation Accuracy: 0.706
Epoch: 14/100..  Training Loss: 0.566..  Validation Loss: 0.563..  Training Accuracy: 0.709..  Validation Accuracy: 0.713
Epoch: 15/100..  Training Loss: 0.566..  Validation Loss: 0.561..  Training Accuracy: 0.709..  Validation Accuracy: 0.711
Epoch: 16/100..  Training Loss: 0.564..  Validation Loss: 0.562..  Training Accuracy: 0.712..  Validation Accuracy: 0.715
Epoch: 17/100..  Training Loss: 0.563..  Validation Loss: 0.562..  Training Accuracy: 0.712..  Validation Accuracy: 0.713
Epoch: 18/100..  Training Loss: 0.566..  Validation Loss: 0.572..  Training Accuracy: 0.708..  Validation Accuracy: 0.701
Epoch: 19/100..  Training Loss: 0.564..  Validation Loss: 0.565..  Training Accuracy: 0.711..  Validation Accuracy: 0.710
Epoch: 20/100..  Training Loss: 0.564..  Validation Loss: 0.561..  Training Accuracy: 0.712..  Validation Accuracy: 0.712
Epoch: 21/100..  Training Loss: 0.562..  Validation Loss: 0.566..  Training Accuracy: 0.710..  Validation Accuracy: 0.706
Epoch: 22/100..  Training Loss: 0.563..  Validation Loss: 0.571..  Training Accuracy: 0.711..  Validation Accuracy: 0.705
Epoch: 23/100..  Training Loss: 0.563..  Validation Loss: 0.563..  Training Accuracy: 0.710..  Validation Accuracy: 0.711
Epoch: 24/100..  Training Loss: 0.559..  Validation Loss: 0.562..  Training Accuracy: 0.711..  Validation Accuracy: 0.710
Epoch: 25/100..  Training Loss: 0.561..  Validation Loss: 0.564..  Training Accuracy: 0.711..  Validation Accuracy: 0.712
Epoch: 26/100..  Training Loss: 0.561..  Validation Loss: 0.565..  Training Accuracy: 0.710..  Validation Accuracy: 0.704
Epoch: 27/100..  Training Loss: 0.560..  Validation Loss: 0.563..  Training Accuracy: 0.712..  Validation Accuracy: 0.714
Epoch: 28/100..  Training Loss: 0.560..  Validation Loss: 0.566..  Training Accuracy: 0.715..  Validation Accuracy: 0.701
Epoch: 29/100..  Training Loss: 0.559..  Validation Loss: 0.568..  Training Accuracy: 0.712..  Validation Accuracy: 0.710
Epoch: 30/100..  Training Loss: 0.561..  Validation Loss: 0.566..  Training Accuracy: 0.711..  Validation Accuracy: 0.712
Epoch: 31/100..  Training Loss: 0.559..  Validation Loss: 0.560..  Training Accuracy: 0.712..  Validation Accuracy: 0.716
Epoch: 32/100..  Training Loss: 0.558..  Validation Loss: 0.561..  Training Accuracy: 0.713..  Validation Accuracy: 0.715
Epoch: 33/100..  Training Loss: 0.560..  Validation Loss: 0.559..  Training Accuracy: 0.711..  Validation Accuracy: 0.713
Epoch: 34/100..  Training Loss: 0.558..  Validation Loss: 0.565..  Training Accuracy: 0.714..  Validation Accuracy: 0.705
Epoch: 35/100..  Training Loss: 0.560..  Validation Loss: 0.560..  Training Accuracy: 0.711..  Validation Accuracy: 0.713
Epoch: 36/100..  Training Loss: 0.557..  Validation Loss: 0.556..  Training Accuracy: 0.714..  Validation Accuracy: 0.715
Epoch: 37/100..  Training Loss: 0.558..  Validation Loss: 0.559..  Training Accuracy: 0.713..  Validation Accuracy: 0.713
Epoch: 38/100..  Training Loss: 0.557..  Validation Loss: 0.565..  Training Accuracy: 0.712..  Validation Accuracy: 0.714
Epoch: 39/100..  Training Loss: 0.559..  Validation Loss: 0.559..  Training Accuracy: 0.711..  Validation Accuracy: 0.714
Epoch: 40/100..  Training Loss: 0.559..  Validation Loss: 0.566..  Training Accuracy: 0.714..  Validation Accuracy: 0.713
Epoch: 41/100..  Training Loss: 0.556..  Validation Loss: 0.559..  Training Accuracy: 0.714..  Validation Accuracy: 0.713
Epoch: 42/100..  Training Loss: 0.557..  Validation Loss: 0.575..  Training Accuracy: 0.714..  Validation Accuracy: 0.716
Epoch: 43/100..  Training Loss: 0.558..  Validation Loss: 0.559..  Training Accuracy: 0.713..  Validation Accuracy: 0.715
Epoch: 44/100..  Training Loss: 0.555..  Validation Loss: 0.556..  Training Accuracy: 0.716..  Validation Accuracy: 0.715
Epoch: 45/100..  Training Loss: 0.557..  Validation Loss: 0.560..  Training Accuracy: 0.714..  Validation Accuracy: 0.715
Epoch: 46/100..  Training Loss: 0.556..  Validation Loss: 0.560..  Training Accuracy: 0.715..  Validation Accuracy: 0.714
Epoch: 47/100..  Training Loss: 0.556..  Validation Loss: 0.561..  Training Accuracy: 0.715..  Validation Accuracy: 0.709
Epoch: 48/100..  Training Loss: 0.553..  Validation Loss: 0.558..  Training Accuracy: 0.716..  Validation Accuracy: 0.714
Epoch: 49/100..  Training Loss: 0.555..  Validation Loss: 0.561..  Training Accuracy: 0.712..  Validation Accuracy: 0.713
Epoch: 50/100..  Training Loss: 0.555..  Validation Loss: 0.564..  Training Accuracy: 0.715..  Validation Accuracy: 0.703
Epoch: 51/100..  Training Loss: 0.556..  Validation Loss: 0.566..  Training Accuracy: 0.713..  Validation Accuracy: 0.699
Epoch: 52/100..  Training Loss: 0.555..  Validation Loss: 0.556..  Training Accuracy: 0.715..  Validation Accuracy: 0.713
Epoch: 53/100..  Training Loss: 0.553..  Validation Loss: 0.554..  Training Accuracy: 0.713..  Validation Accuracy: 0.716
Epoch: 54/100..  Training Loss: 0.553..  Validation Loss: 0.557..  Training Accuracy: 0.715..  Validation Accuracy: 0.713
Epoch: 55/100..  Training Loss: 0.555..  Validation Loss: 0.555..  Training Accuracy: 0.715..  Validation Accuracy: 0.713
Epoch: 56/100..  Training Loss: 0.552..  Validation Loss: 0.558..  Training Accuracy: 0.715..  Validation Accuracy: 0.712
Epoch: 57/100..  Training Loss: 0.555..  Validation Loss: 0.557..  Training Accuracy: 0.713..  Validation Accuracy: 0.712
Epoch: 58/100..  Training Loss: 0.554..  Validation Loss: 0.565..  Training Accuracy: 0.716..  Validation Accuracy: 0.699
Epoch: 59/100..  Training Loss: 0.554..  Validation Loss: 0.560..  Training Accuracy: 0.715..  Validation Accuracy: 0.712
Epoch: 60/100..  Training Loss: 0.554..  Validation Loss: 0.557..  Training Accuracy: 0.713..  Validation Accuracy: 0.711
Epoch: 61/100..  Training Loss: 0.552..  Validation Loss: 0.554..  Training Accuracy: 0.715..  Validation Accuracy: 0.713
Epoch: 62/100..  Training Loss: 0.553..  Validation Loss: 0.558..  Training Accuracy: 0.716..  Validation Accuracy: 0.712
Epoch: 63/100..  Training Loss: 0.553..  Validation Loss: 0.555..  Training Accuracy: 0.715..  Validation Accuracy: 0.716
Epoch: 64/100..  Training Loss: 0.554..  Validation Loss: 0.555..  Training Accuracy: 0.715..  Validation Accuracy: 0.717
Epoch: 65/100..  Training Loss: 0.555..  Validation Loss: 0.558..  Training Accuracy: 0.716..  Validation Accuracy: 0.712
Epoch: 66/100..  Training Loss: 0.552..  Validation Loss: 0.555..  Training Accuracy: 0.716..  Validation Accuracy: 0.716
Epoch: 67/100..  Training Loss: 0.553..  Validation Loss: 0.557..  Training Accuracy: 0.717..  Validation Accuracy: 0.713
Epoch: 68/100..  Training Loss: 0.552..  Validation Loss: 0.563..  Training Accuracy: 0.717..  Validation Accuracy: 0.705
Epoch: 69/100..  Training Loss: 0.552..  Validation Loss: 0.561..  Training Accuracy: 0.715..  Validation Accuracy: 0.705
Epoch: 70/100..  Training Loss: 0.553..  Validation Loss: 0.558..  Training Accuracy: 0.714..  Validation Accuracy: 0.716
Epoch: 71/100..  Training Loss: 0.554..  Validation Loss: 0.556..  Training Accuracy: 0.715..  Validation Accuracy: 0.712
Epoch: 72/100..  Training Loss: 0.553..  Validation Loss: 0.557..  Training Accuracy: 0.717..  Validation Accuracy: 0.716
Epoch: 73/100..  Training Loss: 0.551..  Validation Loss: 0.559..  Training Accuracy: 0.717..  Validation Accuracy: 0.716
Epoch: 74/100..  Training Loss: 0.552..  Validation Loss: 0.562..  Training Accuracy: 0.716..  Validation Accuracy: 0.696
Epoch: 75/100..  Training Loss: 0.553..  Validation Loss: 0.556..  Training Accuracy: 0.715..  Validation Accuracy: 0.715
Epoch: 76/100..  Training Loss: 0.553..  Validation Loss: 0.555..  Training Accuracy: 0.717..  Validation Accuracy: 0.715
Epoch: 77/100..  Training Loss: 0.550..  Validation Loss: 0.556..  Training Accuracy: 0.718..  Validation Accuracy: 0.714
Epoch: 78/100..  Training Loss: 0.551..  Validation Loss: 0.565..  Training Accuracy: 0.717..  Validation Accuracy: 0.700
Epoch: 79/100..  Training Loss: 0.552..  Validation Loss: 0.558..  Training Accuracy: 0.718..  Validation Accuracy: 0.714
Epoch: 80/100..  Training Loss: 0.552..  Validation Loss: 0.555..  Training Accuracy: 0.717..  Validation Accuracy: 0.712
Epoch: 81/100..  Training Loss: 0.553..  Validation Loss: 0.555..  Training Accuracy: 0.716..  Validation Accuracy: 0.712
Epoch: 82/100..  Training Loss: 0.551..  Validation Loss: 0.557..  Training Accuracy: 0.715..  Validation Accuracy: 0.710
Epoch: 83/100..  Training Loss: 0.550..  Validation Loss: 0.553..  Training Accuracy: 0.717..  Validation Accuracy: 0.716
Epoch: 84/100..  Training Loss: 0.550..  Validation Loss: 0.557..  Training Accuracy: 0.718..  Validation Accuracy: 0.711
Epoch: 85/100..  Training Loss: 0.551..  Validation Loss: 0.555..  Training Accuracy: 0.716..  Validation Accuracy: 0.715
Epoch: 86/100..  Training Loss: 0.551..  Validation Loss: 0.559..  Training Accuracy: 0.716..  Validation Accuracy: 0.704
Epoch: 87/100..  Training Loss: 0.553..  Validation Loss: 0.556..  Training Accuracy: 0.715..  Validation Accuracy: 0.717
Epoch: 88/100..  Training Loss: 0.551..  Validation Loss: 0.554..  Training Accuracy: 0.716..  Validation Accuracy: 0.714
Epoch: 89/100..  Training Loss: 0.551..  Validation Loss: 0.559..  Training Accuracy: 0.715..  Validation Accuracy: 0.712
Epoch: 90/100..  Training Loss: 0.549..  Validation Loss: 0.554..  Training Accuracy: 0.717..  Validation Accuracy: 0.718
Epoch: 91/100..  Training Loss: 0.551..  Validation Loss: 0.556..  Training Accuracy: 0.716..  Validation Accuracy: 0.717
Epoch: 92/100..  Training Loss: 0.550..  Validation Loss: 0.554..  Training Accuracy: 0.717..  Validation Accuracy: 0.715
Epoch: 93/100..  Training Loss: 0.549..  Validation Loss: 0.559..  Training Accuracy: 0.720..  Validation Accuracy: 0.705
Epoch: 94/100..  Training Loss: 0.549..  Validation Loss: 0.558..  Training Accuracy: 0.717..  Validation Accuracy: 0.706
Epoch: 95/100..  Training Loss: 0.549..  Validation Loss: 0.559..  Training Accuracy: 0.717..  Validation Accuracy: 0.714
Epoch: 96/100..  Training Loss: 0.549..  Validation Loss: 0.553..  Training Accuracy: 0.719..  Validation Accuracy: 0.718
Epoch: 97/100..  Training Loss: 0.551..  Validation Loss: 0.555..  Training Accuracy: 0.716..  Validation Accuracy: 0.718
Epoch: 98/100..  Training Loss: 0.551..  Validation Loss: 0.558..  Training Accuracy: 0.716..  Validation Accuracy: 0.711
Epoch: 99/100..  Training Loss: 0.549..  Validation Loss: 0.557..  Training Accuracy: 0.719..  Validation Accuracy: 0.716
Epoch: 100/100..  Training Loss: 0.547..  Validation Loss: 0.563..  Training Accuracy: 0.719..  Validation Accuracy: 0.714
fig = plt.subplots(figsize=(15,5))
plt.plot(train_losses,label="Training loss")
plt.plot(dev_losses,label="Validation loss")
plt.legend(frameon=False, fontsize=15)
<matplotlib.legend.Legend at 0x27efa57d460>
fig = plt.subplots(figsize=(15,5))
plt.plot(train_acc,label="Training accuracy")
plt.plot(dev_acc,label="Validation accuracy")
plt.legend(frameon=False, fontsize=15)
<matplotlib.legend.Legend at 0x27efb09d940>
You can experiment by changing the architecture of the model