Sharing is caring!

Lesson 2: Best Pytorch Tutorial for Deep Learning

Table of Contents

Introduction

Welcome to Lesson 2 of our deep learning series! In this lesson, we have an amazing PyTorch tutorial that will help you become a master of deep learning techniques. PyTorch is well-known for its flexibility, user-friendly interface, and impressive GPU acceleration.

But don’t forget to check:

Throughout this tutorial, we will walk you through the essential steps of building and training neural networks using PyTorch. We will cover various topics such as working with matrices, implementing neural network models, and harnessing the power of dynamic computation graphs.

So, let’s jump right in and take your deep learning journey to new heights with PyTorch!

pytorch
pytorch install
install pytorch
installing pytorch
pytorch lightning
pytorch-lightning

What is PyTorch?

PyTorch is a Python-based scientific computing package designed for two main purposes:

  1. GPU-Accelerated NumPy Replacement: Harness the power of GPUs for numerical computations.
  2. Deep Learning Research: Offers maximum flexibility and speed for developing deep learning models.

Advantages

  • Interactive Debugging: Easier to debug and visualize compared to many other frameworks.
  • Dynamic Graphs: Excellent support for dynamic computational graphs.
  • Facebook Support: Backed by a robust organization.
  • Flexible APIs: Combines high-level and low-level APIs for versatile use.
pytorch dataloader
pytorch data loader
install torch
pytorch tutorial
pytorch vs tensorflow

Disadvantages

  • Maturity: Not as mature as some alternative frameworks.
  • Resources: Limited references and resources beyond official documentation.

Note

This guide assumes you are familiar with the basics of neural networks. If not, please refer to my beginner’s tutorial. The tutorial provides foundational knowledge crucial for understanding and using PyTorch for neural networks.

How to install Pytorch

To install PyTorch, you can easily follow these steps:

  1. Choose your Installation Options: PyTorch provides different installation options based on your system configuration, including your operating system, Python version, CUDA support (for GPU acceleration), and package manager preference (pip or conda).
  2. Visit the PyTorch Website: Simply go to the official PyTorch website at pytorch.org.
  3. Select Installation Options: On the PyTorch website, head to the “Get Started” section or click on the “Quick Start” button. There, you’ll find options to select your desired installation configurations.
  4. Generate Installation Command: Once you’ve made your selections, the website will generate an installation command specifically tailored to your preferences. Just copy this command.
  5. Open Terminal or Command Prompt: Open the terminal or command prompt on your system.
  6. Paste and Execute Installation Command: Paste the copied installation command into your terminal or command prompt, and hit Enter to execute it. This command will download and install PyTorch, along with its dependencies, based on your chosen configurations.
  7. Verify Installation: After the installation is complete, you can verify that PyTorch has been installed correctly by opening a Python interpreter and importing the PyTorch library:
   import torch

If there are no errors, it means PyTorch has been successfully installed.

  1. Additional Steps (Optional): Depending on your specific requirements or system configuration, you may need to perform additional steps, such as setting up CUDA for GPU support or configuring virtual environments.

By following these steps, you’ll be able to install PyTorch on your system and start using it for deep learning tasks.

If you encounter any issues during the installation process, referring to the official PyTorch documentation or seeking assistance from online communities can be really helpful.

Is PyTorch better than TensorFlow?

There are several factors to consider when deciding whether PyTorch is superior to TensorFlow. These factors include your specific use case, personal preferences, and requirements.

PyTorch and TensorFlow are both robust deep learning frameworks that are extensively utilized in the machine learning community. Each framework has its own unique strengths and benefits.

FeaturePyTorchTensorFlow
Computational GraphDynamic (eager execution by default)Originally static (with TensorFlow 2.0 introducing eager execution)
InterfaceMore pythonic, easier to write and debugNot as pythonic, but improving with TensorFlow 2.0
Community SupportRapidly growing, vibrant communityLarge and established community
DocumentationExtensive and well-maintainedExtensive and well-maintained
EcosystemGrowing ecosystem with rich libraries and toolsMature ecosystem with extensive pre-trained models and tools
DeploymentSuitable for research and developmentStrong support for large-scale and production-level deployments
Distributed ComputingLess mature support for distributed computingStrong support for distributed computing and deployment on various platforms
Mobile and Edge DeploymentSupport available but less emphasisStrong support for deployment on mobile and edge devices
Industry AdoptionIncreasing adoption in research and industryWidely adopted in industry and academia
Learning CurveGenerally considered easier to learnMay have a steeper learning curve, especially for beginners

Is PyTorch written in C or C++?

PyTorch is mainly coded in C++. Although the essential features and computational backends of PyTorch are developed in C++, PyTorch also offers Python bindings to make it easier to use and integrate with the Python programming language.

This unique blend of C++ implementation for high performance and Python interface for user-friendliness makes PyTorch an incredibly robust and adaptable deep learning framework.

Is PyTorch same as Python?

No, PyTorch and Python are not the same. PyTorch is a deep learning framework that is primarily written in C++, but it has Python bindings which enable users to use PyTorch with Python.

Python is a high-level programming language that is famous for its simplicity and readability, whereas PyTorch is a library or framework that is built using a combination of Python and other languages like C++ for its core functionality.

To summarize, PyTorch is a tool that is used within the Python programming environment to make deep learning tasks easier, but it is not equivalent to Python itself.


Let’s Start Deep Learning Code

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

import os
print(os.listdir("../input"))

# Any results you write to the current directory are saved as output.
['train.csv', 'test.csv', 'sample_submission.csv']
pytorch github
tensorflow vs pytorch
what is pytorch
conda install pytorch
nn conv2d
pytorch geometric
data loader pytorch
dataloader pytorch

Basics of Pytorch

Matrices

Tensors in PyTorch are essentially matrices or arrays. So, a 3×3 matrix is called a 3×3 tensor in PyTorch.

Now, let’s take a look at how we can work with arrays using NumPy, which I’m sure you’re already familiar with.

To create a NumPy array, we use the np.array() method.

There are a couple of useful functions we can use with arrays:

  • The type() function helps us determine the type of the array. In this case, it will be a NumPy array.
  • The np.shape() function returns the shape of the array, which is represented as rows x columns.
# import numpy library
import numpy as np

# numpy array
array = [[1,2,3],[4,5,6]]
first_array = np.array(array) # 2x3 array
print("Array Type: {}".format(type(first_array))) # type
print("Array Shape: {}".format(np.shape(first_array))) # shape
print(first_array)
Array Type: <class 'numpy.ndarray'>
Array Shape: (2, 3)
[[1 2 3]
 [4 5 6]]

We’ve looked at NumPy arrays. Now, let’s see how to implement tensors (PyTorch arrays).

First, import the PyTorch library with import torch.

We create a tensor using the torch.Tensor() method.

  • type(): Determines the type of the array. In this example, it will be a tensor.
  • shape: Returns the shape of the array in the form of rows x columns.
# import pytorch library
import torch

# pytorch array
tensor = torch.Tensor(array)
print("Array Type: {}".format(tensor.type)) # type
print("Array Shape: {}".format(tensor.shape)) # shape
print(tensor)
Array Type: <built-in method type of Tensor object at 0x7fde8c4685a0>
Array Shape: torch.Size([2, 3])
tensor([[1., 2., 3.],
        [4., 5., 6.]])

Allocation is a fundamental technique in coding. Let’s learn how to do it with PyTorch by comparing it to NumPy.

Here’s how similar operations are performed in both libraries:

  • np.ones() in NumPy is equivalent to torch.ones() in PyTorch.
  • np.random.rand() in NumPy is equivalent to torch.rand() in PyTorch.
# numpy ones
print("Numpy {}\n".format(np.ones((2,3))))

# pytorch ones
print(torch.ones((2,3)))
Numpy [[1. 1. 1.]
 [1. 1. 1.]]

tensor([[1., 1., 1.],
        [1., 1., 1.]])
# numpy random
print("Numpy {}\n".format(np.random.rand(2,3)))

# pytorch random
print(torch.rand(2,3))
Numpy [[0.9917393  0.10286286 0.61706085]
 [0.03963639 0.01170117 0.35111965]]

tensor([[0.1462, 0.7302, 0.8815],
        [0.5960, 0.1152, 0.6170]])

Even when using PyTorch for neural networks, I often prefer using NumPy for visualization and examination. Therefore, I usually convert the tensor results from the neural network to NumPy arrays.

pip install pytorch
pip install torch
conv2d
pytorch lstm
pytorch dataset
pytorch datasets
pytorch database
pytorch conv2d

Here’s how to convert between tensors and NumPy arrays:

  • torch.from_numpy(): Converts a NumPy array to a tensor.
  • tensor.numpy(): Converts a tensor to a NumPy array.
# random numpy array
array = np.random.rand(2,2)
print("{} {}\n".format(type(array),array))

# from numpy to tensor
from_numpy_to_tensor = torch.from_numpy(array)
print("{}\n".format(from_numpy_to_tensor))

# from tensor to numpy
tensor = from_numpy_to_tensor
from_tensor_to_numpy = tensor.numpy()
print("{} {}\n".format(type(from_tensor_to_numpy),from_tensor_to_numpy))
<class 'numpy.ndarray'> [[0.27035945 0.97245412]
[0.63432523 0.94168862]]

tensor([[0.2704, 0.9725],
[0.6343, 0.9417]], dtype=torch.float64)

<class 'numpy.ndarray'> [[0.27035945 0.97245412]
[0.63432523 0.94168862]]

Basic Math with PyTorch

  • Resize: view()

Given a and b as tensors:

  • Addition: torch.add(a, b) or a + b
  • Subtraction: a.sub(b) or a - b
  • Element-wise Multiplication: torch.mul(a, b) or a * b
  • Element-wise Division: torch.div(a, b) or a / b
  • Mean: a.mean()
  • Standard Deviation (std): a.std()
# create tensor 
tensor = torch.ones(3,3)
print("\n",tensor)

# Resize
print("{}{}\n".format(tensor.view(9).shape,tensor.view(9)))

# Addition
print("Addition: {}\n".format(torch.add(tensor,tensor)))

# Subtraction
print("Subtraction: {}\n".format(tensor.sub(tensor)))

# Element wise multiplication
print("Element wise multiplication: {}\n".format(torch.mul(tensor,tensor)))

# Element wise division
print("Element wise division: {}\n".format(torch.div(tensor,tensor)))

# Mean
tensor = torch.Tensor([1,2,3,4,5])
print("Mean: {}".format(tensor.mean()))

# Standart deviation (std)
print("std: {}".format(tensor.std()))
 tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
torch.Size([9])tensor([1., 1., 1., 1., 1., 1., 1., 1., 1.])

Addition: tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])

Subtraction: tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

Element wise multiplication: tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

Element wise division: tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

Mean: 3.0
std: 1.5811388492584229

Variables

Variables in PyTorch accumulate gradients.

In neural networks, especially during backpropagation, gradients are crucially calculated. Therefore, handling gradients becomes essential.

If you’re unfamiliar with neural networks, I recommend checking out my deep learning tutorial first, as I won’t delve into detailed concepts such as optimization, loss functions, or backpropagation here: Deep learning tutorial.

The key difference between variables and tensors is that variables accumulate gradients.

Variables also support mathematical operations. They are indispensable for performing backward propagation in neural networks.

# import variable from pytorch library
from torch.autograd import Variable

# define variable
var = Variable(torch.ones(3), requires_grad = True)
var
tensor([1., 1., 1.], requires_grad=True)

Let’s consider the equation ( y = x^2 ).

Define ( x = [2,4] ) as a variable.

After calculation, we find ( y = [4,16] ) (( y = x^2 )).

Recapping the equation, ( o = \frac{1}{2}\sum(y) = \frac{1}{2}\sum(x^2) ).

pytorch cuda
pytorch-cuda
torch utils data dataloader
torch python
pytorch version
pytorch versions
pytorch download

The derivative of ( o = x ).

Resultantly, gradients are equal to ( x ), thus ( \text{gradients} = [2,4] ).

Let’s implement this.

# lets make basic backward propagation
# we have an equation that is y = x^2
array = [2,4]
tensor = torch.Tensor(array)
x = Variable(tensor, requires_grad = True)
y = x**2
print(" y =  ",y)

# recap o equation o = 1/2*sum(y)
o = (1/2)*sum(y)
print(" o =  ",o)

# backward
o.backward() # calculates gradients

# As I defined, variables accumulates gradients. In this part there is only one variable x.
# Therefore variable x should be have gradients
# Lets look at gradients with x.grad
print("gradients: ",x.grad)
 y =   tensor([ 4., 16.], grad_fn=<PowBackward0>)
o = tensor(10., grad_fn=<MulBackward0>)
gradients: tensor([2., 4.])

Linear Regression

For a detailed explanation of linear regression, refer to the “Regression” section of my machine learning tutorial: Machine Learning Tutorial

In linear regression, we have the equation ( y = Ax + B ):

  • ( A ) represents the slope of the curve.
  • ( B ) represents the bias, the point where the curve intersects the y-axis.

For example, consider a car company. If the car price is low, we sell more cars. Conversely, if the car price is high, we sell fewer cars. This is a known fact, and we have a dataset that reflects this relationship.

The question arises: What will be the number of cars sold if the car price is $100?

# As a car company we collect this data from previous selling
# lets define car prices
car_prices_array = [3,4,5,6,7,8,9]
car_price_np = np.array(car_prices_array,dtype=np.float32)
car_price_np = car_price_np.reshape(-1,1)
car_price_tensor = Variable(torch.from_numpy(car_price_np))

# lets define number of car sell
number_of_car_sell_array = [ 7.5, 7, 6.5, 6.0, 5.5, 5.0, 4.5]
number_of_car_sell_np = np.array(number_of_car_sell_array,dtype=np.float32)
number_of_car_sell_np = number_of_car_sell_np.reshape(-1,1)
number_of_car_sell_tensor = Variable(torch.from_numpy(number_of_car_sell_np))

# lets visualize our data
import matplotlib.pyplot as plt
plt.scatter(car_prices_array,number_of_car_sell_array)
plt.xlabel("Car Price $")
plt.ylabel("Number of Car Sell")
plt.title("Car Price$ VS Number of Car Sell")
plt.show()
how to install pytorch
conv2d pytorch
pytorch optimizers
pytorch optimizer
pytorch optimization
lstm pytorch
pytorch profiler
pytorch profiling

Analyzing Collected Data

The plot before us represents our collected data.

Our primary inquiry: What will be the number of cars sold if the car price is $100?

To address this question, we’ll employ linear regression. The objective is to fit a line to this data with minimal error.

Steps of Linear Regression:

  1. Instantiate the LinearRegression class.
  2. Define the model using this LinearRegression class.
  3. Calculate the Mean Squared Error (MSE).
  4. Optimize the model using Stochastic Gradient Descent (SGD).
  5. Conduct Backpropagation.
  6. Make Predictions.

Let’s proceed to implement this using PyTorch.

# Linear Regression with Pytorch

# libraries
import torch      
from torch.autograd import Variable     
import torch.nn as nn 
import warnings
warnings.filterwarnings("ignore")

# create class
class LinearRegression(nn.Module):
    def __init__(self,input_size,output_size):
        # super function. It inherits from nn.Module and we can access everythink in nn.Module
        super(LinearRegression,self).__init__()
        # Linear function.
        self.linear = nn.Linear(input_dim,output_dim)

    def forward(self,x):
        return self.linear(x)
    
# define model
input_dim = 1
output_dim = 1
model = LinearRegression(input_dim,output_dim) # input and output size are 1

# MSE
mse = nn.MSELoss()

# Optimization (find parameters that minimize error)
learning_rate = 0.02   # how fast we reach best parameters
optimizer = torch.optim.SGD(model.parameters(),lr = learning_rate)

# train model
loss_list = []
iteration_number = 1001
for iteration in range(iteration_number):
        
    # optimization
    optimizer.zero_grad() 
    
    # Forward to get output
    results = model(car_price_tensor)
    
    # Calculate Loss
    loss = mse(results, number_of_car_sell_tensor)
    
    # backward propagation
    loss.backward()
    
    # Updating parameters
    optimizer.step()
    
    # store loss
    loss_list.append(loss.data)
    
    # print loss
    if(iteration % 50 == 0):
        print('epoch {}, loss {}'.format(iteration, loss.data))

plt.plot(range(iteration_number),loss_list)
plt.xlabel("Number of Iterations")
plt.ylabel("Loss")
plt.show()
epoch 0, loss 13.964462280273438
epoch 50, loss 4.808793544769287
epoch 100, loss 3.2495040893554688
epoch 150, loss 2.195826768875122
epoch 200, loss 1.4838119745254517
epoch 250, loss 1.0026743412017822
epoch 300, loss 0.6775489449501038
epoch 350, loss 0.4578486979007721
epoch 400, loss 0.30938759446144104
epoch 450, loss 0.20906630158424377
epoch 500, loss 0.14127501845359802
epoch 550, loss 0.09546550363302231
epoch 600, loss 0.06451006978750229
epoch 650, loss 0.043592456728219986
epoch 700, loss 0.029456866905093193
epoch 750, loss 0.019905563443899155
epoch 800, loss 0.013450978323817253
epoch 850, loss 0.009089605882763863
epoch 900, loss 0.006142175756394863
epoch 950, loss 0.004150448366999626
epoch 1000, loss 0.0028045533690601587

After 1001 iterations, the loss is nearly zero, as evident from the plot or the loss recorded at epoch 1000. With this, we now possess a trained model.

Using this trained model, let’s proceed to predict car prices.

# predict our car price 
predicted = model(car_price_tensor).data.numpy()
plt.scatter(car_prices_array,number_of_car_sell_array,label = "original data",color ="red")
plt.scatter(car_prices_array,predicted,label = "predicted data",color ="blue")

# predict if car price is 10$, what will be the number of car sell
#predicted_10 = model(torch.from_numpy(np.array([10]))).data.numpy()
#plt.scatter(10,predicted_10.data,label = "car price 10$",color ="green")
plt.legend()
plt.xlabel("Car Price $")
plt.ylabel("Number of Car Sell")
plt.title("Original vs Predicted values")
plt.show()

Logistic Regression

Linear regression isn’t suitable for classification tasks, leading us to logistic regression.

By combining linear regression with the logistic function (softmax), we derive logistic regression.

Steps of Logistic Regression:

StepDescription
1. Import LibrariesImport necessary libraries.
2. Prepare Dataset– Use the MNIST dataset.
– Normalize data.
– Split into training and testing sets.
3. Create Feature and Target TensorsDefine batch size and epoch for training.
4. TensorDataset() and DataLoader()Prepare data for training and testing.
5. Visualize DatasetView one of the images in the dataset.
6. Create Logistic Regression ModelBuild a model similar to linear regression but with a logistic function in the loss function.
7. Instantiate ModelDefine input and output dimensions.
8. Instantiate LossUse cross-entropy loss with softmax.
9. Instantiate OptimizerChoose Stochastic Gradient Descent (SGD) optimizer.
10. Train the ModelObserve the loss decreasing and accuracy increasing during training.
11. PredictionVisualize the learning progress through loss and accuracy plots.
# Import Libraries
import torch
import torch.nn as nn
from torch.autograd import Variable
from torch.utils.data import DataLoader
import pandas as pd
from sklearn.model_selection import train_test_split
# Prepare Dataset
# load data
train = pd.read_csv(r"../input/train.csv",dtype = np.float32)

# split data into features(pixels) and labels(numbers from 0 to 9)
targets_numpy = train.label.values
features_numpy = train.loc[:,train.columns != "label"].values/255 # normalization

# train test split. Size of train data is 80% and size of test data is 20%. 
features_train, features_test, targets_train, targets_test = train_test_split(features_numpy,
                                                                             targets_numpy,
                                                                             test_size = 0.2,
                                                                             random_state = 42) 

# create feature and targets tensor for train set. As you remember we need variable to accumulate gradients. Therefore first we create tensor, then we will create variable
featuresTrain = torch.from_numpy(features_train)
targetsTrain = torch.from_numpy(targets_train).type(torch.LongTensor) # data type is long

# create feature and targets tensor for test set.
featuresTest = torch.from_numpy(features_test)
targetsTest = torch.from_numpy(targets_test).type(torch.LongTensor) # data type is long

# batch_size, epoch and iteration
batch_size = 100
n_iters = 10000
num_epochs = n_iters / (len(features_train) / batch_size)
num_epochs = int(num_epochs)

# Pytorch train and test sets
train = torch.utils.data.TensorDataset(featuresTrain,targetsTrain)
test = torch.utils.data.TensorDataset(featuresTest,targetsTest)

# data loader
train_loader = DataLoader(train, batch_size = batch_size, shuffle = False)
test_loader = DataLoader(test, batch_size = batch_size, shuffle = False)

# visualize one of the images in data set
plt.imshow(features_numpy[10].reshape(28,28))
plt.axis("off")
plt.title(str(targets_numpy[10]))
plt.savefig('graph.png')
plt.show()
# Create Logistic Regression Model
class LogisticRegressionModel(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(LogisticRegressionModel, self).__init__()
        # Linear part
        self.linear = nn.Linear(input_dim, output_dim)
        # There should be logistic function right?
        # However logistic function in pytorch is in loss function
        # So actually we do not forget to put it, it is only at next parts
    
    def forward(self, x):
        out = self.linear(x)
        return out

# Instantiate Model Class
input_dim = 28*28 # size of image px*px
output_dim = 10  # labels 0,1,2,3,4,5,6,7,8,9

# create logistic regression model
model = LogisticRegressionModel(input_dim, output_dim)

# Cross Entropy Loss  
error = nn.CrossEntropyLoss()

# SGD Optimizer 
learning_rate = 0.001
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
# Traning the Model
count = 0
loss_list = []
iteration_list = []
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        
        # Define variables
        train = Variable(images.view(-1, 28*28))
        labels = Variable(labels)
        
        # Clear gradients
        optimizer.zero_grad()
        
        # Forward propagation
        outputs = model(train)
        
        # Calculate softmax and cross entropy loss
        loss = error(outputs, labels)
        
        # Calculate gradients
        loss.backward()
        
        # Update parameters
        optimizer.step()
        
        count += 1
        
        # Prediction
        if count % 50 == 0:
            # Calculate Accuracy         
            correct = 0
            total = 0
            # Predict test dataset
            for images, labels in test_loader: 
                test = Variable(images.view(-1, 28*28))
                
                # Forward propagation
                outputs = model(test)
                
                # Get predictions from the maximum value
                predicted = torch.max(outputs.data, 1)[1]
                
                # Total number of labels
                total += len(labels)
                
                # Total correct predictions
                correct += (predicted == labels).sum()
            
            accuracy = 100 * correct / float(total)
            
            # store loss and iteration
            loss_list.append(loss.data)
            iteration_list.append(count)
        if count % 500 == 0:
            # Print Loss
            print('Iteration: {}  Loss: {}  Accuracy: {}%'.format(count, loss.data, accuracy))
Iteration: 500  Loss: 1.8399910926818848  Accuracy: 68%
Iteration: 1000  Loss: 1.5982391834259033  Accuracy: 75%
Iteration: 1500  Loss: 1.2930790185928345  Accuracy: 78%
Iteration: 2000  Loss: 1.1937870979309082  Accuracy: 80%
Iteration: 2500  Loss: 1.0323244333267212  Accuracy: 81%
Iteration: 3000  Loss: 0.9379988312721252  Accuracy: 82%
Iteration: 3500  Loss: 0.899523913860321  Accuracy: 82%
Iteration: 4000  Loss: 0.7464531660079956  Accuracy: 83%
Iteration: 4500  Loss: 0.9766625761985779  Accuracy: 83%
Iteration: 5000  Loss: 0.8022621870040894  Accuracy: 83%
Iteration: 5500  Loss: 0.7587511539459229  Accuracy: 84%
Iteration: 6000  Loss: 0.8655218482017517  Accuracy: 84%
Iteration: 6500  Loss: 0.6625986695289612  Accuracy: 84%
Iteration: 7000  Loss: 0.7128363251686096  Accuracy: 84%
Iteration: 7500  Loss: 0.6303086280822754  Accuracy: 85%
Iteration: 8000  Loss: 0.7414441704750061  Accuracy: 85%
Iteration: 8500  Loss: 0.5468852519989014  Accuracy: 85%
Iteration: 9000  Loss: 0.6567560434341431  Accuracy: 85%
Iteration: 9500  Loss: 0.5228758454322815  Accuracy: 85%
# visualization
plt.plot(iteration_list,loss_list)
plt.xlabel("Number of iteration")
plt.ylabel("Loss")
plt.title("Logistic Regression: Loss vs Number of iteration")
plt.show()

Artificial Neural Network (ANN)

While logistic regression performs well for classification, its accuracy tends to decrease as complexity (non-linearity) increases. To address this, we enhance the model’s complexity by incorporating more non-linear functions through hidden layers.

For a comprehensive understanding of artificial neural networks (ANN), please refer to my deep learning tutorial: Deep learning tutorial

Steps of ANN:

StepDescription
1. Import LibrariesImport necessary libraries.
2. Prepare Dataset– Use the same dataset as logistic regression.
– Prepare train_loader and test_loader.
– Keep batch size, epoch, and iteration numbers the same.
3. Create ANN Model– Add 3 hidden layers.
– Use ReLU, Tanh, and ELU activation functions for diversity.
4. Instantiate Model Class– Input dimension: 28×28 (image size in pixels).
– Output dimension: 10 (labels 0 to 9).
– Hidden layer dimension: 150 (hyperparameter).
5. Instantiate LossCross-entropy loss with softmax (logistic function).
6. Instantiate OptimizerStochastic Gradient Descent (SGD) optimizer.
7. Train the ModelObserve decreasing loss and increasing accuracy during training.
8. PredictionVisualize the learning progress through loss and accuracy plots.

Through the incorporation of hidden layers, the model learns better, achieving a higher accuracy (almost 95%) compared to the logistic regression model.

# Import Libraries
import torch
import torch.nn as nn
from torch.autograd import Variable
# Create ANN Model
class ANNModel(nn.Module):
    
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(ANNModel, self).__init__()
        
        # Linear function 1: 784 --> 150
        self.fc1 = nn.Linear(input_dim, hidden_dim) 
        # Non-linearity 1
        self.relu1 = nn.ReLU()
        
        # Linear function 2: 150 --> 150
        self.fc2 = nn.Linear(hidden_dim, hidden_dim)
        # Non-linearity 2
        self.tanh2 = nn.Tanh()
        
        # Linear function 3: 150 --> 150
        self.fc3 = nn.Linear(hidden_dim, hidden_dim)
        # Non-linearity 3
        self.elu3 = nn.ELU()
        
        # Linear function 4 (readout): 150 --> 10
        self.fc4 = nn.Linear(hidden_dim, output_dim)  
    
    def forward(self, x):
        # Linear function 1
        out = self.fc1(x)
        # Non-linearity 1
        out = self.relu1(out)
        
        # Linear function 2
        out = self.fc2(out)
        # Non-linearity 2
        out = self.tanh2(out)
        
        # Linear function 2
        out = self.fc3(out)
        # Non-linearity 2
        out = self.elu3(out)
        
        # Linear function 4 (readout)
        out = self.fc4(out)
        return out

# instantiate ANN
input_dim = 28*28
hidden_dim = 150 #hidden layer dim is one of the hyper parameter and it should be chosen and tuned. For now I only say 150 there is no reason.
output_dim = 10

# Create ANN
model = ANNModel(input_dim, hidden_dim, output_dim)

# Cross Entropy Loss 
error = nn.CrossEntropyLoss()

# SGD Optimizer
learning_rate = 0.02
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
# ANN model training
count = 0
loss_list = []
iteration_list = []
accuracy_list = []
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):

        train = Variable(images.view(-1, 28*28))
        labels = Variable(labels)
        
        # Clear gradients
        optimizer.zero_grad()
        
        # Forward propagation
        outputs = model(train)
        
        # Calculate softmax and ross entropy loss
        loss = error(outputs, labels)
        
        # Calculating gradients
        loss.backward()
        
        # Update parameters
        optimizer.step()
        
        count += 1
        
        if count % 50 == 0:
            # Calculate Accuracy         
            correct = 0
            total = 0
            # Predict test dataset
            for images, labels in test_loader:

                test = Variable(images.view(-1, 28*28))
                
                # Forward propagation
                outputs = model(test)
                
                # Get predictions from the maximum value
                predicted = torch.max(outputs.data, 1)[1]
                
                # Total number of labels
                total += len(labels)

                # Total correct predictions
                correct += (predicted == labels).sum()
            
            accuracy = 100 * correct / float(total)
            
            # store loss and iteration
            loss_list.append(loss.data)
            iteration_list.append(count)
            accuracy_list.append(accuracy)
        if count % 500 == 0:
            # Print Loss
            print('Iteration: {}  Loss: {}  Accuracy: {} %'.format(count, loss.data, accuracy))
Iteration: 500  Loss: 0.8311067223548889  Accuracy: 77 %
Iteration: 1000  Loss: 0.4767582416534424  Accuracy: 87 %
Iteration: 1500  Loss: 0.21807175874710083  Accuracy: 89 %
Iteration: 2000  Loss: 0.2915269732475281  Accuracy: 90 %
Iteration: 2500  Loss: 0.3073478937149048  Accuracy: 91 %
Iteration: 3000  Loss: 0.12328791618347168  Accuracy: 92 %
Iteration: 3500  Loss: 0.24098418653011322  Accuracy: 93 %
Iteration: 4000  Loss: 0.06471655517816544  Accuracy: 93 %
Iteration: 4500  Loss: 0.3368555009365082  Accuracy: 94 %
Iteration: 5000  Loss: 0.12026549130678177  Accuracy: 94 %
Iteration: 5500  Loss: 0.217212975025177  Accuracy: 94 %
Iteration: 6000  Loss: 0.20914879441261292  Accuracy: 94 %
Iteration: 6500  Loss: 0.10008767992258072  Accuracy: 95 %
Iteration: 7000  Loss: 0.13490895926952362  Accuracy: 95 %
Iteration: 7500  Loss: 0.11741413176059723  Accuracy: 95 %
Iteration: 8000  Loss: 0.17519493401050568  Accuracy: 95 %
Iteration: 8500  Loss: 0.06657659262418747  Accuracy: 95 %
Iteration: 9000  Loss: 0.05512683466076851  Accuracy: 95 %
Iteration: 9500  Loss: 0.02535334974527359  Accuracy: 96 %
# visualization loss 
plt.plot(iteration_list,loss_list)
plt.xlabel("Number of iteration")
plt.ylabel("Loss")
plt.title("ANN: Loss vs Number of iteration")
plt.show()

# visualization accuracy 
plt.plot(iteration_list,accuracy_list,color = "red")
plt.xlabel("Number of iteration")
plt.ylabel("Accuracy")
plt.title("ANN: Accuracy vs Number of iteration")
plt.show()

Convolutional Neural Network (CNN)

CNNs are well-suited for image classification tasks.

Steps of CNN:

StepDescription
1. Import LibrariesImport necessary libraries.
2. Prepare Dataset– Use the same dataset as previous parts.
– Prepare train_loader and test_loader.
3. Convolutional Layer– Create feature maps with filters (kernels).
– Apply padding to preserve information.
4. Pooling Layer– Condense feature maps from convolutional layer.
– Utilize max pooling.
5. FlatteningFlatten the feature maps.
6. Fully Connected Layer– Similar to ANN or linear regression.
– Apply softmax function at the end.
7. Instantiate Model ClassDefine the CNN model.
8. Instantiate LossCross-entropy loss with softmax function.
9. Instantiate OptimizerStochastic Gradient Descent (SGD) optimizer.
10. Train the ModelObserve decreasing loss and increasing accuracy during training.
11. PredictionVisualize learning progress through loss and accuracy plots.

Through the integration of convolutional layers, the model learns more effectively, resulting in higher accuracy (almost 98%) compared to the ANN model.

While tuning hyperparameters such as increasing iterations and expanding the CNN can further enhance accuracy, it may also increase running time significantly.

# Import Libraries
import torch
import torch.nn as nn
from torch.autograd import Variable
# Create CNN Model
class CNNModel(nn.Module):
    def __init__(self):
        super(CNNModel, self).__init__()
        
        # Convolution 1
        self.cnn1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, stride=1, padding=0)
        self.relu1 = nn.ReLU()
        
        # Max pool 1
        self.maxpool1 = nn.MaxPool2d(kernel_size=2)
     
        # Convolution 2
        self.cnn2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, stride=1, padding=0)
        self.relu2 = nn.ReLU()
        
        # Max pool 2
        self.maxpool2 = nn.MaxPool2d(kernel_size=2)
        
        # Fully connected 1
        self.fc1 = nn.Linear(32 * 4 * 4, 10) 
    
    def forward(self, x):
        # Convolution 1
        out = self.cnn1(x)
        out = self.relu1(out)
        
        # Max pool 1
        out = self.maxpool1(out)
        
        # Convolution 2 
        out = self.cnn2(out)
        out = self.relu2(out)
        
        # Max pool 2 
        out = self.maxpool2(out)
        
        # flatten
        out = out.view(out.size(0), -1)

        # Linear function (readout)
        out = self.fc1(out)
        
        return out

# batch_size, epoch and iteration
batch_size = 100
n_iters = 2500
num_epochs = n_iters / (len(features_train) / batch_size)
num_epochs = int(num_epochs)

# Pytorch train and test sets
train = torch.utils.data.TensorDataset(featuresTrain,targetsTrain)
test = torch.utils.data.TensorDataset(featuresTest,targetsTest)

# data loader
train_loader = torch.utils.data.DataLoader(train, batch_size = batch_size, shuffle = False)
test_loader = torch.utils.data.DataLoader(test, batch_size = batch_size, shuffle = False)
    
# Create CNN
model = CNNModel()

# Cross Entropy Loss 
error = nn.CrossEntropyLoss()

# SGD Optimizer
learning_rate = 0.1
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
# CNN model training
count = 0
loss_list = []
iteration_list = []
accuracy_list = []
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        
        train = Variable(images.view(100,1,28,28))
        labels = Variable(labels)
        
        # Clear gradients
        optimizer.zero_grad()
        
        # Forward propagation
        outputs = model(train)
        
        # Calculate softmax and ross entropy loss
        loss = error(outputs, labels)
        
        # Calculating gradients
        loss.backward()
        
        # Update parameters
        optimizer.step()
        
        count += 1
        
        if count % 50 == 0:
            # Calculate Accuracy         
            correct = 0
            total = 0
            # Iterate through test dataset
            for images, labels in test_loader:
                
                test = Variable(images.view(100,1,28,28))
                
                # Forward propagation
                outputs = model(test)
                
                # Get predictions from the maximum value
                predicted = torch.max(outputs.data, 1)[1]
                
                # Total number of labels
                total += len(labels)
                
                correct += (predicted == labels).sum()
            
            accuracy = 100 * correct / float(total)
            
            # store loss and iteration
            loss_list.append(loss.data)
            iteration_list.append(count)
            accuracy_list.append(accuracy)
        if count % 500 == 0:
            # Print Loss
            print('Iteration: {}  Loss: {}  Accuracy: {} %'.format(count, loss.data, accuracy))
Iteration: 500  Loss: 0.11975778639316559  Accuracy: 96 %
Iteration: 1000  Loss: 0.04979655146598816  Accuracy: 97 %
Iteration: 1500  Loss: 0.03124646656215191  Accuracy: 97 %
Iteration: 2000  Loss: 0.013534960336983204  Accuracy: 98 %
# visualization loss 
plt.plot(iteration_list,loss_list)
plt.xlabel("Number of iteration")
plt.ylabel("Loss")
plt.title("CNN: Loss vs Number of iteration")
plt.show()

# visualization accuracy 
plt.plot(iteration_list,accuracy_list,color = "red")
plt.xlabel("Number of iteration")
plt.ylabel("Accuracy")
plt.title("CNN: Accuracy vs Number of iteration")
plt.show()

Conclusion

Version 1: Your Exciting Deep Learning Adventure

This series of tutorials has walked you through the fundamental concepts of deep learning, building on the knowledge gained in Lesson 1: Top Deep Learning Tutorial for Beginners 2024. Now, in Lesson 2, we have delved into the intricacies of PyTorch.

To ensure a strong grasp of the material, I suggest reviewing Lesson 1 again before diving into PyTorch.

deep learning projects github 	
deep learning project github
deep learning project ideas
deep learning projects ideas

We have discussed topics such as linear and logistic regression, artificial neural networks (ANNs), convolutional neural networks (CNNs), recurrent neural networks (RNNs), and Long-Short Term Memory (LSTM) networks.

Continue your exploration and don’t hesitate to contact us with any inquiries or feedback. Your journey into the world of deep learning is just getting started!


2 Comments

Best Deep Learning Tutorial For Beginners With Python 2024 · May 29, 2024 at 1:17 pm

[…] Lesson 2: Best Pytorch Tutorial for Deep Learning […]

Lesson 3: Best Transformers And BERT Tutorial With DL/NLP · May 29, 2024 at 2:21 pm

[…] Lesson 2: Best Pytorch Tutorial for Deep Learning […]

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *