top of page
  • utkarsh195

Improving Deep Learning Models with Data Augmentation


Deep Learning has become extremely popular in AI, and companies frequently employ its techniques to solve challenging real-world issues. This domain is progressing rapidly and finds applications in various use cases like self-driving cars, fraud detection, chatbots for customer service, sports analytics, natural language processing, and many more.

One typical limitation faced while training a deep learning model is the amount of data required. Deep learning models employ several different algorithms to make decisions. The required amount of data for training these models aren’t necessarily available for various reasons. New companies do not have any previous data, and data collection requires too much time and expertise. Besides, publicly available data might be limited or cannot be used due to privacy concerns.

One way to address this data shortage is to employ data augmentation and increase the size of the available data.

In this article, we will cover:

● What is Data Augmentation, and why is it required?

● Techniques of Image Data Augmentation in Computer Vision

● Python Libraries for Image Data Augmentation

● Real-world applications

● Does Data Augmentation improve deep learning model accuracy?

● Data Augmentation Best practices

Why is Data Augmentation required?

Deep learning models are computationally expensive and require properly labeled data to perform well. To improve a model, we need to supply it with more data to identify more features to make a decision. Data augmentation can generate additional data from the existing data. It can improve the model accuracy and save time and money when gathering more real data is difficult.

What is Data Augmentation?

Data augmentation is a technique in deep learning to extend the original dataset by generating new training data from the existing data. The data augmentation tool transforms the data into fresh, unique samples by manipulating the parameters of the existing data. Data augmentation can be carried out for image, text, audio, and video inputs.

There are two types of data augmentation: offline (augmented images are stored on a drive and then combined with real data before training the model) and online (data augmentation is applied to randomly chosen images and used for training along with original data).

What are the benefits of Data Augmentation?

With correct implementation, data augmentation offers benefits -

● lowering the cost of data acquisition and data labeling

● improving model generalization by imparting more variety and flexibility to the model

● enhancing model accuracy in prediction as more data is used to train the model

● reducing overfitting of data

● handling imbalance in the dataset by augmenting the samples from the minority class

Image Data Augmentation techniques in Computer Vision

We can manipulate and transform the available images to create an augmented dataset using the following techniques. The two most commonly used types of transformations are - Geometric and Color transformations. To demonstrate these techniques, we will use a sample image from the Plant village dataset (without augmentation version)

Geometric transformations

These include manipulations to the geometry of the image like orientation, size, translation, quality, and so on.

1. Flipping

This technique flips the given image both horizontally and vertically.

Flipping Transformation

2. Rotation

The image is rotated by a degree (0 to 360 degrees), and each rotated image remains unique in the model.

Rotation Transformation

3. Cropping

In this technique, a selected section of the image is cropped and then resized to the original image size.

Cropping Transformation

4. Zooming

A selected portion of the image can be cropped and zoomed in for augmentation.

Zooming Transformation

5. Scaling

The original image can be resized inward or outward. The new image can be smaller or bigger than the original image.

Scaling Transformation

6. Translation

The image can be shifted along the x-axis or y-axis. This way, the neural network looks through the entire image to search for similar patterns.

Translation Transformation

7. Adding noise

To capture the blurry effect, adding noise to the image (also known as salt and pepper noise) can be useful. The augmented image looks blurry with several dots.

Noise Transformation

Color space transformations

These techniques include manipulations to the hue and saturation of the image.

1. Brightness

The brightness of the image is altered to generate a brighter or a darker augmented image to allow the model to identify the same object in different lighting levels.

Brightness Transformation

2. Contrast

In this technique, the image contrast can be adjusted, and the augmented image has different luminosity and color aspects.

Contrast Transformation

3. Color Augmentation

The pixel values of the image are altered to change the color of the image (Grayscale, RGB2BGR/BGR2RGB, HSV).

Color Augmentation Transformation

4. Saturation

The depth or the color intensity of the original image is changed. It intensifies the selected hue, as shown in the example below.

Saturation Transformation

It is possible to use multiple data augmentation techniques together to generate the required data from the original set of images.

Real-world examples of Data Augmentation

Data augmentation is currently being used across different industries, such as the manufacturing industry for quality inspection or defect identification, in Space agencies and Government organizations for assistance in analyzing satellite imagery for object detection.

An important application is the medical imaging domain in healthcare, where data privacy concerns are high and larger datasets for rare diseases are unavailable. Here, data augmentation is used to transform the medical images to add diversity to the original labeled dataset.

The agriculture domain also uses data augmentation to identify crop diseases and analyze the quality of the yield.

Image DA (Data Augmentation) libraries in Python

Data Augmentation is crucial in Deep Learning. Hence, most deep learning frameworks like Keras, Tensorflow, PyTorch, and more in Python include a few augmentation methods. Additionally, there are dedicated python libraries for image data augmentation like -

OpenCV - a large open-source library that supports real-time image and videos processing for object and people detection

Augmentor - is an augmentation package for machine learning that is platform and framework independent. It allows different augmentation methods to be executed in a pipeline.

Albumentations - is a package with a unified API, is fast, and supports more than 70 augmentation techniques for all computer vision tasks.

AugLy- is a library developed by Facebook for social media applications. It has more than 100 augmentation techniques with support for image, text, audio, and video.

Imgaug - is a library that supports several augmentation techniques but also identifies image landmarks, bounding boxes, segmentation maps, and heatmaps.

Improving model accuracy using Data Augmentation

Let us take a use case of data augmentation in Agriculture. Crop diseases are very common, and deep learning assists in quicker identification of the type of crop disease. A CNN model learns the different features attributing to a specific disease and helps in accurate classification based on the plant images. However, gathering a large amount of plant image data is challenging. Here, data augmentation enhances the classification performance of the deep learning model by extending the real dataset with augmented images.

To demonstrate this, we will use the without-augmented version of the plant village dataset, which contains 38 categories of plant leaf images. For demo purposes, we will use the apple leaf dataset and extract the first 150 images from the apple leaf dataset for all four categories (Apple_scab, Black_rot, Cedar_apple_rust, and healthy). Thus, we have a small dataset of a total of 600 images.

Now, we will import the necessary libraries and set up a sample CNN model with the following structure.

model = models.Sequential([
    layers.Conv2D(64, kernel_size = (3,3), activation='relu', input_shape=input_shape),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, kernel_size = (3,3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(32, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),

    layers.Conv2D(32, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Dense(64, activation='relu'),
    layers.Dense(n_classes, activation='softmax'),

We save this model after training to be used again with the augmented dataset. Next, we use the Augmentor library to set up an image augmentation pipeline to augment the images to 1800 and choose three relevant image transformations (flip horizontal, flip vertical, and rotate). To install the library, use the pip command.

pip install Augmentor

For the pipeline, we use the following code,

#setting up an Augmentor pipeline

import Augmentor
# Passing the path of the image directory 
p = Augmentor.Pipeline(source_directory="/content/Data",
# Defining augmentation parameters and generating samples

The above code generated the required set of images in all four categories.

Apple___Apple_scab: 431 
Apple___Black_rot: 468 
Apple___Cedar_apple_rust: 470 
Apple___healthy: 431
Total Augmented images: 1800 

Let’s look at a few randomly sampled original and augmented images.

Sampled original images Vs Sampled augmented images

Comparing the above images, we can say that the augmented images look unique. We can now re-train the earlier saved model and plot the model performance parameters, i.e., accuracy and loss.

Performance plots for model training accuracy & loss
Model Training Accuracy & Loss Plots

The model accuracy seems to have improved from these plots, and the loss has reduced with the augmented data, making the model more robust than the model trained on the original dataset. In general, the augmented data improves the model generalization with exposure to additional data. This capability, however, has several limitations.

The entire code for this demo and the data augmentation techniques are available on GitHub. Feel free to try it or build something new with your chosen dataset.

Best practices in Data Augmentation

With multiple open-source tools available, generating additional data from existing data is certainly more accessible than acquiring additional real data. Although data augmentation is a powerful process, it needs to be used cautiously. Assessing the project requirements to evaluate the need for data augmentation is essential.

When applying data augmentation, we need to keep the following general guidelines in mind:

● Evaluate whether to use data augmentation; if images in dataset set < 100, use transfer learning and use the highly accurate pre-trained CNN models as feature extractors; else, go for data augmentation

●Decide the number of augmented images required, evaluate the existing hardware capabilities to generate these images

● Choose relevant augmentation techniques depending on the task.

● Use the optimum number of augmentation techniques by assessing which methods justify the real samples.

● Setting up systems to examine and evaluate the quality of the augmented dataset

● Develop relevant strategies to avoid the existing bias in augmented datasets.

● Be mindful of the time invested as the data augmentation process is time-consuming and expensive

● Display the samples of original and the augmented data in the notebook to ensure the pipeline worked correctly.


The impact of data augmentation for supervised deep learning models is evident. Data augmentation techniques can address the data shortage in organizations by creating more data from the available data. Various techniques of image transformation in any of the Python libraries discussed in the article can be applied to a deep learning project for improving its performance. Furthermore, the best practices for the success of data augmentation have also been highlighted.

To conclude, data augmentation is an efficient technique to construct robust, high-performance deep learning models. However, creating efficient data augmentation strategies requires time, a good understanding of the problem, and domain expertise.

By: Devashree Madhugiri

142 views0 comments
bottom of page