Emotion-Recognition-System/Data augmentation at main · Someshdiwan/Emotion-Recognition-System · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Data augmentation is a technique used in machine learning and deep learning to artificially increase the size and variability of a training dataset by applying transformations or modifications to the existing data.
The primary goal of data augmentation is to improve the generalization ability of a model, especially when there is limited data available.
By augmenting the dataset, the model can be trained on a broader variety of data, making it more robust and capable of handling unseen examples better.

Data augmentation helps in:

Preventing Overfitting: By exposing the model to more variations of the training data, it reduces the chances of the model memorizing the data (overfitting) and increases its ability to generalize.
Improving Model Accuracy: When combined with various transformations, augmentation can enhance the model's ability to recognize patterns more effectively.
Expanding Limited Datasets: Augmentation allows you to artificially expand your dataset without needing to collect new data, which can be costly and time-consuming.

Types of Data Augmentation
Data augmentation techniques depend on the type of data you're working with. Common data types include images, text, and audio, and each has its own set of transformation techniques:

1. Image Data Augmentation

For image data, the most common augmentations include:

Rotation: Rotating the image by a certain angle.
Flipping: Horizontally or vertically flipping the image.
Cropping: Randomly cropping sections of the image.
Scaling: Resizing the image or zooming in/out.
Translation: Shifting the image along the X or Y axis.
Color Jittering: Randomly changing the brightness, contrast, saturation, or hue.
Gaussian Noise: Adding random noise to the image.

2. Text Data Augmentation

For text data, augmentations can involve:

Synonym Replacement: Replacing words with their synonyms.
Random Insertion: Inserting random words in the text.
Random Deletion: Randomly removing words from the text.
Back Translation: Translating the text to another language and then back to the original language.

3. Audio Data Augmentation

For audio data, common augmentations include:

Time Stretching: Speeding up or slowing down the audio without changing the pitch.
Pitch Shifting: Shifting the pitch of the audio up or down.
Noise Injection: Adding background noise to the audio to make the model more robust.
Random Cropping: Cropping a portion of the audio for training.
Reverberation: Adding echo or reverb to the audio.
Volume Adjustment: Changing the volume level of the audio.

Why Use Data Augmentation?
Expanding Small Datasets: In many cases, obtaining a large amount of data can be difficult, so augmentation allows us to create more training data artificially.
Increased Model Robustness: With a diverse set of augmented data, the model can learn to be more invariant to certain types of variations (e.g., rotation, scaling, noise in images or pitch and speed changes in audio).

Improved Generalization: By training on augmented data, the model is less likely to overfit to the training data, improving its performance on unseen data.