Text Auto-Augmentation for Classification¶

This guide explains how to use PyABSA’s text auto-augmentation features to improve the performance of your classification models. Augmentation can help increase the diversity of your training data and make your models more robust.

Augmentation for Aspect-Based Sentiment Classification¶

PyABSA provides a simple function to automatically augment your dataset and train an Aspect-Based Sentiment Classification (ABSC) model.

Example Usage¶

Here’s how you can use auto_aspect_sentiment_classification_augmentation to train an augmented model.

from pyabsa import AspectPolarityClassification as APC
from pyabsa.augmentation import auto_aspect_sentiment_classification_augmentation

# Get a configuration template
config = APC.APCConfigManager.get_apc_config_english()

# Set the model and BERT checkpoint
config.model = APC.APCModelList.FAST_LSA_T_V2
config.pretrained_bert = 'microsoft/deberta-v3-base'

# Set training hyperparameters
config.num_epoch = 10
config.evaluate_begin = 5
config.max_seq_len = 80
config.log_step = 100
config.dropout = 0.5
config.l2reg = 1e-8
config.seed = 42

# Choose a dataset
dataset = APC.APCDatasetList.Laptop14

# This function will automatically augment the dataset and train the model
auto_aspect_sentiment_classification_augmentation(
    config=config,
    dataset=dataset,
    device='cuda'  # Use 'cpu' if you don't have a GPU
)

This function handles the augmentation process behind the scenes, so you don’t need to manually generate new training examples.

Augmentation for Text Classification¶

Similarly, you can use auto-augmentation for standard text classification tasks.

Example Usage¶

Here’s how to use auto_classification_augmentation to train an augmented text classification model.

from pyabsa import TextClassification as TC
from pyabsa.augmentation import auto_classification_augmentation

# Get a configuration template
config = TC.TCConfigManager.get_tc_config_english()

# Set the model and training hyperparameters
config.model = TC.BERTTCModelList.BERT_MLP
config.num_epoch = 5
config.evaluate_begin = 2
config.max_seq_len = 80
config.dropout = 0.5
config.seed = 42
config.l2reg = 1e-5

# Choose a dataset
dataset = TC.TCDatasetList.SST2

# This function will automatically augment the dataset and train the model
auto_classification_augmentation(
    config=config,
    dataset=dataset,
    device='cuda'  # Use 'cpu' if you don't have a GPU
)

By using these auto-augmentation functions, you can potentially improve your model’s performance with minimal effort. For more advanced control over the augmentation process, refer to the detailed tutorials in the documentation.