pyabsa.tasks.__SubtaskTemplate__.dataset_utils.data_utils_for_training
Module Contents
Classes
Attributes |
- class pyabsa.tasks.__SubtaskTemplate__.dataset_utils.data_utils_for_training.ABSADataset(config, tokenizer, dataset_type='train')[source]
Bases:
pyabsa.framework.dataset_class.dataset_template.PyABSADataset
- Attributes
data: a list of the loaded and preprocessed data samples.
- Methods
__init__(self, config, tokenizer, dataset_type, **kwargs): constructs a new PyABSADataset object by loading and preprocessing a dataset based on the given configuration and dataset type. config is a configuration object containing the settings for loading and preprocessing the dataset, tokenizer is a pre-trained tokenizer object to tokenize the text data, and dataset_type is the type of the dataset to load (e.g., “train”, “dev”, “test”). Additional keyword arguments can be passed to customize the loading and preprocessing behavior. covert_to_tensor(data): a static method that converts the preprocessed data samples to PyTorch tensors. load_data_from_dict(self, dataset_dict, dataset_type, **kwargs): loads the dataset from a dictionary object containing the preprocessed data. dataset_dict is the dictionary object, dataset_type is the type of the dataset to load, and additional keyword arguments can be passed to customize the loading behavior. load_data_from_file(self, dataset_file, dataset_type, **kwargs): loads the dataset from a file containing the preprocessed data. dataset_file is the file path, dataset_type is the type of the dataset to load, and additional keyword arguments can be passed to customize the loading behavior. get_labels(self): returns a list of the labels for each data sample in the dataset. __len__(self): returns the number of data samples in the dataset. __str__(self): returns a string representation of the dataset. __repr__(self): returns a string representation of the dataset.
- load_data_from_dict(data_dict, **kwargs)[source]
Load the dataset from a dictionary. :param dataset_dict: A dictionary containing the dataset. :param dataset_type: The type of the dataset, which can be “train”, “dev”, or “test”. :param kwargs: Additional arguments for loading the dataset, such as “text_column”, “aspect_column”, “label_column”, “separator”, and “data_path”.
- load_data_from_file(file_path, **kwargs)[source]
Load data from a file.
- Parameters:
dataset_file – The file to load data from.
dataset_type – The type of dataset to load, e.g. “train”, “test”, “dev”.
kwargs – Optional additional arguments for loading data.