pyabsa.utils.data_utils.dataset_manager

Module Contents

Functions

detect_dataset(dataset_name_or_path[, task_code, ...])

Detect dataset from dataset_path, you need to specify the task type, which can be TaskCodeOption.Aspect_Polarity_Classification, 'atepc' or 'tc', etc.

detect_infer_dataset(dataset_name_or_path[, task_code])

Detect the inference dataset from local disk or download from GitHub

download_all_available_datasets(**kwargs)

Download datasets from GitHub

download_dataset_by_name([task_code, dataset_name])

If download all datasets failed, try to download dataset by name from Huggingface

Attributes

filter_key_words

pyabsa.utils.data_utils.dataset_manager.filter_key_words = ['.py', '.md', 'readme', '.log', 'result', '.zip', '.state_dict', '.model', '.png', 'acc_',...[source]
pyabsa.utils.data_utils.dataset_manager.detect_dataset(dataset_name_or_path, task_code: pyabsa.framework.flag_class.TaskCodeOption = None, load_aug=False, config=None, **kwargs)[source]

Detect dataset from dataset_path, you need to specify the task type, which can be TaskCodeOption.Aspect_Polarity_Classification, ‘atepc’ or ‘tc’, etc.

Parameters:
  • dataset_name_or_path – str or DatasetItem The name or path of the dataset.

  • task_code – str or TaskCodeOption The task type, such as “apc” for aspect-polarity classification or “tc” for text classification.

  • load_aug – bool, default False Whether to load the augmented dataset.

  • config – Config, optional The configuration object.

  • kwargs – dict Additional keyword arguments.

Returns:

dict A dictionary containing file paths for the train, test, and validation sets.

pyabsa.utils.data_utils.dataset_manager.detect_infer_dataset(dataset_name_or_path, task_code: pyabsa.framework.flag_class.TaskCodeOption = None, **kwargs)[source]

Detect the inference dataset from local disk or download from GitHub :param dataset_name_or_path: dataset name or path :param task_code: task name :param kwargs: other arguments

pyabsa.utils.data_utils.dataset_manager.download_all_available_datasets(**kwargs)[source]

Download datasets from GitHub :param kwargs: other arguments

pyabsa.utils.data_utils.dataset_manager.download_dataset_by_name(task_code: pyabsa.framework.flag_class.TaskCodeOption | str = TaskCodeOption.Aspect_Polarity_Classification, dataset_name: pyabsa.utils.data_utils.dataset_item.DatasetItem | str = None, **kwargs)[source]

If download all datasets failed, try to download dataset by name from Huggingface Download dataset from Huggingface: https://huggingface.co/spaces/yangheng/PyABSA :param task_code: task code -> e.g., TaskCodeOption.Aspect_Polarity_Classification :param dataset_name: dataset name -> e.g, pyabsa.tasks.AspectPolarityClassification.APCDatasetList.Laptop14