pyabsa.tasks.AspectTermExtraction.dataset_utils.__lcf__.data_utils_for_training

Module Contents

Classes

InputExample

A single training_tutorials/test example for simple sequence classification.

InputFeatures

A single set of features of raw_data.

DataProcessor

Base class for raw_data converters for sequence classification raw_data sets.

ATEPCProcessor

Processor for the CoNLL-2003 raw_data set.

Functions

readfile(filename)

read file

split_aspect(tag1[, tag2])

convert_examples_to_features(examples, max_seq_len, ...)

Loads a raw_data file into a list of `InputBatch`s.

Attributes

Labels

pyabsa.tasks.AspectTermExtraction.dataset_utils.__lcf__.data_utils_for_training.Labels[source]
class pyabsa.tasks.AspectTermExtraction.dataset_utils.__lcf__.data_utils_for_training.InputExample(guid, text_a, text_b=None, IOB_label=None, aspect_label=None, polarity=None)[source]

Bases: object

A single training_tutorials/test example for simple sequence classification.

class pyabsa.tasks.AspectTermExtraction.dataset_utils.__lcf__.data_utils_for_training.InputFeatures(input_ids_spc, input_mask, segment_ids, label_id, polarity=None, valid_ids=None, label_mask=None, tokens=None, lcf_cdm_vec=None, lcf_cdw_vec=None)[source]

Bases: object

A single set of features of raw_data.

pyabsa.tasks.AspectTermExtraction.dataset_utils.__lcf__.data_utils_for_training.readfile(filename)[source]

read file

pyabsa.tasks.AspectTermExtraction.dataset_utils.__lcf__.data_utils_for_training.split_aspect(tag1, tag2=None)[source]
class pyabsa.tasks.AspectTermExtraction.dataset_utils.__lcf__.data_utils_for_training.DataProcessor[source]

Bases: object

Base class for raw_data converters for sequence classification raw_data sets.

abstract get_train_examples(data_dir)[source]

Gets a collection of `InputExample`s for the train set.

abstract get_dev_examples(data_dir)[source]

Gets a collection of `InputExample`s for the dev set.

abstract get_labels()[source]

Gets the list of labels for this raw_data set.

classmethod _read_tsv(input_file, quotechar=None)[source]

Reads a tab separated value file.

class pyabsa.tasks.AspectTermExtraction.dataset_utils.__lcf__.data_utils_for_training.ATEPCProcessor(tokenizer)[source]

Bases: DataProcessor

Processor for the CoNLL-2003 raw_data set.

get_train_examples(data_dir, set_tag)[source]

See base class.

get_valid_examples(data_dir, set_tag)[source]

See base class.

get_test_examples(data_dir, set_tag)[source]

See base class.

get_labels()[source]

Gets the list of labels for this raw_data set.

_create_examples(lines, set_type)[source]
pyabsa.tasks.AspectTermExtraction.dataset_utils.__lcf__.data_utils_for_training.convert_examples_to_features(examples, max_seq_len, tokenizer, config=None)[source]

Loads a raw_data file into a list of `InputBatch`s.