pyabsa.augmentation.aug_utils

Module Contents

Functions

contextual_code_noise_instance(→ str)

perform contextual noise on code, based on replace, insert, delete operations

contextual_noise_instance(text, tokenizer[, ...])

param text:

input text

__word_noise_instance(text, tokenizer, noise_level, ...)

param text:

input text

__char_noise_instance(text, tokenizer, noise_level, ...)

param text:

input text

__token_noise_instance(text, tokenizer, noise_level, ...)

param text:

input text

contextual_ids_noise_instance(ids, tokenizer[, ...])

param ids:

input ids

__ids_mask_instance(ids, tokenizer, noise_level, **kwargs)

param ids:

input ids

__ids_random__instance(ids, tokenizer, noise_level, ...)

param ids:

input ids

pyabsa.augmentation.aug_utils.contextual_code_noise_instance(code: str, noise_level: float = 0.15, noise_type: str = 'hybrid', **kwargs) str[source]

perform contextual noise on code, based on replace, insert, delete operations :param code: input code :param noise_level: noise level :param noise_type: noise type, can be {word, char, token} :param kwargs: other arguments :return: augmented instance

pyabsa.augmentation.aug_utils.contextual_noise_instance(text: str, tokenizer, noise_level: float = 0.15, noise_type: str = 'word', **kwargs)[source]
Parameters:
  • text – input text

  • tokenizer – tokenizer

  • noise_level – noise level

  • noise_type – noise type, can be {word, char, token}

  • kwargs – other arguments

Returns:

augmented instance

pyabsa.augmentation.aug_utils.__word_noise_instance(text, tokenizer, noise_level, **kwargs)[source]
Parameters:
  • text – input text

  • tokenizer – tokenizer

  • noise_level – noise level

  • kwargs – other arguments

Returns:

augmented instance

pyabsa.augmentation.aug_utils.__char_noise_instance(text, tokenizer, noise_level, **kwargs)[source]
Parameters:
  • text – input text

  • tokenizer – tokenizer

  • noise_level – noise level

  • kwargs – other arguments

Returns:

augmented instance

pyabsa.augmentation.aug_utils.__token_noise_instance(text, tokenizer, noise_level, **kwargs)[source]
Parameters:
  • text – input text

  • tokenizer – tokenizer

  • noise_level – noise level

  • kwargs – other arguments

Returns:

augmented instance

pyabsa.augmentation.aug_utils.contextual_ids_noise_instance(ids: List[int], tokenizer, noise_level: float = 0.15, noise_type: str = 'mask', **kwargs)[source]
Parameters:
  • ids – input ids

  • tokenizer – tokenizer

  • noise_level – noise level

  • noise_type – noise type, can be {word, char, token}

  • kwargs – other arguments

Returns:

augmented instance

pyabsa.augmentation.aug_utils.__ids_mask_instance(ids, tokenizer, noise_level, **kwargs)[source]
Parameters:
  • ids – input ids

  • tokenizer – tokenizer

  • noise_level – noise level

  • kwargs – other arguments

Returns:

augmented instance

pyabsa.augmentation.aug_utils.__ids_random__instance(ids, tokenizer, noise_level, **kwargs)[source]
Parameters:
  • ids – input ids

  • tokenizer – tokenizer

  • noise_level – noise level

  • kwargs – other arguments

Returns:

augmented instance