pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils

Module Contents

Classes

CodeLineIterator

Built-in mutable sequence.

Functions

random_indices(source, percentage)

_switch_token(tokens, ids)

_replace_token(tokens, ids)

_delete_token(tokens, ids)

_add_token(tokens, ids)

_prepare_corrupt_code(code_src)

remove_comment(code_str[, tokenizer])

Remove comments from code string,

read_defect_examples(lines, data_num[, ...])

Read examples from filename.

calc_stats(examples[, tokenizer, is_tokenize])

class pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils.CodeLineIterator(code, strip=True)[source]

Bases: list

Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.

__getitem__(item)[source]

x.__getitem__(y) <==> x[y]

__setitem__(key, value)[source]

Set self[key] to value.

__iter__()[source]

Implement iter(self).

__len__()[source]

Return len(self).

__str__()[source]

Return str(self).

pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils.random_indices(source, percentage)[source]
pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils._switch_token(tokens: list, ids: list)[source]
pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils._replace_token(tokens: list, ids: list)[source]
pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils._delete_token(tokens: list, ids: list)[source]
pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils._add_token(tokens: list, ids: list)[source]
pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils._prepare_corrupt_code(code_src)[source]
pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils.remove_comment(code_str, tokenizer=None)[source]

Remove comments from code string, :param code_str: code string :param tokenizer: tokenizer if passed, will add <mask> token to the code

pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils.read_defect_examples(lines, data_num, remove_comments=True, tokenizer=None)[source]

Read examples from filename.

pyabsa.tasks.CodeDefectDetection.dataset_utils.cdd_utils.calc_stats(examples, tokenizer=None, is_tokenize=False)[source]