pyabsa.utils.file_utils.file_utils
Module Contents
Functions
|
Load data from a file, which can be plain text, json file, Excel file, |
|
Save data to a pickle file, which can be plain text, json file, Excel file, |
|
Save data to a jsonl file. |
|
Save data to a plain text file. |
|
Save data to a json file. |
|
Save data to an Excel file. |
|
Save data to a csv file. |
|
Save data to a numpy file. |
|
Save data to a torch file. |
|
Save data to a pickle file. |
|
Load an Excel file and return the data. |
|
Load a csv file and return the data. |
|
Load a numpy file and return the data. |
|
Load a torch file and return the data. |
|
Load a pickle file and return the data. |
|
Load a plain text file and return a list of strings. |
|
Remove empty lines from the input files. |
|
Save data to a json file. |
|
Load a JSON file and return a Python dictionary. |
|
Load a JSONL file and return a list of Python dictionaries. |
|
Loads a dataset from one or multiple files. |
|
Check if the provided GloVe embedding exists, if not, search for a similar file in the current directory, or download |
|
Unzip a checkpoint file in zip format. |
|
Save a trained model, configuration, and tokenizer to the specified path. |
- pyabsa.utils.file_utils.file_utils.meta_load(path, **kwargs)[source]
- Load data from a file, which can be plain text, json file, Excel file,
pickle file, numpy file, torch file, pandas file, etc. File types: txt, json, pickle, npy, pkl, pt, torch, csv, xlsx, xls
- Parameters:
path (str) – The path to the file.
kwargs – Other arguments for the corresponding load function.
- Returns:
The loaded data.
- pyabsa.utils.file_utils.file_utils.meta_save(data, path, **kwargs)[source]
- Save data to a pickle file, which can be plain text, json file, Excel file,
pickle file, numpy file, torch file, pandas file, etc. File types: txt, json, pickle, npy, pkl, pt, torch, csv, xlsx, xls
- Parameters:
data – The data to be saved.
path (str) – The path to the file.
kwargs – Other arguments for the corresponding save function.
- pyabsa.utils.file_utils.file_utils.save_jsonl(data, file_path, **kwargs)[source]
Save data to a jsonl file.
- pyabsa.utils.file_utils.file_utils.save_txt(data, file_path, **kwargs)[source]
Save data to a plain text file.
- pyabsa.utils.file_utils.file_utils.save_json(data, file_path, **kwargs)[source]
Save data to a json file.
- pyabsa.utils.file_utils.file_utils.save_excel(data, file_path, **kwargs)[source]
Save data to an Excel file.
- pyabsa.utils.file_utils.file_utils.save_csv(data, file_path, **kwargs)[source]
Save data to a csv file.
- pyabsa.utils.file_utils.file_utils.save_npy(data, file_path, **kwargs)[source]
Save data to a numpy file.
- pyabsa.utils.file_utils.file_utils.save_torch(data, file_path, **kwargs)[source]
Save data to a torch file.
- pyabsa.utils.file_utils.file_utils.save_pickle(data, file_path, **kwargs)[source]
Save data to a pickle file.
- pyabsa.utils.file_utils.file_utils.load_excel(file_path, **kwargs)[source]
Load an Excel file and return the data.
- pyabsa.utils.file_utils.file_utils.load_csv(file_path, **kwargs)[source]
Load a csv file and return the data.
- pyabsa.utils.file_utils.file_utils.load_npy(file_path, **kwargs)[source]
Load a numpy file and return the data.
- pyabsa.utils.file_utils.file_utils.load_torch(file_path, **kwargs)[source]
Load a torch file and return the data.
- pyabsa.utils.file_utils.file_utils.load_pickle(file_path, **kwargs)[source]
Load a pickle file and return the data.
- pyabsa.utils.file_utils.file_utils.load_txt(file_path)[source]
Load a plain text file and return a list of strings.
- pyabsa.utils.file_utils.file_utils.remove_empty_line(files: Union[str, List[str]])[source]
Remove empty lines from the input files.
- pyabsa.utils.file_utils.file_utils.save_json(dic, save_path)[source]
Save a Python dictionary to a JSON file.
- pyabsa.utils.file_utils.file_utils.load_json(file_path, **kwargs)[source]
Load a JSON file and return a Python dictionary.
- pyabsa.utils.file_utils.file_utils.load_jsonl(file_path, **kwargs)[source]
Load a JSONL file and return a list of Python dictionaries.
- pyabsa.utils.file_utils.file_utils.load_dataset_from_file(fname, config)[source]
Loads a dataset from one or multiple files.
- Parameters:
fname (str or List[str]) – The name of the file(s) containing the dataset.
config (dict) – The configuration dictionary containing the logger (optional) and the maximum number of data to load (optional).
- Returns:
A list of strings containing the loaded dataset.
- Raises:
ValueError – If an empty line is found in the dataset.
- pyabsa.utils.file_utils.file_utils.prepare_glove840_embedding(glove_path, embedding_dim, config)[source]
Check if the provided GloVe embedding exists, if not, search for a similar file in the current directory, or download the 840B GloVe embedding. If none of the above exists, raise an error. :param glove_path: str, path to the GloVe embedding :param embedding_dim: int, the dimension of the embedding :param config: dict, configuration dictionary :return: str, the path to the GloVe embedding
- pyabsa.utils.file_utils.file_utils.unzip_checkpoint(zip_path)[source]
Unzip a checkpoint file in zip format.
- Parameters:
zip_path (str) – path to the zip file.
- Returns:
path to the unzipped checkpoint directory.
- Return type:
str
- pyabsa.utils.file_utils.file_utils.save_model(config, model, tokenizer, save_path, **kwargs)[source]
Save a trained model, configuration, and tokenizer to the specified path.
- Parameters:
config (Config) – Configuration for the model.
model (nn.Module) – The trained model.
tokenizer – Tokenizer used by the model.
save_path (str) – The path where to save the model, config, and tokenizer.
**kwargs – Additional keyword arguments.