这是indexloc提供的服务,不要输入任何密码
Skip to content

ZekaiGalaxy/NLPDL_final

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLPDL Final Project (2022 Autumn)

Chinese QG

This project is based on pretrained model Randeng-BART-139M-QG-Chinese.

In this project, I implemented answer separation in order to improve the performance of the QG model, as suggested by this paper. To further guide the model's decoding process, I introduced a post-selection method that selects the generated texts with the highest overlap with the given answer as the output. Upon analyzing the data, I discovered significant differences between the training and testing sets, so I employed a data augmentation technique called "pseudo_q" to reduce these gaps. The report is available at here.

Run Model

Preparation

cd NLPDL_final 
mkdir pretrained_model 
mkdir saved_model 
mkdir output 

# Download pretrained QG model & add to pretrained_model
pretrained_model/bart_zh 
from https://huggingface.co/IDEA-CCNL/Randeng-BART-139M-QG-Chinese

# Download pretrained QA model & add to pretrained_model
pretrained_model/QA 
from https://huggingface.co/uer/roberta-base-chinese-extractive-qa

Preprocess

python data_preprocess/preprocess_squad_zh.py
python data_preprocess/process_pseudo_q.py

Training_args

You can create your own training_args to train model and eval model.

An example is shown below

# train_args

{
    "do_train":true,  # set to true when train
    "do_predict":false, # set to true when eval
    "output_dir":"saved_model",
    "save_strategy":"epoch",
    "save_total_limit":5,
    "num_train_epochs":5,
    "per_device_train_batch_size":16,
    "gradient_accumulation_steps":4,
    "logging_steps":10,
    "learning_rate":3e-04,
    "seed":42,
    "ans_attn":"none",
    "load_pretrain":1,  # whether to load the pretrained model or train from scratch
    "low_res":100,  # low resource senario
    "aug":"none", # data augmentation methods from [qa, squad, pseudo_q]
    "sep_ans":1,  # whether to use answer separation
    "model_type":"as", # use bart model or ans-stressed-attention bart model
    "load_path":"", # provided for eval
    "generation_max_length":25,
    "predict_with_generate":true
}

Train & Eval

python train.py your_train_args.json
python train.py your_eval_args.json

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages