Zan Hongying

Also published as: 红英


2024

pdf bib
基于知识蒸馏的低频词翻译优化策略(Knowledge Distillation-Based Optimization Strategy for Low-Frequency Word Translation in Neural Machine)
Guo Yifan (郭逸帆) | Zan Hongying (昝红英) | Yan Ziyue (阎子悦) | Xu Hongfei (许鸿飞)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“神经机器翻译通常需要大量的平行语料库才能达到良好的翻译效果。而在不同的平行语料库中,均存在词频分布不平衡的问题,这可能导致模型在学习过程中表现出不同的偏差。这些模型倾向于学习高频词汇,而忽略了低频词汇所携带的关键语义信息。忽略的这些低频词汇也包含重要的翻译信息,可能会对翻译质量产生不利影响。目前的方法通常是训练一个双语模型,然后根据频率为词汇分配不同的权重,通过增加低频词的权重来提高低频词的翻译效果。在本文中,我们的目标是提高那些有意义但频率相对较低的词汇的翻译效果。本文提出使用知识蒸馏的方法来提高低频词的翻译效果,训练在低频词上翻译效果更好的模型,将其作为教师模型指导学生模型学习低频词翻译。进而提出一个更加稳定的双教师蒸馏模型,进一步保证高频的性能,使得模型在多个任务上均获得了稳定的提升。本文的单教师蒸馏模型在英语→ 德语任务上相较于SOTA进一步取得了0.64的BLEU提升,双教师蒸馏模型在汉语→ 英语任务上相较于SOTA进一步取得了0.31的BLEU提升,在英语→ 德语、英语→ 捷克语和英语→法语的翻译任务上相较于基线低频词翻译效果,在保证高频词翻译效果不变化的前提下,分别取得了1.24、0.47、0.87的BLEU提升。”

pdf bib
基于动态提示学习和依存关系的生成式结构化情感分析模型(Dynamic Prompt Learning and Dependency Relation based Generative Structured Sentiment Analysis Model)
Jia Yintao (贾银涛) | Cui Jiajia (崔佳佳) | Mu Lingling (穆玲玲) | Zan Hongying (昝红英)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“结构化情感分析旨在从文本中抽取所有由情感持有者、目标事物、观点表示和情感极性构成的情感元组,是较为全面的细粒度情感分析任务。针对目前结构化情感分析方法错误传递,提示模版适应性不足和情感要素构成复杂的问题,本文提出了基于动态提示学习和依存关系的生成式结构化情感分析模型,根据不同的情感元组构成情况分别设计提示模版,并用模板增强生成式预训练模型的输入,用依存关系增强生成效果。实验结果显示,本文提出的模型在SemEval20221数据集上的SF1值优于所对比的基线模型。”

pdf bib
Essay Rhetoric Recognition and Understanding Using Synthetic Data and Model Ensemble Enhanced Large Language Models
Song Jinwang | Zan Hongying | Zhang Kunli
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“Natural language processing technology has been widely applied in the field of education. Essay writing serves as a crucial method for evaluating students’ language skills and logical thinking abilities. Rhetoric, an essential component of essay, is also a key reference for assessing writing quality. In the era of large language models (LLMs), applying LLMs to the tasks of automatic classification and extraction of rhetorical devices is of significant importance. In this paper, we fine-tune LLMs with specific instructions to adapt them for the tasks of recognizing and extracting rhetorical devices in essays. To further enhance the performance of LLMs, we experimented with multi-task fine-tuning and expanded the training dataset through synthetic data. Additionally, we explored a model ensemble approach based on label re-inference. Our method achieved a score of 66.29 in Task 6 of the CCL 2024 Eval, Chinese Essay Rhetoric Recognition and Understanding(CERRU), securing the first position.”

pdf bib
基于指令微调与数据增强的儿童故事常识推理与寓意理解研究
Yu Bohan (于博涵) | Li Yunlong (李云龙) | Liu Tao (刘涛) | Zheng Aoze (郑傲泽) | Zhang Kunli (张坤丽) | Zan Hongying (昝红英)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“尽管现有语言模型在自然语言处理任务上表现出色,但在深层次语义理解和常识推理方面仍有提升空间。本研究通过测试模型在儿童故事常识推理与寓意理解数据集(CRMUS)上的性能,探究如何增强模型在复杂任务中的能力。在本次任务的赛道二中,本研究使用多个7B以内的开源大模型(如Qwen、InternLM等)进行零样本推理,并选择表现最优的模型基于LoRA进行指令微调来提高其表现。除此之外,本研究还对数据集进行了分析与增强。研究结果显示,通过设计有效的指令格式和调整LoRA微调参数,模型在常识推理和寓意理解上的准确率显著提高。最终在本次任务的赛道二中取得第一名的成绩,该任务的评价指标Acc值为74.38,达到了较为先进的水准。”

2023

pdf bib
Learnable Conjunction Enhanced Model for Chinese Sentiment Analysis
Zhao Bingfei | Zan Hongying | Wang Jiajia | Han Yingjie
Proceedings of the 22nd Chinese National Conference on Computational Linguistics

“Sentiment analysis is a crucial text classification task that aims to extract, process, and analyzeopinions, sentiments, and subjectivity within texts. In current research on Chinese text, sentenceand aspect-based sentiment analysis is mainly tackled through well-designed models. However,despite the importance of word order and function words as essential means of semantic ex-pression in Chinese, they are often underutilized. This paper presents a new Chinese sentimentanalysis method that utilizes a Learnable Conjunctions Enhanced Model (LCEM). The LCEMadjusts the general structure of the pre-trained language model and incorporates conjunctionslocation information into the model’s fine-tuning process. Additionally, we discuss a variantstructure of residual connections to construct a residual structure that can learn critical informa-tion in the text and optimize it during training. We perform experiments on the public datasetsand demonstrate that our approach enhances performance on both sentence and aspect-basedsentiment analysis datasets compared to the baseline pre-trained language models. These resultsconfirm the effectiveness of our proposed method. Introduction”

2022

pdf bib
MRC-based Medical NER with Multi-task Learning and Multi-strategies
Xiaojing Du | Jia Yuxiang | Zan Hongying
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“Medical named entity recognition (NER), a fundamental task of medical information extraction, is crucial for medical knowledge graph construction, medical question answering, and automatic medical record analysis, etc. Compared with named entities (NEs) in general domain, medical named entities are usually more complex and prone to be nested. To cope with both flat NEs and nested NEs, we propose a MRC-based approach with multi-task learning and multi-strategies. NER can be treated as a sequence labeling (SL) task or a span boundary detection (SBD) task. We integrate MRC-CRF model for SL and MRC-Biaffine model for SBD into the multi-task learning architecture, and select the more efficient MRC-CRF as the final decoder. To further improve the model, we employ multi-strategies, including adaptive pre-training, adversarial training, and model stacking with cross validation. Experiments on both nested NER corpus CMeEE and flat NER corpus CCKS2019 show the effectiveness of the MRC-based model with multi-task learning and multi-strategies.”

2020

pdf bib
Reusable Phrase Extraction Based on Syntactic Parsing
Xuemin Duan | Zan Hongying | Xiaojing Bai | Christoph Zähner
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Academic Phrasebank is an important resource for academic writers. Student writers use the phrases of Academic Phrasebank organizing their research article to improve their writing ability. Due to the limited size of Academic Phrasebank, it can not meet all the academic writing needs. There are still a large number of academic phraseology in the authentic research article. In this paper, we proposed an academic phraseology extraction model based on constituency parsing and dependency parsing, which can automatically extract the academic phraseology similar to phrases of Academic Phrasebank from an unlabelled research article. We divided the proposed model into three main components including an academic phraseology corpus module, a sentence simplification module, and a syntactic parsing module. We created a corpus of academic phraseology of 2,129 words to help judge whether a word is neutral and general, and created two datasets under two scenarios to verify the feasibility of the proposed model.