0% found this document useful (0 votes)
11 views11 pages

2022 Coling-1 610

The paper presents LEGO-ABSA, a unified generative framework for multi-task aspect-based sentiment analysis (ABSA) that allows for simultaneous training on multiple tasks and effective transfer from simpler to more complex tasks. It addresses limitations of existing methods by treating outputs as combinations of individual elements rather than whole strings, achieving state-of-the-art results across various ABSA tasks. The framework utilizes prompt-based learning and T5 architecture to enhance performance in both task-specific and transfer scenarios.

Uploaded by

orangepanda061
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views11 pages

2022 Coling-1 610

The paper presents LEGO-ABSA, a unified generative framework for multi-task aspect-based sentiment analysis (ABSA) that allows for simultaneous training on multiple tasks and effective transfer from simpler to more complex tasks. It addresses limitations of existing methods by treating outputs as combinations of individual elements rather than whole strings, achieving state-of-the-art results across various ABSA tasks. The framework utilizes prompt-based learning and T5 architecture to enhance performance in both task-specific and transfer scenarios.

Uploaded by

orangepanda061
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

LEGO-ABSA: A Prompt-based Task Assemblable Unified Generative

Framework for Multi-task Aspect-based Sentiment Analysis


Tianhao Gao1,∗, Jun Fang2,∗, HanyuLiu2∗, Zhiyuan Liu2
Chao Liu2†, Pengzhang Liu2 , Yongjun Bao2 , Weipeng Yan2
1
School of Software and Microelectronics, Peking University
2
JD, Retail, Beijing, China
gaotianhao@[Link]
{junfang8, liuhanyu11, liuzhiyuan8,
liuchao397, liupengzhang, baoyongjun, [Link]}@[Link]
Abstract designed a new architecture carefully and trained
with the corresponding dataset for a specific sub-
Aspect-based sentiment analysis (ABSA) has
task. We review them as follows:
received increasing attention recently. ABSA
can be divided into multiple tasks according
Pair Extraction The pair extraction task in-
to the different extracted elements. Existing
generative methods usually treat the output as
cludes AOPE(Aspect-Opinion Pair Extraction),
a whole string rather than the combination of ACSA(Aspect-Category Sentiment Analysis) and
different elements and only focus on a single E2E-ABSA(End-to-End Aspect-based Sentiment
task at once. This paper proposes a unified Analysis) in our method. ACSA is usually treated
generative multi-task framework that can solve as a multi-task classification task (Hu et al., 2018;
multiple ABSA tasks by controlling the type Dai et al., 2020; Ma et al., 2018) Some works
of task prompts consisting of multiple element convert the AOPE and E2E-ABSA tasks into se-
prompts. Further, the proposed approach can
quence tagging problems(Wu et al., 2020b; Gao
train on simple tasks and transfer to difficult
tasks by assembling task prompts, like assem- et al., 2021; Chen et al., 2020; He et al., 2019),
bling Lego bricks. We conduct experiments on specifically using the BIO tagging strategies (Wang
six ABSA tasks across multiple benchmarks. and Pan, 2018; Wu et al., 2020a; Li et al., 2019a,b)
Our proposed multi-task approach achieves The pair extraction is also named the basic task in
new state-of-the-art results in almost all tasks this paper.
and competitive results in task transfer scenar-
ios. Triplets Extraction The triples extraction tasks
contain ASTE(Aspect Sentiment Triplet Extrac-
1 Introduction tion) and TASD(Target Aspect Sentiment Detec-
ABSA is a fine-grained sentiment analysis task tion) in our paper. Most previous works still treat
that has attracted increasing attention in recent them as a sequence tagging task (Xu et al., 2020,
years(Schouten and Frasincar, 2016; Nazir et al., 2021; Zhang et al., 2020; Wu et al., 2021) Some
2020). ABSA aims to extract different elements in- works transfer them to Machine Reading Compre-
cluding: 1) the aspect term(a); 2) opinion term(o); hension tasks(Mao et al., 2021; Chen et al., 2021).
3) the aspect category(c) corresponding to the as-
pect term; 4) the sentiment polarity(s) for a specific Quadruple Extraction (Cai et al., 2021) firstly
aspect term. For example, in the sentence “Pizza is introduced the quadruple extraction task, i.e.,
delicious”, “Pizza” is an aspect term belonging to ASQP(Aspect Sentiment Quad Prediction), and
the food category, and the corresponding opinion provide a multi-stage classification structure adopt-
term is “delicious”, which expresses positive sen- ing from an aspect-opinion co-extraction sys-
timent. As shown in Table 1, based on the combi- tem(Wang et al., 2017).
nation of different elements to be extracted, ABSA Recently, large-scale generative language mod-
can be divided into multiple tasks. els have become increasingly powerful(Raffel et al.,
This paper explores tasks containing two or more 2020; Lewis et al., 2019; Radford et al., 2019),
elements. In general, most ABSA tasks are trans- and any ABSA task can be converted to a genera-
ferred to classification tasks. Previous works often tive problem. Some generative frameworks(Zhang
* These authors have equal contribution. Work done during
et al., 2021b,a; Yan et al., 2021; Hosseini-Asl et al.,
gaotianhao’s internship at JD, Retail, Beijing, China 2022) have been proposed and achieved state-of-

Corresponding authors. the-art results in the field of ABSA. The genera-
7002
Proceedings of the 29th International Conference on Computational Linguistics, pages 7002–7012
October 12–17, 2022.
Task Name Input Output
Aspect-Opinion Pair Extraction(AOPE) Pizza, delicious (a,o)
Aspect-Category Sentiment Analysis(ACSA) food, delicious (c,s)
End-to-End Aspect-based Sentiment Analysis (E2E-ABSA) Pizza, Positive (a,s)
Pizza is delicious
Aspect Sentiment Triplet Extraction(ASTE) Pizza, delicious, positive (a,o,s)
Target Aspect Sentiment Detection(TASD) Pizza, food, positive(a,c,s)
Aspect Sentiment Quad Prediction(ASQP) Pizza, delicious, food, positive (a,o,c,s)

Table 1: target of different tasks

tive format includes but is not limited to Generat- design an element prompt and establish the corre-
ing Structure-Linearized Texts, Labelaugmented spondence between each element with the element
Text(Zhang et al., 2021b) Generating Word Indices prompt. We make the framework treat prompt and
(Yan et al., 2021) Filling Templates (Zhang et al., output text as a combination of independent ele-
2021a), as summarized by (Min et al., 2021). ments by this design. We combine multiple ele-
However, all generative approaches mentioned ment prompts into task prompt. The task prompt
above suffer from 1) training and predicting a sin- of a simple task can be regarded as basic bricks
gle specific task at once; 2) treating output as a which can be assembled to transfer to a complex
whole text rather than a combination of individual task, just like assembling Lego bricks. The out-
elements; 3) poor transferability from simple task put sequence is formed as a concatenation of the
to difficult task. Below is a detailed description of sentinel tokens and the real answer tokens, con-
these three points. sistent with T5. To verify the effectiveness of our
For the first point, in mentioned generative ap- method, we conduct experiments on public datasets.
proaches, the input and output formats do not sup- Comparison results show that our proposed frame-
port training on multiple ABSA tasks simultane- work outperforms previous state-of-the-art (SOTA)
ously, which we call multi-task training setting. approaches in most tasks. Moreover, in the case
For the second point, in previous works, the of missing part of the data annotation, it can also
models cannot understand the meaning of each achieve competitive performance.
element to be extracted because the input and out- In summary, our main contributions are as fol-
put are treated as simple strings, and the model lows:
completes the task of predicting output through • We propose a prompt-based unified genera-
auto-regression. tive framework to solve all ABSA tasks. The
For the third point, previous methods cannot be framework can be trained on multiple tasks
applied to task transfer scenarios. Compared to simultaneously, and it also performs competi-
the triplets like ASTE, pairs like AOPE and E2E- tively in task transfer scenarios.
ABSA are much easier in the annotation. However,
previous works cannot complete ASTE by training • To the best of our knowledge, we are the first
only on AOPE and E2E-ABSA tasks even though to explore solutions for task transfer scenarios.
the ASTE task elements are the same as the union
• The experimental results show that Our
of AOPE and E2E-ABSA task elements. We call
method significantly outperforms the SOTA
this task transfer scenario, and it is a special case
methods on E2E-ABSA, AOPE, ASTE, and
under a multi-task training setting. The proposed
ACSA tasks.
method has a competitive performance in this set-
ting. 2 Methodology
Inspired by above observations, we propose a
unified generative framework LEGO-ABSA that 2.1 Task formulation
can simultaneously solve multiple ABSA tasks and The proposed method will formulate any ABSA
transfer from simple to complex tasks. Specifi- task as a text generation task. Here we give formal
cally, we take T5 as our backbone network and definitions of generative frameworks’ inputs and
combine prompt learning with the practice of plac- output text.
ing sentinel tokens of T5 pre-training. Unlike most The input x consists of two part, the raw text t
previous works that use a piece of simple text as a and a task prompt p: x = t + | + ptask . t =
prompt, e.g."ASQP" in (Zhang et al., 2021a), we [t1 , t2 , ...tn ] where ti is the ith token of t and n is
7003
the length of tokens. ptask = [p1 , p2 , ...pmtask ] 2.3 Task Prompt of Single-task Training
where pi is the ith element prompt of ptask and
mtask is the number of element prompt in ptask , From shallow to deep, we start with the single
which is used as a condition to generate different ABSA task.
output text for different task. The element prompt is defined for each element
Output text otask = [o1 , o2 , ...om′ ], where to be extracted, but in order to complete a specific

oi is the ith tokens pair of otask and m is the ABSA task, we need to concatenate different ele-
output length based on the current input x. The ment prompts to form the task prompt, i.e., ptask .
subsequent subsection will describe construction ptask is used as a condition so that the backbone
methods in detail. can distinguish between different tasks. According
to the kind of elements extracted and the order of
2.2 Element Prompt Definition element extraction in each task, we concatenate all
element prompts by commas, e.g. pAOP E can be
2.2.1 Introduction of T5
pao or poa which means pa +, +po and po +, +pa .
T5 is an encoder-decoder model pre-trained on a Because the training for each task is independent, it
multi-task mixture of unsupervised and supervised is trivial to maintain a unique mapping relationship
tasks converted into a text-to-text format. between sentinel token id and element. Here, sen-
In order to minimize the gap in pre-training and tinel token id for each task increments from 0, as
fine-tune, we use the same training mode as the shown in the Figure 1(b) with the sample of AOPE.
T5 dose in pre-training. The goal of T5 is similar The rest of the task prompts also follow the same
to the cloze test. As shown in the Figure 1(a), the method to define.
input of T5 is a sentence with randomly masked
The arrangement order of the element prompt
consecutive spans using sentinel tokens. During
matters since the generation model is generated in
unsupervised training, T5 aims to reconstruct the
an auto-regressive manner, and the elements gener-
continuous span masked by the sentinel token, i.e.,
ated first can provide more prior information for the
<extra_id_i> in the Figure 1(a) incrementing one
elements generated later. From our experimental
by one starting from zero. Through this training
observations, the elements are arranged in priority
object, T5 can learn general language features.
according to aspect term > opinion term = aspect
2.2.2 Element Prompt category > sentiment polarity
In order to make the framework fully understand
the meaning of each element in the output text, 2.4 Task Prompt of Multi-task Training
instead of treating the output as a simple string,
we design an element prompt for each extracted An improvement of our framework is the ability
element. to organize multiple ABSA tasks into a multi-task
We define the element prompt as "aspect: <ex- training task through task prompts.
tra_id_0>", which has two advantages. On the one As shown in Figure 1(c), under the multi-task
hand, the format is consistent with the T5 unsuper- training setting, the task prompt is still constructed
vised training object, which can help us make better by concatenating element prompt like the single
use of the information learned from pre-training. task. The difference is that the one-to-one corre-
On the other hand, by defining a prompt for a sin- spondence between elements and sentinel tokens is
gle element, the output is no longer regarded as a shared between multiple sub-tasks, so we define a
whole text string but as a combination of different global mapping relationship between the sentinel
elements that offer more convenience. token and the corresponding element. Following
The element prompts for the four elements in the the priority of elements mentioned above we as-
ABSA task are as follows. We use w, x, y, and z to sign <extra_id_0> to aspect term , <extra_id_1> to
represent the id of the sentinel token. opinion term, <extra_id_2> to aspect category and
<extra_id_3> to sentiment polarity. After setting
• pa : "aspect : <extra_id_w>" each task prompt, we concatenate task prompts to
• pc : "category : <extra_id_x>" each original input of the corresponding task and
• po : "opinion : <extra_id_y>" then mix the data of all tasks to do multi-task train-
• ps : "sentiment : <extra_id_z>" ing.
7004
Original text Original text
Thank you for inviting me to your party last week Pizza is delicious| aspect : Pizza, opinion : delicious

Inputs Inputs
Thank you <extra_id_0> me to your party <extra_id_1> week Pizza is delicious| aspect : <extra_id_0> , opinion : <extra_id_1>

Outputs Outputs
<extra_id_0> for inviting <extra_id_1> last <extra_id_2> <extra_id_0> Pizza <extra_id_1> delicious <extra_id_2>

(a) Unsupervised object of T5 (b) Objectives of LEGO-ABSA on the AOPE task

(c) multi-task training setting example

Input1
Pizza is delicious| opinion : <extra_id_0> , aspect : <extra_id_1>
Input
Input2 Pizza is delicious| opinion : <extra_id_0>, aspect :
Service is bad| aspect : <extra_id_1> , sentiment : <extra_id_2> <extra_id_1>, sentiment : <extra_id_2>

Output1
Inference result
<extra_id_0> Pizza <extra_id_1> delicious <extra_id_2>
<extra_id_0> Pizza <extra_id_1> delicious
Output2
<extra_id_2> positive <extra_id_3>
<extra_id_1> Service <extra_id_2> bad <extra_id_3>

(d) task transfer scenario example

Figure 1

2.4.1 Task Transfer Scenario Basic Task Confirmation In order to complete


This section introduces how the proposed frame- the advanced task, we need to confirm the cor-
work works in a task transfer [Link] shown responding basic task. Because advanced tasks
in Figure 1(d), we define the task that extracts two consist of basic tasks connected by connection el-
elements and the combination relationship between ements, we need two basic tasks for extraction of
elements as basic task. As illustrated in Figure 2, triplet, like ASTE and TASD. For ASQP extrac-
AOPE, E2E-ABSA, and ACSA are basic tasks to tion, we need all three basic tasks mentioned in this
accomplish more complicated tasks. Basic tasks paper. Then according to the elements contained in
can be regarded as the bricks in LEGO. the task, we can determine that the basic tasks of
We call the overlapping element of any two basic ASTE(element set is {o, a, s}) are AOPE(element
tasks connection element which is like a connector set is {o, a}) and E2E-ABSA(element set is {a,
that connects two bricks. We define ASTE, TASD, s}). The basic tasks of TASD(element set is
and ASQP as advanced task which aims to extract {a, c, s}) are E2E-ABSA(element set is {a, s})
three or more elements and the combination rela- and ACSA(element set is {c, s}). The basic
tionship between elements. The advanced task is tasks of ASQP(element set is {a, o, c, s}) are
like a final product assembled from basic bricks AOPE(element set is {o, a}), E2E-ABSA(element
and connectors. The goal for task transfer sce- set is {a, s}) and ACSA(element set is {c, s}).
nario is to resolve advanced tasks only given the
training data of basic tasks, and the process of us- Task Prompt Assemble After confirming the
ing the basic tasks to construct the advanced tasks basic task, we will illustrate the method of task
is like building Lego. prompt assemble, and the arrangement order of el-
To achieve this goal, we need to figure out two ements is essential here. We denote the advanced
questions: what basic tasks are required for a given task as A and its basic tasks set as B.
advanced task; how to assemble the basic tasks, We initialize pA = “”, then take any element
i.e., the way to construct an advanced task prompt that is not a connection element as the start element
from basic task prompts. We will give a detailed and let pA = pA + “, ” + pstart . Next, we select a
introduction in the following section for these two task B containing start element from B and use an-
questions. other element in B as the next element, then delete
7005
B from B. Afterward, we use the next element as 2016). For each ABSA task, we use the pub-
the start element and repeat the above process until lic datasets derived from SemEval14-16 with ad-
all elements of A have been traversed once. ditional sentiment annotations. Specifically, we
For example, as shown in Figure 2, adopt the dataset AOPE, ASTE, and E2E-ABSA
given ASQP as advanced task, we can get provided by (Peng et al., 2020), ACSA provided by
B={{a, s}, {a, o}, {c, s}}, and the connection (Pontiki et al., 2015, 2016; Liu et al., 2021) TASD
elements are {a, s}. First, we choose element o as provided by (Wan et al., 2020) , ASQP provided
the beginning element and the corresponding task by (Zhang et al., 2021a). For a fair comparison, we
B is {a, o}. Then, we concatenate po with pA and use the same data split as previous works.
let another element a be the new begin element
and delete {a, o} from B. Next, the corresponding 3.2 Baselines
task B is {a, s} based on element a. Afterward, For E2E-ABSA, AOPE, and ASTE tasks, we adopt
we repeat the process until all elements of ASQP two types of baselines: 1) extraction based meth-
have been traversed once. Through the above ods, including Li-unified(Li et al., 2019a), Peng-
process, the result is shown in Figure 2, pA = two-stage(Peng et al., 2020) and Bi-MRC(Chen
po + “, ” + pa + “, ” + ps + “, ” + pc . et al., 2021) JET-BERT(Xu et al., 2020), Dual-
MRC(Mao et al., 2021); 2) generation based meth-
task prompt of ASQP
opinion:<W>,aspect:<X>, sentiment :<Y>,category:<Z> ods, including GAS(Zhang et al., 2021b) and Yan-
O A S C unified(Yan et al., 2021)
For the ACSA task, we adopt five baselines
assemble assemble derived from (Cai et al., 2020). For TASD and
ASQP tasks, we utilize two types of baselines 1)
O A A S S C Extraction-based methods, including TAS-LPM-
AOPE E2E-ABSA ACSA
CRF and TAS-SW-TO from (Wan et al., 2020)
and TASO-BERT-CRF (Zhang et al., 2021a); 2)
Figure 2: Illustration of task prompt assemble
Generation-based methods including GAS(Zhang
We can get the task prompt after assembly and et al., 2021b) and PARAPHRASE(Zhang et al.,
determine the order of element prompts through 2021a).
the above process. Besides, we also need to main-
3.3 Implementation Details
tain the global mapping relationship between the
sentinel token and element. As shown in Table 2, Evaluation Metrics F1 score is the evaluation met-
we design a set of task prompts that are unique ric for all tasks. A prediction is correct if all its
in global mapping and conform to assemble rules. predicted sentiment elements in the pair, triplet, or
Moreover, the priority order of elements of the task quadruple are correct.
prompt in Table 2 is preserved as much as possible. Experiment Details We adopt the pre-trained T5-
base model released by huggingface* . We set the
2.5 Output Sequence Definition learning rate to 3e-4 as suggested by huggingface.
As shown in Figure 1(b), the definition form of In single task and multi-task training settings, the
the otask is consistent with the T5 unsupervised model is trained up to 20 epochs for the AOPE,
training output form. Whether it is a multi-task E2E-ABSA, ACSA, and ASTE tasks and 30 epochs
training or task transfer setting, the output sequence for the TASD and ASQP tasks. We train two multi-
is formed as a concatenation of the sentinel tokens task models according to whether the aspect cate-
and the corresponding gold label. For sentences gory element is included. The first is trained with
with multiple sets of extracted elements, we use “;” AOPE, E2E-ABSA, and ASTE tasks, while the
to separate sets. second model is trained with ACSA, TASD, and
ASQP. For the in-domain task transfer setting, we
3 Experiment train one epoch on basic tasks and 2 epochs on
3.1 Datasets basic tasks with a learning rate of 3e-4. For the
cross-domain setting, we train five epochs on basic
We evaluate the proposed LEGO-ABSA on bench-
tasks with a learning rate equal to 3e-4.
mark SemEval14-16 initially provided by the Se-
mEval shared challenges (Pontiki et al., 2014, 2015, * [Link]

7006
Task name Task prompt
AOPE opinion:<extra_id_0>, aspect:<extra_id_1>
E2E-ABSA aspect:<extra_id_1>, sentiment:<extra_id_2>
ACSA sentiment:<extra_id_2>, category:<extra_id_3>
ASTE opinion:<extra_id_0>, aspect:<extra_id_1>, sentiment:<extra_id_2>
TASD aspect:<extra_id_1>, sentiment:<extra_id_2>, category:<extra_id_3>
ASQP opinion:<extra_id_0>, aspect:<extra_id_1>, sentiment:<extra_id_2>, category:<extra_id_3>

Table 2: Task prompts in task transfer scenarios

3.4 Main Results 3.5 Task Transfer Results


The main results show in Table 3 and 4. All re- This section verifies our proposed framework’s in-
sults are the average F1 scores across 3 runs with domain and cross-domain performance under the
different random seeds. task transfer scenario.
Notably, our proposed method with a single task
3.5.1 In-domain
outperforms the state-of-the-art on AOPE, E2E-
ABSA, ASTE, and ACSA tasks by 1.9, 2.4, 1.7, In the in-domain setting, we complete a advanced
and 3.1 average F1 scores, respectively. Besides, task by training on the necessary basic tasks of the
competitive results are also shown on TASD and same training corpus at a time. The result of the in-
ASQP. domain task transfer is shown in Table 5. We were
Our method with a multi-task training setting surprised to find that the inference performance
achieved more competitive performance than sep- on advanced tasks is very competitive by training
arate training for each task, even though we only on basic tasks. Even the result on the ASTE task
used one T5-base as the backbone. We get 2.5, surpasses some purely supervised baselines.
3.3, 2.5, and 1.6 average higher F1 scores than 3.5.2 Cross-domain
the state-of-the-art methods on AOPE, E2E-ABSA,
In some real situations, AOPE and E2E-ABSA an-
ASTE, and ACSA tasks. For AOPE, E2E-ABSA,
notations may not be on the same corpus, or we
and ASTE tasks. Our model is trained on four
cannot combine them into a complete ASTE an-
datasets on each task and only uses one backbone,
notation. Therefore, task transfer performance of
which is equivalent to reducing the backbone size to
cross-domain is very important.
1/12 compared with the previous method with one
model per task, while the average F1 is 2.8 points For TASD and ASQP tasks, since the cross-
higher. The result shows that multi-task training domain aspect categories are not the same, the
can significantly improve performance. Since the model cannot transfer across domains on tasks that
multi-task training is modeled under a unified gen- include aspect categories. Therefore, we conduct
erative framework, the construction of input and experiments on ASTE in this section under the
output follows the same principle so that the infor- cross-domain setting.
mation between different tasks can be utilized and The cross-domain result shows in Table 6. The
mutually enhanced. proposed method outperforms some purely super-
Regarding why TASD and ASQP do not perform vised methods on average, and no noticeable per-
as outstanding as the rest of the tasks, we speculate formance drop compared to the in-domain setting.
that it may be because TASD and ASQP both need Compared with rule-based methods, task prompt
to extract aspect category and sentiment polarity. assembly can achieve a large performance improve-
These two elements are generated by reasoning and ment. The possible reason is that, in the rule-
have not appeared in the original text. The unsuper- based approach, the error of each model caused
vised pre-training object of T5 can only guarantee by domain transfer propagates. However, the task
to generate text spans that have appeared in the prompt assembly is more similar to a joint method.
original text. The working principle of sentiment Therefore, the performance promotion is obvious.
and category extraction is similar to using a gen-
4 Analysis
erative model to do classification tasks, which is
different from the unsupervisied training object of This section explores the principle of T5 assembly
T5. The gap between tasks is the leading cause of basic task corresponding to task prompt under task
performance degradation. transfer training setting.
7007
AOPE E2E-ABSA ASTE
Model
L14 R14 R15 R16 L14 R14 R15 R16 L14 R14 R15 R16
Li-unified(Li et al., 2019a) 52.6 55.3 56.9 53.8 63.4 73.8 65.0 70.2 42.5 51.7 46.7 44.5
Peng-two-stage(Peng et al., 2020) 53.9 56.1 56.2 60.0 62.3 74.2 65.8 71.7 43.5 51.9 46.8 53.6
JET-BERT(Xu et al., 2020) - - - - - - - - 50.0 63.9 54.7 62.9
Bi-MRC(Chen et al., 2021) 67.4 76.2 68.6 76.5 67.2 76.3 67.1 73.1 59.2 70.6 61.0 68.1
Dual-MRC(Mao et al., 2021) 63.3 74.9 64.9 75.7 64.5 76.5 65.1 70.8 55.5 70.3 57.2 67.4
GAS(Zhang et al., 2021b) 63.8 73.2 65.0 75.0 65.3 78.5 69.4 72.7 54.5 70.2 59.1 65.0
Yan-unified(Yan et al., 2021) 66.1 77.7 68.0 77.4 68.2 78.5 70.0 75.7 57.6 72.5 60.1 70.0
LEGO-ABSA(multi-task) 71.3 78.0 72.9 77.1 72.3 80.6 74.2 76.1 62.2 73.7 64.4 69.9
LEGO-ABSA(separate) 69.7 78.1 71.4 77.6 69.1 80.0 74.3 78.6 59.5 72.6 63.2 71.5

Table 3: Main result on AOPE, E2E-ABSA, and ASTE tasks. LEGO-ABSA(multi-task) means mixing the training
dataset of three tasks and shuffling the order. LEGO-ABSA(separate) means that a task is trained with only one
dataset, like other baselines. Since the original paper of GAS is not implemented on Peng’s dataset, we reproduce
the results ourselves using the same experiment config. We highlight the best results and results with F1 gaps within
0.2

ACSA Model TASD ASQP


Model
L15 L16 R15 R16 R15 R16 R15 R16
Cartesian-BERT 32.8 39.5 58.4 68.9 TAS-SW-TO(Wan et al., 2020) 58.1 65.4 - -
AddOneDim-BERT 48.9 47.2 61.7 69.8 TAS-LPM-CRF(Wan et al., 2020) 54.7 64.6 - -
Hier-BERT 50.6 49.2 62.4 70.3 TASO-BERT-CRF(Zhang et al., 2021a) - - 34.8 43.7
Hier-Transformer-BERT 57.8 52.7 64.7 73.5 GAS(Zhang et al., 2021b) 60.6 68.3 46.0 56.0
Hier-GCN-BERT 62.13 54.2 64.2 74.6 PARAPHRASE(Zhang et al., 2021a) 63.1 72.0 46.9 57.9
LEGO-ABSA(multi-task) 65.0 53.6 67.3 75.6 LEGO-ABSA(multi-task) 62.3 71.8 46.1 57.6
LEGO-ABSA(separate) 64.2 55.9 71.0 76.2 LEGO-ABSA(separate) 61.7 68.8 45.8 57.7

Table 4: Main results on ACSA, TASD, and ASQP tasks. LEGO-ABSA(multi-task) means mixing individual
training dataset and shuffling the order. We highlight the best results and results with F1 gaps within 0.2 in bold.

Task L14 R14 R15 R16 GM TTO TST task transfer ability
✕ ✕ ✓ ✕
ASTE 49.2 60.9 51.4 50.0 ✓ ✕ ✓ ✕
TASD - - 30.9 30.6 ✓ ✓ ✕ ✕
ASQP - - 25.8 24.5 ✓ ✓ ✓ ✓

Table 5: In-domain task transfer performance. In this Table 7: Factor analysis for task transferability, where
situation, basic tasks and advanced task are on the same GM is global mapping between sentinel token and el-
domain and corpus. ement, TTO is task transfer order that follows rule of
Task Prompt Assemble 2.4.1, and TST is use original
T5 sentinel token instead of custom token.
method Lap → Rest Rest → Lap
GAS-rule 32.4 33.7
LEGO-ABSA 53.9 44.7
element prompts, but replace <extra_id_x> (the
Table 6: Cross-domain task transfer performance on sentinel token used in T5 pre-training) with a cus-
ASTE task. We use dataset from Peng(Peng et al., 2020). tom new token. As shown in Table 7, only when all
Where rule method means that we get results by com- three conditions are met can the backbone obtain
bining (a, s) and (a, o) with same a. the task transferability.
Using the T5 Sentinel Token shows that down-
4.1 Factor Analysis of Transferability stream tasks can indeed reuse the unsupervised out-
put of T5 pre-training. The custom token cannot
We try to 1) increase the sentinel token id from 0 have the function of masking a consecutive span be-
in each basic task, which means no global map- cause it has not been pre-trained. The global map-
ping between sentinel token and element. 2) give a ping between sentinel token and element shows
global sentinel token id for each element but ran- that each sentinel token has a specific meaning af-
domly arrange elements’ order in the basic task. ter downstream task training. More importantly,
3) employ the global mapping and right order of the experiment result shows that a specific element
7008
task prompt Prediction
aspect: <extra_id_0> tech support
opinion: <extra_id_1> not fix
sentiment: <extra_id_2> negative
aspect: <extra_id_0>, opinion: <extra_id_1> tech support, not fix
aspect: <extra_id_0>, sentiment: <extra_id_2> tech support, negative
opinion: <extra_id_1>, sentiment: <extra_id_2> not fix, negative

Table 8: Lego split case for text "tech support would not fix the problem unless I bought your plan for $ 150 plus ."

(a) Decoder attention with AS type atten- (b) Decoder attention with OA type atten-
tion head tion head

Figure 3: attention visualization

prompt arrangement must be used to achieve task Some other attention heads learn associations
transfer and indirectly show that what the backbone between element o and a. As shown in Figure
learns is how to mix two or more task prompts. 3(b), the attention weight between <extra_id_1>
and <extra_id_0> is high, which means that the
Decoder Attention Visualization information of the opinion is used when the aspect
is generated via the <extra_id_1>. Such attention
We conjecture that LEGo-ABSA uses the ending head models the attention relationship between o
element prompt of the previous task as the begin- and a.
ning element prompt of the next task. To verify
In a word, combining information from multi-
this, we visualized two attention heads from the
ple attention heads with different functions, our
T5’s multiple attention heads in Figure 3. In this
framework can model advanced tasks through ba-
example, AOPE and E2E-ABSA are used as basic
sic tasks.
tasks, and ASTE is used as advanced task. Through
the analysis of decoder-attention visualization, we
LEGO split
have following findings.
Some attention heads learn associations between This section introduces how to make the framework
a and s. As shown in Figure 3(a), <extra_id_1> trained on advanced tasks capable of extracting any
nearly never attend to opinion term(good) and custom elements by changing the task prompt like
<extra_id_0>, and <extra_id_2> attend to <ex- an assembled Lego can be divided into parts of
tra_id_1> heavily where the association of aspect different sizes.
and sentiment is established. Such attention head We explored the ASTE task as the target advance
models the relation between a and s. task and traverse the full permutation of the three
7009
element prompts of a, o, and s. For each permu- (Volume 1: Long Papers), pages 340–350, Online.
tation of element prompts, we generate a dataset Association for Computational Linguistics.
with specific task prompt that assembled by ele- Shaowei Chen, Jie Liu, Yu Wang, Wenzheng Zhang,
ment prompts. Finally we mix and shuffle all the and Ziming Chi. 2020. Synchronous double-channel
datasets and train the framework with the setting recurrent network for aspect-opinion pair extraction.
In Proceedings of the 58th Annual Meeting of the As-
of multitask training.
sociation for Computational Linguistics, pages 6515–
As shown in Table 8, we can arbitrarily extract 6524.
any single element and any combination of ele-
Shaowei Chen, Yu Wang, Jie Liu, and Yuelin Wang.
ments by changing the task prompt. The framework
2021. Bidirectional machine reading comprehension
can perfectly control the output content through the for aspect sentiment triplet extraction. In Proceed-
task prompt. This result shows that the approach ings of the AAAI Conference on Artificial Intelligence,
proposed in this paper can make T5 regard the volume 35, pages 12666–12674.
task prompt as a combination of multiple element Zehui Dai, Cheng Peng, Huajie Chen, and Yadong Ding.
prompts, rather than a simple string. 2020. A multi-task incremental learning framework
with category name embedding for aspect-category
5 Conclusion sentiment analysis.
Lei Gao, Yulong Wang, Tongcun Liu, Jingyu Wang, Lei
In this paper, we propose a prompt-based genera- Zhang, and Jianxin Liao. 2021. Question-driven span
tive framework LEGO-ABSA for ABSA tasks that labeling model for aspect–opinion pair extraction. In
use T5 as the backbone, which can make full use Proceedings of the AAAI Conference on Artificial
of the information learned from the T5 unsuper- Intelligence, volume 35, pages 12875–12883.
vised training object through the formulation of Ruidan He, Wee Sun Lee, Hwee Tou Ng, and Daniel
task prompts we proposed. Dahlmeier. 2019. An interactive multi-task learning
LEGO-ABSA does not regard the prompt and network for end-to-end aspect-based sentiment anal-
ysis. In Proceedings of the 57th Annual Meeting of
the output text as a simple string but a combination the Association for Computational Linguistics, pages
of multiple elements to be extracted. It is mainly 504–515, Florence, Italy. Association for Computa-
used in multi-task training and task transfer sce- tional Linguistics.
narios. Extensive experiments on six ABSA tasks Ehsan Hosseini-Asl, Wenhao Liu, and Caiming Xiong.
verify the effectiveness of our framework and its 2022. A generative language model for few-shot
excellent transferability in task transfer scenarios. aspect-based sentiment analysis. arXiv preprint
There is still space for improvement in our frame- arXiv:2204.05356.
work, such as completing the combination extrac- Mengting Hu, Shiwan Zhao, Li Zhang, Keke Cai, Zhong
tion of multiple elements task through the learning Su, Renhong Cheng, and Xiaowei Shen. 2018. Can:
of single element tasks. Constrained attention networks for multi-aspect sen-
timent analysis.
6 Acknowledge Mike Lewis, Yinhan Liu, Naman Goyal, Marjan
Ghazvininejad, Abdelrahman Mohamed, Omer Levy,
Supported by the National Key Research and Devel- Ves Stoyanov, and Luke Zettlemoyer. 2019. Bart: De-
opment Program of China (No. 2020YFC0833301) noising sequence-to-sequence pre-training for natural
language generation, translation, and comprehension.
Xin Li, Lidong Bing, Piji Li, and Wai Lam. 2019a.
References A unified model for opinion target extraction and
target sentiment prediction. In Proceedings of the
Hongjie Cai, Yaofeng Tu, Xiangsheng Zhou, Jianfei
AAAI conference on artificial intelligence, volume 33,
Yu, and Rui Xia. 2020. Aspect-category based senti-
pages 6714–6721.
ment analysis with hierarchical graph convolutional
network. In Proceedings of the 28th International Xin Li, Lidong Bing, Wenxuan Zhang, and Wai Lam.
Conference on Computational Linguistics, pages 833– 2019b. Exploiting BERT for end-to-end aspect-based
843. sentiment analysis. In Proceedings of the 5th Work-
shop on Noisy User-generated Text (W-NUT 2019),
Hongjie Cai, Rui Xia, and Jianfei Yu. 2021. Aspect- pages 34–41.
category-opinion-sentiment quadruple extraction
with implicit aspects and opinions. In Proceedings Jian Liu, Zhiyang Teng, Leyang Cui, Hanmeng Liu, and
of the 59th Annual Meeting of the Association for Yue Zhang. 2021. Solving aspect category sentiment
Computational Linguistics and the 11th International analysis as a text generation task. arXiv preprint
Joint Conference on Natural Language Processing arXiv:2110.07310.
7010
Yukun Ma, Haiyun Peng, and Erik Cambria. 2018. Tar- transformer. Journal of Machine Learning Research,
geted aspect-based sentiment analysis via embedding 21(140):1–67.
commonsense knowledge into an attentive lstm. Pro-
ceedings of the AAAI Conference on Artificial Intelli- Kim Schouten and Flavius Frasincar. 2016. Survey on
gence, 32(1). aspect-level sentiment analysis. IEEE Transactions
on Knowledge and Data Engineering, 28(3):813–
Yue Mao, Yi Shen, Chao Yu, and Longjun Cai. 2021. A 830.
joint training dual-mrc framework for aspect based
sentiment analysis. Hai Wan, Yufei Yang, Jianfeng Du, Yanan Liu, Kunxun
Qi, and Jeff Z Pan. 2020. Target-aspect-sentiment
Bonan Min, Hayley Ross, Elior Sulem, Amir joint detection for aspect-based sentiment analysis.
Pouran Ben Veyseh, Thien Huu Nguyen, Oscar Sainz, In Proceedings of the AAAI Conference on Artificial
Eneko Agirre, Ilana Heinz, and Dan Roth. 2021. Re- Intelligence, volume 34, pages 9122–9129.
cent advances in natural language processing via
large pre-trained language models: A survey. arXiv Wenya Wang and Sinno Jialin Pan. 2018. Recursive
preprint arXiv:2111.01243. neural structural correspondence network for cross-
domain aspect and opinion co-extraction. In Proceed-
Ambreen Nazir, Yuan Rao, Lianwei Wu, and Ling Sun. ings of the 56th Annual Meeting of the Association for
2020. Issues and challenges of aspect-based sen- Computational Linguistics (Volume 1: Long Papers),
timent analysis: A comprehensive survey. IEEE pages 2171–2181, Melbourne, Australia. Association
Transactions on Affective Computing, pages 1–1. for Computational Linguistics.

Haiyun Peng, Lu Xu, Lidong Bing, Fei Huang, Wei Lu, Wenya Wang, Sinno Jialin Pan, Daniel Dahlmeier, and
and Luo Si. 2020. Knowing what, how and why: A Xiaokui Xiao. 2017. Coupled multi-layer attentions
near complete solution for aspect-based sentiment for co-extraction of aspect and opinion terms. In
analysis. In Proceedings of the AAAI Conference on Proceedings of the AAAI Conference on Artificial
Artificial Intelligence, volume 34, pages 8600–8607. Intelligence, volume 31.

Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Chao Wu, Qingyu Xiong, Hualing Yi, Yang Yu, Qiwu
Ion Androutsopoulos, Suresh Manandhar, Moham- Zhu, Min Gao, and Jie Chen. 2021. Multiple-element
mad AL-Smadi, Mahmoud Al-Ayyoub, Yanyan joint detection for aspect-based sentiment analysis.
Zhao, Bing Qin, Orphée De Clercq, Véronique Knowledge-Based Systems, 223:107073.
Hoste, Marianna Apidianaki, Xavier Tannier, Na-
talia Loukachevitch, Evgeniy Kotelnikov, Nuria Bel, Meixi Wu, Wenya Wang, and Sinno Jialin Pan. 2020a.
Salud María Jiménez-Zafra, and Gülşen Eryiğit. 2016. Deep Weighted MaxSAT for Aspect-based Opinion
SemEval-2016 task 5: Aspect based sentiment analy- Extraction. In Proceedings of the 2020 Conference
sis. In Proceedings of the 10th International Work- on Empirical Methods in Natural Language Process-
shop on Semantic Evaluation (SemEval-2016), pages ing (EMNLP), pages 5618–5628, Online. Association
19–30, San Diego, California. Association for Com- for Computational Linguistics.
putational Linguistics.
Zhen Wu, Chengcan Ying, Fei Zhao, Zhifang Fan,
Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Xinyu Dai, and Rui Xia. 2020b. Grid tagging scheme
Suresh Manandhar, and Ion Androutsopoulos. 2015. for aspect-oriented fine-grained opinion extraction.
SemEval-2015 task 12: Aspect based sentiment anal- In Findings of the Association for Computational
ysis. In Proceedings of the 9th International Work- Linguistics: EMNLP 2020, pages 2576–2585.
shop on Semantic Evaluation (SemEval 2015), pages
486–495, Denver, Colorado. Association for Compu- Lu Xu, Yew Ken Chia, and Lidong Bing. 2021. Learn-
tational Linguistics. ing span-level interactions for aspect sentiment triplet
extraction.
Maria Pontiki, Dimitris Galanis, John Pavlopoulos, Har-
ris Papageorgiou, Ion Androutsopoulos, and Suresh Lu Xu, Hao Li, Wei Lu, and Lidong Bing. 2020.
Manandhar. 2014. SemEval-2014 task 4: Aspect Position-aware tagging for aspect sentiment triplet
based sentiment analysis. In Proceedings of the 8th extraction. In Proceedings of the 2020 Conference on
International Workshop on Semantic Evaluation (Se- Empirical Methods in Natural Language Processing
mEval 2014), pages 27–35, Dublin, Ireland. Associa- (EMNLP), pages 2339–2349, Online. Association for
tion for Computational Linguistics. Computational Linguistics.

Alec Radford, Jeff Wu, Rewon Child, David Luan, Hang Yan, Junqi Dai, Tuo Ji, Xipeng Qiu, and Zheng
Dario Amodei, and Ilya Sutskever. 2019. Language Zhang. 2021. A unified generative framework for
models are unsupervised multitask learners. aspect-based sentiment analysis. In Proceedings
of the 59th Annual Meeting of the Association for
Colin Raffel, Noam Shazeer, Adam Roberts, Kather- Computational Linguistics and the 11th International
ine Lee, Sharan Narang, Michael Matena, Yanqi Joint Conference on Natural Language Processing
Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the (Volume 1: Long Papers), pages 2416–2429, Online.
limits of transfer learning with a unified text-to-text Association for Computational Linguistics.
7011
Chen Zhang, Qiuchi Li, Dawei Song, and Benyou Wang.
2020. A multi-task learning framework for opinion
triplet extraction. arXiv preprint arXiv:2010.01512.
Wenxuan Zhang, Yang Deng, Xin Li, Yifei Yuan, Li-
dong Bing, and Wai Lam. 2021a. Aspect sentiment
quad prediction as paraphrase generation. In Pro-
ceedings of the 2021 Conference on Empirical Meth-
ods in Natural Language Processing, pages 9209–
9219, Online and Punta Cana, Dominican Republic.
Association for Computational Linguistics.
Wenxuan Zhang, Xin Li, Yang Deng, Lidong Bing, and
Wai Lam. 2021b. Towards generative aspect-based
sentiment analysis. In Proceedings of the 59th An-
nual Meeting of the Association for Computational
Linguistics and the 11th International Joint Confer-
ence on Natural Language Processing (Volume 2:
Short Papers), pages 504–510.

7012

You might also like