Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text

Li, Yafu; Wang, Zhilin; Cui, Leyang; Bi, Wei; Shi, Shuming; Zhang, Yue

Computer Science > Computation and Language

arXiv:2405.12689 (cs)

[Submitted on 21 May 2024 (v1), last revised 29 May 2024 (this version, v2)]

Title:Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text

Authors:Yafu Li, Zhilin Wang, Leyang Cui, Wei Bi, Shuming Shi, Yue Zhang

View PDF HTML (experimental)

Abstract:AI-generated text detection has attracted increasing attention as powerful language models approach human-level generation. Limited work is devoted to detecting (partially) AI-paraphrased texts. However, AI paraphrasing is commonly employed in various application scenarios for text refinement and diversity. To this end, we propose a novel detection framework, paraphrased text span detection (PTD), aiming to identify paraphrased text spans within a text. Different from text-level detection, PTD takes in the full text and assigns each of the sentences with a score indicating the paraphrasing degree. We construct a dedicated dataset, PASTED, for paraphrased text span detection. Both in-distribution and out-of-distribution results demonstrate the effectiveness of PTD models in identifying AI-paraphrased text spans. Statistical and model analysis explains the crucial role of the surrounding context of the paraphrased text spans. Extensive experiments show that PTD models can generalize to versatile paraphrasing prompts and multiple paraphrased text spans. We release our resources at this https URL.

Comments:	ACL 2024 Findings
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2405.12689 [cs.CL]
	(or arXiv:2405.12689v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2405.12689

Submission history

From: Yafu Li [view email]
[v1] Tue, 21 May 2024 11:22:27 UTC (5,278 KB)
[v2] Wed, 29 May 2024 07:09:59 UTC (5,285 KB)

Computer Science > Computation and Language

Title:Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators