HW-TSC at SemEval-2024 Task 9: Exploring Prompt Engineering Strategies for Brain Teaser Puzzles Through LLMs

Yinglu Li; Zhao Yanqing; Min Zhang (张民); Yadong Deng; Aiju Geng; Xiaoqin Liu; Mengxin Ren; Yuang Li; Su Chang; Xiaofeng Zhao

doi:10.18653/v1/2024.semeval-1.234

HW-TSC at SemEval-2024 Task 9: Exploring Prompt Engineering Strategies for Brain Teaser Puzzles Through LLMs

Yinglu Li, Zhao Yanqing, Min Zhang, Yadong Deng, Aiju Geng, Xiaoqin Liu, Mengxin Ren, Yuang Li, Su Chang, Xiaofeng Zhao

Abstract

Large Language Models (LLMs) have demonstrated impressive performance on many Natural Language Processing (NLP) tasks. However, their ability to solve more creative, lateral thinking puzzles remains relatively unexplored. In this work, we develop methods to enhance the lateral thinking and puzzle-solving capabilities of LLMs. We curate a dataset of word-type and sentence-type brain teasers requiring creative problem-solving abilities beyond commonsense reasoning. We first evaluate the zero-shot performance of models like GPT-3.5 and GPT-4 on this dataset. To improve their puzzle-solving skills, we employ prompting techniques like providing reasoning clues and chaining multiple examples to demonstrate the desired thinking process. We also fine-tune the state-of-the-art Mixtral 7x8b LLM on ourdataset. Our methods enable the models to achieve strong results, securing 2nd and 3rd places in the brain teaser task. Our work highlights the potential of LLMs in acquiring complex reasoning abilities with the appropriate training. The efficacy of our approaches opens up new research avenues into advancing lateral thinking and creative problem-solving with AI systems.

Anthology ID:: 2024.semeval-1.234
Volume:: Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1646–1651
Language:
URL:: https://aclanthology.org/2024.semeval-1.234/
DOI:: 10.18653/v1/2024.semeval-1.234
Bibkey:
Cite (ACL):: Yinglu Li, Zhao Yanqing, Min Zhang, Yadong Deng, Aiju Geng, Xiaoqin Liu, Mengxin Ren, Yuang Li, Su Chang, and Xiaofeng Zhao. 2024. HW-TSC at SemEval-2024 Task 9: Exploring Prompt Engineering Strategies for Brain Teaser Puzzles Through LLMs. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1646–1651, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: HW-TSC at SemEval-2024 Task 9: Exploring Prompt Engineering Strategies for Brain Teaser Puzzles Through LLMs (Li et al., SemEval 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.semeval-1.234.pdf
Supplementarymaterial:: 2024.semeval-1.234.SupplementaryMaterial.txt

PDF Cite Search Supplementarymaterial Fix data