DeBERTa at SemEval-2024 Task 9: Using DeBERTa for Defying Common Sense

Siino, Marco

doi:10.18653/v1/2024.semeval-1.45

The significant achievements of language models have motivated researchers in the natural language processing (NLP) community to confront challenges requiring nuanced and implicit reasoning, inspired by human-like commonsense understanding. Although efforts focusing on vertical thinking tasks have received substantial recognition, there remains a notable lack of investigation into lateral thinking puzzles. To bridge this void, the authors at SemEval-2024 propose BRAINTEASER: a multiple-choice Question Answering task designed meticulously to assess the model's lateral thinking capabilities and its capacity to question default common-sense assumptions. Specifically, at the SemEval-2024 Task 9, for the first subtask (i.e., Sentence Puzzle) the organizers asked the participants to develop models able to reply to multi-answer brain-teasing questions. For this purpose, we propose the application of a DeBERTa model in a zero-shot configuration. The proposed approach achieves an aggregate score of 0.250. Suggesting a significant room for improvements in future works.

DeBERTa at SemEval-2024 Task 9: Using DeBERTa for Defying Common Sense

Siino, Marco^Primo

2024-01-01

Abstract

The significant achievements of language models have motivated researchers in the natural language processing (NLP) community to confront challenges requiring nuanced and implicit reasoning, inspired by human-like commonsense understanding. Although efforts focusing on vertical thinking tasks have received substantial recognition, there remains a notable lack of investigation into lateral thinking puzzles. To bridge this void, the authors at SemEval-2024 propose BRAINTEASER: a multiple-choice Question Answering task designed meticulously to assess the model's lateral thinking capabilities and its capacity to question default common-sense assumptions. Specifically, at the SemEval-2024 Task 9, for the first subtask (i.e., Sentence Puzzle) the organizers asked the participants to develop models able to reply to multi-answer brain-teasing questions. For this purpose, we propose the application of a DeBERTa model in a zero-shot configuration. The proposed approach achieves an aggregate score of 0.250. Suggesting a significant room for improvements in future works.