The significant achievements of language models have motivated researchers in the natural language processing (NLP) community to confront challenges requiring nuanced and implicit reasoning, inspired by human-like commonsense understanding. Although efforts focusing on vertical thinking tasks have received substantial recognition, there remains a notable lack of investigation into lateral thinking puzzles. To bridge this void, the authors at SemEval-2024 propose BRAINTEASER: a multiple-choice Question Answering task designed meticulously to assess the model's lateral thinking capabilities and its capacity to question default common-sense assumptions. Specifically, at the SemEval-2024 Task 9, for the first subtask (i.e., Sentence Puzzle) the organizers asked the participants to develop models able to reply to multi-answer brain-teasing questions. For this purpose, we propose the application of a DeBERTa model in a zero-shot configuration. The proposed approach achieves an aggregate score of 0.250. Suggesting a significant room for improvements in future works.
DeBERTa at SemEval-2024 Task 9: Using DeBERTa for Defying Common Sense
Siino, Marco
Primo
2024-01-01
Abstract
The significant achievements of language models have motivated researchers in the natural language processing (NLP) community to confront challenges requiring nuanced and implicit reasoning, inspired by human-like commonsense understanding. Although efforts focusing on vertical thinking tasks have received substantial recognition, there remains a notable lack of investigation into lateral thinking puzzles. To bridge this void, the authors at SemEval-2024 propose BRAINTEASER: a multiple-choice Question Answering task designed meticulously to assess the model's lateral thinking capabilities and its capacity to question default common-sense assumptions. Specifically, at the SemEval-2024 Task 9, for the first subtask (i.e., Sentence Puzzle) the organizers asked the participants to develop models able to reply to multi-answer brain-teasing questions. For this purpose, we propose the application of a DeBERTa model in a zero-shot configuration. The proposed approach achieves an aggregate score of 0.250. Suggesting a significant room for improvements in future works.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


