Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting

Thilini Wijesiriwardene, Ruwan Wickramarachchi, Sreeram Vennam, Vinija Jain, Aman Chadha, Amitava Das, Ponnurangam Kumaraguru, Amit Sheth

2024-12-03

Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting

Summary

This paper explores how well large language models (LLMs) can solve proportional analogies, which are comparisons that show relationships between different pairs of words or concepts.

What's the problem?

Proportional analogies, like 'Oxygen is to Gas as Aluminum is to Metal,' require understanding the relationship between the first pair of words and finding a second pair that shares that same relationship. Many LLMs struggle with this task, and there hasn't been enough research on how to effectively evaluate their abilities in this area. Existing datasets for testing these skills are often small and limited in variety.

What's the solution?

To tackle this issue, the researchers created a new dataset containing 15,000 multiple-choice questions specifically designed for solving proportional analogies. They tested various prompting techniques to help the models better understand the analogies, including using examples (few-shot prompting), structured knowledge from external sources, and targeted knowledge that focuses on specific relationships. The results showed that even with extensive training, the best model only achieved an accuracy of 55%. However, using targeted knowledge significantly improved performance compared to other methods.

Why it matters?

This research is important because it sheds light on the limitations of current large language models in understanding complex relationships between concepts. By developing a larger and more focused dataset for evaluating analogy-solving abilities, this work can help improve future LLMs and enhance their reasoning capabilities, which is crucial for applications in education, AI development, and natural language processing.

Abstract

Making analogies is fundamental to cognition. Proportional analogies, which consist of four terms, are often used to assess linguistic and cognitive abilities. For instance, completing analogies like "Oxygen is to Gas as <blank> is to <blank>" requires identifying the semantic relationship (e.g., "type of") between the first pair of terms ("Oxygen" and "Gas") and finding a second pair that shares the same relationship (e.g., "Aluminum" and "Metal"). In this work, we introduce a 15K Multiple-Choice Question Answering (MCQA) dataset for proportional analogy completion and evaluate the performance of contemporary Large Language Models (LLMs) in various knowledge-enhanced prompt settings. Specifically, we augment prompts with three types of knowledge: exemplar, structured, and targeted. Our results show that despite extensive training data, solving proportional analogies remains challenging for current LLMs, with the best model achieving an accuracy of 55%. Notably, we find that providing targeted knowledge can better assist models in completing proportional analogies compared to providing exemplars or collections of structured knowledge.

View Paper