Researchers tested the OOCR capabilities of LLMs using a comprehensive suite of five demanding tests specifically designed to evaluate inductive OOCR capabilities6. These tests included tasks such as identifying an unknown city based on distances between known cities, determining if a coin is biased based on flip results, and constructing a function and its inverse from trained pairs. GPT-3.5, GPT-4, and Llama 3 were able to complete these tasks with OOCR success, demonstrating their inductive reasoning abilities6.
The potential dangers of LLMs learning harmful information include the acquisition of detrimental details, such as instructions on how to make biological pathogens. Even if explicit mentions of dangerous facts are removed, LLMs can deduce the information by piecing together faint clues from various sources. This raises concerns about the models' ability to learn and use knowledge in ways that are difficult for humans to monitor, possibly leading to deception by misaligned models.
In the city identification experiment using an LLM, the model was finetuned on a dataset containing only distances between various known cities and an unknown city2. Remarkably, without in-context examples or Chain of Thought, the LLM was able to identify the unknown city as Paris and used this information to answer downstream questions12.