Charting the future of AI, from safer answers to faster thinking

charting-the-future-of-ai,-from-safer-answers-to-faster-thinking

Adoption of new tools and technologies occurs when users largely perceive them as reliable, accessible, and an improvement over the available methods and workflows for the cost. Five PhD students from the inaugural class of the MIT-IBM Watson AI Lab Summer Program are utilizing state-of-the-art resources, alleviating AI pain points, and creating new features and capabilities to promote AI usefulness and deployment — from learning when to trust a model that predicts another’s accuracy to more effectively reasoning over knowledge bases. Together, the efforts from the students and their mentors form a through-line, where practical and technically rigorous research leads to more dependable and valuable models across domains.

Building probes, routers, new attention mechanisms, synthetic datasets, and program-synthesis pipelines, the students’ work spans safety, inference efficiency, multimodal data, and knowledge-grounded reasoning. Their techniques emphasize scaling and integration, with impact always in sight.

Learning to trust, and when

MIT math graduate student Andrey Bryutkin’s research prioritizes the trustworthiness of models.

 » Read More

Method teaches generative AI models to locate personalized objects

method-teaches-generative-ai-models-to-locate-personalized-objects

Say a person takes their French Bulldog, Bowser, to the dog park. Identifying Bowser as he plays among the other canines is easy for the dog-owner to do while onsite.

But if someone wants to use a generative AI model like GPT-5 to monitor their pet while they are at work, the model could fail at this basic task. Vision-language models like GPT-5 often excel at recognizing general objects, like a dog, but they perform poorly at locating personalized objects, like Bowser the French Bulldog.    

To address this shortcoming, researchers from MIT and the MIT-IBM Watson AI Lab have introduced a new training method that teaches vision-language models to localize personalized objects in a scene.

Their method uses carefully prepared video-tracking data in which the same object is tracked across multiple frames. They designed the dataset so the model must focus on contextual clues to identify the personalized object,

 » Read More

How to build AI scaling laws for efficient LLM training and budget maximization

how-to-build-ai-scaling-laws-for-efficient-llm-training-and-budget-maximization

When researchers are building large language models (LLMs), they aim to maximize performance under a particular computational and financial budget. Since training a model can amount to millions of dollars, developers need to be judicious with cost-impacting decisions about, for instance, the model architecture, optimizers, and training datasets before committing to a model. To anticipate the quality and accuracy of a large model’s predictions, practitioners often turn to scaling laws: using smaller, cheaper models to try to approximate the performance of a much larger target model. The challenge, however, is that there are thousands of ways to create a scaling law.

New work from MIT and MIT-IBM Watson AI Lab researchers addresses this by amassing and releasing a collection of hundreds of models and metrics concerning training and performance to approximate more than a thousand scaling laws. From this, the team developed a meta-analysis and guide for how to select small models and estimate scaling laws for different LLM model families,

 » Read More

MIT tool visualizes and edits “physically impossible” objects

mit-tool-visualizes-and-edits-“physically-impossible”-objects

M.C. Escher’s artwork is a gateway into a world of depth-defying optical illusions, featuring “impossible objects” that break the laws of physics with convoluted geometries. What you perceive his illustrations to be depends on your point of view — for example, a person seemingly walking upstairs may be heading down the steps if you tilt your head sideways

Computer graphics scientists and designers can recreate these illusions in 3D, but only by bending or cutting a real shape and positioning it at a particular angle. This workaround has downsides, though: Changing the smoothness or lighting of the structure will expose that it isn’t actually an optical illusion, which also means you can’t accurately solve geometry problems on it.

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a unique approach to represent “impossible” objects in a more versatile way. Their “Meschers” tool converts images and 3D models into 2.5-dimensional structures,

 » Read More

This “smart coach” helps LLMs switch between text and code

this-“smart-coach”-helps-llms-switch-between-text-and-code

Large language models (LLMs) excel at using textual reasoning to understand the context of a document and provide a logical answer about its contents. But these same LLMs often struggle to correctly answer even the simplest math problems.

Textual reasoning is usually a less-than-ideal way to deliberate over computational or algorithmic tasks. While some LLMs can generate code like Python to handle symbolic queries, the models don’t always know when to use code, or what kind of code would work best.

LLMs, it seems, may need a coach to steer them toward the best technique.

Enter CodeSteer, a smart assistant developed by MIT researchers that guides an LLM to switch between code and text generation until it correctly answers a query.

CodeSteer, itself a smaller LLM, automatically generates a series of prompts to iteratively steer a larger LLM. It reviews the model’s current and previous answers after each round and provides guidance for how it can fix or refine that solution until it deems the answer is correct.

 » Read More

Study could lead to LLMs that are better at complex reasoning

study-could-lead-to-llms-that-are-better-at-complex-reasoning

For all their impressive capabilities, large language models (LLMs) often fall short when given challenging new tasks that require complex reasoning skills.

While an accounting firm’s LLM might excel at summarizing financial reports, that same model could fail unexpectedly if tasked with predicting market trends or identifying fraudulent transactions.

To make LLMs more adaptable, MIT researchers investigated how a certain training technique can be strategically deployed to boost a model’s performance on unfamiliar, difficult problems.

They show that test-time training, a method that involves temporarily updating some of a model’s inner workings during deployment, can lead to a sixfold improvement in accuracy. The researchers developed a framework for implementing a test-time training strategy that uses examples of the new task to maximize these gains.

Their work could improve a model’s flexibility, enabling an off-the-shelf LLM to adapt to complex tasks that require planning or abstraction.

 » Read More