Study could lead to LLMs that are better at complex reasoning

study-could-lead-to-llms-that-are-better-at-complex-reasoning

For all their impressive capabilities, large language models (LLMs) often fall short when given challenging new tasks that require complex reasoning skills.

While an accounting firm’s LLM might excel at summarizing financial reports, that same model could fail unexpectedly if tasked with predicting market trends or identifying fraudulent transactions.

To make LLMs more adaptable, MIT researchers investigated how a certain training technique can be strategically deployed to boost a model’s performance on unfamiliar, difficult problems.

They show that test-time training, a method that involves temporarily updating some of a model’s inner workings during deployment, can lead to a sixfold improvement in accuracy. The researchers developed a framework for implementing a test-time training strategy that uses examples of the new task to maximize these gains.

Their work could improve a model’s flexibility, enabling an off-the-shelf LLM to adapt to complex tasks that require planning or abstraction.

 » Read More

Robotic probe quickly measures key properties of new materials

robotic-probe-quickly-measures-key-properties-of-new-materials

Scientists are striving to discover new semiconductor materials that could boost the efficiency of solar cells and other electronics. But the pace of innovation is bottlenecked by the speed at which researchers can manually measure important material properties.

A fully autonomous robotic system developed by MIT researchers could speed things up.

Their system utilizes a robotic probe to measure an important electrical property known as photoconductance, which is how electrically responsive a material is to the presence of light.

The researchers inject materials-science-domain knowledge from human experts into the machine-learning model that guides the robot’s decision making. This enables the robot to identify the best places to contact a material with the probe to gain the most information about its photoconductance, while a specialized planning procedure finds the fastest way to move between contact points.

During a 24-hour test, the fully autonomous robotic probe took more than 125 unique measurements per hour,

 » Read More

MIT and Mass General Hospital researchers find disparities in organ allocation

mit-and-mass-general-hospital-researchers-find-disparities-in-organ-allocation

In 1954, the world’s first successful organ transplant took place at Brigham and Women’s Hospital, in the form of a kidney donated from one twin to the other. At the time, a group of doctors and scientists had correctly theorized that the recipient’s antibodies were unlikely to reject an organ from an identical twin. One Nobel Prize and a few decades later, advancements in immune-suppressing drugs increased the viability of and demand for organ transplants. Today, over 1 million organ transplants have been performed in the United States, more than any other country in the world.

The impressive scale of this achievement was made possible due to advances in organ matching systems: The first computer-based organ matching system was released in 1977. Despite continued innovation in computing, medicine, and matching technology over the years, over 100,000 people in the U.S. are currently on the national transplant waiting list and 13 people die each day waiting for an organ transplant. 

 » Read More

New imaging technique reconstructs the shapes of hidden objects

new-imaging-technique-reconstructs-the-shapes-of-hidden-objects

A new imaging technique developed by MIT researchers could enable quality-control robots in a warehouse to peer through a cardboard shipping box and see that the handle of a mug buried under packing peanuts is broken.

Their approach leverages millimeter wave (mmWave) signals, the same type of signals used in Wi-Fi, to create accurate 3D reconstructions of objects that are blocked from view.

The waves can travel through common obstacles like plastic containers or interior walls, and reflect off hidden objects. The system, called mmNorm, collects those reflections and feeds them into an algorithm that estimates the shape of the object’s surface.

This new approach achieved 96 percent reconstruction accuracy on a range of everyday objects with complex, curvy shapes, like silverware and a power drill. State-of-the-art baseline methods achieved only 78 percent accuracy.

In addition, mmNorm does not require additional bandwidth to achieve such high accuracy.

 » Read More

Unpacking the bias of large language models

unpacking-the-bias-of-large-language-models

Research has shown that large language models (LLMs) tend to overemphasize information at the beginning and end of a document or conversation, while neglecting the middle.

This “position bias” means that, if a lawyer is using an LLM-powered virtual assistant to retrieve a certain phrase in a 30-page affidavit, the LLM is more likely to find the right text if it is on the initial or final pages.

MIT researchers have discovered the mechanism behind this phenomenon.

They created a theoretical framework to study how information flows through the machine-learning architecture that forms the backbone of LLMs. They found that certain design choices which control how the model processes input data can cause position bias.

Their experiments revealed that model architectures, particularly those affecting how information is spread across input words within the model, can give rise to or intensify position bias,

 » Read More

Using generative AI to help robots jump higher and land safely

using-generative-ai-to-help-robots-jump-higher-and-land-safely

Diffusion models like OpenAI’s DALL-E are becoming increasingly useful in helping brainstorm new designs. Humans can prompt these systems to generate an image, create a video, or refine a blueprint, and come back with ideas they hadn’t considered before.

But did you know that generative artificial intelligence (GenAI) models are also making headway in creating working robots? Recent diffusion-based approaches have generated structures and the systems that control them from scratch. With or without a user’s input, these models can make new designs and then evaluate them in simulation before they’re fabricated.

A new approach from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) applies this generative know-how toward improving humans’ robotic designs. Users can draft a 3D model of a robot and specify which parts they’d like to see a diffusion model modify, providing its dimensions beforehand. GenAI then brainstorms the optimal shape for these areas and tests its ideas in simulation.

 » Read More

Four from MIT named 2025 Goldwater Scholars

four-from-mit-named-2025-goldwater-scholars

Four MIT rising seniors have been selected to receive a 2025 Barry Goldwater Scholarship, including Avani Ahuja and Jacqueline Prawira in the School of Engineering and Julianna Lian and Alex Tang from the School of Science. An estimated 5,000 college sophomores and juniors from across the United States were nominated for the scholarships, of whom only 441 were selected.

The Goldwater Scholarships have been conferred since 1989 by the Barry Goldwater Scholarship and Excellence in Education Foundation. These scholarships have supported undergraduates who go on to become leading scientists, engineers, and mathematicians in their respective fields.

Avani Ahuja, a mechanical engineering and electrical engineering major, conducts research in the Conformable Decoders group, where she is focused on developing a “wearable conformable breast ultrasound patch” that makes ultrasounds for breast cancer more accessible.

“Doing research in the Media Lab has had a huge impact on me,

 » Read More

LLMs factor in unrelated information when recommending medical treatments

llms-factor-in-unrelated-information-when-recommending-medical-treatments

A large language model (LLM) deployed to make treatment recommendations can be tripped up by nonclinical information in patient messages, like typos, extra white space, missing gender markers, or the use of uncertain, dramatic, and informal language, according to a study by MIT researchers.

They found that making stylistic or grammatical changes to messages increases the likelihood an LLM will recommend that a patient self-manage their reported health condition rather than come in for an appointment, even when that patient should seek medical care.

Their analysis also revealed that these nonclinical variations in text, which mimic how people really communicate, are more likely to change a model’s treatment recommendations for female patients, resulting in a higher percentage of women who were erroneously advised not to seek medical care, according to human doctors.

This work “is strong evidence that models must be audited before use in health care — which is a setting where they are already in use,” says Marzyeh Ghassemi,

 » Read More

Researchers present bold ideas for AI at MIT Generative AI Impact Consortium kickoff event

researchers-present-bold-ideas-for-ai-at-mit-generative-ai-impact-consortium-kickoff-event

Launched in February of this year, the MIT Generative AI Impact Consortium (MGAIC), a presidential initiative led by MIT’s Office of Innovation and Strategy and administered by the MIT Stephen A. Schwarzman College of Computing, issued a call for proposals, inviting researchers from across MIT to submit ideas for innovative projects studying high-impact uses of generative AI models.

The call received 180 submissions from nearly 250 faculty members, spanning all of MIT’s five schools and the college. The overwhelming response across the Institute exemplifies the growing interest in AI and follows in the wake of MIT’s Generative AI Week and call for impact papers. Fifty-five proposals were selected for MGAIC’s inaugural seed grants, with several more selected to be funded by the consortium’s founding company members.

Over 30 funding recipients presented their proposals to the greater MIT community at a kickoff event on May 13.

 » Read More

New 3D chips could make electronics faster and more energy-efficient

new-3d-chips-could-make-electronics-faster-and-more-energy-efficient

The advanced semiconductor material gallium nitride will likely be key for the next generation of high-speed communication systems and the power electronics needed for state-of-the-art data centers.

Unfortunately, the high cost of gallium nitride (GaN) and the specialization required to incorporate this semiconductor material into conventional electronics have limited its use in commercial applications.

Now, researchers from MIT and elsewhere have developed a new fabrication process that integrates high-performance GaN transistors onto standard silicon CMOS chips in a way that is low-cost and scalable, and compatible with existing semiconductor foundries.

Their method involves building many tiny transistors on the surface of a GaN chip, cutting out each individual transistor, and then bonding just the necessary number of transistors onto a silicon chip using a low-temperature process that preserves the functionality of both materials.

The cost remains minimal since only a tiny amount of GaN material is added to the chip,

 » Read More