Study could lead to LLMs that are better at complex reasoning

study-could-lead-to-llms-that-are-better-at-complex-reasoning

For all their impressive capabilities, large language models (LLMs) often fall short when given challenging new tasks that require complex reasoning skills.

While an accounting firm’s LLM might excel at summarizing financial reports, that same model could fail unexpectedly if tasked with predicting market trends or identifying fraudulent transactions.

To make LLMs more adaptable, MIT researchers investigated how a certain training technique can be strategically deployed to boost a model’s performance on unfamiliar, difficult problems.

They show that test-time training, a method that involves temporarily updating some of a model’s inner workings during deployment, can lead to a sixfold improvement in accuracy. The researchers developed a framework for implementing a test-time training strategy that uses examples of the new task to maximize these gains.

Their work could improve a model’s flexibility, enabling an off-the-shelf LLM to adapt to complex tasks that require planning or abstraction.

 » Read More

Robotic probe quickly measures key properties of new materials

robotic-probe-quickly-measures-key-properties-of-new-materials

Scientists are striving to discover new semiconductor materials that could boost the efficiency of solar cells and other electronics. But the pace of innovation is bottlenecked by the speed at which researchers can manually measure important material properties.

A fully autonomous robotic system developed by MIT researchers could speed things up.

Their system utilizes a robotic probe to measure an important electrical property known as photoconductance, which is how electrically responsive a material is to the presence of light.

The researchers inject materials-science-domain knowledge from human experts into the machine-learning model that guides the robot’s decision making. This enables the robot to identify the best places to contact a material with the probe to gain the most information about its photoconductance, while a specialized planning procedure finds the fastest way to move between contact points.

During a 24-hour test, the fully autonomous robotic probe took more than 125 unique measurements per hour,

 » Read More

New imaging technique reconstructs the shapes of hidden objects

new-imaging-technique-reconstructs-the-shapes-of-hidden-objects

A new imaging technique developed by MIT researchers could enable quality-control robots in a warehouse to peer through a cardboard shipping box and see that the handle of a mug buried under packing peanuts is broken.

Their approach leverages millimeter wave (mmWave) signals, the same type of signals used in Wi-Fi, to create accurate 3D reconstructions of objects that are blocked from view.

The waves can travel through common obstacles like plastic containers or interior walls, and reflect off hidden objects. The system, called mmNorm, collects those reflections and feeds them into an algorithm that estimates the shape of the object’s surface.

This new approach achieved 96 percent reconstruction accuracy on a range of everyday objects with complex, curvy shapes, like silverware and a power drill. State-of-the-art baseline methods achieved only 78 percent accuracy.

In addition, mmNorm does not require additional bandwidth to achieve such high accuracy.

 » Read More

Unpacking the bias of large language models

unpacking-the-bias-of-large-language-models

Research has shown that large language models (LLMs) tend to overemphasize information at the beginning and end of a document or conversation, while neglecting the middle.

This “position bias” means that, if a lawyer is using an LLM-powered virtual assistant to retrieve a certain phrase in a 30-page affidavit, the LLM is more likely to find the right text if it is on the initial or final pages.

MIT researchers have discovered the mechanism behind this phenomenon.

They created a theoretical framework to study how information flows through the machine-learning architecture that forms the backbone of LLMs. They found that certain design choices which control how the model processes input data can cause position bias.

Their experiments revealed that model architectures, particularly those affecting how information is spread across input words within the model, can give rise to or intensify position bias,

 » Read More

Bringing meaning into technology deployment

bringing-meaning-into-technology-deployment

In 15 TED Talk-style presentations, MIT faculty recently discussed their pioneering research that incorporates social, ethical, and technical considerations and expertise, each supported by seed grants established by the Social and Ethical Responsibilities of Computing (SERC), a cross-cutting initiative of the MIT Schwarzman College of Computing. The call for proposals last summer was met with nearly 70 applications. A committee with representatives from every MIT school and the college convened to select the winning projects that received up to $100,000 in funding.

“SERC is committed to driving progress at the intersection of computing, ethics, and society. The seed grants are designed to ignite bold, creative thinking around the complex challenges and possibilities in this space,” said Nikos Trichakis, co-associate dean of SERC and the J.C. Penney Professor of Management. “With the MIT Ethics of Computing Research Symposium, we felt it important to not just showcase the breadth and depth of the research that’s shaping the future of ethical computing,

 » Read More

Animation technique simulates the motion of squishy objects

Animators could create more realistic bouncy, stretchy, and squishy characters for movies and video games thanks to a new simulation method developed by researchers at MIT.

Their approach allows animators to simulate rubbery and elastic materials in a way that preserves the physical properties of the material and avoids pitfalls like instability.

The technique simulates elastic objects for animation and other applications, with improved reliability compared to other methods. In comparison, many existing simulation techniques can produce elastic animations that become erratic or sluggish or can even break down entirely.

To achieve this improvement, the MIT researchers uncovered a hidden mathematical structure in equations that capture how elastic materials deform on a computer. By leveraging this property, known as convexity, they designed a method that consistently produces accurate, physically faithful simulations.

Wiggly gummy bears » Read More

New system enables robots to solve manipulation problems in seconds

new-system-enables-robots-to-solve-manipulation-problems-in-seconds

Ready for that long-awaited summer vacation? First, you’ll need to pack all items required for your trip into a suitcase, making sure everything fits securely without crushing anything fragile.

Because humans possess strong visual and geometric reasoning skills, this is usually a straightforward problem, even if it may take a bit of finagling to squeeze everything in.

To a robot, though, it is an extremely complex planning challenge that requires thinking simultaneously about many actions, constraints, and mechanical capabilities. Finding an effective solution could take the robot a very long time — if it can even come up with one.

Researchers from MIT and NVIDIA Research have developed a novel algorithm that dramatically speeds up the robot’s planning process. Their approach enables a robot to “think ahead” by evaluating thousands of possible solutions in parallel and then refining the best ones to meet the constraints of the robot and its environment.

 » Read More

System lets robots identify an object’s properties through handling

system-lets-robots-identify-an-object’s-properties-through-handling

A human clearing junk out of an attic can often guess the contents of a box simply by picking it up and giving it a shake, without the need to see what’s inside. Researchers from MIT, Amazon Robotics, and the University of British Columbia have taught robots to do something similar.

They developed a technique that enables robots to use only internal sensors to learn about an object’s weight, softness, or contents by picking it up and gently shaking it. With their method, which does not require external measurement tools or cameras, the robot can accurately guess parameters like an object’s mass in a matter of seconds.

This low-cost technique could be especially useful in applications where cameras might be less effective, such as sorting objects in a dark basement or clearing rubble inside a building that partially collapsed after an earthquake.

Key to their approach is a simulation process that incorporates models of the robot and the object to rapidly identify characteristics of that object as the robot interacts with it. 

 » Read More

Novel AI model inspired by neural dynamics from the brain

novel-ai-model-inspired-by-neural-dynamics-from-the-brain

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a novel artificial intelligence model inspired by neural oscillations in the brain, with the goal of significantly advancing how machine learning algorithms handle long sequences of data.

AI often struggles with analyzing complex information that unfolds over long periods of time, such as climate trends, biological signals, or financial data. One new type of AI model, called “state-space models,” has been designed specifically to understand these sequential patterns more effectively. However, existing state-space models often face challenges — they can become unstable or require a significant amount of computational resources when processing long data sequences.

To address these issues, CSAIL researchers T. Konstantin Rusch and Daniela Rus have developed what they call “linear oscillatory state-space models” (LinOSS), which leverage principles of forced harmonic oscillators — a concept deeply rooted in physics and observed in biological neural networks.

 » Read More

Designing a new way to optimize complex coordinated systems

designing-a-new-way-to-optimize-complex-coordinated-systems

Coordinating complicated interactive systems, whether it’s the different modes of transportation in a city or the various components that must work together to make an effective and efficient robot, is an increasingly important subject for software designers to tackle. Now, researchers at MIT have developed an entirely new way of approaching these complex problems, using simple diagrams as a tool to reveal better approaches to software optimization in deep-learning models.

They say the new method makes addressing these complex tasks so simple that it can be reduced to a drawing that would fit on the back of a napkin.

The new approach is described in the journal Transactions of Machine Learning Research, in a paper by incoming doctoral student Vincent Abbott and Professor Gioele Zardini of MIT’s Laboratory for Information and Decision Systems (LIDS).

“We designed a new language to talk about these new systems,” Zardini says.

 » Read More