AI

Cost-Efficient AI Model Retraining to Prevent Forgetting: A Breakthrough Discovery by Researchers

Published

1 day ago

October 14, 2025

Enterprises frequently discover that when refining models, a successful strategy to optimize a large language model (LLM) involves sacrificing some of its capabilities to ensure it aligns with the intended purpose and remains grounded in data. Following fine-tuning, certain models may “forget” how to execute specific tasks or previously acquired skills.

Research conducted by the University of Illinois Urbana-Champaign introduces a novel approach to retraining models that prevents “catastrophic forgetting,” where the model loses past knowledge. The study concentrates on two distinct LLMs, namely LLaVA and Qwen 2.5-VL, which generate responses from images.

The proposed method encourages enterprises to selectively retrain specific sections of an LLM to avoid the need for comprehensive model retraining, which could result in a significant increase in computational expenses. The researchers argue that catastrophic forgetting is not synonymous with genuine memory loss but rather a consequence of bias drift.

“Developing a new LMM can be a costly and time-consuming process, accompanied by a substantial carbon footprint. Therefore, identifying more efficient and effective ways to update existing models is imperative,” the team emphasized in their paper. “Guided by this insight, we explore tuning approaches that maintain learning while limiting output deviations.”

The study focused on a multi-layer perceptron (MLP), the internal decision-making component of the model.

Catastrophic Forgetting

The researchers initially sought to confirm the presence and cause of catastrophic forgetting in models.

To achieve this, they designed a series of tasks for the models to complete. Following fine-tuning and evaluation, the researchers observed whether significant forgetting occurred. Interestingly, as the process unfolded, the models began to regain some of their capabilities.

“We observed a notable phenomenon where model performance would decrease notably in certain benchmarks after training on a specific task. However, the performance would largely recover on other specialized tasks not well-represented in the benchmarks,” they noted. “During the forgetting mitigation experiments, we also explored the option of tuning only the self-attention projection (SA Proj) or MLP layers, based on the observation that tuning solely the LLM yielded better results than full model tuning. Surprisingly, tuning only the self-attention projection layers facilitated significant learning of the target tasks without compromising performance on held-out tasks, even after sequential training on all five target tasks.”

The researchers suggested that what appears as forgetting or interference post fine-tuning on a specific task is actually a bias in the output distribution caused by a shift in task distribution.

Narrow Retraining

This discovery proved pivotal in the experiment. The researchers highlighted that tuning the MLP increased the chances of outputting numeric tokens, leading to a corresponding drop in accuracy for held-out tasks. This indicated that temporary knowledge loss in a model is transient and not a long-term issue.

“To prevent bias in the output distribution, we propose tuning the MLP up/gating projections while keeping the down projection static, which resulted in comparable learning to full MLP tuning with minimal forgetting,” the researchers explained.

This streamlined approach offers a more efficient and reproducible method for fine-tuning a model. By concentrating on a specific segment of the model rather than retraining it entirely, enterprises can reduce computational costs and better manage output variations.

However, the study focused solely on two models specializing in vision and language tasks. The researchers acknowledged their limitations in experimenting with other models due to resource constraints.

Nevertheless, the findings can be extrapolated to other LLMs, particularly those involving diverse modalities.

Bennett Tech Innovation

Cost-Efficient AI Model Retraining to Prevent Forgetting: A Breakthrough Discovery by Researchers

AI

Cost-Efficient AI Model Retraining to Prevent Forgetting: A Breakthrough Discovery by Researchers

Catastrophic Forgetting

Narrow Retraining

Leave a Reply
Cancel reply

Leave a Reply

Anthropic’s Free Offer: Harnessing the Power of Claude Haiku 4.5 AI to Compete with OpenAI

Apple Strikes Back: Turning BSOD into an Anti-PC Campaign

The AI Apprentice: Inside Amazon’s Model Factory Training Ground

Apple TV Rebranding: Just Do It

Google’s New Feature: Enlist Friends to Recover Your Account

AI Acceleration: How ATLAS Adaptive Speculator Achieved a 400% Inference Speedup Through Real-Time Workload Learning

Unforeseen Consequences: How September’s Windows Server Updates Impacted Active Directory

Exclusive Nintendo Switch Bundles and Pokemon Pre-Orders Now Available at Target

Unveiling the Xiaomi Smart Band 10 Glimmer Edition: A Comprehensive Review

Tekken 8: Rise of the Shadows

Goku Takes on the Dragon Ball FighterZ Arena

Samsung Galaxy UI 8: Embracing the Big Free AI Upgrade

Neil Young Takes a Stand: Pulling Music from Amazon in Protest of Jeff Bezos’ Support for Trump

Critical Vulnerability Exposed: Oracle EBS Targeted in Recent Cyber Attacks by Cl0p Hackers

Exploring the Dystopian Realms of Pluribus: An Apple Original Series Trailer

Enhanced Copilot Features: Creating Office Documents and Gmail Integration

Microsoft Launches Free Copilot Tools for Washington State Schools: Navigating the AI Learning Debate

Oracle’s Next-Gen Enterprise AI Services Powered by NVIDIA’s Cutting-Edge GPUs

9 Cool Gadgets You Can Buy on Amazon

10 Cool Gadgets on Amazon You Can Buy

12 Cool Gadgets You Can Buy on Amazon

10 Crazy Gadgets on Amazon And Online You Can Buy

10 Cool Gadgets on Amazon You Can Buy

12 Cool Gadgets on Amazon You Can Buy

10 Cool Gadgets You Can Buy on Amazon And Online

10 Cool Gadgets You Can Buy on Amazon

10 Cool Gadgets Available on Amazon | Gadgets From Rs,299 Rs,499 to 10k

Trending

Newsletter Signup

Bennett Tech Innovation

Cost-Efficient AI Model Retraining to Prevent Forgetting: A Breakthrough Discovery by Researchers

Catastrophic Forgetting

Narrow Retraining

You may like

Leave a Reply Cancel reply

Leave a Reply

Anthropic’s Free Offer: Harnessing the Power of Claude Haiku 4.5 AI to Compete with OpenAI

Apple Strikes Back: Turning BSOD into an Anti-PC Campaign

The AI Apprentice: Inside Amazon’s Model Factory Training Ground

Apple TV Rebranding: Just Do It

Google’s New Feature: Enlist Friends to Recover Your Account

AI Acceleration: How ATLAS Adaptive Speculator Achieved a 400% Inference Speedup Through Real-Time Workload Learning

Unforeseen Consequences: How September’s Windows Server Updates Impacted Active Directory

Exclusive Nintendo Switch Bundles and Pokemon Pre-Orders Now Available at Target

Unveiling the Xiaomi Smart Band 10 Glimmer Edition: A Comprehensive Review

Tekken 8: Rise of the Shadows

Goku Takes on the Dragon Ball FighterZ Arena

Samsung Galaxy UI 8: Embracing the Big Free AI Upgrade

Neil Young Takes a Stand: Pulling Music from Amazon in Protest of Jeff Bezos’ Support for Trump

Critical Vulnerability Exposed: Oracle EBS Targeted in Recent Cyber Attacks by Cl0p Hackers

Exploring the Dystopian Realms of Pluribus: An Apple Original Series Trailer

Enhanced Copilot Features: Creating Office Documents and Gmail Integration

Microsoft Launches Free Copilot Tools for Washington State Schools: Navigating the AI Learning Debate

Oracle’s Next-Gen Enterprise AI Services Powered by NVIDIA’s Cutting-Edge GPUs

9 Cool Gadgets You Can Buy on Amazon

10 Cool Gadgets on Amazon You Can Buy

12 Cool Gadgets You Can Buy on Amazon

10 Crazy Gadgets on Amazon And Online You Can Buy

10 Cool Gadgets on Amazon You Can Buy

12 Cool Gadgets on Amazon You Can Buy

10 Cool Gadgets You Can Buy on Amazon And Online

10 Cool Gadgets You Can Buy on Amazon

10 Cool Gadgets Available on Amazon | Gadgets From Rs,299 Rs,499 to 10k

Trending

Leave a Reply
Cancel reply