Skip to main content

MINI REVIEW article

Front. Robot. AI

Sec. Robot Vision and Artificial Perception

Volume 12 - 2025 | doi: 10.3389/frobt.2025.1518965

This article is part of the Research Topic Embodied Neuromorphic AI for Robotic Perception View all 3 articles

A Survey of Model Compression Techniques: Past, Present, and Future

Provisionally accepted
Defu Liu Defu Liu *Yixiao Zhu Yixiao Zhu Zhe Liu Zhe Liu Yi Liu Yi Liu Changlin Han Changlin Han Jinkai Tian Jinkai Tian Ruihao Li Ruihao Li *Wei Yi Wei Yi *
  • Intelligent Game and Decision Lab (IGDL), Beijing, China

The final, formatted version of the article will be published soon.

    The exceptional performance of general-purpose large models has driven various industries to focus on developing domain-specific models. However, large models are not only time-consuming and labor-intensive during the training phase but also have very high hardware requirements during the inference phase, such as large memory and high computational power. These requirements pose considerable challenges for the practical deployment of large models. As these challenges intensify, model compression has become a vital research focus to address these limitations. This paper presents a comprehensive review of the evolution of model compression techniques, from their inception to future directions. To meet the urgent demand for efficient deployment, we delve into several compression methods-such as quantization, pruning, lowrank decomposition, and knowledge distillation-emphasizing their fundamental principles, recent advancements, and innovative strategies. By offering insights into the latest developments and their implications for practical applications, this review serves as a valuable technical resource for researchers and practitioners, providing a range of strategies for model deployment and laying the groundwork for future advancements in model compression.

    Keywords: Model compression, deep neural networks, Large Language Model, pruning, quantization, Low-rank decomposition, Knowledge distillation

    Received: 29 Oct 2024; Accepted: 25 Feb 2025.

    Copyright: © 2025 Liu, Zhu, Liu, Liu, Han, Tian, Li and Yi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Defu Liu, Intelligent Game and Decision Lab (IGDL), Beijing, China
    Ruihao Li, Intelligent Game and Decision Lab (IGDL), Beijing, China
    Wei Yi, Intelligent Game and Decision Lab (IGDL), Beijing, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

    Research integrity at Frontiers

    Man ultramarathon runner in the mountains he trains at sunset

    94% of researchers rate our articles as excellent or good

    Learn more about the work of our research integrity team to safeguard the quality of each article we publish.


    Find out more