1. This paper presents an overview of popular methods and reviews recent works on compressing and accelerating deep neural networks.
2. The types of compression methods discussed include pruning methods, quantization methods, and low-rank factorization methods.
3. Pruning approaches can be applied to pre-trained models or trained from scratch and are further categorized into two classes according to pruning level: weights level and units level.
The article is generally reliable and trustworthy in its presentation of the various compression techniques for deep neural networks. It provides a comprehensive overview of the different approaches, including pruning methods, quantization methods, and low-rank factorization methods. The article also provides a detailed review of each method, highlighting their characteristics, advantages, and shortcomings.
The article does not appear to have any biases or one-sided reporting as it presents both sides equally in its discussion of the various compression techniques for deep neural networks. Furthermore, it does not contain any unsupported claims or missing points of consideration as it provides a thorough overview of each method with clear explanations and examples. Additionally, the article does not contain any promotional content or partiality as it objectively discusses the different approaches without favoring one over another.
Finally, the article does note possible risks associated with each approach such as expensive computation costs or poor cache locality when using unstructured pruning strategies. Therefore, overall this article is reliable and trustworthy in its presentation of the various compression techniques for deep neural networks.