1. A method is presented to formulate algorithm discovery as program search, and applied to discover optimization algorithms for deep neural network training.
2. The discovered algorithm, Lion, is more memory-efficient than Adam and has the same magnitude of update for each parameter calculated through the sign operation.
3. Lion outperforms other widely used optimizers on a variety of tasks, such as image classification, vision-language contrastive learning, diffusion models, autoregressive masked language modeling and fine-tuning.
The article presents a method to formulate algorithm discovery as program search and applies it to discover an optimization algorithm for deep neural network training. The article claims that the discovered algorithm, Lion, outperforms other widely used optimizers on a variety of tasks. The article provides evidence for this claim by comparing Lion with Adam and Adafactor on image classification, vision-language contrastive learning, diffusion models, autoregressive masked language modeling and fine-tuning tasks.
The trustworthiness and reliability of the article can be assessed by looking at its potential biases and their sources, one-sided reporting, unsupported claims, missing points of consideration, missing evidence for the claims made, unexplored counterarguments etc. In this regard, the article appears to be reliable as it provides evidence for its claims in the form of comparison results between Lion and other widely used optimizers on various tasks. Furthermore, it also provides insights into how Lion works (e.g., its update has the same magnitude for each parameter calculated through the sign operation) which helps in understanding why it performs better than other optimizers in certain scenarios. Additionally, there are no obvious signs of bias or one-sided reporting in the article which further adds to its credibility.
However there are some points that could have been explored further such as possible risks associated with using Lion or exploring counterarguments against using it instead of other optimizers in certain scenarios. Additionally there could have been more detailed analysis regarding why Lion performs better than other optimizers in certain scenarios (e.g., what factors contribute to its performance gain when increasing batch size). These points could have added more depth to the article but overall it appears to be reliable and trustworthy given that it provides evidence for its claims and does not appear biased or one-sided in any way.