Towards scalable and automatic neural network training
In this talk, I will discuss how to design a scalable neural network optimizer, from human-based design to AI-based automatic design. The first part of the talk will center around my journey in developing scalable optimizers, including Cluster-GCN, LAMB, and various second-order optimizers. The second part of the talk will introduce some of our initial efforts in automating the optimizer design, including an efficient optimizer search framework and automatically discovered algorithms that surpass Adam's performance on several large-scale training tasks.