Tay, Yi, Mostafa Dehghani, Samira Abnar, Hyung Won Chung,
William Fedus, Jinfeng Rao, Sharan Narang, Vinh Q. Tran, Dani
Yogatama, and Donald Metzler. “Scaling Laws vs Model
Architectures: How Does Inductive Bias Influence Scaling?” arXiv,
July 21, 2022. http://arxiv.org/abs/2207.10551.
Sorscher, Ben, Robert Geirhos, Shashank Shekhar, Surya
Ganguli, and Ari S. Morcos. “Beyond Neural Scaling Laws: Beating
Power Law Scaling via Data Pruning.” arXiv, June 29, 2022. http://arxiv.org/abs/2206.14486.
Thompson, Neil C., Shuning Ge, and Gabriel F. Manso. “The
Importance of (Exponentially More) Computing Power.” arXiv, June
28, 2022. http://arxiv.org/abs/2206.14007.