The Cerebras-GPT family of models was developed by the AI accelerator company Cerebras following Chinchilla scaling laws as a demonstration of its Wafter-Scale Cluster technology.
Initial release: 2023-03-28
Similar to FLAN-T5, FLAN-UL2 is a model based on Google's popular T5 architecture with an upgraded pre-training procedure dubbed UL2. On most NLU benchmarks, FLAN-UL2 outperforms FLAN-T5 by a significant margin.
Initial release: 2023-03-03
|Products & Features|
|License||Apache 2.0||Apache 2.0|
|Model Sizes||1.3B, 2.7B, 6.7B, 13B||20B|