FLAN-UL2 vs. MPT: Which LLM is Better?

FLAN-UL2 vs. MPT

LLM Comparison

FLAN-UL2

Overview

Similar to FLAN-T5, FLAN-UL2 is a model based on Google's popular T5 architecture with an upgraded pre-training procedure dubbed UL2. On most NLU benchmarks, FLAN-UL2 outperforms FLAN-T5 by a significant margin.

Initial release: 2023-03-03

Reference

https://huggingface.co/google/flan-ul2

MPT

Overview

MPT-7B and MPT-30B are a set of models that are part of MosaicML's Foundation Series. Trained on 1T tokens, the developers state that MPT-7B matches the performance of LLaMA while also being open source, while MPT-30B outperforms the original GPT-3. In addition to the base model, the developers also offer MPT-Instruct, MPT-Chat, and MPT-7B-StoryWriter-65k+, the last of which is trained on a context length of 65K tokens.

Initial release: 2023-05-05

Reference

https://www.mosaicml.com/blog/mpt-30b

	FLAN-UL2	MPT
Products & Features
Instruct Models
Coding Capability
Customization
Finetuning
Open Source
License	Apache 2.0	Apache 2.0
Model Sizes	20B	7B, 30B

FLAN-UL2 vs. MPT

LLM Comparison

FLAN-UL2

Overview

Reference

Further Reading

MPT

Overview

Reference

Further Reading

FLAN-UL2

MPT

Other FLAN-UL2 Comparisons

Other MPT Comparisons

Undecided?