Not All Layers of LLMs Are Necessary During Inference

Sat, 16 Aug 2025 00:00:00 +0000

Few-Shot Learner Generalizes Across AI-Generated Image Detection

Tue, 15 Jul 2025 00:00:00 +0000

52B to 1T: Lessons Learned via Tele-FLM Series

Wed, 03 Jul 2024 00:00:00 +0000

Masked Structural Growth for 2x Faster Language Model Pre-training

Tue, 07 May 2024 00:00:00 +0000

FLM Family

Mon, 01 Jan 2024 00:00:00 +0000

FLM is a large language model jointly developed by the Cognitive Team (Cofe-AI) of BAAI, together with Tsinghua University, ICT, Nanyang Technological University, and University of Electronic Science and Technology. The project aims to develop a cost-effective, fully open-source, and highly effective large model. The FLM series has evolved to its second generation at the current stage.

1. FLM-2

FLM-2 is a more significant attempt. Doing

2. FLM-101B

FLM-101B inherits the structure of FreeLM and employs the Growth Strategy (MSG) to reduce costs by more than 70%. Additionally, it utilizes loss prediction to determine the optimal hyperparameters. FLM-101B represents a significant milestone, not only validating the feasibility of individual sub-technologies but also successfully implementing them at the system level. Regarding the relationship between FLM-101B and MSG, we perceive it as analogous to the relationship between GPT-3 and the Transformer architecture—it is not merely a matter of scaling up, but rather, it signifies the first successful implementation at the system level.

For detail, please refer to the FLM-101B and How to train it with a $100,000 Budget.

3. FreeLM

FreeLM is at generation 0, with the objective of validating the feasibility of integrating relevant knowledge learning stages into language model training.

For detail, please refer to the FreeLM.

4. Concepts for Large Modes Development

Our team’s philosophy about the development of large models is:

Both system capabilities and research capabilities are essential.
Without system capabilities, it is not possible to develop large models, as it would be impossible to control costs.
Without research capabilities, one can only follow in the footsteps of others; under the circumstances where the leader in large models chooses to close the source code, it will be impossible to make further breakthroughs.

We welcome researchers with strong capabilities in both system capabilities and research capabilities to contact us!

Easter egg: This page was generated by an early version of FLM-2, without further editing.

Large Model | Yequan's Academic