*This model was published in HF papers on 2025-06-06 and contributed to Hugging Face Transformers on 2025-06-25.*

# dots.llm1 [dots.llm1](https://huggingface.co/papers/2506.05767) is a 142B-parameter mixture-of-experts model that activates 14B parameters per token, using top-6-of-128 routed experts plus 2 shared experts. It delivers performance on par with Qwen2.5-72B while significantly reducing training and inference costs. Notably, no synthetic data was used during pretraining. The example below demonstrates how to generate text with [`Pipeline`] or the [`AutoModelForCausalLM`] class. ```python from transformers import pipeline pipe = pipeline( task="text-generation", model="rednote-hilab/dots.llm1.base", ) pipe("The advantage of mixture-of-experts models is") ``` ```python from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("rednote-hilab/dots.llm1.base") model = AutoModelForCausalLM.from_pretrained( "rednote-hilab/dots.llm1.base", device_map="auto", ) input_ids = tokenizer("The advantage of mixture-of-experts models is", return_tensors="pt").to(model.device) output = model.generate(**input_ids, max_new_tokens=50) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ## Dots1Config [[autodoc]] Dots1Config ## Dots1Model [[autodoc]] Dots1Model - forward ## Dots1ForCausalLM [[autodoc]] Dots1ForCausalLM - forward