*This model was published in HF papers on 2025-05-14 and contributed to Hugging Face Transformers on 2025-03-31.*

# Qwen3MoE [Qwen3MoE](https://huggingface.co/papers/2505.09388) is the mixture-of-experts variant in the Qwen3 family, with 30.5B total parameters and 3.3B active parameters per token. It uses 128 routed experts with 8 activated per token across 48 layers, and supports up to 131K context with YaRN. See also the dense variant [Qwen3](qwen3). The example below demonstrates how to generate text with [`Pipeline`] or the [`AutoModelForCausalLM`] class. ```python from transformers import pipeline pipe = pipeline( task="text-generation", model="Qwen/Qwen3-30B-A3B", ) pipe("The key to effective reasoning is") ``` ```python from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-30B-A3B") model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen3-30B-A3B", device_map="auto", ) input_ids = tokenizer("The key to effective reasoning is", return_tensors="pt").to(model.device) output = model.generate(**input_ids, max_new_tokens=50) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ## Qwen3MoeConfig [[autodoc]] Qwen3MoeConfig ## Qwen3_5MoeVisionConfig [[autodoc]] Qwen3_5MoeVisionConfig ## Qwen3MoeModel [[autodoc]] Qwen3MoeModel - forward ## Qwen3MoeForCausalLM [[autodoc]] Qwen3MoeForCausalLM - forward ## Qwen3MoeForSequenceClassification [[autodoc]] Qwen3MoeForSequenceClassification - forward ## Qwen3MoeForTokenClassification [[autodoc]] Qwen3MoeForTokenClassification - forward ## Qwen3MoeForQuestionAnswering [[autodoc]] Qwen3MoeForQuestionAnswering - forward