*This model was published in HF papers on 2025-05-14 and contributed to Hugging Face Transformers on 2025-03-31.*
# Qwen3MoE
[Qwen3MoE](https://huggingface.co/papers/2505.09388) is the mixture-of-experts variant in the Qwen3 family, with 30.5B total parameters and 3.3B active parameters per token. It uses 128 routed experts with 8 activated per token across 48 layers, and supports up to 131K context with YaRN. See also the dense variant [Qwen3](qwen3).
The example below demonstrates how to generate text with [`Pipeline`] or the [`AutoModelForCausalLM`] class.
```python
from transformers import pipeline
pipe = pipeline(
task="text-generation",
model="Qwen/Qwen3-30B-A3B",
)
pipe("The key to effective reasoning is")
```
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-30B-A3B")
model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3-30B-A3B",
device_map="auto",
)
input_ids = tokenizer("The key to effective reasoning is", return_tensors="pt").to(model.device)
output = model.generate(**input_ids, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
```
## Qwen3MoeConfig
[[autodoc]] Qwen3MoeConfig
## Qwen3_5MoeVisionConfig
[[autodoc]] Qwen3_5MoeVisionConfig
## Qwen3MoeModel
[[autodoc]] Qwen3MoeModel
- forward
## Qwen3MoeForCausalLM
[[autodoc]] Qwen3MoeForCausalLM
- forward
## Qwen3MoeForSequenceClassification
[[autodoc]] Qwen3MoeForSequenceClassification
- forward
## Qwen3MoeForTokenClassification
[[autodoc]] Qwen3MoeForTokenClassification
- forward
## Qwen3MoeForQuestionAnswering
[[autodoc]] Qwen3MoeForQuestionAnswering
- forward