2.9 KiB
This model was published in HF papers on 2025-05-14 and contributed to Hugging Face Transformers on 2025-03-31.
Qwen3MoE
Qwen3MoE is the mixture-of-experts variant in the Qwen3 family, with 30.5B total parameters and 3.3B active parameters per token. It uses 128 routed experts with 8 activated per token across 48 layers, and supports up to 131K context with YaRN. See also the dense variant Qwen3.
The example below demonstrates how to generate text with [Pipeline] or the [AutoModelForCausalLM] class.
from transformers import pipeline
pipe = pipeline(
task="text-generation",
model="Qwen/Qwen3-30B-A3B",
)
pipe("The key to effective reasoning is")
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-30B-A3B")
model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3-30B-A3B",
device_map="auto",
)
input_ids = tokenizer("The key to effective reasoning is", return_tensors="pt").to(model.device)
output = model.generate(**input_ids, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Qwen3MoeConfig
autodoc Qwen3MoeConfig
Qwen3_5MoeVisionConfig
autodoc Qwen3_5MoeVisionConfig
Qwen3MoeModel
autodoc Qwen3MoeModel - forward
Qwen3MoeForCausalLM
autodoc Qwen3MoeForCausalLM - forward
Qwen3MoeForSequenceClassification
autodoc Qwen3MoeForSequenceClassification - forward
Qwen3MoeForTokenClassification
autodoc Qwen3MoeForTokenClassification - forward
Qwen3MoeForQuestionAnswering
autodoc Qwen3MoeForQuestionAnswering - forward