*This model was published in HF papers on 2025-05-14 and contributed to Hugging Face Transformers on 2025-03-31.*

# Qwen3 [Qwen3](https://huggingface.co/papers/2505.09388) is the dense model architecture in the Qwen3 family, available in sizes from 0.6B to 32B parameters. It supports both thinking mode (multi-step reasoning) and non-thinking mode, with seamless switching between the two. Qwen3 was trained on approximately 36T tokens covering 119 languages. See also the MoE variant [Qwen3MoE](qwen3_moe). The example below demonstrates how to generate text with [`Pipeline`] or the [`AutoModelForCausalLM`] class. ```python from transformers import pipeline pipe = pipeline( task="text-generation", model="Qwen/Qwen3-0.6B", ) pipe("The key to effective reasoning is") ``` ```python from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B") model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen3-0.6B", device_map="auto", ) input_ids = tokenizer("The key to effective reasoning is", return_tensors="pt").to(model.device) output = model.generate(**input_ids, max_new_tokens=50) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ## Qwen3Config [[autodoc]] Qwen3Config ## Qwen3Model [[autodoc]] Qwen3Model - forward ## Qwen3ForCausalLM [[autodoc]] Qwen3ForCausalLM - forward ## Qwen3ForSequenceClassification [[autodoc]] Qwen3ForSequenceClassification - forward ## Qwen3ForTokenClassification [[autodoc]] Qwen3ForTokenClassification - forward ## Qwen3ForQuestionAnswering [[autodoc]] Qwen3ForQuestionAnswering - forward