Qwen/Qwen3-VL-235B-A22B-Thinking

#1416

by Rubertigno - opened Sep 25, 2025

Discussion

Rubertigno

Sep 25, 2025

Qwen/Qwen3-VL-235B-A22B-Thinking
Qwen/Qwen3-VL-235B-A22B-Instruct

please gguff quantized. thank a lot

nicoboss

Sep 25, 2025

No idea if that model is supported but I let's download the 471 GB Qwen3-VL-235B-A22B-Thinking to find out. This might be the first time we need to manually handle a vision model so no idea if I can just softlink the folder containing the model for noquant and mmproj extraction. If not, we might need to wait for more storage on spool. But this all assumes the in my opinion unlikely case that it is supported. We will find out soon. Storage even outside spool is getting really tight on nico1 due to the massive number of large models we currently process.

nicoboss

Sep 25, 2025

Haha wow that made it not even far enough to get the usual convert_hf_to_gguf.py errors. At least that one might be fixable.

root@AI:/apool/llama.cpp# venv/bin/python convert_hf_to_gguf.py /apool/Qwen3-VL-235B-A22B-Thinking --outfile /tmp/Qwen3-VL-235B-A22B-Thinking.gguf --outtype source
INFO:hf-to-gguf:Loading model: Qwen3-VL-235B-A22B-Thinking
WARNING:hf-to-gguf:Failed to load model config from /apool/Qwen3-VL-235B-A22B-Thinking: The checkpoint you are trying to load has model type `qwen3_vl_moe` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Model architecture: Qwen3VLMoeForConditionalGeneration
ERROR:hf-to-gguf:Model Qwen3VLMoeForConditionalGeneration is not supported

nicoboss

Sep 25, 2025

Either PIP or HuggingFace transformers is a terrible product because it has impossible to fulfill dependencies according to PIP as it requires booth huggingface-hub < 1.0 and equals 1.0.0.rc1 which simply is impossible.

transformers 4.57.0.dev0 requires huggingface-hub<1.0,>=0.34.0, but you have huggingface-hub 1.0.0rc1 which is incompatible.
transformers 4.57.0.dev0 depends on huggingface-hub==1.0.0.rc1
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

nicoboss

Sep 25, 2025

•

edited Sep 25, 2025

Well even after upgrading transformers to the latest development version it is not really much better as now llama.cpp correctly states that the model is indeed not currently supported:

root@AI:/apool/llama.cpp# venv/bin/python convert_hf_to_gguf.py /apool/Qwen3-VL-235B-A22B-Thinking --outfile /tmp/Qwen3-VL-235B-A22B-Thinking.gguf --outtype source
INFO:hf-to-gguf:Loading model: Qwen3-VL-235B-A22B-Thinking
INFO:hf-to-gguf:Model architecture: Qwen3VLMoeForConditionalGeneration
ERROR:hf-to-gguf:Model Qwen3VLMoeForConditionalGeneration is not supported

Rubertigno

Sep 26, 2025

Thank you so much, Nicoboss and the entire mradermacher team. You're doing a wonderful job.

mradermacher

Owner Sep 27, 2025

•

edited Sep 27, 2025

arguably, 1.0.0rc1 is < 1.0. maybe transformers used a syntactically unsupported version number, or the version number ordering in pip is not as intuitive as one hoped.
(yes, rate-limit hell is over for a while)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment