Resource to create mxfp4 gguf models?

#2
by sandeshrajx - opened

What script was used to make this model. If I have a mxfp4 safetensor model, how can i turn it into a mxfp4 kernel compatible gguf version? could you share your process?

I'm using the official tools from the llama.cpp repository.
Just clone the repo, they provide a convert_hf_to_gguf.py script and with that you create the original BF16 gguf, and then use the llama-quantize tool to quantize it to MXFP4.

Sign up or log in to comment