Docker model run compatibility?

#17
by keyoti - opened

Hi, should the model work with docker model run? I get this with any prompt

docker model run hf.co/pszemraj/flan-t5-large-grammar-synthesis:q6_k
> i dance last week
 on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on
> Send a message (/? for help)

Thanks

Hi! I am not familiar with docker model run/ I don't think it's part of transformers/hugging face ecosystem. my thoughts are:

  • based on the q6_k, it seems like it will try and load q6_k weights and quantizes this repo if it doesn't find them. the q6_k in this repo is quite old, you could try making a fresh one with https://hf.co/spaces/ggml-org/gguf-my-repo or trying one of the two sub-repos that contain quantized weights of this model.
  • this repo contains transformers weights, to be used with transformers/accelerate hf ecosystem. check the docs on alternative quantization methods that are directly compatible
  • if none of ^ work, I would revert to the git repo/implementation that docker model run uses and ask about their implementation/support for t5 encoder-decoder (as its not a traditional decoder "llm", it is used less frequently and it is possible there are errors in 3P implementations, so worth checking)

Sign up or log in to comment