Docker model run compatibility?
#17
by
keyoti
- opened
Hi, should the model work with docker model run? I get this with any prompt
docker model run hf.co/pszemraj/flan-t5-large-grammar-synthesis:q6_k
> i dance last week
on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on on
> Send a message (/? for help)
Thanks
Hi! I am not familiar with docker model run/ I don't think it's part of transformers/hugging face ecosystem. my thoughts are:
- based on the
q6_k, it seems like it will try and load q6_k weights and quantizes this repo if it doesn't find them. the q6_k in this repo is quite old, you could try making a fresh one with https://hf.co/spaces/ggml-org/gguf-my-repo or trying one of the two sub-repos that contain quantized weights of this model.- I have usage instructions+example on mine (albeit for different frameworks closer to the core of llama.cpp)
- this repo contains transformers weights, to be used with transformers/accelerate hf ecosystem. check the docs on alternative quantization methods that are directly compatible
- if none of ^ work, I would revert to the git repo/implementation that
docker model runuses and ask about their implementation/support for t5 encoder-decoder (as its not a traditional decoder "llm", it is used less frequently and it is possible there are errors in 3P implementations, so worth checking)