Can you use the HuggingFace serverless Inference API in a chat frontend such as librechat without setting anything else up?

#437
by SquigglyFruit - opened

To set this up you need a chat completions address.
Reading: https://huggingface.co/docs/api-inference/quicktour

It is https://huggingface.co/proxy/api-inference.huggingface.co/models/{insert model here}

eg. https://huggingface.co/proxy/api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B

(or is it https://huggingface.co/proxy/api-inference.huggingface.co/models/Meta-Llama-3-8B )?

and use a header of

Authorization : Bearer {api key}

eg using model: "meta-llama/Meta-Llama-3-8B"

I've tried this in TypingMind with errors.

Is the API is OpenAI compatible?
Any help would be much appreciated.

Sign up or log in to comment