Instructions to use Maincode/Maincoder-1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Maincode/Maincoder-1B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Maincode/Maincoder-1B", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Maincode/Maincoder-1B", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Maincode/Maincoder-1B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Maincode/Maincoder-1B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Maincode/Maincoder-1B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Maincode/Maincoder-1B
- SGLang
How to use Maincode/Maincoder-1B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Maincode/Maincoder-1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Maincode/Maincoder-1B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Maincode/Maincoder-1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Maincode/Maincoder-1B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Maincode/Maincoder-1B with Docker Model Runner:
docker model run hf.co/Maincode/Maincoder-1B
When a GGUF would be available?
It would be nice to have a gguf of the model
say not supported by llama.cpp when I tried to convert it
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Model architecture: MaincoderForCausalLM
ERROR:hf-to-gguf:Model MaincoderForCausalLM is not supported
and https://huggingface.co/spaces/ggml-org/gguf-my-repo errored out
I know, i tried that too. I am guessing that you need to modify llama.cpp to be compatible with MaincoderForCausalLM
Someone from our team is already working on it so there should be an update soon : )
Okay! Hope everything goes well!
Any update on this?
Hi.. We've been taking a break after the release so this could some time but rest assured it will up updated here the moment it is done and released....
But again, in the meantime, to encourage community involvement and participation too, I think it would be great to see someone from the community do that if they're up for the task... The model's license permits that too so you're all cool...
Neat, now I can run the model locally :)
so i tried to load the model on lm studio (thinking that it would work) and i got this
🥲 Failed to load the model
Failed to load model
error loading model: error loading model architecture: unknown model architecture: 'maincoder'
Maybe enable LM Studio's beta channel. You need llama.cpp version >= b7614
I set LM Studio to beta, but it still doesn’t work, and I’m not sure how to set up llama.cpp b7614 to work with LM Studio.
Probably need to wait a few days to a week for LM Studio to catch up if you don't want to run llama.cpp directly
okay