Instructions to use RESMP-DEV/LLaMutation-Qwen2.5-14B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use RESMP-DEV/LLaMutation-Qwen2.5-14B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="RESMP-DEV/LLaMutation-Qwen2.5-14B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("RESMP-DEV/LLaMutation-Qwen2.5-14B") model = AutoModelForCausalLM.from_pretrained("RESMP-DEV/LLaMutation-Qwen2.5-14B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use RESMP-DEV/LLaMutation-Qwen2.5-14B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "RESMP-DEV/LLaMutation-Qwen2.5-14B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RESMP-DEV/LLaMutation-Qwen2.5-14B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/RESMP-DEV/LLaMutation-Qwen2.5-14B
- SGLang
How to use RESMP-DEV/LLaMutation-Qwen2.5-14B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "RESMP-DEV/LLaMutation-Qwen2.5-14B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RESMP-DEV/LLaMutation-Qwen2.5-14B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "RESMP-DEV/LLaMutation-Qwen2.5-14B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RESMP-DEV/LLaMutation-Qwen2.5-14B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use RESMP-DEV/LLaMutation-Qwen2.5-14B with Docker Model Runner:
docker model run hf.co/RESMP-DEV/LLaMutation-Qwen2.5-14B
Update README.md
Browse filesActual model card with proper information.
README.md
CHANGED
|
@@ -2,15 +2,75 @@
|
|
| 2 |
library_name: transformers
|
| 3 |
license: apache-2.0
|
| 4 |
base_model: Qwen/Qwen2.5-14B
|
| 5 |
-
tags:
|
| 6 |
-
- generated_from_trainer
|
| 7 |
model-index:
|
| 8 |
- name: LLaMutation-Qwen2.5-14B-SFFT-v0.0
|
| 9 |
results: []
|
| 10 |
---
|
| 11 |
|
| 12 |
-
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
|
| 16 |
<details><summary>See axolotl config</summary>
|
|
@@ -120,57 +180,4 @@ weight_decay: 0.1
|
|
| 120 |
# fsdp_mixed_precision: BF16 # Added
|
| 121 |
```
|
| 122 |
|
| 123 |
-
</details><br>
|
| 124 |
-
|
| 125 |
-
# LLaMutation-Qwen2.5-14B-SFFT-v0.0
|
| 126 |
-
|
| 127 |
-
This model is a fine-tuned version of [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) on the None dataset.
|
| 128 |
-
It achieves the following results on the evaluation set:
|
| 129 |
-
- Loss: 0.2621
|
| 130 |
-
|
| 131 |
-
## Model description
|
| 132 |
-
|
| 133 |
-
More information needed
|
| 134 |
-
|
| 135 |
-
## Intended uses & limitations
|
| 136 |
-
|
| 137 |
-
More information needed
|
| 138 |
-
|
| 139 |
-
## Training and evaluation data
|
| 140 |
-
|
| 141 |
-
More information needed
|
| 142 |
-
|
| 143 |
-
## Training procedure
|
| 144 |
-
|
| 145 |
-
### Training hyperparameters
|
| 146 |
-
|
| 147 |
-
The following hyperparameters were used during training:
|
| 148 |
-
- learning_rate: 0.0005
|
| 149 |
-
- train_batch_size: 1
|
| 150 |
-
- eval_batch_size: 1
|
| 151 |
-
- seed: 42
|
| 152 |
-
- distributed_type: multi-GPU
|
| 153 |
-
- num_devices: 8
|
| 154 |
-
- gradient_accumulation_steps: 4
|
| 155 |
-
- total_train_batch_size: 32
|
| 156 |
-
- total_eval_batch_size: 8
|
| 157 |
-
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
| 158 |
-
- lr_scheduler_type: linear
|
| 159 |
-
- lr_scheduler_warmup_steps: 50
|
| 160 |
-
- num_epochs: 1
|
| 161 |
-
|
| 162 |
-
### Training results
|
| 163 |
-
|
| 164 |
-
| Training Loss | Epoch | Step | Validation Loss |
|
| 165 |
-
|:-------------:|:------:|:----:|:---------------:|
|
| 166 |
-
| 0.3948 | 0.0237 | 1 | 0.3920 |
|
| 167 |
-
| 0.2392 | 0.4970 | 21 | 0.2500 |
|
| 168 |
-
| 0.2606 | 0.9941 | 42 | 0.2621 |
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
### Framework versions
|
| 172 |
-
|
| 173 |
-
- Transformers 4.45.2
|
| 174 |
-
- Pytorch 2.3.1+cu121
|
| 175 |
-
- Datasets 3.0.1
|
| 176 |
-
- Tokenizers 0.20.1
|
|
|
|
| 2 |
library_name: transformers
|
| 3 |
license: apache-2.0
|
| 4 |
base_model: Qwen/Qwen2.5-14B
|
|
|
|
|
|
|
| 5 |
model-index:
|
| 6 |
- name: LLaMutation-Qwen2.5-14B-SFFT-v0.0
|
| 7 |
results: []
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# LLaMutation-Qwen2.5-14B-SFFT-v0.0
|
| 11 |
+
|
| 12 |
+

|
| 13 |
+
|
| 14 |
+
This model is a [Spectrum](https://github.com/axolotl-ai-cloud/axolotl/blob/67f744dc8c9564ef7a42d5df780ae53e319dca61/src/axolotl/integrations/spectrum/README.md) FFT of [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) on a code translation dataset evolved with [EvolKit](https://github.com/arcee-ai/EvolKit).
|
| 15 |
+
|
| 16 |
+
## Model description
|
| 17 |
+
|
| 18 |
+
Code translation and completion model trained on Qwen2.5-14B as there is not yet a Qwen2.5-Coder-14B model. This is 100% an alpha completion model thus there will be quirks to it's useage parameters.
|
| 19 |
+
|
| 20 |
+
I will refine the model both for completion and create an instruct/chat variant.
|
| 21 |
+
|
| 22 |
+
## Intended uses & limitations
|
| 23 |
+
|
| 24 |
+
Differing system prompts for code translation and use as a tab autocomplete model with [continue.dev](https://www.continue.dev/)
|
| 25 |
+
|
| 26 |
+
## Chat template and sampling paramaters.
|
| 27 |
+
|
| 28 |
+
Chat template is chatml.
|
| 29 |
+
|
| 30 |
+
Sampling parameters for the generation and demo at the hackathon are here:
|
| 31 |
+
|
| 32 |
+

|
| 33 |
+
|
| 34 |
+
### SYSTEM PROMPT MUST BE USED FOR THIS MODEL
|
| 35 |
+
|
| 36 |
+
`You are an Al assistant that is an expert at converting code from any language to another within properly formatted code blocks. DON'T SAY ANYTHING ABOUT NOT SEEING CODE. Keep non code text to the a minimum possible. DO NOT REPEAT ANY NON CODE TEXT. ONLY PRINT OUT CODE ONCE DO NOT ITTERATE!`
|
| 37 |
+
|
| 38 |
+
## Training procedure
|
| 39 |
+
|
| 40 |
+
Spectrum FFT/SFFT
|
| 41 |
+
|
| 42 |
+
### Training hyperparameters
|
| 43 |
+
|
| 44 |
+
The following hyperparameters were used during training:
|
| 45 |
+
- learning_rate: 0.0005
|
| 46 |
+
- train_batch_size: 1
|
| 47 |
+
- eval_batch_size: 1
|
| 48 |
+
- seed: 42
|
| 49 |
+
- distributed_type: multi-GPU
|
| 50 |
+
- num_devices: 8
|
| 51 |
+
- gradient_accumulation_steps: 4
|
| 52 |
+
- total_train_batch_size: 32
|
| 53 |
+
- total_eval_batch_size: 8
|
| 54 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
| 55 |
+
- lr_scheduler_type: linear
|
| 56 |
+
- lr_scheduler_warmup_steps: 50
|
| 57 |
+
- num_epochs: 1
|
| 58 |
+
|
| 59 |
+
### Training results
|
| 60 |
+
|
| 61 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
| 62 |
+
|:-------------:|:------:|:----:|:---------------:|
|
| 63 |
+
| 0.3948 | 0.0237 | 1 | 0.3920 |
|
| 64 |
+
| 0.2392 | 0.4970 | 21 | 0.2500 |
|
| 65 |
+
| 0.2606 | 0.9941 | 42 | 0.2621 |
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
### Framework versions
|
| 69 |
+
|
| 70 |
+
- Transformers 4.45.2
|
| 71 |
+
- Pytorch 2.3.1+cu121
|
| 72 |
+
- Datasets 3.0.1
|
| 73 |
+
- Tokenizers 0.20.1
|
| 74 |
|
| 75 |
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
|
| 76 |
<details><summary>See axolotl config</summary>
|
|
|
|
| 180 |
# fsdp_mixed_precision: BF16 # Added
|
| 181 |
```
|
| 182 |
|
| 183 |
+
</details><br>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|