Instructions to use RESMP-DEV/LLaMutation-Qwen2.5-14B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RESMP-DEV/LLaMutation-Qwen2.5-14B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="RESMP-DEV/LLaMutation-Qwen2.5-14B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("RESMP-DEV/LLaMutation-Qwen2.5-14B")
model = AutoModelForCausalLM.from_pretrained("RESMP-DEV/LLaMutation-Qwen2.5-14B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use RESMP-DEV/LLaMutation-Qwen2.5-14B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "RESMP-DEV/LLaMutation-Qwen2.5-14B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RESMP-DEV/LLaMutation-Qwen2.5-14B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/RESMP-DEV/LLaMutation-Qwen2.5-14B

SGLang

How to use RESMP-DEV/LLaMutation-Qwen2.5-14B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "RESMP-DEV/LLaMutation-Qwen2.5-14B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RESMP-DEV/LLaMutation-Qwen2.5-14B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "RESMP-DEV/LLaMutation-Qwen2.5-14B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RESMP-DEV/LLaMutation-Qwen2.5-14B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use RESMP-DEV/LLaMutation-Qwen2.5-14B with Docker Model Runner:
```
docker model run hf.co/RESMP-DEV/LLaMutation-Qwen2.5-14B
```

Kearm commited on Oct 21, 2024

Commit

a29cf36

verified ·

1 Parent(s): 6fdf3d5

Update README.md

Browse files

Actual model card with proper information.

Files changed (1) hide show

README.md +65 -58

README.md CHANGED Viewed

@@ -2,15 +2,75 @@
 library_name: transformers
 license: apache-2.0
 base_model: Qwen/Qwen2.5-14B
-tags:
-- generated_from_trainer
 model-index:
 - name: LLaMutation-Qwen2.5-14B-SFFT-v0.0
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
 <details><summary>See axolotl config</summary>
@@ -120,57 +180,4 @@ weight_decay: 0.1
 #   fsdp_mixed_precision: BF16  # Added
 ```
-</details><br>
-# LLaMutation-Qwen2.5-14B-SFFT-v0.0
-This model is a fine-tuned version of [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.2621
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 0.0005
-- train_batch_size: 1
-- eval_batch_size: 1
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 8
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 32
-- total_eval_batch_size: 8
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 50
-- num_epochs: 1
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 0.3948        | 0.0237 | 1    | 0.3920          |
-| 0.2392        | 0.4970 | 21   | 0.2500          |
-| 0.2606        | 0.9941 | 42   | 0.2621          |
-### Framework versions
-- Transformers 4.45.2
-- Pytorch 2.3.1+cu121
-- Datasets 3.0.1
-- Tokenizers 0.20.1

 library_name: transformers
 license: apache-2.0
 base_model: Qwen/Qwen2.5-14B
 model-index:
 - name: LLaMutation-Qwen2.5-14B-SFFT-v0.0
   results: []
 ---
+# LLaMutation-Qwen2.5-14B-SFFT-v0.0
+![image/webp](https://cdn-uploads.huggingface.co/production/uploads/655dc641accde1bbc8b41aec/IFK02cTih72zfZfT5UY4f.webp)
+This model is a [Spectrum](https://github.com/axolotl-ai-cloud/axolotl/blob/67f744dc8c9564ef7a42d5df780ae53e319dca61/src/axolotl/integrations/spectrum/README.md) FFT of [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) on a code translation dataset evolved with [EvolKit](https://github.com/arcee-ai/EvolKit).
+## Model description
+Code translation and completion model trained on Qwen2.5-14B as there is not yet a Qwen2.5-Coder-14B model. This is 100% an alpha completion model thus there will be quirks to it's useage parameters.
+I will refine the model both for completion and create an instruct/chat variant.
+## Intended uses & limitations
+Differing system prompts for code translation and use as a tab autocomplete model with [continue.dev](https://www.continue.dev/)
+## Chat template and sampling paramaters.
+Chat template is chatml.
+Sampling parameters for the generation and demo at the hackathon are here:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/655dc641accde1bbc8b41aec/YzQ8nqu83lEhl3Kg4u0PC.png)
+### SYSTEM PROMPT MUST BE USED FOR THIS MODEL
+`You are an Al assistant that is an expert at converting code from any language to another within properly formatted code blocks. DON'T SAY ANYTHING ABOUT NOT SEEING CODE. Keep non code text to the a minimum possible. DO NOT REPEAT ANY NON CODE TEXT. ONLY PRINT OUT CODE ONCE DO NOT ITTERATE!`
+## Training procedure
+Spectrum FFT/SFFT
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0005
+- train_batch_size: 1
+- eval_batch_size: 1
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 50
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 0.3948        | 0.0237 | 1    | 0.3920          |
+| 0.2392        | 0.4970 | 21   | 0.2500          |
+| 0.2606        | 0.9941 | 42   | 0.2621          |
+### Framework versions
+- Transformers 4.45.2
+- Pytorch 2.3.1+cu121
+- Datasets 3.0.1
+- Tokenizers 0.20.1
 [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
 <details><summary>See axolotl config</summary>
 #   fsdp_mixed_precision: BF16  # Added
 ```
+</details><br>