XGenerationLab
/

XiYanSQL-QwenCoder-7B-2502

@@ -62,7 +62,7 @@ transformers >= 4.37.0
 Here is a simple code snippet for quickly using **XiYanSQL-QwenCoder** model. We provide a Chinese version of the prompt, and you just need to replace the placeholders for "question," "db_schema," and "evidence" to get started. We recommend using our [M-Schema](https://github.com/XGenerationLab/M-Schema) format for the schema; other formats such as DDL are also acceptable, but they may affect performance.
 Currently, we mainly support mainstream dialects like SQLite, PostgreSQL, and MySQL.
-```
 nl2sqlite_template_cn = """你是一名{dialect}专家，现在需要阅读并理解下面的【数据库schema】描述，以及可能用到的【参考信息】，并运用{dialect}知识生成sql语句回答【用户问题】。
 【用户问题】
@@ -82,7 +82,7 @@ nl2sqlite_template_cn = """你是一名{dialect}专家，现在需要阅读并
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
-model_name = "XGenerationLab/XiYanSQL-QwenCoder-32B-2412"
 model = AutoModelForCausalLM.from_pretrained(
     model_name,
     torch_dtype=torch.bfloat16,
@@ -118,6 +118,41 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 ```
 ## Acknowledgments
-If you find our work useful, please give us a citation or a like, so we can make a greater contribution to the open-source community!

 Here is a simple code snippet for quickly using **XiYanSQL-QwenCoder** model. We provide a Chinese version of the prompt, and you just need to replace the placeholders for "question," "db_schema," and "evidence" to get started. We recommend using our [M-Schema](https://github.com/XGenerationLab/M-Schema) format for the schema; other formats such as DDL are also acceptable, but they may affect performance.
 Currently, we mainly support mainstream dialects like SQLite, PostgreSQL, and MySQL.
+```python
 nl2sqlite_template_cn = """你是一名{dialect}专家，现在需要阅读并理解下面的【数据库schema】描述，以及可能用到的【参考信息】，并运用{dialect}知识生成sql语句回答【用户问题】。
 【用户问题】
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "XGenerationLab/XiYanSQL-QwenCoder-7B-2502"
 model = AutoModelForCausalLM.from_pretrained(
     model_name,
     torch_dtype=torch.bfloat16,
 ```
+### Inference with vLLM
+```python
+from vllm import LLM, SamplingParams
+from transformers import AutoTokenizer
+model_path = "XGenerationLab/XiYanSQL-QwenCoder-7B-2502"
+llm = LLM(model=model_path, tensor_parallel_size=8)
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+sampling_params = SamplingParams(
+    n=1,
+    temperature=0.1,
+    max_tokens=1024
+)
+## dialects -> ['SQLite', 'PostgreSQL', 'MySQL']
+prompt = nl2sqlite_template_cn.format(dialect="", db_schema="", question="", evidence="")
+message = [{'role': 'user', 'content': prompt}]
+text = tokenizer.apply_chat_template(
+    message,
+    tokenize=False,
+    add_generation_prompt=True
+)
+outputs = llm.generate([text], sampling_params=sampling_params)
+response = outputs[0].outputs[0].text
+```
 ## Acknowledgments
+If you find our work useful, please give us a citation or a like, so we can make a greater contribution to the open-source community!
+```bibtex
+@article{XiYanSQL,
+      title={XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL},
+      author={Yifu Liu and Yin Zhu and Yingqi Gao and Zhiling Luo and Xiaoxia Li and Xiaorong Shi and Yuntao Hong and Jinyang Gao and Yu Li and Bolin Ding and Jingren Zhou},
+      year={2025},
+      eprint={2507.04701},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2507.04701},
+}
+```