File size: 7,585 Bytes
0ecc4db
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89dd0b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d77d6fb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c0bc42c
0ecc4db
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
---
license: apache-2.0
language:
- en
metrics:
- accuracy
tags:
- code
arxiv: 2407.10424
---

# CodeV:Empowering LLMs for HDL Generation through Multi-Level Summarization

<img src="assets/overview_v20250413.png" style="zoom:50%;" /> 

CodeV is an innovative series of open-source, instruction-tuned Large Language Models (LLMs) specifically designed for the generation of high-quality HDL code, addressing the challenges faced by existing models in this domain.  **(This repo is under development)** 


## Models and Datasets

|      | Base Model                                                                                          | CodeV                                                                                |
| ---- | --------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
| 6.7B | [deepseek-ai/deepseek-coder-6.7b-base](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base) | [yang-z/CodeV-DS-6.7B](https://huggingface.co/yang-z/CodeV-DS-6.7B) |
| 7B   | [codellama/CodeLlama-7b-Python-hf](https://huggingface.co/codellama/CodeLlama-7b-Python-hf)         | [yang-z/CodeV-CL-7B](https://huggingface.co/yang-z/CodeV-CL-7B)      |
| 7B   | [Qwen/CodeQwen1.5-7B-Chat](https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat)                         | [yang-z/CodeV-QW-7B](https://huggingface.co/yang-z/CodeV-QW-7B)      |
| 7B   | [Qwen/Qwen2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B)                         | [yang-z/CodeV-QC-7B](https://huggingface.co/yang-z/CodeV-QC-7B)      |
| 6.7B | [deepseek-ai/deepseek-coder-6.7b-base](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base)               | [yang-z/CodeV-All-DSC](https://huggingface.co/yang-z/CodeV-All-DSC)      |
| 7B   | [codellama/CodeLlama-7b-Python-hf](https://huggingface.co/codellama/CodeLlama-7b-Python-hf)                       | [yang-z/CodeV-All-CL](https://huggingface.co/yang-z/CodeV-All-CL)      |
| 7B   |[Qwen/CodeQwen1.5-7B-Chat](https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat)                         | [yang-z/CodeV-All-CQ](https://huggingface.co/yang-z/CodeV-All-CQ)      |
| 7B   |[Qwen/Qwen2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B)                              | [yang-z/CodeV-All-QC](https://huggingface.co/yang-z/CodeV-All-QC)      |

## Test

If you want to test the generation capability of existing models on Verilog, you need to install the [VerilogEval](https://github.com/NVlabs/verilog-eval) and [RTLLM](https://github.com/hkust-zhiyao/rtllm) environments.

## Quick Start

```python
from transformers import pipeline

import torch



prompt= "FILL IN THE QUESTION"



generator = pipeline(

  model="CODEV",

  task="text-generation",

  torch_dtype=torch.bfloat16,

  device_map="auto",

)



result = generator(prompt , max_length=2048, num_return_sequences=1, temperature=0.0)

response = result[0]["generated_text"]

print("Response:", response)
```
### Usage Recommendations
1. The template of chat task
   
The goal of the Chat task is to generate complete Verilog or Chisel code from natural language descriptions. The input includes natural language descriptions and optional module headers, while the output is the corresponding HDL code.
```
<LanguageTag>
[Natural Language Description]
[Optional Module Header]
```
2. The template of FIM task

The goal of the FIM task is to fill in the missing parts of the code, generating the middle code based on the prefix and suffix. The input includes language tags, prefix, suffix, and special FIM markers, while the output is the missing middle code snippet.
````
[PRE]```[verilog/scala]
<LanguageTag>
{prefix}[SUF]{suffix}[MID]
````
It is recommended to use our template during inference.
## Run CodeV-All Models with Twinny

The instructions below use `codev-all-qc` as an example. For other models, please make corresponding adjustments.

### Install Ollama

Refer to the [official documentation](https://github.com/ollama/ollama/tree/main/docs).

### Import a Model in Ollama

#### Create a Modelfile

Create a file named `Modelfile` and fill it with the following content:

```
from path/to/codev-all-qc

TEMPLATE """{{ .Prompt }}"""

PARAMETER stop "```"
```

Replace `path/to/codev-all-qc` with the actual path to your model. You can also customize parameters (e.g., temperature). See the [Modelfile Reference](https://github.com/ollama/ollama/blob/main/docs/modelfile.md) for details.

#### Import CodeV-ALL

Start the Ollama service:

```
ollama serve
```

Create the model:

```
ollama create codev-all-qc -f path/to/Modelfile
```

Repace `path/to/Modelfile` with the actual path to your Modelfile. Wait for the model creation process to complete. 

### **Twinny Setup**

#### Install Twinny

Open VS Code and install Twinny in the Extensions Marketplace. 

<img src="./assets/image-20250912155617922.png" alt="image-20250912155617922" style="zoom: 35%;" />

#### Twinny Configuration

Open the FIM Configuration page.

<img src="./assets/7449b0e6ac2ff722339b7c74f37a8b0e.png" alt="7449b0e6ac2ff722339b7c74f37a8b0e" style="zoom:33%;" />

Enter the settings as shown below. The model name should match the one used during `ollama create`. Modify the hostname according to your setup (if Ollama is running on a different node, use that node’s IP address; for local use, use `0.0.0.0`). Click Save.

<img src="./assets/image-20250912160402939.png" alt="image-20250912160402939" style="zoom: 35%;" />

Go to Template Configuration and open the template editor.

<img src="./assets/image-20250912160957699.png" alt="image-20250912160957699" style="zoom: 35%;" />

Open `fim.hbs`, replace its content with the following, and save:

```
<|fim_prefix|>```verilog\n<verilog>{{{prefix}}}<|fim_suffix|>{{{suffix}}}<|fim_middle|>
```

<img src="./assets/image-20250912160901631.png" alt="image-20250912160901631" style="zoom: 33%;" />

Finally, ensure the Fim option is checked in the template settings. Note: you may need to re-enable this each time VS Code restarts.

<img src="./assets/bd1fc20b0075656ba4e5321523832e19.png" alt="bd1fc20b0075656ba4e5321523832e19" style="zoom:35%;" />

#### Try FIM

You can now try FIM while writing code in VS Code. Note: The first time you use completion, Ollama will load the model, which may cause a significant delay.

<img src="./assets/image-20250225124004805.png" alt="image-20250225124004805" style="zoom: 67%;" />

## Paper
**Arxiv:** <https://arxiv.org/abs/2407.10424>

Please cite the paper if you use the models from CodeV.

```
@misc{zhao2025codevempoweringllmshdl,
      title={CodeV: Empowering LLMs with HDL Generation through Multi-Level Summarization}, 
      author={Yang Zhao and Di Huang and Chongxiao Li and Pengwei Jin and Muxin Song and Yinan Xu and Ziyuan Nan and Mingju Gao and Tianyun Ma and Lei Qi and Yansong Pan and Zhenxing Zhang and Rui Zhang and Xishan Zhang and Zidong Du and Qi Guo and Xing Hu},
      year={2025},
      eprint={2407.10424},
      archivePrefix={arXiv},
      primaryClass={cs.PL},
      url={https://arxiv.org/abs/2407.10424}, 
}
```
## Acknowledgements

* [Magicoder](https://github.com/ise-uiuc/magicoder): Training code, original datasets and data decontamination
* [DeepSeek-Coder](https://github.com/deepseek-ai/DeepSeek-Coder): Base model for CodeV-DeepSeek
* [CodeLlama](https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/): Base model for CodeLlama
* [CodeQwen](https://github.com/QwenLM/CodeQwen1.5): CodeV-CodeQwen