File size: 7,585 Bytes

---
license: apache-2.0
language:
- en
metrics:
- accuracy
tags:
- code
arxiv: 2407.10424
---

# CodeV:Empowering LLMs for HDL Generation through Multi-Level Summarization

<img src="assets/overview_v20250413.png" style="zoom:50%;" /> 

CodeV is an innovative series of open-source, instruction-tuned Large Language Models (LLMs) specifically designed for the generation of high-quality HDL code, addressing the challenges faced by existing models in this domain.  **(This repo is under development)** 


## Models and Datasets

|      | Base Model                                                                                          | CodeV                                                                                |
| ---- | --------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
| 6.7B | [deepseek-ai/deepseek-coder-6.7b-base](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base) | [yang-z/CodeV-DS-6.7B](https://huggingface.co/yang-z/CodeV-DS-6.7B) |
| 7B   | [codellama/CodeLlama-7b-Python-hf](https://huggingface.co/codellama/CodeLlama-7b-Python-hf)         | [yang-z/CodeV-CL-7B](https://huggingface.co/yang-z/CodeV-CL-7B)      |
| 7B   | [Qwen/CodeQwen1.5-7B-Chat](https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat)                         | [yang-z/CodeV-QW-7B](https://huggingface.co/yang-z/CodeV-QW-7B)      |
| 7B   | [Qwen/Qwen2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B)                         | [yang-z/CodeV-QC-7B](https://huggingface.co/yang-z/CodeV-QC-7B)      |
| 6.7B | [deepseek-ai/deepseek-coder-6.7b-base](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base)               | [yang-z/CodeV-All-DSC](https://huggingface.co/yang-z/CodeV-All-DSC)      |
| 7B   | [codellama/CodeLlama-7b-Python-hf](https://huggingface.co/codellama/CodeLlama-7b-Python-hf)                       | [yang-z/CodeV-All-CL](https://huggingface.co/yang-z/CodeV-All-CL)      |
| 7B   |[Qwen/CodeQwen1.5-7B-Chat](https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat)                         | [yang-z/CodeV-All-CQ](https://huggingface.co/yang-z/CodeV-All-CQ)      |
| 7B   |[Qwen/Qwen2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B)                              | [yang-z/CodeV-All-QC](https://huggingface.co/yang-z/CodeV-All-QC)      |

## Test

If you want to test the generation capability of existing models on Verilog, you need to install the [VerilogEval](https://github.com/NVlabs/verilog-eval) and [RTLLM](https://github.com/hkust-zhiyao/rtllm) environments.

## Quick Start

```python
from transformers import pipeline

import torch



prompt= "FILL IN THE QUESTION"



generator = pipeline(

  model="CODEV",

  task="text-generation",

  torch_dtype=torch.bfloat16,

  device_map="auto",

)



result = generator(prompt , max_length=2048, num_return_sequences=1, temperature=0.0)

response = result[0]["generated_text"]

print("Response:", response)
```
### Usage Recommendations
1. The template of chat task
   
The goal of the Chat task is to generate complete Verilog or Chisel code from natural language descriptions. The input includes natural language descriptions and optional module headers, while the output is the corresponding HDL code.
```
<LanguageTag>
[Natural Language Description]
[Optional Module Header]
```
2. The template of FIM task

The goal of the FIM task is to fill in the missing parts of the code, generating the middle code based on the prefix and suffix. The input includes language tags, prefix, suffix, and special FIM markers, while the output is the missing middle code snippet.
````
[PRE]```[verilog/scala]
<LanguageTag>
{prefix}[SUF]{suffix}[MID]
````
It is recommended to use our template during inference.
## Run CodeV-All Models with Twinny

The instructions below use `codev-all-qc` as an example. For other models, please make corresponding adjustments.

### Install Ollama

Refer to the [official documentation](https://github.com/ollama/ollama/tree/main/docs).

### Import a Model in Ollama

#### Create a Modelfile

Create a file named `Modelfile` and fill it with the following content:

```
from path/to/codev-all-qc

TEMPLATE """{{ .Prompt }}"""

PARAMETER stop "```"
```

Replace `path/to/codev-all-qc` with the actual path to your model. You can also customize parameters (e.g., temperature). See the [Modelfile Reference](https://github.com/ollama/ollama/blob/main/docs/modelfile.md) for details.

#### Import CodeV-ALL

Start the Ollama service:

```
ollama serve
```

Create the model:

```
ollama create codev-all-qc -f path/to/Modelfile
```

Repace `path/to/Modelfile` with the actual path to your Modelfile. Wait for the model creation process to complete. 

### **Twinny Setup**

#### Install Twinny

Open VS Code and install Twinny in the Extensions Marketplace. 

<img src="./assets/image-20250912155617922.png" alt="image-20250912155617922" style="zoom: 35%;" />

#### Twinny Configuration

Open the FIM Configuration page.

<img src="./assets/7449b0e6ac2ff722339b7c74f37a8b0e.png" alt="7449b0e6ac2ff722339b7c74f37a8b0e" style="zoom:33%;" />

Enter the settings as shown below. The model name should match the one used during `ollama create`. Modify the hostname according to your setup (if Ollama is running on a different node, use that node’s IP address; for local use, use `0.0.0.0`). Click Save.

<img src="./assets/image-20250912160402939.png" alt="image-20250912160402939" style="zoom: 35%;" />

Go to Template Configuration and open the template editor.

<img src="./assets/image-20250912160957699.png" alt="image-20250912160957699" style="zoom: 35%;" />

Open `fim.hbs`, replace its content with the following, and save:

```
<|fim_prefix|>```verilog\n<verilog>{{{prefix}}}<|fim_suffix|>{{{suffix}}}<|fim_middle|>
```

<img src="./assets/image-20250912160901631.png" alt="image-20250912160901631" style="zoom: 33%;" />

Finally, ensure the Fim option is checked in the template settings. Note: you may need to re-enable this each time VS Code restarts.

<img src="./assets/bd1fc20b0075656ba4e5321523832e19.png" alt="bd1fc20b0075656ba4e5321523832e19" style="zoom:35%;" />

#### Try FIM

You can now try FIM while writing code in VS Code. Note: The first time you use completion, Ollama will load the model, which may cause a significant delay.

<img src="./assets/image-20250225124004805.png" alt="image-20250225124004805" style="zoom: 67%;" />

## Paper
**Arxiv:** <https://arxiv.org/abs/2407.10424>

Please cite the paper if you use the models from CodeV.

```
@misc{zhao2025codevempoweringllmshdl,
      title={CodeV: Empowering LLMs with HDL Generation through Multi-Level Summarization}, 
      author={Yang Zhao and Di Huang and Chongxiao Li and Pengwei Jin and Muxin Song and Yinan Xu and Ziyuan Nan and Mingju Gao and Tianyun Ma and Lei Qi and Yansong Pan and Zhenxing Zhang and Rui Zhang and Xishan Zhang and Zidong Du and Qi Guo and Xing Hu},
      year={2025},
      eprint={2407.10424},
      archivePrefix={arXiv},
      primaryClass={cs.PL},
      url={https://arxiv.org/abs/2407.10424}, 
}
```
## Acknowledgements

* [Magicoder](https://github.com/ise-uiuc/magicoder): Training code, original datasets and data decontamination
* [DeepSeek-Coder](https://github.com/deepseek-ai/DeepSeek-Coder): Base model for CodeV-DeepSeek
* [CodeLlama](https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/): Base model for CodeLlama
* [CodeQwen](https://github.com/QwenLM/CodeQwen1.5): CodeV-CodeQwen