Text Generation
PEFT
Safetensors
security
vulnerability-detection
penetration-testing
code-analysis
cybersecurity
lora
deepseek
conversational
Instructions to use elsiddik/pentest-vulnerability-detector with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use elsiddik/pentest-vulnerability-detector with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-1.3b-instruct") model = PeftModel.from_pretrained(base_model, "elsiddik/pentest-vulnerability-detector") - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| base_model: deepseek-ai/deepseek-coder-1.3b-instruct | |
| tags: | |
| - security | |
| - vulnerability-detection | |
| - penetration-testing | |
| - code-analysis | |
| - cybersecurity | |
| - lora | |
| - deepseek | |
| library_name: peft | |
| pipeline_tag: text-generation | |
| # Pentest Vulnerability Detector | |
| ## Model Description | |
| This is a fine-tuned version of DeepSeek-Coder-1.3B-Instruct, specialized for detecting security vulnerabilities in code. | |
| **Base Model:** deepseek-ai/deepseek-coder-1.3b-instruct | |
| **Training Data:** 440 synthetic vulnerability examples | |
| **Training Method:** LoRA (Low-Rank Adaptation) with 4-bit quantization | |
| **Training Platform:** Google Colab (Free T4 GPU) | |
| ## Capabilities | |
| The model can detect and analyze: | |
| - SQL Injection | |
| - Cross-Site Scripting (XSS) | |
| - Command Injection / RCE | |
| - Insecure Direct Object Reference (IDOR) | |
| - Server-Side Request Forgery (SSRF) | |
| - Authentication Bypass | |
| - Cross-Site Request Forgery (CSRF) | |
| - Path Traversal | |
| ## Training Details | |
| - **Examples:** 440 vulnerability patterns | |
| - **Epochs:** 3 | |
| - **Batch Size:** 2 (with gradient accumulation) | |
| - **Learning Rate:** 2e-4 | |
| - **LoRA Rank:** 8 | |
| - **Quantization:** 4-bit (NF4) | |
| - **Training Time:** ~45-60 minutes on T4 GPU | |
| ## Usage | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| from peft import PeftModel | |
| # Load base model | |
| base_model = "deepseek-ai/deepseek-coder-1.3b-instruct" | |
| model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto") | |
| tokenizer = AutoTokenizer.from_pretrained(base_model) | |
| # Load LoRA adapter | |
| model = PeftModel.from_pretrained(model, "YOUR_USERNAME/pentest-vulnerability-detector") | |
| # Analyze code | |
| code = "SELECT * FROM users WHERE id = 'user_input'" | |
| prompt = f"System: You are a security expert.\n\nUser: Analyze this code:\n{code}\n\nAssistant:" | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| outputs = model.generate(**inputs, max_new_tokens=200) | |
| response = tokenizer.decode(outputs[0], skip_special_tokens=True) | |
| print(response) | |
| ``` | |
| ## Inference Script | |
| For easier usage, use the provided inference script: | |
| ```bash | |
| python inference_deepseek.py --model ./model --code "YOUR_CODE_HERE" | |
| ``` | |
| ## Model Performance | |
| The model provides: | |
| - Vulnerability type identification | |
| - Severity assessment (CRITICAL/HIGH/MEDIUM/LOW) | |
| - Detailed attack vector analysis | |
| - Specific remediation recommendations | |
| - Code-specific security guidance | |
| ## Limitations | |
| - Not 100% accurate - always verify findings manually | |
| - May have false positives/negatives | |
| - Best used as a pre-screening tool | |
| - Should complement, not replace, manual security testing | |
| - Trained on synthetic data - may need fine-tuning for specific use cases | |
| ## Ethical Use | |
| This model is intended for: | |
| - Security research | |
| - Penetration testing (authorized only) | |
| - Code review and security auditing | |
| - Educational purposes | |
| **Do not use for:** | |
| - Unauthorized system access | |
| - Malicious activities | |
| - Illegal purposes | |
| ## Training Data | |
| The model was trained on 440 synthetic vulnerability examples covering: | |
| - 100 SQL Injection patterns | |
| - 80 XSS patterns | |
| - 60 Command Injection patterns | |
| - 50 IDOR patterns | |
| - 40 SSRF patterns | |
| - 40 Authentication Bypass patterns | |
| - 40 CSRF patterns | |
| - 30 Path Traversal patterns | |
| ## Citation | |
| If you use this model, please cite: | |
| ``` | |
| @misc{pentest-vulnerability-detector, | |
| author = {YOUR_NAME}, | |
| title = {Pentest Vulnerability Detector}, | |
| year = {2025}, | |
| publisher = {Hugging Face}, | |
| howpublished = {\url{https://huggingface.co/YOUR_USERNAME/pentest-vulnerability-detector}} | |
| } | |
| ``` | |
| ## License | |
| This model adapter is released under the **Apache 2.0 License**. | |
| The base model (DeepSeek-Coder-1.3B-Instruct) has its own license terms. | |
| ### Apache 2.0 License Summary: | |
| - ✅ Commercial use allowed | |
| - ✅ Modification allowed | |
| - ✅ Distribution allowed | |
| - ✅ Patent use allowed | |
| - ⚠️ Must include license and copyright notice | |
| - ⚠️ Must state changes made | |
| See LICENSE file for full terms. | |
| ## Contact | |
| For questions or issues, please open an issue on the model repository. | |
| ## Acknowledgments | |
| - Base model: DeepSeek-Coder by DeepSeek AI | |
| - Training framework: Hugging Face Transformers, PEFT | |
| - Training platform: Google Colab | |