Llama-2-7B CloudLex Intent Detection (QLoRA)
Overview
This model is a CloudLex-specific intent classification model fine-tuned from
NousResearch/Llama-2-7b-chat-hf using QLoRA (4-bit) and LoRA adapters.
It is designed to classify user messages into predefined business intents commonly encountered in legal-tech and SaaS customer interactions.
π― Task
Single-label intent classification (SEQ_CLS)
Given a user message, the model predicts one intent from the following set:
- Buying
- Support
- Careers
- Partnership
- Explore
- Others
π§ Training Details
- Base model: NousResearch/Llama-2-7b-chat-hf
- Fine-tuning method: QLoRA (PEFT)
- Quantization: 4-bit NF4
- LoRA rank: 64
- LoRA alpha: 16
- Optimizer: paged_adamw_32bit
- Training data: ~1,000 balanced intent-labeled samples
- Framework: Hugging Face Transformers + PEFT
This repository contains LoRA adapters only, not the full base model.
π How to Use (Inference)
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
from peft import PeftModel
# Load base model
base_model = AutoModelForSequenceClassification.from_pretrained(
"NousResearch/Llama-2-7b-chat-hf",
load_in_4bit=True,
device_map="auto"
)
# Load LoRA adapters
model = PeftModel.from_pretrained(
base_model,
"Suramya/Llama-2-7b-CloudLex-Intent-Detection"
)
tokenizer = AutoTokenizer.from_pretrained(
"Suramya/Llama-2-7b-CloudLex-Intent-Detection"
)
classifier = pipeline(
"text-classification",
model=model,
tokenizer=tokenizer
)
classifier("I'd like to schedule a demo for our law firm")
Model tree for Suramya/Llama-2-7b-CloudLex-Intent-Detection
Base model
NousResearch/Llama-2-7b-chat-hf