Instructions to use Ursulalala/HomeGuard-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Ursulalala/HomeGuard-8B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="Ursulalala/HomeGuard-8B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("Ursulalala/HomeGuard-8B")
model = AutoModelForImageTextToText.from_pretrained("Ursulalala/HomeGuard-8B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Ursulalala/HomeGuard-8B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Ursulalala/HomeGuard-8B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Ursulalala/HomeGuard-8B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/Ursulalala/HomeGuard-8B

SGLang

How to use Ursulalala/HomeGuard-8B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Ursulalala/HomeGuard-8B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Ursulalala/HomeGuard-8B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Ursulalala/HomeGuard-8B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Ursulalala/HomeGuard-8B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use Ursulalala/HomeGuard-8B with Docker Model Runner:
```
docker model run hf.co/Ursulalala/HomeGuard-8B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

HomeGuard-8B

HomeGuard-8B is an 8B-parameter vision-language safeguard model for identifying contextual risk in household tasks. It is introduced in the paper HomeGuard: VLM-based Embodied Safeguard for Identifying Contextual Risk in Household Task and is designed to help embodied agents detect subtle, implicit hazards that arise from environmental context rather than explicit malicious intent.

This checkpoint corresponds to the 8B step-RFT model used in the HomeGuard framework. It is built on top of Qwen3-VL-8B-Thinking and further optimized for grounded household risk reasoning with reinforcement fine-tuning.

Model Summary

HomeGuard focuses on scenarios where a seemingly benign instruction becomes unsafe because of object attributes, spatial relations, or latent environmental conditions.

Compared with generic VLMs, HomeGuard is specialized for:

contextual risk identification
grounded multimodal safety reasoning
safety-aware support for downstream planning and trajectory generation

Training Recipe

This model is derived from Qwen3-VL-8B-Thinking and trained within the HomeGuard pipeline.

Training setup summarized from the released training configuration:

Base model: Qwen/Qwen3-VL-8B-Thinking
Training stage: step-level RFT + GRPO-style optimization in the HomeGuard pipeline
Training data: HomeSafe

Intended Use

HomeGuard-8B is intended for research and development on:

safety assessment for embodied agents
contextual risk identification in household tasks
grounded VLM reasoning with visual context
safe planning and downstream robotics pipelines

Usage

This repository contains the inference-ready model weights and tokenizer assets. A typical Transformers loading pattern is:

from transformers import AutoProcessor, Qwen3VLForConditionalGeneration

model_id = "Ursulalala/HomeGuard-8B"
processor = AutoProcessor.from_pretrained(model_id)
model = Qwen3VLForConditionalGeneration.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
)

For full prompting, evaluation, and application examples, please refer to the HomeGuard project repository.

Resources

Paper: HomeGuard: VLM-based Embodied Safeguard for Identifying Contextual Risk in Household Task
Code: AI45Lab/HomeGuard
Dataset: Ursulalala/HomeSafe
Base model: Qwen/Qwen3-VL-8B-Thinking

Citation

If you use this model, please cite the HomeGuard paper:

@article{lu2026homeguard,
  title={HomeGuard: VLM-based Embodied Safeguard for Identifying Contextual Risk in Household Task},
  author={Lu, Xiaoya and Zhou, Yijin and Chen, Zeren and Wang, Ruocheng and Sima, Bingrui and Zhou, Enshen and Sheng, Lu and Liu, Dongrui and Shao, Jing},
  journal={arXiv preprint arXiv:2603.14367},
  year={2026}
}