Instructions to use Ursulalala/HomeGuard-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Ursulalala/HomeGuard-8B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="Ursulalala/HomeGuard-8B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Ursulalala/HomeGuard-8B") model = AutoModelForImageTextToText.from_pretrained("Ursulalala/HomeGuard-8B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Ursulalala/HomeGuard-8B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Ursulalala/HomeGuard-8B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Ursulalala/HomeGuard-8B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/Ursulalala/HomeGuard-8B
- SGLang
How to use Ursulalala/HomeGuard-8B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Ursulalala/HomeGuard-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Ursulalala/HomeGuard-8B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Ursulalala/HomeGuard-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Ursulalala/HomeGuard-8B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use Ursulalala/HomeGuard-8B with Docker Model Runner:
docker model run hf.co/Ursulalala/HomeGuard-8B
HomeGuard-8B
HomeGuard-8B is an 8B-parameter vision-language safeguard model for identifying contextual risk in household tasks. It is introduced in the paper HomeGuard: VLM-based Embodied Safeguard for Identifying Contextual Risk in Household Task and is designed to help embodied agents detect subtle, implicit hazards that arise from environmental context rather than explicit malicious intent.
This checkpoint corresponds to the 8B step-RFT model used in the HomeGuard framework. It is built on top of Qwen3-VL-8B-Thinking and further optimized for grounded household risk reasoning with reinforcement fine-tuning.
Model Summary
HomeGuard focuses on scenarios where a seemingly benign instruction becomes unsafe because of object attributes, spatial relations, or latent environmental conditions.
Compared with generic VLMs, HomeGuard is specialized for:
- contextual risk identification
- grounded multimodal safety reasoning
- safety-aware support for downstream planning and trajectory generation
Training Recipe
This model is derived from Qwen3-VL-8B-Thinking and trained within the HomeGuard pipeline.
Training setup summarized from the released training configuration:
- Base model:
Qwen/Qwen3-VL-8B-Thinking - Training stage: step-level RFT + GRPO-style optimization in the HomeGuard pipeline
- Training data: HomeSafe
Intended Use
HomeGuard-8B is intended for research and development on:
- safety assessment for embodied agents
- contextual risk identification in household tasks
- grounded VLM reasoning with visual context
- safe planning and downstream robotics pipelines
Usage
This repository contains the inference-ready model weights and tokenizer assets. A typical Transformers loading pattern is:
from transformers import AutoProcessor, Qwen3VLForConditionalGeneration
model_id = "Ursulalala/HomeGuard-8B"
processor = AutoProcessor.from_pretrained(model_id)
model = Qwen3VLForConditionalGeneration.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto",
)
For full prompting, evaluation, and application examples, please refer to the HomeGuard project repository.
Resources
- Paper: HomeGuard: VLM-based Embodied Safeguard for Identifying Contextual Risk in Household Task
- Code: AI45Lab/HomeGuard
- Dataset: Ursulalala/HomeSafe
- Base model: Qwen/Qwen3-VL-8B-Thinking
Citation
If you use this model, please cite the HomeGuard paper:
@article{lu2026homeguard,
title={HomeGuard: VLM-based Embodied Safeguard for Identifying Contextual Risk in Household Task},
author={Lu, Xiaoya and Zhou, Yijin and Chen, Zeren and Wang, Ruocheng and Sima, Bingrui and Zhou, Enshen and Sheng, Lu and Liu, Dongrui and Shao, Jing},
journal={arXiv preprint arXiv:2603.14367},
year={2026}
}
- Downloads last month
- 6