Spaces:

GoJulyAI
/

datasets_overview

Running

App Files Files Community

datasets_overview / README.md

Yang Chung

Update with correct dataset links

6c2bd88 9 days ago

preview code

raw

history blame contribute delete

4.1 kB

	---
	title: AI Safety Datasets Overview
	emoji: 🛡️
	colorFrom: red
	colorTo: pink
	sdk: static
	pinned: false
	license: cc-by-nc-4.0
	short_description: AI safety datasets with adversarial conversations
	tags:
	- safety
	- adversarial
	- red-teaming
	- ai-safety
	- multi-turn
	- synthetic
	datasets:
	- GoJulyAI/multi-turn-conversations
	- GoJulyAI/multi-turn-bio-transformed-synth-conversations-v1
	- GoJulyAI/multi-turn-bio-transformed-synth-conversations-v2
	---

	# 🛡️ AI Safety Datasets Collection

	Comprehensive evaluation datasets for testing AI model safety mechanisms

	## 📊 Dataset Collection Summary

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Total Conversations \| 849+ \|
	\| Total Turns \| 6,694+ \|
	\| Dataset Types \| 3 complementary methodologies \|
	\| Sample Data Available \| 150 free conversations \|

	## 📈 Full Dataset Statistics

	\| Dataset \| Conversations \| Turns \| Avg Turns/Conv \| Focus \|
	\|---------\|--------------\|-------\|----------------\|--------\|
	\| Psychology multi-turn \| 184+ \| 1,964+ \| 10.3 \| Psychology harmfulness such as self-harm, psychosis, anthropomorphism, etc. \|
	\| Illicit (bioweapon) multi-turn \| 84+ \| 822+ \| 9.8 \| Bio-safety harmfulness such as bioweapons, pathogens, etc. \|
	\| Illicit (chemical, general) multi-turn \| 581+ \| 3,908+ \| 6.7 \| Non-bio safety harmfulness such as chemical weapons, cyber threats, etc. \|

	## 🔗 Access Datasets on Hugging Face

	### Psychology Multi-turn Conversations
	Psychology harmfulness such as self-harm, psychosis, anthropomorphism, etc.
	Sample: 5 conversations

	🔗 [View Dataset](https://huggingface.co/datasets/GoJulyAI/psychology-multi-turn)

	### Illicit (bioweapon) Multi-turn Conversations
	Bio-safety harmfulness such as bioweapons, pathogens, etc.
	Sample: 5 conversations

	🔗 [View Dataset](https://huggingface.co/datasets/GoJulyAI/illicit-bio-multi-turn/)

	### Illicit (chemical, general) Multi-turn Conversations
	Non-bio safety harmfulness such as chemical weapons, cyber threats, etc.
	Sample: 5 conversations

	🔗 [View Dataset](https://huggingface.co/datasets/GoJulyAI/illicit-general-multi-turn)

	## ⚠️ Ethical Considerations

	⚠️ IMPORTANT: These datasets contain successful adversarial attacks and harmful content.

	### ✅ Intended Use
	- Defensive security research
	- AI safety evaluation and improvement
	- Academic research on adversarial robustness
	- Training safety and moderation systems

	### ❌ Prohibited Use
	- Creating offensive content
	- Developing attack tools for malicious purposes
	- Bypassing safety systems for harm
	- Any use that violates laws or ethical guidelines

	## 🎯 Data Selection Process

	All datasets are derived from high-quality, validated conversations with strict quality filters including NeurIPS evaluation protocols.

	### Base Criteria
	- Text-based objectives (no code execution templates)
	- Verdict: `success` (harmful requests successfully fulfilled)
	- Multi-turn conversations with prompt-response pairs

	### Psychology-Specific Criteria
	- Organic conversations (`organicity = true`)
	- Successfully elicited harmful psychology-related content

	### Illicit-Specific Criteria
	- Contains specific instruction details
	- Practically executable (not abstract)
	- Successfully elicited harmful illicit-related content

	## 📄 License

	Sample datasets are released under CC-BY-NC-4.0 (Creative Commons Attribution-NonCommercial 4.0 International).

	- ✅ Use for research and evaluation
	- ✅ Modify and build upon the data
	- ✅ Share with attribution
	- ❌ Commercial use without separate licensing

	## 💼 Full Dataset Access

	The sample datasets provide representative examples. Full datasets contain thousands of additional conversations with expanded harm categories and regular updates.

	Please contact us at [[email protected]](mailto:[email protected]) to purchase any or all of full datasets.

	Include your research objectives, institutional affiliation, and intended use in your inquiry.

	---

	Last Updated: December 2, 2025

	For detailed documentation, visit the individual dataset repositories on Hugging Face.