Gradio

🥇 TOFU Leaderboard

The TOFU dataset is a benchmark designed to evaluate the unlearning performance of large language models in realistic scenarios. This unique dataset consists of question-answer pairs that are based on the autobiographies of 200 fictitious authors, entirely generated by the GPT-4 model. The primary objective of this task is to effectively unlearn a fine-tuned model using different portions of the forget set. Read more at https://locuslab.github.io/tofu/.

Method	Submitted By	Epoch	Model Utility	Forget Quality	ROUGE Real Authors	Truth Ratio Real Authors	Prob. Real Authors	ROUGE Real World	Truth Ratio Real World	Prob. Real World	ROUGE Retain	Truth Ratio Retain	Prob. Retain	ROUGE Forget	Truth Ratio Forget	Prob. Forget
Finetune Model (WD = 0.01)	Baseline	-1	0.0740776537502228	1.0747499261825833e-13	0.1728333333333333	0.5706808914420933	0.5059025919974256	0.8974358974358975	0.5441982317366325	0.4143263304391235	0.9758111850816836	0.4709771340245287	1.2049720751682384e-32	0.4082436195223163	0.6740192657877031	1.3830246365938358e-23

Method

Submitted By

Epoch

Model Utility

Forget Quality

ROUGE Real Authors

Truth Ratio Real Authors

Prob. Real Authors

ROUGE Real World

Truth Ratio Real World

Prob. Real World

ROUGE Retain

Truth Ratio Retain

Prob. Retain

ROUGE Forget

Truth Ratio Forget

Prob. Forget

Finetune Model (WD = 0.01)

Baseline

-1

0.0740776537502228

1.0747499261825833e-13

0.1728333333333333

0.5706808914420933

0.5059025919974256

0.8974358974358975

0.5441982317366325

0.4143263304391235

0.9758111850816836

0.4709771340245287

1.2049720751682384e-32

0.4082436195223163

0.6740192657877031

1.3830246365938358e-23

Method	Submitted By	Epoch	Model Utility	Forget Quality	ROUGE Real Authors	Truth Ratio Real Authors	Prob. Real Authors	ROUGE Real World	Truth Ratio Real World	Prob. Real World	ROUGE Retain	Truth Ratio Retain	Prob. Retain	ROUGE Forget	Truth Ratio Forget	Prob. Forget
Retain Model (WD = 0.01)	Baseline	-1	0.613744995233942	1	0.923	0.5706808914420933	0.434087175400151	0.8974358974358975	0.5441982317366325	0.4143263304391235	0.9758111850816836	0.4709771340245287	0.988896727508374	0.4082436195223163	0.6740192657877031	0.1475628122566954
Grad. Diff. (WD = 0.01)	Baseline	2	0.0740776537502228	0.0003339706975929	0.1728333333333333	0.6423979530486502	0.5059025919974256	0.6548433048433049	0.6295666352270255	0.4655615838599209	0.0870156615843273	0.4633503218832769	0.010722586958864	0.0668923800783858	0.5355421828523081	0.0003759038991735
Pref. Opt. (WD = 0.01)	Baseline	2	0.3512721525289908	1.0747499261825833e-13	0.1039999999999999	0.5059030972392077	0.4028677904694676	0.6638176638176637	0.4790720066598873	0.3912257779575284	0.5211568349191114	0.42691351412792	0.8804135552213584	0.0483068824382606	0.5800607255926354	0.8276830612396768
Pref. Opt. (WD = 0.01)	Baseline	1	0.2558894244122359	7.382393651222541e-15	0.1886666666666666	0.5684018498993254	0.4395374701560491	0.7307692307692307	0.5278176097731178	0.433563820244969	0.0602643669032138	0.412807120748979	0.7992467555279746	0.0094441551186385	0.6046942846055752	0.7694941444129995
Grad. Ascent (WD = 0.01)	Baseline	1	0.6269273964348965	2.194274302189124e-16	0.943	0.6591336231061837	0.505168359434624	0.9216524216524216	0.6085091204510102	0.478685379756773	0.6967473664683093	0.4295557937069344	0.8210499032114303	0.59040885036102	0.5949198076291524	0.7389353978950747
KL Min. (WD = 0.01)	Baseline	1	0.6304735070793996	1.0619204030347364e-16	0.943	0.6583247647991428	0.5044385586857676	0.9109686609686608	0.6074096882352636	0.478119760287955	0.7232456444580733	0.4343367336828946	0.8385607306197629	0.6127124726456656	0.5896249303856661	0.754144674131171
Pref. Opt. (WD = 0.01)	Baseline	4	0.5389479912216734	1.0619204030347364e-16	0.5446666666666666	0.5068122645039701	0.405404689693547	0.8304843304843305	0.4850306505721883	0.3924823244222029	0.7452735492064857	0.4548787502012498	0.9349978059277404	0.0917418033267073	0.5476428474219546	0.8493405039270879
Pref. Opt. (WD = 0.01)	Baseline	5	0.5356521285408038	1.0619204030347364e-16	0.5046666666666667	0.5058335484988348	0.4039233875785541	0.8475783475783475	0.482302382394082	0.3908568584903417	0.7712388184133889	0.4550135942990296	0.9418515334298316	0.0565269012889234	0.5469937132807363	0.8480107162631808
Pref. Opt. (WD = 0.01)	Baseline	3	0.526714283710676	1.0619204030347364e-16	0.5613333333333334	0.4980178657889795	0.3998805654069662	0.8304843304843305	0.4773895327946352	0.3885011303782734	0.6177300729189973	0.4512840621257767	0.9180065658746102	0.122411309310708	0.5555959379029551	0.8500111371625925
Grad. Diff. (WD = 0.01)	Baseline	3	0.5185060305234307	5.100356615560927e-17	0.6859999999999999	0.7636217588364563	0.6063370989010282	0.8162393162393162	0.65074307359923	0.4930695237013233	0.3213876620463802	0.5052906001871599	0.3267413314089894	0.003430819078085	0.7895774621500501	1.3830246365938358e-23
Grad. Diff. (WD = 0.01)	Baseline	1	0.6282003341510932	2.1857531667946302e-20	0.913	0.6164888867223018	0.4736271415687449	0.8888888888888888	0.551145594485676	0.4370257029777569	0.9094457692299228	0.4694987259572409	0.9608036367802	0.757231894245776	0.5307776556603724	0.8870737428383404
Finetune Model (WD = 0.01)	Baseline	-1	0.6226773637427153	1.834066410994743e-21	0.933	0.5962289157077106	0.4554820778376288	0.8824786324786325	0.5390328416393139	0.4185618075594897	0.9856545937933034	0.4746985880106376	0.9895272273969072	0.985449693916369	0.5159854212808592	0.9909385566403212
Grad. Diff. (WD = 0.01)	Baseline	4	0.5822698363612353	1.042764202128099e-23	0.8013333333333332	0.7451299751459344	0.5837382529250267	0.8888888888888888	0.6228707709540217	0.4723969792394359	0.4450434199989012	0.4950364944867823	0.4903574184702363	0.003430819078085	0.8161530228616595	1.1513781349839877e-22
Grad. Diff. (WD = 0.01)	Baseline	5	0.587217121099408	2.8288467798247926e-25	0.8073333333333332	0.7298685519957755	0.563388139474065	0.8888888888888888	0.6061317353753678	0.4636778477857274	0.4649649961594466	0.4920855321780907	0.5469097745811741	0.0024437389747258	0.8220082189318875	1.470487341086269e-31
Grad. Ascent (WD = 0.01)	Baseline	5	0	1.4334261764269554e-22	0	0.3680370239716012	0.2320940116256572	0	0.3829973318695422	0.2393206807027449	0	0.0763287756708789	4.927488202135739e-36	0.0017443373987056	0.8470718835084912	7.700348491069903e-36
Grad. Ascent (WD = 0.01)	Baseline	4	0	6.813596151995703e-39	0	0.3797607456361624	0.2390976121995255	0	0.3799166078023295	0.2467986361500367	0	0.0469579747028106	1.2049720751682384e-32	0.0019426132607746	0.8929100638357892	7.488727584100663e-33
Grad. Ascent (WD = 0.01)	Baseline	3	0	1.415348597545954e-32	0	0.3824854432733555	0.2413009139354225	0	0.3914648206005994	0.2603007453584722	0	0.0444975652237022	2.9947610381115934e-29	0.0020288719332402	0.8856404556865698	1.81826029512634e-27
Grad. Ascent (WD = 0.01)	Baseline	2	0	1.1617214067615611e-31	0	0.1553563080897968	0.2483656568638272	0.0071225071225071	0.1612685736925218	0.2603793674222422	0.0001234567901234	0.0454129375760216	0.00006790053646828093	0.00040259009009	0.9124060717988668	0.00006761970324755308
KL Min. (WD = 0.01)	Baseline	2	0	2.194013320306139e-36	0	0.2750405914586382	0.2841370324906028	0.0527065527065527	0.3633038733434796	0.3452071481772765	0.0038661287905868	0.0396752806466009	0.00007768152001241951	0.0012240016230521	0.9324141399071828	0.00007538987834516267
KL Min. (WD = 0.01)	Baseline	3	0	2.8288467798247926e-25	0	0.4330048483658166	0.2727739697772781	0	0.5057846821528144	0.356455110823806	0	0.0756420395440419	1.8311706089659576e-27	0	0.8450925399429845	8.780599407889728e-28
KL Min. (WD = 0.01)	Baseline	4	0	5.226994065745332e-29	0	0.3857580535369662	0.2771577879790345	0	0.3951193251138081	0.2595325279605469	0	0.0665580331847488	1.8530824390659888e-31	0	0.8664332905428379	9.960029999179614e-32
KL Min. (WD = 0.01)	Baseline	5	0	4.072551448422551e-32	0	0.3858264714903051	0.2728542324830574	0	0.3806645545894216	0.2482291318408188	0	0.0642338539246564	3.8433879342761596e-33	0	0.8642358258380053	1.3237458067136124e-33

Method

Submitted By

Epoch

Model Utility

Forget Quality

ROUGE Real Authors

Truth Ratio Real Authors

Prob. Real Authors

ROUGE Real World

Truth Ratio Real World

Prob. Real World

ROUGE Retain

Truth Ratio Retain

Prob. Retain

ROUGE Forget

Truth Ratio Forget

Prob. Forget

Retain Model (WD = 0.01)

Baseline

-1

0.613744995233942

0.923

0.5706808914420933

0.434087175400151

0.8974358974358975

0.5441982317366325

0.4143263304391235

0.9758111850816836

0.4709771340245287

0.988896727508374

0.4082436195223163

0.6740192657877031

0.1475628122566954

Grad. Diff. (WD = 0.01)

Baseline

0.0740776537502228

0.0003339706975929

0.1728333333333333

0.6423979530486502

0.5059025919974256

0.6548433048433049

0.6295666352270255

0.4655615838599209

0.0870156615843273

0.4633503218832769

0.010722586958864

0.0668923800783858

0.5355421828523081

0.0003759038991735

Pref. Opt. (WD = 0.01)

Baseline

0.3512721525289908

1.0747499261825833e-13

0.1039999999999999

0.5059030972392077

0.4028677904694676

0.6638176638176637

0.4790720066598873

0.3912257779575284

0.5211568349191114

0.42691351412792

0.8804135552213584

0.0483068824382606

0.5800607255926354

0.8276830612396768

Pref. Opt. (WD = 0.01)

Baseline

0.2558894244122359

7.382393651222541e-15

0.1886666666666666

0.5684018498993254

0.4395374701560491

0.7307692307692307

0.5278176097731178

0.433563820244969

0.0602643669032138

0.412807120748979

0.7992467555279746

0.0094441551186385

0.6046942846055752

0.7694941444129995

Grad. Ascent (WD = 0.01)

Baseline

0.6269273964348965

2.194274302189124e-16

0.943

0.6591336231061837

0.505168359434624

0.9216524216524216

0.6085091204510102

0.478685379756773

0.6967473664683093

0.4295557937069344

0.8210499032114303

0.59040885036102

0.5949198076291524

0.7389353978950747

KL Min. (WD = 0.01)

Baseline

0.6304735070793996

1.0619204030347364e-16

0.943

0.6583247647991428

0.5044385586857676

0.9109686609686608

0.6074096882352636

0.478119760287955

0.7232456444580733

0.4343367336828946

0.8385607306197629

0.6127124726456656

0.5896249303856661

0.754144674131171

Pref. Opt. (WD = 0.01)

Baseline

0.5389479912216734

1.0619204030347364e-16

0.5446666666666666

0.5068122645039701

0.405404689693547

0.8304843304843305

0.4850306505721883

0.3924823244222029

0.7452735492064857

0.4548787502012498

0.9349978059277404

0.0917418033267073

0.5476428474219546

0.8493405039270879

Pref. Opt. (WD = 0.01)

Baseline

0.5356521285408038

1.0619204030347364e-16

0.5046666666666667

0.5058335484988348

0.4039233875785541

0.8475783475783475

0.482302382394082

0.3908568584903417

0.7712388184133889

0.4550135942990296

0.9418515334298316

0.0565269012889234

0.5469937132807363

0.8480107162631808

Pref. Opt. (WD = 0.01)

Baseline

0.526714283710676

1.0619204030347364e-16

0.5613333333333334

0.4980178657889795

0.3998805654069662

0.8304843304843305

0.4773895327946352

0.3885011303782734

0.6177300729189973

0.4512840621257767

0.9180065658746102

0.122411309310708

0.5555959379029551

0.8500111371625925

Grad. Diff. (WD = 0.01)

Baseline

0.5185060305234307

5.100356615560927e-17

0.6859999999999999

0.7636217588364563

0.6063370989010282

0.8162393162393162

0.65074307359923

0.4930695237013233

0.3213876620463802

0.5052906001871599

0.3267413314089894

0.003430819078085

0.7895774621500501

1.3830246365938358e-23

Grad. Diff. (WD = 0.01)

Baseline

0.6282003341510932

2.1857531667946302e-20

0.913

0.6164888867223018

0.4736271415687449

0.8888888888888888

0.551145594485676

0.4370257029777569

0.9094457692299228

0.4694987259572409

0.9608036367802

0.757231894245776

0.5307776556603724

0.8870737428383404

Finetune Model (WD = 0.01)

Baseline

-1

0.6226773637427153

1.834066410994743e-21

0.933

0.5962289157077106

0.4554820778376288

0.8824786324786325

0.5390328416393139

0.4185618075594897

0.9856545937933034

0.4746985880106376

0.9895272273969072

0.985449693916369

0.5159854212808592

0.9909385566403212

Grad. Diff. (WD = 0.01)

Baseline

0.5822698363612353

1.042764202128099e-23

0.8013333333333332

0.7451299751459344

0.5837382529250267

0.8888888888888888

0.6228707709540217

0.4723969792394359

0.4450434199989012

0.4950364944867823

0.4903574184702363

0.003430819078085

0.8161530228616595

1.1513781349839877e-22

Grad. Diff. (WD = 0.01)

Baseline

0.587217121099408

2.8288467798247926e-25

0.8073333333333332

0.7298685519957755

0.563388139474065

0.8888888888888888

0.6061317353753678

0.4636778477857274

0.4649649961594466

0.4920855321780907

0.5469097745811741

0.0024437389747258

0.8220082189318875

1.470487341086269e-31

Grad. Ascent (WD = 0.01)

Baseline

1.4334261764269554e-22

0.3680370239716012

0.2320940116256572

0.3829973318695422

0.2393206807027449

0.0763287756708789

4.927488202135739e-36

0.0017443373987056

0.8470718835084912

7.700348491069903e-36

Grad. Ascent (WD = 0.01)

Baseline

6.813596151995703e-39

0.3797607456361624

0.2390976121995255

0.3799166078023295

0.2467986361500367

0.0469579747028106

1.2049720751682384e-32

0.0019426132607746

0.8929100638357892

7.488727584100663e-33

Grad. Ascent (WD = 0.01)

Baseline

1.415348597545954e-32

0.3824854432733555

0.2413009139354225

0.3914648206005994

0.2603007453584722

0.0444975652237022

2.9947610381115934e-29

0.0020288719332402

0.8856404556865698

1.81826029512634e-27

Grad. Ascent (WD = 0.01)

Baseline

1.1617214067615611e-31

0.1553563080897968

0.2483656568638272

0.0071225071225071

0.1612685736925218

0.2603793674222422

0.0001234567901234

0.0454129375760216

0.00006790053646828093

0.00040259009009

0.9124060717988668

0.00006761970324755308

KL Min. (WD = 0.01)

Baseline

2.194013320306139e-36

0.2750405914586382

0.2841370324906028

0.0527065527065527

0.3633038733434796

0.3452071481772765

0.0038661287905868

0.0396752806466009

0.00007768152001241951

0.0012240016230521

0.9324141399071828

0.00007538987834516267

KL Min. (WD = 0.01)

Baseline

2.8288467798247926e-25

0.4330048483658166

0.2727739697772781

0.5057846821528144

0.356455110823806

0.0756420395440419

1.8311706089659576e-27

0.8450925399429845

8.780599407889728e-28

KL Min. (WD = 0.01)

Baseline

5.226994065745332e-29

0.3857580535369662

0.2771577879790345

0.3951193251138081

0.2595325279605469

0.0665580331847488

1.8530824390659888e-31

0.8664332905428379

9.960029999179614e-32

KL Min. (WD = 0.01)

Baseline

4.072551448422551e-32

0.3858264714903051

0.2728542324830574

0.3806645545894216

0.2482291318408188

0.0642338539246564

3.8433879342761596e-33

0.8642358258380053

1.3237458067136124e-33

Quick Links

Website: The landing page for TOFU
arXiv Paper: Detailed information about the TOFU dataset and its significance in unlearning tasks.
GitHub Repository: Access the source code, fine-tuning scripts, and additional resources for the TOFU dataset.
Dataset on Hugging Face: Direct link to download the TOFU dataset.
Leaderboard on Hugging Face Spaces: Current rankings and submissions for the TOFU dataset challenges.
Summary on Twitter: A concise summary and key takeaways from the project.

Applicability 🚀

The dataset is in QA format, making it ideal for use with popular chat models such as Llama2, Mistral, or Qwen. However, it also works for any other large language model. The corresponding code base is written for the Llama2 model, but can be easily adapted to other models.

Installation

conda create -n tofu python=3.10
conda activate tofu
conda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
pip install -r requirements.txt

Loading the Dataset

To load the dataset, use the following code:

from datasets import load_dataset
dataset = load_dataset("locuslab/TOFU","full")