Seq2SeqTrainer: enabled must be a bool (got NoneType)

sanchit-gandhi · November 10, 2022, 9:26am

Don’t worry about the use_cache warning, it just means that we cannot use the k,v cache for the attention mechanism with gradient checkpointing. If you want to disable the warning, load the model and then set use_cache to False:

model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
model.config.use_cache = False

The operation of the model is the same with and without cache - we just use cache to speed up decoding. Cache isn’t compatible when we use gradient checkpointing, so it’s disabled by the Trainer and a warning shown instead.

It shouldn’t stay idle for that long - usually this happens when we set group_by_length=True but haven’t specified input_lengths in out prepare_dataset function. Have you modified the prepare_dataset function? Could you make sure the dataset that you pass to trainer has the input_lengths column?
A progress bar should show - you need to set disable_tqdm=False in your training args.

You have a couple of options for running it in the background:

tmux: call tmux and then run Jupyter notebooks from the tmux shell:

tmux new -s mysession
jupyter lab

Then run your shell as normal. The process will continue running even when you close your shell. When you re-open your shell, you can reattach through:

tmux a -t mysession

Check out the docs for more info.

The other option is to export the ipynb notebook as a python script, and then run it using tmux or nohup:
From File → Export Notebook As… in the Jupyter Lab menu select ‘Export Notebook to Executable Script’. This will give you a Python script to download. Then run it using tmux (as above) or nohup:

nohup python fine-tuning-whisper.py

You can open a new window to view the output:

vim nohup.out

The table generates automatically by the Trainer if you perform evaluation over the course of training.
It’s possible. The model checkpoint saved at step 1000 saves in the output directory under /home/sivan/whisper_base_fl_ch/checkpoint-1000
You can load a model checkpoint from the saved checkpoint at step 1000 as follows:

model = WhisperForConditionalGeneration.from_pretrained("/home/sivan/whisper_base_fl_ch/checkpoint-1000")

You can then run a validation step:

from transformers import Seq2SeqTrainingArguments, Seq2SeqTrainer

training_args = Seq2SeqTrainingArguments(
    output_dir="/home/sivan/whisper_base_fl_ch/validation_step",
    do_train=False,
    do_eval=True,
    per_device_eval_batch_size=8,
    predict_with_generate=True,
    generation_max_length=225,
    save_strategy="no",
    report_to=["tensorboard"],
    push_to_hub=False,
    disable_tqdm=False,
)

trainer = Seq2SeqTrainer(
    args=training_args,
    model=model,
    eval_dataset=fleurs_ch["validation"],  # set to your val set
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    tokenizer=processor.feature_extractor,
)

trainer.evaluate()

You can then repeat this for the checkpoints in directories checkpoint-2000, checkpoint-3000 and so on.

Topic		Replies	Views
Compute_metrics do not find tokenizer (whisper finetuning) 🤗Transformers	1	347	March 6, 2024
Eval Loss spike Seq2seq Trainer Resume from Checkpoint 🤗Transformers	0	535	June 22, 2021
[Open-to-the-community] Whisper fine-tuning event Community Calls	31	12281	December 10, 2023
Trainer RuntimeError: The size of tensor a (462) must match the size of tensor b (448) at non-singleton dimension 1 🤗Transformers	17	46321	May 23, 2024
Chapter 3 questions Course	158	10963	December 23, 2025

Seq2SeqTrainer: enabled must be a bool (got NoneType)

Related topics