Accelerate documentation
Stateful Classes
Getting started
Tutorials
OverviewMigrating to π€ AccelerateLaunching distributed codeLaunching distributed training from Jupyter Notebooks
How-To Guides
Start Here!Example ZooHow perform inference on large models with small resourcesPerforming gradient accumulationSaving and loading training statesUsing experiment trackersHow to avoid CUDA Out-of-MemoryHow to use Apple Silicon M1 GPUsHow to use DeepSpeedHow to use Fully Sharded Data ParallelismHow to use Megatron-LMHow to use π€ Accelerate with SageMaker
Concepts and fundamentals
Comparing performance across distributed setupsExecuting and deferring jobsGradient synchronizationTPU best practices
Reference
You are viewing v0.16.0 version. A newer version v1.14.0 is available.
Stateful Classes
Below are variations of a singleton class in the sense that all instances share the same state, which is initialized on the first instantiation.
These classes are immutable and store information about certain configurations or states.
class accelerate.state.AcceleratorState
< source >( mixed_precision: str = None cpu: bool = False dynamo_backend = None deepspeed_plugin = None fsdp_plugin = None megatron_lm_plugin = None _from_accelerator: bool = False **kwargs )
Singleton class that has information about the current training environment.
Available attributes:
- device (
torch.device) β The device to use. - distributed_type (DistributedType) β The type of distributed environment currently in use.
- initialized (
bool) β Whether or not theAcceleratorStatehas been initialized fromAccelerator. - local_process_index (
int) β The index of the current process on the current server. - mixed_precision (
str) β Whether or not the current script will use mixed precision, and if so the type of mixed precision being performed. - num_processes (
int) β The number of processes currently launched in parallel. - process_index (
int) β The index of the current process.
Singleton class that has information related to gradient synchronization for gradient accumulation
Available attributes:
- end_of_dataloader (
bool) β Whether we have reached the end the current dataloader - remainder (
int) β The number of extra samples that were added from padding the dataloader - sync_gradients (
bool) β Whether the gradients should be synced across all devices