# Auto Classes

多くの場合、`from_pretrained()`メソッドに与えられた事前学習済みモデルの名前やパスから、使用したいアーキテクチャを推測することができます。自動クラスはこの仕事をあなたに代わって行うためにここにありますので、事前学習済みの重み/設定/語彙への名前/パスを与えると自動的に関連するモデルを取得できます。

[AutoConfig](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoConfig)、[AutoModel](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel)、[AutoTokenizer](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoTokenizer)のいずれかをインスタンス化すると、関連するアーキテクチャのクラスが直接作成されます。例えば、

```python
model = AutoModel.from_pretrained("google-bert/bert-base-cased")
```

これは[BertModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertModel)のインスタンスであるモデルを作成します。

各タスクごと、そして各バックエンド（PyTorch、TensorFlow、またはFlax）ごとに`AutoModel`のクラスが存在します。

## 自動クラスの拡張

それぞれの自動クラスには、カスタムクラスで拡張するためのメソッドがあります。例えば、`NewModel`というモデルのカスタムクラスを定義した場合、`NewModelConfig`を確保しておけばこのようにして自動クラスに追加することができます：

```python
from transformers import AutoConfig, AutoModel

AutoConfig.register("new-model", NewModelConfig)
AutoModel.register(NewModelConfig, NewModel)
```

その後、通常どおりauto classesを使用することができるようになります！

あなたの`NewModelConfig`が[PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)のサブクラスである場合、その`model_type`属性がコンフィグを登録するときに使用するキー（ここでは`"new-model"`）と同じに設定されていることを確認してください。

同様に、あなたの`NewModel`が[PreTrainedModel](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel)のサブクラスである場合、その`config_class`属性がモデルを登録する際に使用するクラス（ここでは`NewModelConfig`）と同じに設定されていることを確認してください。

## AutoConfig[[transformers.AutoConfig]]

#### transformers.AutoConfig[[transformers.AutoConfig]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/configuration_auto.py#L1207)

This is a generic configuration class that will be instantiated as one of the configuration classes of the library
when created with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoConfig.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoConfig.from_pretrainedhttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/configuration_auto.py#L1230[{"name": "pretrained_model_name_or_path", "val": ": typing.Union[str, os.PathLike[str]]"}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  Can be either:

  - A string, the *model id* of a pretrained model configuration hosted inside a model repo on
    huggingface.co.
  - A path to a *directory* containing a configuration file saved using the
    [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.save_pretrained) method, or the [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) method,
    e.g., `./my_model_directory/`.
  - A path or url to a saved configuration JSON *file*, e.g.,
    `./my_model_directory/configuration.json`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model configuration should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force the (re-)download the model weights and configuration files and override the
  cached versions if they exist.
- **resume_download** --
  Deprecated and ignored. All downloads are now resumed by default when possible.
  Will be removed in v5 of Transformers.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final configuration object.

  If `True`, then this functions returns a `Tuple(config, unused_kwargs)` where *unused_kwargs* is a
  dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the
  part of `kwargs` which has not been used to update `config` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs(additional** keyword arguments, *optional*) --
  The values in kwargs of any keys which are configuration attributes will be used to override the loaded
  values. Behavior concerning key/value pairs whose keys are *not* configuration attributes is controlled
  by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the configuration classes of the library from a pretrained model configuration.

The configuration class to instantiate is selected based on the `model_type` property of the config object that
is loaded, or when it's missing, by falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aimv2** -- `Aimv2Config` (AIMv2 model)
- **aimv2_vision_model** -- `Aimv2VisionConfig` (Aimv2VisionModel model)
- **albert** -- [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) (ALBERT model)
- **align** -- [AlignConfig](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignConfig) (ALIGN model)
- **altclip** -- [AltCLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPConfig) (AltCLIP model)
- **apertus** -- `ApertusConfig` (Apertus model)
- **arcee** -- `ArceeConfig` (Arcee model)
- **aria** -- `AriaConfig` (Aria model)
- **aria_text** -- `AriaTextConfig` (AriaText model)
- **audio-spectrogram-transformer** -- [ASTConfig](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTConfig) (Audio Spectrogram Transformer model)
- **autoformer** -- [AutoformerConfig](/docs/transformers/v4.57.1/ja/model_doc/autoformer#transformers.AutoformerConfig) (Autoformer model)
- **aya_vision** -- `AyaVisionConfig` (AyaVision model)
- **bamba** -- `BambaConfig` (Bamba model)
- **bark** -- [BarkConfig](/docs/transformers/v4.57.1/ja/model_doc/bark#transformers.BarkConfig) (Bark model)
- **bart** -- [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) (BART model)
- **beit** -- [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) (BEiT model)
- **bert** -- [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) (BERT model)
- **bert-generation** -- [BertGenerationConfig](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationConfig) (Bert Generation model)
- **big_bird** -- [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) (BigBird model)
- **bigbird_pegasus** -- [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) (BigBird-Pegasus model)
- **biogpt** -- [BioGptConfig](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptConfig) (BioGpt model)
- **bit** -- [BitConfig](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitConfig) (BiT model)
- **bitnet** -- `BitNetConfig` (BitNet model)
- **blenderbot** -- [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) (Blenderbot model)
- **blenderbot-small** -- [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) (BlenderbotSmall model)
- **blip** -- [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) (BLIP model)
- **blip-2** -- [Blip2Config](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Config) (BLIP-2 model)
- **blip_2_qformer** -- [Blip2QFormerConfig](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2QFormerConfig) (BLIP-2 QFormer model)
- **bloom** -- [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) (BLOOM model)
- **blt** -- `BltConfig` (Blt model)
- **bridgetower** -- [BridgeTowerConfig](/docs/transformers/v4.57.1/ja/model_doc/bridgetower#transformers.BridgeTowerConfig) (BridgeTower model)
- **bros** -- [BrosConfig](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosConfig) (BROS model)
- **camembert** -- [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) (CamemBERT model)
- **canine** -- [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) (CANINE model)
- **chameleon** -- `ChameleonConfig` (Chameleon model)
- **chinese_clip** -- [ChineseCLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPConfig) (Chinese-CLIP model)
- **chinese_clip_vision_model** -- [ChineseCLIPVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPVisionConfig) (ChineseCLIPVisionModel model)
- **clap** -- [ClapConfig](/docs/transformers/v4.57.1/ja/model_doc/clap#transformers.ClapConfig) (CLAP model)
- **clip** -- [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) (CLIP model)
- **clip_text_model** -- [CLIPTextConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTextConfig) (CLIPTextModel model)
- **clip_vision_model** -- [CLIPVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPVisionConfig) (CLIPVisionModel model)
- **clipseg** -- [CLIPSegConfig](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegConfig) (CLIPSeg model)
- **clvp** -- [ClvpConfig](/docs/transformers/v4.57.1/ja/model_doc/clvp#transformers.ClvpConfig) (CLVP model)
- **code_llama** -- `LlamaConfig` (CodeLlama model)
- **codegen** -- [CodeGenConfig](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenConfig) (CodeGen model)
- **cohere** -- `CohereConfig` (Cohere model)
- **cohere2** -- `Cohere2Config` (Cohere2 model)
- **cohere2_vision** -- `Cohere2VisionConfig` (Cohere2Vision model)
- **colpali** -- `ColPaliConfig` (ColPali model)
- **colqwen2** -- `ColQwen2Config` (ColQwen2 model)
- **conditional_detr** -- [ConditionalDetrConfig](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrConfig) (Conditional DETR model)
- **convbert** -- [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) (ConvBERT model)
- **convnext** -- [ConvNextConfig](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextConfig) (ConvNeXT model)
- **convnextv2** -- [ConvNextV2Config](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Config) (ConvNeXTV2 model)
- **cpmant** -- [CpmAntConfig](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntConfig) (CPM-Ant model)
- **csm** -- `CsmConfig` (CSM model)
- **ctrl** -- [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) (CTRL model)
- **cvt** -- [CvtConfig](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtConfig) (CvT model)
- **d_fine** -- `DFineConfig` (D-FINE model)
- **dab-detr** -- `DabDetrConfig` (DAB-DETR model)
- **dac** -- `DacConfig` (DAC model)
- **data2vec-audio** -- [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) (Data2VecAudio model)
- **data2vec-text** -- [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) (Data2VecText model)
- **data2vec-vision** -- [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) (Data2VecVision model)
- **dbrx** -- `DbrxConfig` (DBRX model)
- **deberta** -- [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) (DeBERTa model)
- **deberta-v2** -- [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) (DeBERTa-v2 model)
- **decision_transformer** -- [DecisionTransformerConfig](/docs/transformers/v4.57.1/ja/model_doc/decision_transformer#transformers.DecisionTransformerConfig) (Decision Transformer model)
- **deepseek_v2** -- `DeepseekV2Config` (DeepSeek-V2 model)
- **deepseek_v3** -- `DeepseekV3Config` (DeepSeek-V3 model)
- **deepseek_vl** -- `DeepseekVLConfig` (DeepseekVL model)
- **deepseek_vl_hybrid** -- `DeepseekVLHybridConfig` (DeepseekVLHybrid model)
- **deformable_detr** -- [DeformableDetrConfig](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrConfig) (Deformable DETR model)
- **deit** -- [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) (DeiT model)
- **depth_anything** -- `DepthAnythingConfig` (Depth Anything model)
- **depth_pro** -- `DepthProConfig` (DepthPro model)
- **deta** -- [DetaConfig](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaConfig) (DETA model)
- **detr** -- [DetrConfig](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrConfig) (DETR model)
- **dia** -- `DiaConfig` (Dia model)
- **diffllama** -- `DiffLlamaConfig` (DiffLlama model)
- **dinat** -- [DinatConfig](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatConfig) (DiNAT model)
- **dinov2** -- `Dinov2Config` (DINOv2 model)
- **dinov2_with_registers** -- `Dinov2WithRegistersConfig` (DINOv2 with Registers model)
- **dinov3_convnext** -- `DINOv3ConvNextConfig` (DINOv3 ConvNext model)
- **dinov3_vit** -- `DINOv3ViTConfig` (DINOv3 ViT model)
- **distilbert** -- `DistilBertConfig` (DistilBERT model)
- **doge** -- `DogeConfig` (Doge model)
- **donut-swin** -- `DonutSwinConfig` (DonutSwin model)
- **dots1** -- `Dots1Config` (dots1 model)
- **dpr** -- `DPRConfig` (DPR model)
- **dpt** -- `DPTConfig` (DPT model)
- **edgetam** -- `EdgeTamConfig` (EdgeTAM model)
- **edgetam_video** -- `EdgeTamVideoConfig` (EdgeTamVideo model)
- **edgetam_vision_model** -- `EdgeTamVisionConfig` (EdgeTamVisionModel model)
- **efficientformer** -- `EfficientFormerConfig` (EfficientFormer model)
- **efficientloftr** -- `EfficientLoFTRConfig` (EfficientLoFTR model)
- **efficientnet** -- `EfficientNetConfig` (EfficientNet model)
- **electra** -- `ElectraConfig` (ELECTRA model)
- **emu3** -- `Emu3Config` (Emu3 model)
- **encodec** -- `EncodecConfig` (EnCodec model)
- **encoder-decoder** -- `EncoderDecoderConfig` (Encoder decoder model)
- **eomt** -- `EomtConfig` (EoMT model)
- **ernie** -- `ErnieConfig` (ERNIE model)
- **ernie4_5** -- `Ernie4_5Config` (Ernie4_5 model)
- **ernie4_5_moe** -- `Ernie4_5_MoeConfig` (Ernie4_5_MoE model)
- **ernie_m** -- `ErnieMConfig` (ErnieM model)
- **esm** -- `EsmConfig` (ESM model)
- **evolla** -- `EvollaConfig` (Evolla model)
- **exaone4** -- `Exaone4Config` (EXAONE-4.0 model)
- **falcon** -- `FalconConfig` (Falcon model)
- **falcon_h1** -- `FalconH1Config` (FalconH1 model)
- **falcon_mamba** -- `FalconMambaConfig` (FalconMamba model)
- **fastspeech2_conformer** -- `FastSpeech2ConformerConfig` (FastSpeech2Conformer model)
- **fastspeech2_conformer_with_hifigan** -- `FastSpeech2ConformerWithHifiGanConfig` (FastSpeech2ConformerWithHifiGan model)
- **flaubert** -- `FlaubertConfig` (FlauBERT model)
- **flava** -- `FlavaConfig` (FLAVA model)
- **flex_olmo** -- `FlexOlmoConfig` (FlexOlmo model)
- **florence2** -- `Florence2Config` (Florence2 model)
- **fnet** -- `FNetConfig` (FNet model)
- **focalnet** -- `FocalNetConfig` (FocalNet model)
- **fsmt** -- `FSMTConfig` (FairSeq Machine-Translation model)
- **funnel** -- `FunnelConfig` (Funnel Transformer model)
- **fuyu** -- `FuyuConfig` (Fuyu model)
- **gemma** -- `GemmaConfig` (Gemma model)
- **gemma2** -- `Gemma2Config` (Gemma2 model)
- **gemma3** -- `Gemma3Config` (Gemma3ForConditionalGeneration model)
- **gemma3_text** -- `Gemma3TextConfig` (Gemma3ForCausalLM model)
- **gemma3n** -- `Gemma3nConfig` (Gemma3nForConditionalGeneration model)
- **gemma3n_audio** -- `Gemma3nAudioConfig` (Gemma3nAudioEncoder model)
- **gemma3n_text** -- `Gemma3nTextConfig` (Gemma3nForCausalLM model)
- **gemma3n_vision** -- `Gemma3nVisionConfig` (TimmWrapperModel model)
- **git** -- `GitConfig` (GIT model)
- **glm** -- `GlmConfig` (GLM model)
- **glm4** -- `Glm4Config` (GLM4 model)
- **glm4_moe** -- `Glm4MoeConfig` (Glm4MoE model)
- **glm4v** -- `Glm4vConfig` (GLM4V model)
- **glm4v_moe** -- `Glm4vMoeConfig` (GLM4VMOE model)
- **glm4v_moe_text** -- `Glm4vMoeTextConfig` (GLM4VMOE model)
- **glm4v_text** -- `Glm4vTextConfig` (GLM4V model)
- **glpn** -- `GLPNConfig` (GLPN model)
- **got_ocr2** -- `GotOcr2Config` (GOT-OCR2 model)
- **gpt-sw3** -- `GPT2Config` (GPT-Sw3 model)
- **gpt2** -- `GPT2Config` (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeConfig` (GPTBigCode model)
- **gpt_neo** -- `GPTNeoConfig` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXConfig` (GPT NeoX model)
- **gpt_neox_japanese** -- `GPTNeoXJapaneseConfig` (GPT NeoX Japanese model)
- **gpt_oss** -- `GptOssConfig` (GptOss model)
- **gptj** -- `GPTJConfig` (GPT-J model)
- **gptsan-japanese** -- `GPTSanJapaneseConfig` (GPTSAN-japanese model)
- **granite** -- `GraniteConfig` (Granite model)
- **granite_speech** -- `GraniteSpeechConfig` (GraniteSpeech model)
- **granitemoe** -- `GraniteMoeConfig` (GraniteMoeMoe model)
- **granitemoehybrid** -- `GraniteMoeHybridConfig` (GraniteMoeHybrid model)
- **granitemoeshared** -- `GraniteMoeSharedConfig` (GraniteMoeSharedMoe model)
- **granitevision** -- `LlavaNextConfig` (LLaVA-NeXT model)
- **graphormer** -- `GraphormerConfig` (Graphormer model)
- **grounding-dino** -- `GroundingDinoConfig` (Grounding DINO model)
- **groupvit** -- `GroupViTConfig` (GroupViT model)
- **helium** -- `HeliumConfig` (Helium model)
- **hgnet_v2** -- `HGNetV2Config` (HGNet-V2 model)
- **hiera** -- `HieraConfig` (Hiera model)
- **hubert** -- `HubertConfig` (Hubert model)
- **hunyuan_v1_dense** -- `HunYuanDenseV1Config` (HunYuanDenseV1 model)
- **hunyuan_v1_moe** -- `HunYuanMoEV1Config` (HunYuanMoeV1 model)
- **ibert** -- `IBertConfig` (I-BERT model)
- **idefics** -- `IdeficsConfig` (IDEFICS model)
- **idefics2** -- `Idefics2Config` (Idefics2 model)
- **idefics3** -- `Idefics3Config` (Idefics3 model)
- **idefics3_vision** -- `Idefics3VisionConfig` (Idefics3VisionTransformer model)
- **ijepa** -- `IJepaConfig` (I-JEPA model)
- **imagegpt** -- `ImageGPTConfig` (ImageGPT model)
- **informer** -- `InformerConfig` (Informer model)
- **instructblip** -- `InstructBlipConfig` (InstructBLIP model)
- **instructblipvideo** -- `InstructBlipVideoConfig` (InstructBlipVideo model)
- **internvl** -- `InternVLConfig` (InternVL model)
- **internvl_vision** -- `InternVLVisionConfig` (InternVLVision model)
- **jamba** -- `JambaConfig` (Jamba model)
- **janus** -- `JanusConfig` (Janus model)
- **jetmoe** -- `JetMoeConfig` (JetMoe model)
- **jukebox** -- `JukeboxConfig` (Jukebox model)
- **kosmos-2** -- `Kosmos2Config` (KOSMOS-2 model)
- **kosmos-2.5** -- `Kosmos2_5Config` (KOSMOS-2.5 model)
- **kyutai_speech_to_text** -- `KyutaiSpeechToTextConfig` (KyutaiSpeechToText model)
- **layoutlm** -- `LayoutLMConfig` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2Config` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3Config` (LayoutLMv3 model)
- **led** -- `LEDConfig` (LED model)
- **levit** -- `LevitConfig` (LeViT model)
- **lfm2** -- `Lfm2Config` (Lfm2 model)
- **lfm2_vl** -- `Lfm2VlConfig` (Lfm2Vl model)
- **lightglue** -- `LightGlueConfig` (LightGlue model)
- **lilt** -- `LiltConfig` (LiLT model)
- **llama** -- `LlamaConfig` (LLaMA model)
- **llama4** -- `Llama4Config` (Llama4 model)
- **llama4_text** -- `Llama4TextConfig` (Llama4ForCausalLM model)
- **llava** -- `LlavaConfig` (LLaVa model)
- **llava_next** -- `LlavaNextConfig` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoConfig` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionConfig` (LLaVA-Onevision model)
- **longcat_flash** -- `LongcatFlashConfig` (LongCatFlash model)
- **longformer** -- `LongformerConfig` (Longformer model)
- **longt5** -- `LongT5Config` (LongT5 model)
- **luke** -- `LukeConfig` (LUKE model)
- **lxmert** -- `LxmertConfig` (LXMERT model)
- **m2m_100** -- `M2M100Config` (M2M100 model)
- **mamba** -- `MambaConfig` (Mamba model)
- **mamba2** -- `Mamba2Config` (mamba2 model)
- **marian** -- `MarianConfig` (Marian model)
- **markuplm** -- `MarkupLMConfig` (MarkupLM model)
- **mask2former** -- `Mask2FormerConfig` (Mask2Former model)
- **maskformer** -- `MaskFormerConfig` (MaskFormer model)
- **maskformer-swin** -- `MaskFormerSwinConfig` (MaskFormerSwin model)
- **mbart** -- `MBartConfig` (mBART model)
- **mctct** -- `MCTCTConfig` (M-CTC-T model)
- **mega** -- `MegaConfig` (MEGA model)
- **megatron-bert** -- `MegatronBertConfig` (Megatron-BERT model)
- **metaclip_2** -- `MetaClip2Config` (MetaCLIP 2 model)
- **mgp-str** -- `MgpstrConfig` (MGP-STR model)
- **mimi** -- `MimiConfig` (Mimi model)
- **minimax** -- `MiniMaxConfig` (MiniMax model)
- **ministral** -- `MinistralConfig` (Ministral model)
- **mistral** -- `MistralConfig` (Mistral model)
- **mistral3** -- `Mistral3Config` (Mistral3 model)
- **mixtral** -- `MixtralConfig` (Mixtral model)
- **mlcd** -- `MLCDVisionConfig` (MLCD model)
- **mllama** -- `MllamaConfig` (Mllama model)
- **mm-grounding-dino** -- `MMGroundingDinoConfig` (MM Grounding DINO model)
- **mobilebert** -- `MobileBertConfig` (MobileBERT model)
- **mobilenet_v1** -- `MobileNetV1Config` (MobileNetV1 model)
- **mobilenet_v2** -- `MobileNetV2Config` (MobileNetV2 model)
- **mobilevit** -- `MobileViTConfig` (MobileViT model)
- **mobilevitv2** -- `MobileViTV2Config` (MobileViTV2 model)
- **modernbert** -- `ModernBertConfig` (ModernBERT model)
- **modernbert-decoder** -- `ModernBertDecoderConfig` (ModernBertDecoder model)
- **moonshine** -- `MoonshineConfig` (Moonshine model)
- **moshi** -- `MoshiConfig` (Moshi model)
- **mpnet** -- `MPNetConfig` (MPNet model)
- **mpt** -- `MptConfig` (MPT model)
- **mra** -- `MraConfig` (MRA model)
- **mt5** -- `MT5Config` (MT5 model)
- **musicgen** -- `MusicgenConfig` (MusicGen model)
- **musicgen_melody** -- `MusicgenMelodyConfig` (MusicGen Melody model)
- **mvp** -- `MvpConfig` (MVP model)
- **nat** -- `NatConfig` (NAT model)
- **nemotron** -- `NemotronConfig` (Nemotron model)
- **nezha** -- `NezhaConfig` (Nezha model)
- **nllb-moe** -- `NllbMoeConfig` (NLLB-MOE model)
- **nougat** -- `VisionEncoderDecoderConfig` (Nougat model)
- **nystromformer** -- `NystromformerConfig` (Nyströmformer model)
- **olmo** -- `OlmoConfig` (OLMo model)
- **olmo2** -- `Olmo2Config` (OLMo2 model)
- **olmo3** -- `Olmo3Config` (Olmo3 model)
- **olmoe** -- `OlmoeConfig` (OLMoE model)
- **omdet-turbo** -- `OmDetTurboConfig` (OmDet-Turbo model)
- **oneformer** -- `OneFormerConfig` (OneFormer model)
- **open-llama** -- `OpenLlamaConfig` (OpenLlama model)
- **openai-gpt** -- `OpenAIGPTConfig` (OpenAI GPT model)
- **opt** -- `OPTConfig` (OPT model)
- **ovis2** -- `Ovis2Config` (Ovis2 model)
- **owlv2** -- `Owlv2Config` (OWLv2 model)
- **owlvit** -- `OwlViTConfig` (OWL-ViT model)
- **paligemma** -- `PaliGemmaConfig` (PaliGemma model)
- **parakeet_ctc** -- `ParakeetCTCConfig` (Parakeet model)
- **parakeet_encoder** -- `ParakeetEncoderConfig` (ParakeetEncoder model)
- **patchtsmixer** -- `PatchTSMixerConfig` (PatchTSMixer model)
- **patchtst** -- `PatchTSTConfig` (PatchTST model)
- **pegasus** -- `PegasusConfig` (Pegasus model)
- **pegasus_x** -- `PegasusXConfig` (PEGASUS-X model)
- **perceiver** -- `PerceiverConfig` (Perceiver model)
- **perception_encoder** -- `TimmWrapperConfig` (PerceptionEncoder model)
- **perception_lm** -- `PerceptionLMConfig` (PerceptionLM model)
- **persimmon** -- `PersimmonConfig` (Persimmon model)
- **phi** -- `PhiConfig` (Phi model)
- **phi3** -- `Phi3Config` (Phi3 model)
- **phi4_multimodal** -- `Phi4MultimodalConfig` (Phi4Multimodal model)
- **phimoe** -- `PhimoeConfig` (Phimoe model)
- **pix2struct** -- `Pix2StructConfig` (Pix2Struct model)
- **pixtral** -- `PixtralVisionConfig` (Pixtral model)
- **plbart** -- `PLBartConfig` (PLBart model)
- **poolformer** -- `PoolFormerConfig` (PoolFormer model)
- **pop2piano** -- `Pop2PianoConfig` (Pop2Piano model)
- **prompt_depth_anything** -- `PromptDepthAnythingConfig` (PromptDepthAnything model)
- **prophetnet** -- `ProphetNetConfig` (ProphetNet model)
- **pvt** -- `PvtConfig` (PVT model)
- **pvt_v2** -- `PvtV2Config` (PVTv2 model)
- **qdqbert** -- `QDQBertConfig` (QDQBert model)
- **qwen2** -- `Qwen2Config` (Qwen2 model)
- **qwen2_5_omni** -- `Qwen2_5OmniConfig` (Qwen2_5Omni model)
- **qwen2_5_vl** -- `Qwen2_5_VLConfig` (Qwen2_5_VL model)
- **qwen2_5_vl_text** -- `Qwen2_5_VLTextConfig` (Qwen2_5_VL model)
- **qwen2_audio** -- `Qwen2AudioConfig` (Qwen2Audio model)
- **qwen2_audio_encoder** -- `Qwen2AudioEncoderConfig` (Qwen2AudioEncoder model)
- **qwen2_moe** -- `Qwen2MoeConfig` (Qwen2MoE model)
- **qwen2_vl** -- `Qwen2VLConfig` (Qwen2VL model)
- **qwen2_vl_text** -- `Qwen2VLTextConfig` (Qwen2VL model)
- **qwen3** -- `Qwen3Config` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeConfig` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextConfig` (Qwen3Next model)
- **qwen3_omni_moe** -- `Qwen3OmniMoeConfig` (Qwen3OmniMoE model)
- **qwen3_vl** -- `Qwen3VLConfig` (Qwen3VL model)
- **qwen3_vl_moe** -- `Qwen3VLMoeConfig` (Qwen3VLMoe model)
- **qwen3_vl_moe_text** -- `Qwen3VLMoeTextConfig` (Qwen3VLMoe model)
- **qwen3_vl_text** -- `Qwen3VLTextConfig` (Qwen3VL model)
- **rag** -- `RagConfig` (RAG model)
- **realm** -- `RealmConfig` (REALM model)
- **recurrent_gemma** -- `RecurrentGemmaConfig` (RecurrentGemma model)
- **reformer** -- `ReformerConfig` (Reformer model)
- **regnet** -- `RegNetConfig` (RegNet model)
- **rembert** -- `RemBertConfig` (RemBERT model)
- **resnet** -- `ResNetConfig` (ResNet model)
- **retribert** -- `RetriBertConfig` (RetriBERT model)
- **roberta** -- `RobertaConfig` (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormConfig` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertConfig` (RoCBert model)
- **roformer** -- `RoFormerConfig` (RoFormer model)
- **rt_detr** -- `RTDetrConfig` (RT-DETR model)
- **rt_detr_resnet** -- `RTDetrResNetConfig` (RT-DETR-ResNet model)
- **rt_detr_v2** -- `RTDetrV2Config` (RT-DETRv2 model)
- **rwkv** -- `RwkvConfig` (RWKV model)
- **sam** -- `SamConfig` (SAM model)
- **sam2** -- `Sam2Config` (SAM2 model)
- **sam2_hiera_det_model** -- `Sam2HieraDetConfig` (Sam2HieraDetModel model)
- **sam2_video** -- `Sam2VideoConfig` (Sam2VideoModel model)
- **sam2_vision_model** -- `Sam2VisionConfig` (Sam2VisionModel model)
- **sam_hq** -- `SamHQConfig` (SAM-HQ model)
- **sam_hq_vision_model** -- `SamHQVisionConfig` (SamHQVisionModel model)
- **sam_vision_model** -- `SamVisionConfig` (SamVisionModel model)
- **seamless_m4t** -- `SeamlessM4TConfig` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4Tv2Config` (SeamlessM4Tv2 model)
- **seed_oss** -- `SeedOssConfig` (SeedOss model)
- **segformer** -- `SegformerConfig` (SegFormer model)
- **seggpt** -- `SegGptConfig` (SegGPT model)
- **sew** -- `SEWConfig` (SEW model)
- **sew-d** -- `SEWDConfig` (SEW-D model)
- **shieldgemma2** -- `ShieldGemma2Config` (Shieldgemma2 model)
- **siglip** -- `SiglipConfig` (SigLIP model)
- **siglip2** -- `Siglip2Config` (SigLIP2 model)
- **siglip2_vision_model** -- `Siglip2VisionConfig` (Siglip2VisionModel model)
- **siglip_vision_model** -- `SiglipVisionConfig` (SiglipVisionModel model)
- **smollm3** -- `SmolLM3Config` (SmolLM3 model)
- **smolvlm** -- `SmolVLMConfig` (SmolVLM model)
- **smolvlm_vision** -- `SmolVLMVisionConfig` (SmolVLMVisionTransformer model)
- **speech-encoder-decoder** -- `SpeechEncoderDecoderConfig` (Speech Encoder decoder model)
- **speech_to_text** -- `Speech2TextConfig` (Speech2Text model)
- **speech_to_text_2** -- `Speech2Text2Config` (Speech2Text2 model)
- **speecht5** -- `SpeechT5Config` (SpeechT5 model)
- **splinter** -- `SplinterConfig` (Splinter model)
- **squeezebert** -- `SqueezeBertConfig` (SqueezeBERT model)
- **stablelm** -- `StableLmConfig` (StableLm model)
- **starcoder2** -- `Starcoder2Config` (Starcoder2 model)
- **superglue** -- `SuperGlueConfig` (SuperGlue model)
- **superpoint** -- `SuperPointConfig` (SuperPoint model)
- **swiftformer** -- `SwiftFormerConfig` (SwiftFormer model)
- **swin** -- `SwinConfig` (Swin Transformer model)
- **swin2sr** -- `Swin2SRConfig` (Swin2SR model)
- **swinv2** -- `Swinv2Config` (Swin Transformer V2 model)
- **switch_transformers** -- `SwitchTransformersConfig` (SwitchTransformers model)
- **t5** -- `T5Config` (T5 model)
- **t5gemma** -- `T5GemmaConfig` (T5Gemma model)
- **table-transformer** -- `TableTransformerConfig` (Table Transformer model)
- **tapas** -- `TapasConfig` (TAPAS model)
- **textnet** -- `TextNetConfig` (TextNet model)
- **time_series_transformer** -- `TimeSeriesTransformerConfig` (Time Series Transformer model)
- **timesfm** -- `TimesFmConfig` (TimesFm model)
- **timesformer** -- `TimesformerConfig` (TimeSformer model)
- **timm_backbone** -- `TimmBackboneConfig` (TimmBackbone model)
- **timm_wrapper** -- `TimmWrapperConfig` (TimmWrapperModel model)
- **trajectory_transformer** -- `TrajectoryTransformerConfig` (Trajectory Transformer model)
- **transfo-xl** -- `TransfoXLConfig` (Transformer-XL model)
- **trocr** -- `TrOCRConfig` (TrOCR model)
- **tvlt** -- `TvltConfig` (TVLT model)
- **tvp** -- `TvpConfig` (TVP model)
- **udop** -- `UdopConfig` (UDOP model)
- **umt5** -- `UMT5Config` (UMT5 model)
- **unispeech** -- `UniSpeechConfig` (UniSpeech model)
- **unispeech-sat** -- `UniSpeechSatConfig` (UniSpeechSat model)
- **univnet** -- `UnivNetConfig` (UnivNet model)
- **upernet** -- `UperNetConfig` (UPerNet model)
- **van** -- `VanConfig` (VAN model)
- **vaultgemma** -- `VaultGemmaConfig` (VaultGemma model)
- **video_llava** -- `VideoLlavaConfig` (VideoLlava model)
- **videomae** -- `VideoMAEConfig` (VideoMAE model)
- **vilt** -- `ViltConfig` (ViLT model)
- **vipllava** -- `VipLlavaConfig` (VipLlava model)
- **vision-encoder-decoder** -- `VisionEncoderDecoderConfig` (Vision Encoder decoder model)
- **vision-text-dual-encoder** -- `VisionTextDualEncoderConfig` (VisionTextDualEncoder model)
- **visual_bert** -- `VisualBertConfig` (VisualBERT model)
- **vit** -- `ViTConfig` (ViT model)
- **vit_hybrid** -- `ViTHybridConfig` (ViT Hybrid model)
- **vit_mae** -- `ViTMAEConfig` (ViTMAE model)
- **vit_msn** -- `ViTMSNConfig` (ViTMSN model)
- **vitdet** -- `VitDetConfig` (VitDet model)
- **vitmatte** -- `VitMatteConfig` (ViTMatte model)
- **vitpose** -- `VitPoseConfig` (ViTPose model)
- **vitpose_backbone** -- `VitPoseBackboneConfig` (ViTPoseBackbone model)
- **vits** -- `VitsConfig` (VITS model)
- **vivit** -- `VivitConfig` (ViViT model)
- **vjepa2** -- `VJEPA2Config` (VJEPA2Model model)
- **voxtral** -- `VoxtralConfig` (Voxtral model)
- **voxtral_encoder** -- `VoxtralEncoderConfig` (Voxtral Encoder model)
- **wav2vec2** -- `Wav2Vec2Config` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertConfig` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerConfig` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMConfig` (WavLM model)
- **whisper** -- `WhisperConfig` (Whisper model)
- **xclip** -- `XCLIPConfig` (X-CLIP model)
- **xcodec** -- `XcodecConfig` (X-CODEC model)
- **xglm** -- `XGLMConfig` (XGLM model)
- **xlm** -- `XLMConfig` (XLM model)
- **xlm-prophetnet** -- `XLMProphetNetConfig` (XLM-ProphetNet model)
- **xlm-roberta** -- `XLMRobertaConfig` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLConfig` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetConfig` (XLNet model)
- **xlstm** -- `xLSTMConfig` (xLSTM model)
- **xmod** -- `XmodConfig` (X-MOD model)
- **yolos** -- `YolosConfig` (YOLOS model)
- **yoso** -- `YosoConfig` (YOSO model)
- **zamba** -- `ZambaConfig` (Zamba model)
- **zamba2** -- `Zamba2Config` (Zamba2 model)
- **zoedepth** -- `ZoeDepthConfig` (ZoeDepth model)

Examples:

```python
>>> from transformers import AutoConfig

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased")

>>> # Download configuration from huggingface.co (user-uploaded) and cache.
>>> config = AutoConfig.from_pretrained("dbmdz/bert-base-german-cased")

>>> # If configuration file is in a directory (e.g., was saved using *save_pretrained('./test/saved_model/')*).
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/")

>>> # Load a specific configuration file.
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/my_configuration.json")

>>> # Change some config attributes when loading a pretrained config.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased", output_attentions=True, foo=False)
>>> config.output_attentions
True

>>> config, unused_kwargs = AutoConfig.from_pretrained(
...     "google-bert/bert-base-uncased", output_attentions=True, foo=False, return_unused_kwargs=True
... )
>>> config.output_attentions
True

>>> unused_kwargs
{'foo': False}
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model configuration hosted inside a model repo on huggingface.co. - A path to a *directory* containing a configuration file saved using the [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.save_pretrained) method, or the [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) method, e.g., `./my_model_directory/`. - A path or url to a saved configuration JSON *file*, e.g., `./my_model_directory/configuration.json`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final configuration object.  If `True`, then this functions returns a `Tuple(config, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part of `kwargs` which has not been used to update `config` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs(additional keyword arguments, *optional*) : The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* configuration attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoConfig.register]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/configuration_auto.py#L1386)

Register a new configuration for this class.

**Parameters:**

model_type (`str`) : The model type like "bert" or "gpt".

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The config to register.

## AutoTokenizer[[transformers.AutoTokenizer]]

#### transformers.AutoTokenizer[[transformers.AutoTokenizer]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/tokenization_auto.py#L948)

This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when
created with the [AutoTokenizer.from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoTokenizer.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoTokenizer.from_pretrainedhttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/tokenization_auto.py#L962[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "*inputs", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  Can be either:

  - A string, the *model id* of a predefined tokenizer hosted inside a model repo on huggingface.co.
  - A path to a *directory* containing vocabulary files required by the tokenizer, for instance saved
    using the [save_pretrained()](/docs/transformers/v4.57.1/ja/internal/tokenization_utils#transformers.PreTrainedTokenizerBase.save_pretrained) method, e.g., `./my_model_directory/`.
  - A path or url to a single saved vocabulary file if and only if the tokenizer only requires a
    single vocabulary file (like Bert or XLNet), e.g.: `./my_model_directory/vocab.txt`. (Not
    applicable to all derived classes)
- **inputs** (additional positional arguments, *optional*) --
  Will be passed along to the Tokenizer `__init__()` method.
- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) --
  The configuration object used to determine the tokenizer class to instantiate.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model configuration should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force the (re-)download the model weights and configuration files and override the
  cached versions if they exist.
- **resume_download** --
  Deprecated and ignored. All downloads are now resumed by default when possible.
  Will be removed in v5 of Transformers.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **subfolder** (`str`, *optional*) --
  In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for
  facebook/rag-token-base), specify it here.
- **use_fast** (`bool`, *optional*, defaults to `True`) --
  Use a [fast Rust-based tokenizer](https://huggingface.co/docs/tokenizers/index) if it is supported for
  a given model. If a fast tokenizer is not available for a given model, a normal Python-based tokenizer
  is returned instead.
- **tokenizer_type** (`str`, *optional*) --
  Tokenizer type to be loaded.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs** (additional keyword arguments, *optional*) --
  Will be passed to the Tokenizer `__init__()` method. Can be used to set special tokens like
  `bos_token`, `eos_token`, `unk_token`, `sep_token`, `pad_token`, `cls_token`, `mask_token`,
  `additional_special_tokens`. See parameters in the `__init__()` for more details.0

Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.

The tokenizer class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aimv2** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizerFast) (AIMv2 model)
- **albert** -- [AlbertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertTokenizer) or [AlbertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertTokenizerFast) (ALBERT model)
- **align** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (ALIGN model)
- **arcee** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Arcee model)
- **aria** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Aria model)
- **aya_vision** -- `CohereTokenizerFast` (AyaVision model)
- **bark** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (Bark model)
- **bart** -- [BartTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartTokenizer) or [BartTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartTokenizerFast) (BART model)
- **barthez** -- [BarthezTokenizer](/docs/transformers/v4.57.1/ja/model_doc/barthez#transformers.BarthezTokenizer) or [BarthezTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/barthez#transformers.BarthezTokenizerFast) (BARThez model)
- **bartpho** -- [BartphoTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bartpho#transformers.BartphoTokenizer) (BARTpho model)
- **bert** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (BERT model)
- **bert-generation** -- [BertGenerationTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationTokenizer) (Bert Generation model)
- **bert-japanese** -- [BertJapaneseTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert-japanese#transformers.BertJapaneseTokenizer) (BertJapanese model)
- **bertweet** -- [BertweetTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bertweet#transformers.BertweetTokenizer) (BERTweet model)
- **big_bird** -- [BigBirdTokenizer](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdTokenizer) or [BigBirdTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdTokenizerFast) (BigBird model)
- **bigbird_pegasus** -- `PegasusTokenizer` or `PegasusTokenizerFast` (BigBird-Pegasus model)
- **biogpt** -- [BioGptTokenizer](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptTokenizer) (BioGpt model)
- **bitnet** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (BitNet model)
- **blenderbot** -- [BlenderbotTokenizer](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotTokenizer) or [BlenderbotTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotTokenizerFast) (Blenderbot model)
- **blenderbot-small** -- [BlenderbotSmallTokenizer](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallTokenizer) (BlenderbotSmall model)
- **blip** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (BLIP model)
- **blip-2** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (BLIP-2 model)
- **bloom** -- [BloomTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomTokenizerFast) (BLOOM model)
- **blt** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Blt model)
- **bridgetower** -- `RobertaTokenizer` or `RobertaTokenizerFast` (BridgeTower model)
- **bros** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (BROS model)
- **byt5** -- [ByT5Tokenizer](/docs/transformers/v4.57.1/ja/model_doc/byt5#transformers.ByT5Tokenizer) (ByT5 model)
- **camembert** -- [CamembertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertTokenizer) or [CamembertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertTokenizerFast) (CamemBERT model)
- **canine** -- [CanineTokenizer](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineTokenizer) (CANINE model)
- **chameleon** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Chameleon model)
- **chinese_clip** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (Chinese-CLIP model)
- **clap** -- `RobertaTokenizer` or `RobertaTokenizerFast` (CLAP model)
- **clip** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizerFast) (CLIP model)
- **clipseg** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizerFast) (CLIPSeg model)
- **clvp** -- [ClvpTokenizer](/docs/transformers/v4.57.1/ja/model_doc/clvp#transformers.ClvpTokenizer) (CLVP model)
- **code_llama** -- [CodeLlamaTokenizer](/docs/transformers/v4.57.1/ja/model_doc/code_llama#transformers.CodeLlamaTokenizer) or [CodeLlamaTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/code_llama#transformers.CodeLlamaTokenizerFast) (CodeLlama model)
- **codegen** -- [CodeGenTokenizer](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenTokenizer) or [CodeGenTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenTokenizerFast) (CodeGen model)
- **cohere** -- `CohereTokenizerFast` (Cohere model)
- **cohere2** -- `CohereTokenizerFast` (Cohere2 model)
- **colpali** -- `LlamaTokenizer` or `LlamaTokenizerFast` (ColPali model)
- **colqwen2** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (ColQwen2 model)
- **convbert** -- [ConvBertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertTokenizer) or [ConvBertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertTokenizerFast) (ConvBERT model)
- **cpm** -- [CpmTokenizer](/docs/transformers/v4.57.1/ja/model_doc/cpm#transformers.CpmTokenizer) or [CpmTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/cpm#transformers.CpmTokenizerFast) (CPM model)
- **cpmant** -- [CpmAntTokenizer](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntTokenizer) (CPM-Ant model)
- **csm** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (CSM model)
- **ctrl** -- [CTRLTokenizer](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLTokenizer) (CTRL model)
- **data2vec-audio** -- `Wav2Vec2CTCTokenizer` (Data2VecAudio model)
- **data2vec-text** -- `RobertaTokenizer` or `RobertaTokenizerFast` (Data2VecText model)
- **dbrx** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (DBRX model)
- **deberta** -- [DebertaTokenizer](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaTokenizer) or [DebertaTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaTokenizerFast) (DeBERTa model)
- **deberta-v2** -- [DebertaV2Tokenizer](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Tokenizer) or [DebertaV2TokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2TokenizerFast) (DeBERTa-v2 model)
- **deepseek_v2** -- `LlamaTokenizer` or `LlamaTokenizerFast` (DeepSeek-V2 model)
- **deepseek_v3** -- `LlamaTokenizer` or `LlamaTokenizerFast` (DeepSeek-V3 model)
- **deepseek_vl** -- `LlamaTokenizer` or `LlamaTokenizerFast` (DeepseekVL model)
- **deepseek_vl_hybrid** -- `LlamaTokenizer` or `LlamaTokenizerFast` (DeepseekVLHybrid model)
- **dia** -- `DiaTokenizer` (Dia model)
- **diffllama** -- `LlamaTokenizer` or `LlamaTokenizerFast` (DiffLlama model)
- **distilbert** -- `DistilBertTokenizer` or `DistilBertTokenizerFast` (DistilBERT model)
- **dpr** -- `DPRQuestionEncoderTokenizer` or `DPRQuestionEncoderTokenizerFast` (DPR model)
- **electra** -- `ElectraTokenizer` or `ElectraTokenizerFast` (ELECTRA model)
- **emu3** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (Emu3 model)
- **ernie** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (ERNIE model)
- **ernie4_5** -- `LlamaTokenizerFast` (Ernie4_5 model)
- **ernie4_5_moe** -- `LlamaTokenizerFast` (Ernie4_5_MoE model)
- **ernie_m** -- `ErnieMTokenizer` (ErnieM model)
- **esm** -- `EsmTokenizer` (ESM model)
- **exaone4** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (EXAONE-4.0 model)
- **falcon** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Falcon model)
- **falcon_mamba** -- `GPTNeoXTokenizerFast` (FalconMamba model)
- **fastspeech2_conformer** --  (FastSpeech2Conformer model)
- **flaubert** -- `FlaubertTokenizer` (FlauBERT model)
- **flex_olmo** -- `GPT2TokenizerFast` (FlexOlmo model)
- **fnet** -- `FNetTokenizer` or `FNetTokenizerFast` (FNet model)
- **fsmt** -- `FSMTTokenizer` (FairSeq Machine-Translation model)
- **funnel** -- `FunnelTokenizer` or `FunnelTokenizerFast` (Funnel Transformer model)
- **gemma** -- `GemmaTokenizer` or `GemmaTokenizerFast` (Gemma model)
- **gemma2** -- `GemmaTokenizer` or `GemmaTokenizerFast` (Gemma2 model)
- **gemma3** -- `GemmaTokenizer` or `GemmaTokenizerFast` (Gemma3ForConditionalGeneration model)
- **gemma3_text** -- `GemmaTokenizer` or `GemmaTokenizerFast` (Gemma3ForCausalLM model)
- **gemma3n** -- `GemmaTokenizer` or `GemmaTokenizerFast` (Gemma3nForConditionalGeneration model)
- **gemma3n_text** -- `GemmaTokenizer` or `GemmaTokenizerFast` (Gemma3nForCausalLM model)
- **git** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (GIT model)
- **glm** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (GLM model)
- **glm4** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (GLM4 model)
- **glm4_moe** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Glm4MoE model)
- **glm4v** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (GLM4V model)
- **glm4v_moe** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (GLM4VMOE model)
- **gpt-sw3** -- `GPTSw3Tokenizer` (GPT-Sw3 model)
- **gpt2** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (GPTBigCode model)
- **gpt_neo** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXTokenizerFast` (GPT NeoX model)
- **gpt_neox_japanese** -- `GPTNeoXJapaneseTokenizer` (GPT NeoX Japanese model)
- **gpt_oss** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (GptOss model)
- **gptj** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (GPT-J model)
- **gptsan-japanese** -- `GPTSanJapaneseTokenizer` (GPTSAN-japanese model)
- **granite** -- `GPT2Tokenizer` (Granite model)
- **granitemoe** -- `GPT2Tokenizer` (GraniteMoeMoe model)
- **granitemoehybrid** -- `GPT2Tokenizer` (GraniteMoeHybrid model)
- **granitemoeshared** -- `GPT2Tokenizer` (GraniteMoeSharedMoe model)
- **grounding-dino** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (Grounding DINO model)
- **groupvit** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizerFast) (GroupViT model)
- **helium** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Helium model)
- **herbert** -- `HerbertTokenizer` or `HerbertTokenizerFast` (HerBERT model)
- **hubert** -- `Wav2Vec2CTCTokenizer` (Hubert model)
- **ibert** -- `RobertaTokenizer` or `RobertaTokenizerFast` (I-BERT model)
- **idefics** -- `LlamaTokenizerFast` (IDEFICS model)
- **idefics2** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Idefics2 model)
- **idefics3** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Idefics3 model)
- **instructblip** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (InstructBLIP model)
- **instructblipvideo** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (InstructBlipVideo model)
- **internvl** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (InternVL model)
- **jamba** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Jamba model)
- **janus** -- `LlamaTokenizerFast` (Janus model)
- **jetmoe** -- `LlamaTokenizer` or `LlamaTokenizerFast` (JetMoe model)
- **jukebox** -- `JukeboxTokenizer` (Jukebox model)
- **kosmos-2** -- `XLMRobertaTokenizer` or `XLMRobertaTokenizerFast` (KOSMOS-2 model)
- **kosmos-2.5** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (KOSMOS-2.5 model)
- **layoutlm** -- `LayoutLMTokenizer` or `LayoutLMTokenizerFast` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2Tokenizer` or `LayoutLMv2TokenizerFast` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3Tokenizer` or `LayoutLMv3TokenizerFast` (LayoutLMv3 model)
- **layoutxlm** -- `LayoutXLMTokenizer` or `LayoutXLMTokenizerFast` (LayoutXLM model)
- **led** -- `LEDTokenizer` or `LEDTokenizerFast` (LED model)
- **lilt** -- `LayoutLMv3Tokenizer` or `LayoutLMv3TokenizerFast` (LiLT model)
- **llama** -- `LlamaTokenizer` or `LlamaTokenizerFast` (LLaMA model)
- **llama4** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Llama4 model)
- **llama4_text** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Llama4ForCausalLM model)
- **llava** -- `LlamaTokenizer` or `LlamaTokenizerFast` (LLaVa model)
- **llava_next** -- `LlamaTokenizer` or `LlamaTokenizerFast` (LLaVA-NeXT model)
- **llava_next_video** -- `LlamaTokenizer` or `LlamaTokenizerFast` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlamaTokenizer` or `LlamaTokenizerFast` (LLaVA-Onevision model)
- **longformer** -- `LongformerTokenizer` or `LongformerTokenizerFast` (Longformer model)
- **longt5** -- `T5Tokenizer` or `T5TokenizerFast` (LongT5 model)
- **luke** -- `LukeTokenizer` (LUKE model)
- **lxmert** -- `LxmertTokenizer` or `LxmertTokenizerFast` (LXMERT model)
- **m2m_100** -- `M2M100Tokenizer` (M2M100 model)
- **mamba** -- `GPTNeoXTokenizerFast` (Mamba model)
- **mamba2** -- `GPTNeoXTokenizerFast` (mamba2 model)
- **marian** -- `MarianTokenizer` (Marian model)
- **mbart** -- `MBartTokenizer` or `MBartTokenizerFast` (mBART model)
- **mbart50** -- `MBart50Tokenizer` or `MBart50TokenizerFast` (mBART-50 model)
- **mega** -- `RobertaTokenizer` or `RobertaTokenizerFast` (MEGA model)
- **megatron-bert** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (Megatron-BERT model)
- **metaclip_2** -- `XLMRobertaTokenizer` or `XLMRobertaTokenizerFast` (MetaCLIP 2 model)
- **mgp-str** -- `MgpstrTokenizer` (MGP-STR model)
- **minimax** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (MiniMax model)
- **ministral** -- `MistralCommonTokenizer` (Ministral model)
- **mistral** -- `MistralCommonTokenizer` (Mistral model)
- **mistral3** -- `MistralCommonTokenizer` (Mistral3 model)
- **mixtral** -- `MistralCommonTokenizer` (Mixtral model)
- **mllama** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Mllama model)
- **mluke** -- `MLukeTokenizer` (mLUKE model)
- **mm-grounding-dino** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (MM Grounding DINO model)
- **mobilebert** -- `MobileBertTokenizer` or `MobileBertTokenizerFast` (MobileBERT model)
- **modernbert** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (ModernBERT model)
- **moonshine** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Moonshine model)
- **moshi** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Moshi model)
- **mpnet** -- `MPNetTokenizer` or `MPNetTokenizerFast` (MPNet model)
- **mpt** -- `GPTNeoXTokenizerFast` (MPT model)
- **mra** -- `RobertaTokenizer` or `RobertaTokenizerFast` (MRA model)
- **mt5** -- `MT5Tokenizer` or `MT5TokenizerFast` (MT5 model)
- **musicgen** -- `T5Tokenizer` or `T5TokenizerFast` (MusicGen model)
- **musicgen_melody** -- `T5Tokenizer` or `T5TokenizerFast` (MusicGen Melody model)
- **mvp** -- `MvpTokenizer` or `MvpTokenizerFast` (MVP model)
- **myt5** -- `MyT5Tokenizer` (myt5 model)
- **nemotron** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Nemotron model)
- **nezha** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (Nezha model)
- **nllb** -- `NllbTokenizer` or `NllbTokenizerFast` (NLLB model)
- **nllb-moe** -- `NllbTokenizer` or `NllbTokenizerFast` (NLLB-MOE model)
- **nystromformer** -- [AlbertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertTokenizer) or [AlbertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertTokenizerFast) (Nyströmformer model)
- **olmo** -- `GPTNeoXTokenizerFast` (OLMo model)
- **olmo2** -- `GPTNeoXTokenizerFast` (OLMo2 model)
- **olmo3** -- `GPT2TokenizerFast` (Olmo3 model)
- **olmoe** -- `GPTNeoXTokenizerFast` (OLMoE model)
- **omdet-turbo** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizerFast) (OmDet-Turbo model)
- **oneformer** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizerFast) (OneFormer model)
- **openai-gpt** -- `OpenAIGPTTokenizer` or `OpenAIGPTTokenizerFast` (OpenAI GPT model)
- **opt** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (OPT model)
- **owlv2** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizerFast) (OWLv2 model)
- **owlvit** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizerFast) (OWL-ViT model)
- **paligemma** -- `LlamaTokenizer` or `LlamaTokenizerFast` (PaliGemma model)
- **parakeet** -- `ParakeetCTCTokenizer` (Parakeet model)
- **pegasus** -- `PegasusTokenizer` or `PegasusTokenizerFast` (Pegasus model)
- **pegasus_x** -- `PegasusTokenizer` or `PegasusTokenizerFast` (PEGASUS-X model)
- **perceiver** -- `PerceiverTokenizer` (Perceiver model)
- **persimmon** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Persimmon model)
- **phi** -- [CodeGenTokenizer](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenTokenizer) or [CodeGenTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenTokenizerFast) (Phi model)
- **phi3** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Phi3 model)
- **phimoe** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Phimoe model)
- **phobert** -- `PhobertTokenizer` (PhoBERT model)
- **pix2struct** -- `T5Tokenizer` or `T5TokenizerFast` (Pix2Struct model)
- **pixtral** -- `MistralCommonTokenizer` (Pixtral model)
- **plbart** -- `PLBartTokenizer` (PLBart model)
- **prophetnet** -- `ProphetNetTokenizer` (ProphetNet model)
- **qdqbert** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (QDQBert model)
- **qwen2** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2 model)
- **qwen2_5_omni** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2_5Omni model)
- **qwen2_5_vl** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2_5_VL model)
- **qwen2_audio** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2Audio model)
- **qwen2_moe** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2MoE model)
- **qwen2_vl** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2VL model)
- **qwen3** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3 model)
- **qwen3_moe** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3MoE model)
- **qwen3_next** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3Next model)
- **qwen3_omni_moe** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3OmniMoE model)
- **qwen3_vl** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3VL model)
- **qwen3_vl_moe** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3VLMoe model)
- **rag** -- `RagTokenizer` (RAG model)
- **realm** -- `RealmTokenizer` or `RealmTokenizerFast` (REALM model)
- **recurrent_gemma** -- `GemmaTokenizer` or `GemmaTokenizerFast` (RecurrentGemma model)
- **reformer** -- `ReformerTokenizer` or `ReformerTokenizerFast` (Reformer model)
- **rembert** -- `RemBertTokenizer` or `RemBertTokenizerFast` (RemBERT model)
- **retribert** -- `RetriBertTokenizer` or `RetriBertTokenizerFast` (RetriBERT model)
- **roberta** -- `RobertaTokenizer` or `RobertaTokenizerFast` (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaTokenizer` or `RobertaTokenizerFast` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertTokenizer` (RoCBert model)
- **roformer** -- `RoFormerTokenizer` or `RoFormerTokenizerFast` (RoFormer model)
- **rwkv** -- `GPTNeoXTokenizerFast` (RWKV model)
- **seamless_m4t** -- `SeamlessM4TTokenizer` or `SeamlessM4TTokenizerFast` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4TTokenizer` or `SeamlessM4TTokenizerFast` (SeamlessM4Tv2 model)
- **shieldgemma2** -- `GemmaTokenizer` or `GemmaTokenizerFast` (Shieldgemma2 model)
- **siglip** -- `SiglipTokenizer` (SigLIP model)
- **siglip2** -- `GemmaTokenizer` or `GemmaTokenizerFast` (SigLIP2 model)
- **smollm3** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ja/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (SmolLM3 model)
- **speech_to_text** -- `Speech2TextTokenizer` (Speech2Text model)
- **speech_to_text_2** -- `Speech2Text2Tokenizer` (Speech2Text2 model)
- **speecht5** -- `SpeechT5Tokenizer` (SpeechT5 model)
- **splinter** -- `SplinterTokenizer` or `SplinterTokenizerFast` (Splinter model)
- **squeezebert** -- `SqueezeBertTokenizer` or `SqueezeBertTokenizerFast` (SqueezeBERT model)
- **stablelm** -- `GPTNeoXTokenizerFast` (StableLm model)
- **starcoder2** -- `GPT2Tokenizer` or `GPT2TokenizerFast` (Starcoder2 model)
- **switch_transformers** -- `T5Tokenizer` or `T5TokenizerFast` (SwitchTransformers model)
- **t5** -- `T5Tokenizer` or `T5TokenizerFast` (T5 model)
- **t5gemma** -- `GemmaTokenizer` or `GemmaTokenizerFast` (T5Gemma model)
- **tapas** -- `TapasTokenizer` (TAPAS model)
- **tapex** -- `TapexTokenizer` (TAPEX model)
- **transfo-xl** -- `TransfoXLTokenizer` (Transformer-XL model)
- **tvp** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (TVP model)
- **udop** -- `UdopTokenizer` or `UdopTokenizerFast` (UDOP model)
- **umt5** -- `T5Tokenizer` or `T5TokenizerFast` (UMT5 model)
- **video_llava** -- `LlamaTokenizer` or `LlamaTokenizerFast` (VideoLlava model)
- **vilt** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (ViLT model)
- **vipllava** -- `LlamaTokenizer` or `LlamaTokenizerFast` (VipLlava model)
- **visual_bert** -- [BertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertTokenizerFast) (VisualBERT model)
- **vits** -- `VitsTokenizer` (VITS model)
- **voxtral** -- `MistralCommonTokenizer` (Voxtral model)
- **wav2vec2** -- `Wav2Vec2CTCTokenizer` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2CTCTokenizer` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2CTCTokenizer` (Wav2Vec2-Conformer model)
- **wav2vec2_phoneme** -- `Wav2Vec2PhonemeCTCTokenizer` (Wav2Vec2Phoneme model)
- **whisper** -- `WhisperTokenizer` or `WhisperTokenizerFast` (Whisper model)
- **xclip** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTokenizerFast) (X-CLIP model)
- **xglm** -- `XGLMTokenizer` or `XGLMTokenizerFast` (XGLM model)
- **xlm** -- `XLMTokenizer` (XLM model)
- **xlm-prophetnet** -- `XLMProphetNetTokenizer` (XLM-ProphetNet model)
- **xlm-roberta** -- `XLMRobertaTokenizer` or `XLMRobertaTokenizerFast` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaTokenizer` or `XLMRobertaTokenizerFast` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetTokenizer` or `XLNetTokenizerFast` (XLNet model)
- **xlstm** -- `GPTNeoXTokenizerFast` (xLSTM model)
- **xmod** -- `XLMRobertaTokenizer` or `XLMRobertaTokenizerFast` (X-MOD model)
- **yoso** -- [AlbertTokenizer](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertTokenizer) or [AlbertTokenizerFast](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertTokenizerFast) (YOSO model)
- **zamba** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Zamba model)
- **zamba2** -- `LlamaTokenizer` or `LlamaTokenizerFast` (Zamba2 model)

Examples:

```python
>>> from transformers import AutoTokenizer

>>> # Download vocabulary from huggingface.co and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")

>>> # Download vocabulary from huggingface.co (user-uploaded) and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-cased")

>>> # If vocabulary files are in a directory (e.g. tokenizer was saved using *save_pretrained('./test/saved_model/')*)
>>> # tokenizer = AutoTokenizer.from_pretrained("./test/bert_saved_model/")

>>> # Download vocabulary from huggingface.co and define model-specific arguments
>>> tokenizer = AutoTokenizer.from_pretrained("FacebookAI/roberta-base", add_prefix_space=True)
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a predefined tokenizer hosted inside a model repo on huggingface.co. - A path to a *directory* containing vocabulary files required by the tokenizer, for instance saved using the [save_pretrained()](/docs/transformers/v4.57.1/ja/internal/tokenization_utils#transformers.PreTrainedTokenizerBase.save_pretrained) method, e.g., `./my_model_directory/`. - A path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (like Bert or XLNet), e.g.: `./my_model_directory/vocab.txt`. (Not applicable to all derived classes)

inputs (additional positional arguments, *optional*) : Will be passed along to the Tokenizer `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : The configuration object used to determine the tokenizer class to instantiate.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

subfolder (`str`, *optional*) : In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for facebook/rag-token-base), specify it here.

use_fast (`bool`, *optional*, defaults to `True`) : Use a [fast Rust-based tokenizer](https://huggingface.co/docs/tokenizers/index) if it is supported for a given model. If a fast tokenizer is not available for a given model, a normal Python-based tokenizer is returned instead.

tokenizer_type (`str`, *optional*) : Tokenizer type to be loaded.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs (additional keyword arguments, *optional*) : Will be passed to the Tokenizer `__init__()` method. Can be used to set special tokens like `bos_token`, `eos_token`, `unk_token`, `sep_token`, `pad_token`, `cls_token`, `mask_token`, `additional_special_tokens`. See parameters in the `__init__()` for more details.
#### register[[transformers.AutoTokenizer.register]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/tokenization_auto.py#L1190)

Register a new tokenizer in this mapping.

**Parameters:**

config_class ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The configuration corresponding to the model to register.

slow_tokenizer_class (`PretrainedTokenizer`, *optional*) : The slow tokenizer to register.

fast_tokenizer_class (`PretrainedTokenizerFast`, *optional*) : The fast tokenizer to register.

## AutoFeatureExtractor[[transformers.AutoFeatureExtractor]]

#### transformers.AutoFeatureExtractor[[transformers.AutoFeatureExtractor]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/feature_extraction_auto.py#L255)

This is a generic feature extractor class that will be instantiated as one of the feature extractor classes of the
library when created with the [AutoFeatureExtractor.from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoFeatureExtractor.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoFeatureExtractor.from_pretrainedhttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/feature_extraction_auto.py#L269[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  This can be either:

  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on
    huggingface.co.
  - a path to a *directory* containing a feature extractor file saved using the
    [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/feature_extractor#transformers.FeatureExtractionMixin.save_pretrained) method, e.g.,
    `./my_model_directory/`.
  - a path or url to a saved feature extractor JSON *file*, e.g.,
    `./my_model_directory/preprocessor_config.json`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model feature extractor should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force to (re-)download the feature extractor files and override the cached versions
  if they exist.
- **resume_download** --
  Deprecated and ignored. All downloads are now resumed by default when possible.
  Will be removed in v5 of Transformers.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
- **token** (`str` or *bool*, *optional*) --
  The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
  when running `hf auth login` (stored in `~/.huggingface`).
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final feature extractor object. If `True`, then this
  functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary
  consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of
  `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs** (`dict[str, Any]`, *optional*) --
  The values in kwargs of any keys which are feature extractor attributes will be used to override the
  loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is
  controlled by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the feature extractor classes of the library from a pretrained model vocabulary.

The feature extractor class to instantiate is selected based on the `model_type` property of the config object
(either passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's
missing, by falling back to using pattern matching on `pretrained_model_name_or_path`:

- **audio-spectrogram-transformer** -- [ASTFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTFeatureExtractor) (Audio Spectrogram Transformer model)
- **beit** -- [BeitFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitFeatureExtractor) (BEiT model)
- **chinese_clip** -- [ChineseCLIPFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPFeatureExtractor) (Chinese-CLIP model)
- **clap** -- [ClapFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/clap#transformers.ClapFeatureExtractor) (CLAP model)
- **clip** -- [CLIPFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPFeatureExtractor) (CLIP model)
- **clipseg** -- `ViTFeatureExtractor` (CLIPSeg model)
- **clvp** -- [ClvpFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/clvp#transformers.ClvpFeatureExtractor) (CLVP model)
- **conditional_detr** -- [ConditionalDetrFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrFeatureExtractor) (Conditional DETR model)
- **convnext** -- [ConvNextFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextFeatureExtractor) (ConvNeXT model)
- **cvt** -- [ConvNextFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextFeatureExtractor) (CvT model)
- **dac** -- `DacFeatureExtractor` (DAC model)
- **data2vec-audio** -- `Wav2Vec2FeatureExtractor` (Data2VecAudio model)
- **data2vec-vision** -- [BeitFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitFeatureExtractor) (Data2VecVision model)
- **deformable_detr** -- [DeformableDetrFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrFeatureExtractor) (Deformable DETR model)
- **deit** -- [DeiTFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTFeatureExtractor) (DeiT model)
- **detr** -- [DetrFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrFeatureExtractor) (DETR model)
- **dia** -- `DiaFeatureExtractor` (Dia model)
- **dinat** -- `ViTFeatureExtractor` (DiNAT model)
- **donut-swin** -- `DonutFeatureExtractor` (DonutSwin model)
- **dpt** -- `DPTFeatureExtractor` (DPT model)
- **encodec** -- `EncodecFeatureExtractor` (EnCodec model)
- **flava** -- `FlavaFeatureExtractor` (FLAVA model)
- **gemma3n** -- `Gemma3nAudioFeatureExtractor` (Gemma3nForConditionalGeneration model)
- **glpn** -- `GLPNFeatureExtractor` (GLPN model)
- **granite_speech** -- `GraniteSpeechFeatureExtractor` (GraniteSpeech model)
- **groupvit** -- [CLIPFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPFeatureExtractor) (GroupViT model)
- **hubert** -- `Wav2Vec2FeatureExtractor` (Hubert model)
- **imagegpt** -- `ImageGPTFeatureExtractor` (ImageGPT model)
- **kyutai_speech_to_text** -- `KyutaiSpeechToTextFeatureExtractor` (KyutaiSpeechToText model)
- **layoutlmv2** -- `LayoutLMv2FeatureExtractor` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3FeatureExtractor` (LayoutLMv3 model)
- **levit** -- `LevitFeatureExtractor` (LeViT model)
- **maskformer** -- `MaskFormerFeatureExtractor` (MaskFormer model)
- **mctct** -- `MCTCTFeatureExtractor` (M-CTC-T model)
- **mimi** -- `EncodecFeatureExtractor` (Mimi model)
- **mobilenet_v1** -- `MobileNetV1FeatureExtractor` (MobileNetV1 model)
- **mobilenet_v2** -- `MobileNetV2FeatureExtractor` (MobileNetV2 model)
- **mobilevit** -- `MobileViTFeatureExtractor` (MobileViT model)
- **moonshine** -- `Wav2Vec2FeatureExtractor` (Moonshine model)
- **moshi** -- `EncodecFeatureExtractor` (Moshi model)
- **nat** -- `ViTFeatureExtractor` (NAT model)
- **owlvit** -- `OwlViTFeatureExtractor` (OWL-ViT model)
- **parakeet_ctc** -- `ParakeetFeatureExtractor` (Parakeet model)
- **parakeet_encoder** -- `ParakeetFeatureExtractor` (ParakeetEncoder model)
- **perceiver** -- `PerceiverFeatureExtractor` (Perceiver model)
- **phi4_multimodal** -- `Phi4MultimodalFeatureExtractor` (Phi4Multimodal model)
- **poolformer** -- `PoolFormerFeatureExtractor` (PoolFormer model)
- **pop2piano** -- `Pop2PianoFeatureExtractor` (Pop2Piano model)
- **regnet** -- [ConvNextFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextFeatureExtractor) (RegNet model)
- **resnet** -- [ConvNextFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextFeatureExtractor) (ResNet model)
- **seamless_m4t** -- `SeamlessM4TFeatureExtractor` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4TFeatureExtractor` (SeamlessM4Tv2 model)
- **segformer** -- `SegformerFeatureExtractor` (SegFormer model)
- **sew** -- `Wav2Vec2FeatureExtractor` (SEW model)
- **sew-d** -- `Wav2Vec2FeatureExtractor` (SEW-D model)
- **speech_to_text** -- `Speech2TextFeatureExtractor` (Speech2Text model)
- **speecht5** -- `SpeechT5FeatureExtractor` (SpeechT5 model)
- **swiftformer** -- `ViTFeatureExtractor` (SwiftFormer model)
- **swin** -- `ViTFeatureExtractor` (Swin Transformer model)
- **swinv2** -- `ViTFeatureExtractor` (Swin Transformer V2 model)
- **table-transformer** -- [DetrFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrFeatureExtractor) (Table Transformer model)
- **timesformer** -- `VideoMAEFeatureExtractor` (TimeSformer model)
- **tvlt** -- `TvltFeatureExtractor` (TVLT model)
- **unispeech** -- `Wav2Vec2FeatureExtractor` (UniSpeech model)
- **unispeech-sat** -- `Wav2Vec2FeatureExtractor` (UniSpeechSat model)
- **univnet** -- `UnivNetFeatureExtractor` (UnivNet model)
- **van** -- [ConvNextFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextFeatureExtractor) (VAN model)
- **videomae** -- `VideoMAEFeatureExtractor` (VideoMAE model)
- **vilt** -- `ViltFeatureExtractor` (ViLT model)
- **vit** -- `ViTFeatureExtractor` (ViT model)
- **vit_mae** -- `ViTFeatureExtractor` (ViTMAE model)
- **vit_msn** -- `ViTFeatureExtractor` (ViTMSN model)
- **wav2vec2** -- `Wav2Vec2FeatureExtractor` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2FeatureExtractor` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2FeatureExtractor` (Wav2Vec2-Conformer model)
- **wavlm** -- `Wav2Vec2FeatureExtractor` (WavLM model)
- **whisper** -- `WhisperFeatureExtractor` (Whisper model)
- **xclip** -- [CLIPFeatureExtractor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPFeatureExtractor) (X-CLIP model)
- **xcodec** -- `DacFeatureExtractor` (X-CODEC model)
- **yolos** -- `YolosFeatureExtractor` (YOLOS model)

Passing `token=True` is required when you want to use a private model.

Examples:

```python
>>> from transformers import AutoFeatureExtractor

>>> # Download feature extractor from huggingface.co and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> # feature_extractor = AutoFeatureExtractor.from_pretrained("./test/saved_model/")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : This can be either:  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on huggingface.co. - a path to a *directory* containing a feature extractor file saved using the [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/feature_extractor#transformers.FeatureExtractionMixin.save_pretrained) method, e.g., `./my_model_directory/`. - a path or url to a saved feature extractor JSON *file*, e.g., `./my_model_directory/preprocessor_config.json`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.

token (`str` or *bool*, *optional*) : The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `hf auth login` (stored in `~/.huggingface`).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final feature extractor object. If `True`, then this functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs (`dict[str, Any]`, *optional*) : The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoFeatureExtractor.register]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/feature_extraction_auto.py#L409)

Register a new feature extractor for this class.

**Parameters:**

config_class ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The configuration corresponding to the model to register.

feature_extractor_class (`FeatureExtractorMixin`) : The feature extractor to register.

## AutoImageProcessor[[transformers.AutoImageProcessor]]

#### transformers.AutoImageProcessor[[transformers.AutoImageProcessor]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/image_processing_auto.py#L354)

This is a generic image processor class that will be instantiated as one of the image processor classes of the
library when created with the [AutoImageProcessor.from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoImageProcessor.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoImageProcessor.from_pretrainedhttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/image_processing_auto.py#L368[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "*inputs", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  This can be either:

  - a string, the *model id* of a pretrained image_processor hosted inside a model repo on
    huggingface.co.
  - a path to a *directory* containing a image processor file saved using the
    [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/image_processor#transformers.ImageProcessingMixin.save_pretrained) method, e.g.,
    `./my_model_directory/`.
  - a path or url to a saved image processor JSON *file*, e.g.,
    `./my_model_directory/preprocessor_config.json`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model image processor should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force to (re-)download the image processor files and override the cached versions if
  they exist.
- **resume_download** --
  Deprecated and ignored. All downloads are now resumed by default when possible.
  Will be removed in v5 of Transformers.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
- **token** (`str` or *bool*, *optional*) --
  The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
  when running `hf auth login` (stored in `~/.huggingface`).
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **use_fast** (`bool`, *optional*, defaults to `False`) --
  Use a fast torchvision-base image processor if it is supported for a given model.
  If a fast image processor is not available for a given model, a normal numpy-based image processor
  is returned instead.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final image processor object. If `True`, then this
  functions returns a `Tuple(image_processor, unused_kwargs)` where *unused_kwargs* is a dictionary
  consisting of the key/value pairs whose keys are not image processor attributes: i.e., the part of
  `kwargs` which has not been used to update `image_processor` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **image_processor_filename** (`str`, *optional*, defaults to `"config.json"`) --
  The name of the file in the model directory to use for the image processor config.
- **kwargs** (`dict[str, Any]`, *optional*) --
  The values in kwargs of any keys which are image processor attributes will be used to override the
  loaded values. Behavior concerning key/value pairs whose keys are *not* image processor attributes is
  controlled by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the image processor classes of the library from a pretrained model vocabulary.

The image processor class to instantiate is selected based on the `model_type` property of the config object
(either passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's
missing, by falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aimv2** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessor) or [CLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessorFast) (AIMv2 model)
- **aimv2_vision_model** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessor) or [CLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessorFast) (Aimv2VisionModel model)
- **align** -- `EfficientNetImageProcessor` or `EfficientNetImageProcessorFast` (ALIGN model)
- **aria** -- `AriaImageProcessor` (Aria model)
- **beit** -- [BeitImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitImageProcessor) or [BeitImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitImageProcessorFast) (BEiT model)
- **bit** -- [BitImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitImageProcessor) or [BitImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitImageProcessorFast) (BiT model)
- **blip** -- [BlipImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipImageProcessor) or [BlipImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipImageProcessorFast) (BLIP model)
- **blip-2** -- [BlipImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipImageProcessor) or [BlipImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipImageProcessorFast) (BLIP-2 model)
- **bridgetower** -- [BridgeTowerImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/bridgetower#transformers.BridgeTowerImageProcessor) or [BridgeTowerImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/bridgetower#transformers.BridgeTowerImageProcessorFast) (BridgeTower model)
- **chameleon** -- `ChameleonImageProcessor` or `ChameleonImageProcessorFast` (Chameleon model)
- **chinese_clip** -- [ChineseCLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPImageProcessor) or [ChineseCLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPImageProcessorFast) (Chinese-CLIP model)
- **clip** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessor) or [CLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessorFast) (CLIP model)
- **clipseg** -- `ViTImageProcessor` or `ViTImageProcessorFast` (CLIPSeg model)
- **cohere2_vision** -- `Cohere2VisionImageProcessorFast` (Cohere2Vision model)
- **conditional_detr** -- [ConditionalDetrImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrImageProcessor) or [ConditionalDetrImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrImageProcessorFast) (Conditional DETR model)
- **convnext** -- [ConvNextImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessor) or [ConvNextImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessorFast) (ConvNeXT model)
- **convnextv2** -- [ConvNextImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessor) or [ConvNextImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessorFast) (ConvNeXTV2 model)
- **cvt** -- [ConvNextImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessor) or [ConvNextImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessorFast) (CvT model)
- **data2vec-vision** -- [BeitImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitImageProcessor) or [BeitImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitImageProcessorFast) (Data2VecVision model)
- **deepseek_vl** -- `DeepseekVLImageProcessor` or `DeepseekVLImageProcessorFast` (DeepseekVL model)
- **deepseek_vl_hybrid** -- `DeepseekVLHybridImageProcessor` or `DeepseekVLHybridImageProcessorFast` (DeepseekVLHybrid model)
- **deformable_detr** -- [DeformableDetrImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrImageProcessor) or `DeformableDetrImageProcessorFast` (Deformable DETR model)
- **deit** -- [DeiTImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTImageProcessor) or [DeiTImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTImageProcessorFast) (DeiT model)
- **depth_anything** -- `DPTImageProcessor` or `DPTImageProcessorFast` (Depth Anything model)
- **depth_pro** -- `DepthProImageProcessor` or `DepthProImageProcessorFast` (DepthPro model)
- **deta** -- [DetaImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaImageProcessor) (DETA model)
- **detr** -- [DetrImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrImageProcessor) or [DetrImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrImageProcessorFast) (DETR model)
- **dinat** -- `ViTImageProcessor` or `ViTImageProcessorFast` (DiNAT model)
- **dinov2** -- [BitImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitImageProcessor) or [BitImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitImageProcessorFast) (DINOv2 model)
- **dinov3_vit** -- `DINOv3ViTImageProcessorFast` (DINOv3 ViT model)
- **donut-swin** -- `DonutImageProcessor` or `DonutImageProcessorFast` (DonutSwin model)
- **dpt** -- `DPTImageProcessor` or `DPTImageProcessorFast` (DPT model)
- **edgetam** -- `Sam2ImageProcessorFast` (EdgeTAM model)
- **efficientformer** -- `EfficientFormerImageProcessor` (EfficientFormer model)
- **efficientloftr** -- `EfficientLoFTRImageProcessor` or `EfficientLoFTRImageProcessorFast` (EfficientLoFTR model)
- **efficientnet** -- `EfficientNetImageProcessor` or `EfficientNetImageProcessorFast` (EfficientNet model)
- **eomt** -- `EomtImageProcessor` or `EomtImageProcessorFast` (EoMT model)
- **flava** -- `FlavaImageProcessor` or `FlavaImageProcessorFast` (FLAVA model)
- **focalnet** -- [BitImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitImageProcessor) or [BitImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitImageProcessorFast) (FocalNet model)
- **fuyu** -- `FuyuImageProcessor` (Fuyu model)
- **gemma3** -- `Gemma3ImageProcessor` or `Gemma3ImageProcessorFast` (Gemma3ForConditionalGeneration model)
- **gemma3n** -- `SiglipImageProcessor` or `SiglipImageProcessorFast` (Gemma3nForConditionalGeneration model)
- **git** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessor) or [CLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessorFast) (GIT model)
- **glm4v** -- `Glm4vImageProcessor` or `Glm4vImageProcessorFast` (GLM4V model)
- **glpn** -- `GLPNImageProcessor` (GLPN model)
- **got_ocr2** -- `GotOcr2ImageProcessor` or `GotOcr2ImageProcessorFast` (GOT-OCR2 model)
- **grounding-dino** -- `GroundingDinoImageProcessor` or `GroundingDinoImageProcessorFast` (Grounding DINO model)
- **groupvit** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessor) or [CLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessorFast) (GroupViT model)
- **hiera** -- [BitImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitImageProcessor) or [BitImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitImageProcessorFast) (Hiera model)
- **idefics** -- `IdeficsImageProcessor` (IDEFICS model)
- **idefics2** -- `Idefics2ImageProcessor` or `Idefics2ImageProcessorFast` (Idefics2 model)
- **idefics3** -- `Idefics3ImageProcessor` or `Idefics3ImageProcessorFast` (Idefics3 model)
- **ijepa** -- `ViTImageProcessor` or `ViTImageProcessorFast` (I-JEPA model)
- **imagegpt** -- `ImageGPTImageProcessor` or `ImageGPTImageProcessorFast` (ImageGPT model)
- **instructblip** -- [BlipImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipImageProcessor) or [BlipImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipImageProcessorFast) (InstructBLIP model)
- **instructblipvideo** -- `InstructBlipVideoImageProcessor` (InstructBlipVideo model)
- **janus** -- `JanusImageProcessor` or `JanusImageProcessorFast` (Janus model)
- **kosmos-2** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessor) or [CLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessorFast) (KOSMOS-2 model)
- **kosmos-2.5** -- `Kosmos2_5ImageProcessor` or `Kosmos2_5ImageProcessorFast` (KOSMOS-2.5 model)
- **layoutlmv2** -- `LayoutLMv2ImageProcessor` or `LayoutLMv2ImageProcessorFast` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3ImageProcessor` or `LayoutLMv3ImageProcessorFast` (LayoutLMv3 model)
- **levit** -- `LevitImageProcessor` or `LevitImageProcessorFast` (LeViT model)
- **lfm2_vl** -- `Lfm2VlImageProcessorFast` (Lfm2Vl model)
- **lightglue** -- `LightGlueImageProcessor` (LightGlue model)
- **llama4** -- `Llama4ImageProcessor` or `Llama4ImageProcessorFast` (Llama4 model)
- **llava** -- `LlavaImageProcessor` or `LlavaImageProcessorFast` (LLaVa model)
- **llava_next** -- `LlavaNextImageProcessor` or `LlavaNextImageProcessorFast` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoImageProcessor` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionImageProcessor` or `LlavaOnevisionImageProcessorFast` (LLaVA-Onevision model)
- **mask2former** -- `Mask2FormerImageProcessor` or `Mask2FormerImageProcessorFast` (Mask2Former model)
- **maskformer** -- `MaskFormerImageProcessor` or `MaskFormerImageProcessorFast` (MaskFormer model)
- **metaclip_2** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessor) or [CLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessorFast) (MetaCLIP 2 model)
- **mgp-str** -- `ViTImageProcessor` or `ViTImageProcessorFast` (MGP-STR model)
- **mistral3** -- `PixtralImageProcessor` or `PixtralImageProcessorFast` (Mistral3 model)
- **mlcd** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessor) or [CLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessorFast) (MLCD model)
- **mllama** -- `MllamaImageProcessor` (Mllama model)
- **mm-grounding-dino** -- `GroundingDinoImageProcessor` or `GroundingDinoImageProcessorFast` (MM Grounding DINO model)
- **mobilenet_v1** -- `MobileNetV1ImageProcessor` or `MobileNetV1ImageProcessorFast` (MobileNetV1 model)
- **mobilenet_v2** -- `MobileNetV2ImageProcessor` or `MobileNetV2ImageProcessorFast` (MobileNetV2 model)
- **mobilevit** -- `MobileViTImageProcessor` or `MobileViTImageProcessorFast` (MobileViT model)
- **mobilevitv2** -- `MobileViTImageProcessor` or `MobileViTImageProcessorFast` (MobileViTV2 model)
- **nat** -- `ViTImageProcessor` or `ViTImageProcessorFast` (NAT model)
- **nougat** -- `NougatImageProcessor` or `NougatImageProcessorFast` (Nougat model)
- **oneformer** -- `OneFormerImageProcessor` or `OneFormerImageProcessorFast` (OneFormer model)
- **ovis2** -- `Ovis2ImageProcessor` or `Ovis2ImageProcessorFast` (Ovis2 model)
- **owlv2** -- `Owlv2ImageProcessor` or `Owlv2ImageProcessorFast` (OWLv2 model)
- **owlvit** -- `OwlViTImageProcessor` or `OwlViTImageProcessorFast` (OWL-ViT model)
- **paligemma** -- `SiglipImageProcessor` or `SiglipImageProcessorFast` (PaliGemma model)
- **perceiver** -- `PerceiverImageProcessor` or `PerceiverImageProcessorFast` (Perceiver model)
- **perception_lm** -- `PerceptionLMImageProcessorFast` (PerceptionLM model)
- **phi4_multimodal** -- `Phi4MultimodalImageProcessorFast` (Phi4Multimodal model)
- **pix2struct** -- `Pix2StructImageProcessor` (Pix2Struct model)
- **pixtral** -- `PixtralImageProcessor` or `PixtralImageProcessorFast` (Pixtral model)
- **poolformer** -- `PoolFormerImageProcessor` or `PoolFormerImageProcessorFast` (PoolFormer model)
- **prompt_depth_anything** -- `PromptDepthAnythingImageProcessor` or `PromptDepthAnythingImageProcessorFast` (PromptDepthAnything model)
- **pvt** -- `PvtImageProcessor` or `PvtImageProcessorFast` (PVT model)
- **pvt_v2** -- `PvtImageProcessor` or `PvtImageProcessorFast` (PVTv2 model)
- **qwen2_5_vl** -- `Qwen2VLImageProcessor` or `Qwen2VLImageProcessorFast` (Qwen2_5_VL model)
- **qwen2_vl** -- `Qwen2VLImageProcessor` or `Qwen2VLImageProcessorFast` (Qwen2VL model)
- **qwen3_vl** -- `Qwen2VLImageProcessor` or `Qwen2VLImageProcessorFast` (Qwen3VL model)
- **regnet** -- [ConvNextImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessor) or [ConvNextImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessorFast) (RegNet model)
- **resnet** -- [ConvNextImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessor) or [ConvNextImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessorFast) (ResNet model)
- **rt_detr** -- `RTDetrImageProcessor` or `RTDetrImageProcessorFast` (RT-DETR model)
- **sam** -- `SamImageProcessor` or `SamImageProcessorFast` (SAM model)
- **sam2** -- `Sam2ImageProcessorFast` (SAM2 model)
- **sam_hq** -- `SamImageProcessor` or `SamImageProcessorFast` (SAM-HQ model)
- **segformer** -- `SegformerImageProcessor` or `SegformerImageProcessorFast` (SegFormer model)
- **seggpt** -- `SegGptImageProcessor` (SegGPT model)
- **shieldgemma2** -- `Gemma3ImageProcessor` or `Gemma3ImageProcessorFast` (Shieldgemma2 model)
- **siglip** -- `SiglipImageProcessor` or `SiglipImageProcessorFast` (SigLIP model)
- **siglip2** -- `Siglip2ImageProcessor` or `Siglip2ImageProcessorFast` (SigLIP2 model)
- **smolvlm** -- `SmolVLMImageProcessor` or `SmolVLMImageProcessorFast` (SmolVLM model)
- **superglue** -- `SuperGlueImageProcessor` (SuperGlue model)
- **superpoint** -- `SuperPointImageProcessor` or `SuperPointImageProcessorFast` (SuperPoint model)
- **swiftformer** -- `ViTImageProcessor` or `ViTImageProcessorFast` (SwiftFormer model)
- **swin** -- `ViTImageProcessor` or `ViTImageProcessorFast` (Swin Transformer model)
- **swin2sr** -- `Swin2SRImageProcessor` or `Swin2SRImageProcessorFast` (Swin2SR model)
- **swinv2** -- `ViTImageProcessor` or `ViTImageProcessorFast` (Swin Transformer V2 model)
- **table-transformer** -- [DetrImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrImageProcessor) or [DetrImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrImageProcessorFast) (Table Transformer model)
- **textnet** -- `TextNetImageProcessor` or `TextNetImageProcessorFast` (TextNet model)
- **timesformer** -- `VideoMAEImageProcessor` (TimeSformer model)
- **timm_wrapper** -- `TimmWrapperImageProcessor` (TimmWrapperModel model)
- **tvlt** -- `TvltImageProcessor` (TVLT model)
- **tvp** -- `TvpImageProcessor` or `TvpImageProcessorFast` (TVP model)
- **udop** -- `LayoutLMv3ImageProcessor` or `LayoutLMv3ImageProcessorFast` (UDOP model)
- **upernet** -- `SegformerImageProcessor` or `SegformerImageProcessorFast` (UPerNet model)
- **van** -- [ConvNextImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessor) or [ConvNextImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextImageProcessorFast) (VAN model)
- **videomae** -- `VideoMAEImageProcessor` (VideoMAE model)
- **vilt** -- `ViltImageProcessor` or `ViltImageProcessorFast` (ViLT model)
- **vipllava** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessor) or [CLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessorFast) (VipLlava model)
- **vit** -- `ViTImageProcessor` or `ViTImageProcessorFast` (ViT model)
- **vit_hybrid** -- `ViTHybridImageProcessor` (ViT Hybrid model)
- **vit_mae** -- `ViTImageProcessor` or `ViTImageProcessorFast` (ViTMAE model)
- **vit_msn** -- `ViTImageProcessor` or `ViTImageProcessorFast` (ViTMSN model)
- **vitmatte** -- `VitMatteImageProcessor` or `VitMatteImageProcessorFast` (ViTMatte model)
- **xclip** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessor) or [CLIPImageProcessorFast](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPImageProcessorFast) (X-CLIP model)
- **yolos** -- `YolosImageProcessor` or `YolosImageProcessorFast` (YOLOS model)
- **zoedepth** -- `ZoeDepthImageProcessor` or `ZoeDepthImageProcessorFast` (ZoeDepth model)

Passing `token=True` is required when you want to use a private model.

Examples:

```python
>>> from transformers import AutoImageProcessor

>>> # Download image processor from huggingface.co and cache.
>>> image_processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224-in21k")

>>> # If image processor files are in a directory (e.g. image processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # image_processor = AutoImageProcessor.from_pretrained("./test/saved_model/")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : This can be either:  - a string, the *model id* of a pretrained image_processor hosted inside a model repo on huggingface.co. - a path to a *directory* containing a image processor file saved using the [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/image_processor#transformers.ImageProcessingMixin.save_pretrained) method, e.g., `./my_model_directory/`. - a path or url to a saved image processor JSON *file*, e.g., `./my_model_directory/preprocessor_config.json`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model image processor should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force to (re-)download the image processor files and override the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.

token (`str` or *bool*, *optional*) : The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `hf auth login` (stored in `~/.huggingface`).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

use_fast (`bool`, *optional*, defaults to `False`) : Use a fast torchvision-base image processor if it is supported for a given model. If a fast image processor is not available for a given model, a normal numpy-based image processor is returned instead.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final image processor object. If `True`, then this functions returns a `Tuple(image_processor, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not image processor attributes: i.e., the part of `kwargs` which has not been used to update `image_processor` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

image_processor_filename (`str`, *optional*, defaults to `"config.json"`) : The name of the file in the model directory to use for the image processor config.

kwargs (`dict[str, Any]`, *optional*) : The values in kwargs of any keys which are image processor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* image processor attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoImageProcessor.register]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/image_processing_auto.py#L628)

Register a new image processor for this class.

**Parameters:**

config_class ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The configuration corresponding to the model to register.

image_processor_class ([ImageProcessingMixin](/docs/transformers/v4.57.1/ja/main_classes/image_processor#transformers.ImageProcessingMixin)) : The image processor to register.

## AutoProcessor[[transformers.AutoProcessor]]

#### transformers.AutoProcessor[[transformers.AutoProcessor]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/processing_auto.py#L188)

This is a generic processor class that will be instantiated as one of the processor classes of the library when
created with the [AutoProcessor.from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoProcessor.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoProcessor.from_pretrainedhttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/processing_auto.py#L202[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  This can be either:

  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on
    huggingface.co.
  - a path to a *directory* containing a processor files saved using the `save_pretrained()` method,
    e.g., `./my_model_directory/`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model feature extractor should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force to (re-)download the feature extractor files and override the cached versions
  if they exist.
- **resume_download** --
  Deprecated and ignored. All downloads are now resumed by default when possible.
  Will be removed in v5 of Transformers.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
- **token** (`str` or *bool*, *optional*) --
  The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
  when running `hf auth login` (stored in `~/.huggingface`).
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final feature extractor object. If `True`, then this
  functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary
  consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of
  `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs** (`dict[str, Any]`, *optional*) --
  The values in kwargs of any keys which are feature extractor attributes will be used to override the
  loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is
  controlled by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the processor classes of the library from a pretrained model vocabulary.

The processor class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible):

- **aimv2** -- [CLIPProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPProcessor) (AIMv2 model)
- **align** -- [AlignProcessor](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignProcessor) (ALIGN model)
- **altclip** -- [AltCLIPProcessor](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPProcessor) (AltCLIP model)
- **aria** -- `AriaProcessor` (Aria model)
- **aya_vision** -- `AyaVisionProcessor` (AyaVision model)
- **bark** -- [BarkProcessor](/docs/transformers/v4.57.1/ja/model_doc/bark#transformers.BarkProcessor) (Bark model)
- **blip** -- [BlipProcessor](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipProcessor) (BLIP model)
- **blip-2** -- [Blip2Processor](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Processor) (BLIP-2 model)
- **bridgetower** -- [BridgeTowerProcessor](/docs/transformers/v4.57.1/ja/model_doc/bridgetower#transformers.BridgeTowerProcessor) (BridgeTower model)
- **chameleon** -- `ChameleonProcessor` (Chameleon model)
- **chinese_clip** -- [ChineseCLIPProcessor](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPProcessor) (Chinese-CLIP model)
- **clap** -- [ClapProcessor](/docs/transformers/v4.57.1/ja/model_doc/clap#transformers.ClapProcessor) (CLAP model)
- **clip** -- [CLIPProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPProcessor) (CLIP model)
- **clipseg** -- [CLIPSegProcessor](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegProcessor) (CLIPSeg model)
- **clvp** -- [ClvpProcessor](/docs/transformers/v4.57.1/ja/model_doc/clvp#transformers.ClvpProcessor) (CLVP model)
- **cohere2_vision** -- `Cohere2VisionProcessor` (Cohere2Vision model)
- **colpali** -- `ColPaliProcessor` (ColPali model)
- **colqwen2** -- `ColQwen2Processor` (ColQwen2 model)
- **deepseek_vl** -- `DeepseekVLProcessor` (DeepseekVL model)
- **deepseek_vl_hybrid** -- `DeepseekVLHybridProcessor` (DeepseekVLHybrid model)
- **dia** -- `DiaProcessor` (Dia model)
- **edgetam** -- `Sam2Processor` (EdgeTAM model)
- **emu3** -- `Emu3Processor` (Emu3 model)
- **evolla** -- `EvollaProcessor` (Evolla model)
- **flava** -- `FlavaProcessor` (FLAVA model)
- **florence2** -- `Florence2Processor` (Florence2 model)
- **fuyu** -- `FuyuProcessor` (Fuyu model)
- **gemma3** -- `Gemma3Processor` (Gemma3ForConditionalGeneration model)
- **gemma3n** -- `Gemma3nProcessor` (Gemma3nForConditionalGeneration model)
- **git** -- `GitProcessor` (GIT model)
- **glm4v** -- `Glm4vProcessor` (GLM4V model)
- **glm4v_moe** -- `Glm4vProcessor` (GLM4VMOE model)
- **got_ocr2** -- `GotOcr2Processor` (GOT-OCR2 model)
- **granite_speech** -- `GraniteSpeechProcessor` (GraniteSpeech model)
- **grounding-dino** -- `GroundingDinoProcessor` (Grounding DINO model)
- **groupvit** -- [CLIPProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPProcessor) (GroupViT model)
- **hubert** -- `Wav2Vec2Processor` (Hubert model)
- **idefics** -- `IdeficsProcessor` (IDEFICS model)
- **idefics2** -- `Idefics2Processor` (Idefics2 model)
- **idefics3** -- `Idefics3Processor` (Idefics3 model)
- **instructblip** -- `InstructBlipProcessor` (InstructBLIP model)
- **instructblipvideo** -- `InstructBlipVideoProcessor` (InstructBlipVideo model)
- **internvl** -- `InternVLProcessor` (InternVL model)
- **janus** -- `JanusProcessor` (Janus model)
- **kosmos-2** -- `Kosmos2Processor` (KOSMOS-2 model)
- **kosmos-2.5** -- `Kosmos2_5Processor` (KOSMOS-2.5 model)
- **kyutai_speech_to_text** -- `KyutaiSpeechToTextProcessor` (KyutaiSpeechToText model)
- **layoutlmv2** -- `LayoutLMv2Processor` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3Processor` (LayoutLMv3 model)
- **lfm2_vl** -- `Lfm2VlProcessor` (Lfm2Vl model)
- **llama4** -- `Llama4Processor` (Llama4 model)
- **llava** -- `LlavaProcessor` (LLaVa model)
- **llava_next** -- `LlavaNextProcessor` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoProcessor` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionProcessor` (LLaVA-Onevision model)
- **markuplm** -- `MarkupLMProcessor` (MarkupLM model)
- **mctct** -- `MCTCTProcessor` (M-CTC-T model)
- **metaclip_2** -- [CLIPProcessor](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPProcessor) (MetaCLIP 2 model)
- **mgp-str** -- `MgpstrProcessor` (MGP-STR model)
- **mistral3** -- `PixtralProcessor` (Mistral3 model)
- **mllama** -- `MllamaProcessor` (Mllama model)
- **mm-grounding-dino** -- `GroundingDinoProcessor` (MM Grounding DINO model)
- **moonshine** -- `Wav2Vec2Processor` (Moonshine model)
- **oneformer** -- `OneFormerProcessor` (OneFormer model)
- **ovis2** -- `Ovis2Processor` (Ovis2 model)
- **owlv2** -- `Owlv2Processor` (OWLv2 model)
- **owlvit** -- `OwlViTProcessor` (OWL-ViT model)
- **paligemma** -- `PaliGemmaProcessor` (PaliGemma model)
- **perception_lm** -- `PerceptionLMProcessor` (PerceptionLM model)
- **phi4_multimodal** -- `Phi4MultimodalProcessor` (Phi4Multimodal model)
- **pix2struct** -- `Pix2StructProcessor` (Pix2Struct model)
- **pixtral** -- `PixtralProcessor` (Pixtral model)
- **pop2piano** -- `Pop2PianoProcessor` (Pop2Piano model)
- **qwen2_5_omni** -- `Qwen2_5OmniProcessor` (Qwen2_5Omni model)
- **qwen2_5_vl** -- `Qwen2_5_VLProcessor` (Qwen2_5_VL model)
- **qwen2_audio** -- `Qwen2AudioProcessor` (Qwen2Audio model)
- **qwen2_vl** -- `Qwen2VLProcessor` (Qwen2VL model)
- **qwen3_omni_moe** -- `Qwen3OmniMoeProcessor` (Qwen3OmniMoE model)
- **qwen3_vl** -- `Qwen3VLProcessor` (Qwen3VL model)
- **qwen3_vl_moe** -- `Qwen3VLProcessor` (Qwen3VLMoe model)
- **sam** -- `SamProcessor` (SAM model)
- **sam2** -- `Sam2Processor` (SAM2 model)
- **sam_hq** -- `SamHQProcessor` (SAM-HQ model)
- **seamless_m4t** -- `SeamlessM4TProcessor` (SeamlessM4T model)
- **sew** -- `Wav2Vec2Processor` (SEW model)
- **sew-d** -- `Wav2Vec2Processor` (SEW-D model)
- **shieldgemma2** -- `ShieldGemma2Processor` (Shieldgemma2 model)
- **siglip** -- `SiglipProcessor` (SigLIP model)
- **siglip2** -- `Siglip2Processor` (SigLIP2 model)
- **smolvlm** -- `SmolVLMProcessor` (SmolVLM model)
- **speech_to_text** -- `Speech2TextProcessor` (Speech2Text model)
- **speech_to_text_2** -- `Speech2Text2Processor` (Speech2Text2 model)
- **speecht5** -- `SpeechT5Processor` (SpeechT5 model)
- **trocr** -- `TrOCRProcessor` (TrOCR model)
- **tvlt** -- `TvltProcessor` (TVLT model)
- **tvp** -- `TvpProcessor` (TVP model)
- **udop** -- `UdopProcessor` (UDOP model)
- **unispeech** -- `Wav2Vec2Processor` (UniSpeech model)
- **unispeech-sat** -- `Wav2Vec2Processor` (UniSpeechSat model)
- **video_llava** -- `VideoLlavaProcessor` (VideoLlava model)
- **vilt** -- `ViltProcessor` (ViLT model)
- **vipllava** -- `LlavaProcessor` (VipLlava model)
- **vision-text-dual-encoder** -- `VisionTextDualEncoderProcessor` (VisionTextDualEncoder model)
- **voxtral** -- `VoxtralProcessor` (Voxtral model)
- **wav2vec2** -- `Wav2Vec2Processor` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2Processor` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2Processor` (Wav2Vec2-Conformer model)
- **wavlm** -- `Wav2Vec2Processor` (WavLM model)
- **whisper** -- `WhisperProcessor` (Whisper model)
- **xclip** -- `XCLIPProcessor` (X-CLIP model)

Passing `token=True` is required when you want to use a private model.

Examples:

```python
>>> from transformers import AutoProcessor

>>> # Download processor from huggingface.co and cache.
>>> processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If processor files are in a directory (e.g. processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # processor = AutoProcessor.from_pretrained("./test/saved_model/")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : This can be either:  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on huggingface.co. - a path to a *directory* containing a processor files saved using the `save_pretrained()` method, e.g., `./my_model_directory/`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.

token (`str` or *bool*, *optional*) : The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `hf auth login` (stored in `~/.huggingface`).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final feature extractor object. If `True`, then this functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs (`dict[str, Any]`, *optional*) : The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoProcessor.register]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/processing_auto.py#L430)

Register a new processor for this class.

**Parameters:**

config_class ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The configuration corresponding to the model to register.

processor_class ([ProcessorMixin](/docs/transformers/v4.57.1/ja/main_classes/processors#transformers.ProcessorMixin)) : The processor to register.

## Generic model classes

以下の自動クラスは、特定のヘッドを持たないベースモデルクラスをインスタンス化するために利用可能です。

### AutoModel[[transformers.AutoModel]]

#### transformers.AutoModel[[transformers.AutoModel]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1940)

This is a generic model class that will be instantiated as one of the base model classes of the library when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModel.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [ASTConfig](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTConfig) configuration class: [ASTModel](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTModel) (Audio Spectrogram Transformer model)
  - `Aimv2Config` configuration class: `Aimv2Model` (AIMv2 model)
  - `Aimv2VisionConfig` configuration class: `Aimv2VisionModel` (Aimv2VisionModel model)
  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertModel](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertModel) (ALBERT model)
  - [AlignConfig](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignConfig) configuration class: [AlignModel](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignModel) (ALIGN model)
  - [AltCLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model)
  - `ApertusConfig` configuration class: `ApertusModel` (Apertus model)
  - `ArceeConfig` configuration class: `ArceeModel` (Arcee model)
  - `AriaConfig` configuration class: `AriaModel` (Aria model)
  - `AriaTextConfig` configuration class: `AriaTextModel` (AriaText model)
  - [AutoformerConfig](/docs/transformers/v4.57.1/ja/model_doc/autoformer#transformers.AutoformerConfig) configuration class: [AutoformerModel](/docs/transformers/v4.57.1/ja/model_doc/autoformer#transformers.AutoformerModel) (Autoformer model)
  - `AyaVisionConfig` configuration class: `AyaVisionModel` (AyaVision model)
  - `BambaConfig` configuration class: `BambaModel` (Bamba model)
  - [BarkConfig](/docs/transformers/v4.57.1/ja/model_doc/bark#transformers.BarkConfig) configuration class: [BarkModel](/docs/transformers/v4.57.1/ja/model_doc/bark#transformers.BarkModel) (Bark model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartModel](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartModel) (BART model)
  - [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) configuration class: [BeitModel](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitModel) (BEiT model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertModel) (BERT model)
  - [BertGenerationConfig](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationConfig) configuration class: [BertGenerationEncoder](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationEncoder) (Bert Generation model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdModel](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdModel) (BigBird model)
  - [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusModel](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusModel) (BigBird-Pegasus model)
  - [BioGptConfig](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptModel](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptModel) (BioGpt model)
  - [BitConfig](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitConfig) configuration class: [BitModel](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitModel) (BiT model)
  - `BitNetConfig` configuration class: `BitNetModel` (BitNet model)
  - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotModel) (Blenderbot model)
  - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallModel) (BlenderbotSmall model)
  - [Blip2Config](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2Model](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Model) (BLIP-2 model)
  - [Blip2QFormerConfig](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2QFormerConfig) configuration class: [Blip2QFormerModel](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2QFormerModel) (BLIP-2 QFormer model)
  - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipModel) (BLIP model)
  - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomModel](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomModel) (BLOOM model)
  - `BltConfig` configuration class: `BltModel` (Blt model)
  - [BridgeTowerConfig](/docs/transformers/v4.57.1/ja/model_doc/bridgetower#transformers.BridgeTowerConfig) configuration class: [BridgeTowerModel](/docs/transformers/v4.57.1/ja/model_doc/bridgetower#transformers.BridgeTowerModel) (BridgeTower model)
  - [BrosConfig](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosConfig) configuration class: [BrosModel](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosModel) (BROS model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPModel) (CLIP model)
  - [CLIPSegConfig](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model)
  - [CLIPTextConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTextConfig) configuration class: [CLIPTextModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTextModel) (CLIPTextModel model)
  - [CLIPVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPVisionConfig) configuration class: [CLIPVisionModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPVisionModel) (CLIPVisionModel model)
  - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLModel) (CTRL model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertModel](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertModel) (CamemBERT model)
  - [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) configuration class: [CanineModel](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineModel) (CANINE model)
  - `ChameleonConfig` configuration class: `ChameleonModel` (Chameleon model)
  - [ChineseCLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPConfig) configuration class: [ChineseCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPModel) (Chinese-CLIP model)
  - [ChineseCLIPVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPVisionConfig) configuration class: [ChineseCLIPVisionModel](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPVisionModel) (ChineseCLIPVisionModel model)
  - [ClapConfig](/docs/transformers/v4.57.1/ja/model_doc/clap#transformers.ClapConfig) configuration class: [ClapModel](/docs/transformers/v4.57.1/ja/model_doc/clap#transformers.ClapModel) (CLAP model)
  - [ClvpConfig](/docs/transformers/v4.57.1/ja/model_doc/clvp#transformers.ClvpConfig) configuration class: [ClvpModelForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/clvp#transformers.ClvpModelForConditionalGeneration) (CLVP model)
  - [CodeGenConfig](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenModel](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenModel) (CodeGen model)
  - `Cohere2Config` configuration class: `Cohere2Model` (Cohere2 model)
  - `Cohere2VisionConfig` configuration class: `Cohere2VisionModel` (Cohere2Vision model)
  - `CohereConfig` configuration class: `CohereModel` (Cohere model)
  - [ConditionalDetrConfig](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrConfig) configuration class: [ConditionalDetrModel](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrModel) (Conditional DETR model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertModel](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertModel) (ConvBERT model)
  - [ConvNextConfig](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextConfig) configuration class: [ConvNextModel](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextModel) (ConvNeXT model)
  - [ConvNextV2Config](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [ConvNextV2Model](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Model) (ConvNeXTV2 model)
  - [CpmAntConfig](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntConfig) configuration class: [CpmAntModel](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntModel) (CPM-Ant model)
  - `CsmConfig` configuration class: `CsmForConditionalGeneration` (CSM model)
  - [CvtConfig](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtConfig) configuration class: [CvtModel](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtModel) (CvT model)
  - `DFineConfig` configuration class: `DFineModel` (D-FINE model)
  - `DINOv3ConvNextConfig` configuration class: `DINOv3ConvNextModel` (DINOv3 ConvNext model)
  - `DINOv3ViTConfig` configuration class: `DINOv3ViTModel` (DINOv3 ViT model)
  - `DPRConfig` configuration class: `DPRQuestionEncoder` (DPR model)
  - `DPTConfig` configuration class: `DPTModel` (DPT model)
  - `DabDetrConfig` configuration class: `DabDetrModel` (DAB-DETR model)
  - `DacConfig` configuration class: `DacModel` (DAC model)
  - [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioModel) (Data2VecAudio model)
  - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextModel) (Data2VecText model)
  - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionModel) (Data2VecVision model)
  - `DbrxConfig` configuration class: `DbrxModel` (DBRX model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaModel](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaModel) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2Model](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Model) (DeBERTa-v2 model)
  - [DecisionTransformerConfig](/docs/transformers/v4.57.1/ja/model_doc/decision_transformer#transformers.DecisionTransformerConfig) configuration class: `DecisionTransformerModel` (Decision Transformer model)
  - `DeepseekV2Config` configuration class: `DeepseekV2Model` (DeepSeek-V2 model)
  - `DeepseekV3Config` configuration class: `DeepseekV3Model` (DeepSeek-V3 model)
  - `DeepseekVLConfig` configuration class: `DeepseekVLModel` (DeepseekVL model)
  - `DeepseekVLHybridConfig` configuration class: `DeepseekVLHybridModel` (DeepseekVLHybrid model)
  - [DeformableDetrConfig](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrConfig) configuration class: [DeformableDetrModel](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrModel) (Deformable DETR model)
  - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTModel](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTModel) (DeiT model)
  - `DepthProConfig` configuration class: `DepthProModel` (DepthPro model)
  - [DetaConfig](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaConfig) configuration class: [DetaModel](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaModel) (DETA model)
  - [DetrConfig](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrConfig) configuration class: [DetrModel](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrModel) (DETR model)
  - `DiaConfig` configuration class: `DiaModel` (Dia model)
  - `DiffLlamaConfig` configuration class: `DiffLlamaModel` (DiffLlama model)
  - [DinatConfig](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatConfig) configuration class: [DinatModel](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatModel) (DiNAT model)
  - `Dinov2Config` configuration class: `Dinov2Model` (DINOv2 model)
  - `Dinov2WithRegistersConfig` configuration class: `Dinov2WithRegistersModel` (DINOv2 with Registers model)
  - `DistilBertConfig` configuration class: `DistilBertModel` (DistilBERT model)
  - `DogeConfig` configuration class: `DogeModel` (Doge model)
  - `DonutSwinConfig` configuration class: `DonutSwinModel` (DonutSwin model)
  - `Dots1Config` configuration class: `Dots1Model` (dots1 model)
  - `EdgeTamConfig` configuration class: `EdgeTamModel` (EdgeTAM model)
  - `EdgeTamVideoConfig` configuration class: `EdgeTamVideoModel` (EdgeTamVideo model)
  - `EdgeTamVisionConfig` configuration class: `EdgeTamVisionModel` (EdgeTamVisionModel model)
  - `EfficientFormerConfig` configuration class: `EfficientFormerModel` (EfficientFormer model)
  - `EfficientLoFTRConfig` configuration class: `EfficientLoFTRModel` (EfficientLoFTR model)
  - `EfficientNetConfig` configuration class: `EfficientNetModel` (EfficientNet model)
  - `ElectraConfig` configuration class: `ElectraModel` (ELECTRA model)
  - `Emu3Config` configuration class: `Emu3Model` (Emu3 model)
  - `EncodecConfig` configuration class: `EncodecModel` (EnCodec model)
  - `Ernie4_5Config` configuration class: `Ernie4_5Model` (Ernie4_5 model)
  - `Ernie4_5_MoeConfig` configuration class: `Ernie4_5_MoeModel` (Ernie4_5_MoE model)
  - `ErnieConfig` configuration class: `ErnieModel` (ERNIE model)
  - `ErnieMConfig` configuration class: `ErnieMModel` (ErnieM model)
  - `EsmConfig` configuration class: `EsmModel` (ESM model)
  - `EvollaConfig` configuration class: `EvollaModel` (Evolla model)
  - `Exaone4Config` configuration class: `Exaone4Model` (EXAONE-4.0 model)
  - `FNetConfig` configuration class: `FNetModel` (FNet model)
  - `FSMTConfig` configuration class: `FSMTModel` (FairSeq Machine-Translation model)
  - `FalconConfig` configuration class: `FalconModel` (Falcon model)
  - `FalconH1Config` configuration class: `FalconH1Model` (FalconH1 model)
  - `FalconMambaConfig` configuration class: `FalconMambaModel` (FalconMamba model)
  - `FastSpeech2ConformerConfig` configuration class: `FastSpeech2ConformerModel` (FastSpeech2Conformer model)
  - `FastSpeech2ConformerWithHifiGanConfig` configuration class: `FastSpeech2ConformerWithHifiGan` (FastSpeech2ConformerWithHifiGan model)
  - `FlaubertConfig` configuration class: `FlaubertModel` (FlauBERT model)
  - `FlavaConfig` configuration class: `FlavaModel` (FLAVA model)
  - `FlexOlmoConfig` configuration class: `FlexOlmoModel` (FlexOlmo model)
  - `Florence2Config` configuration class: `Florence2Model` (Florence2 model)
  - `FocalNetConfig` configuration class: `FocalNetModel` (FocalNet model)
  - `FunnelConfig` configuration class: `FunnelModel` or `FunnelBaseModel` (Funnel Transformer model)
  - `FuyuConfig` configuration class: `FuyuModel` (Fuyu model)
  - `GLPNConfig` configuration class: `GLPNModel` (GLPN model)
  - `GPT2Config` configuration class: `GPT2Model` (OpenAI GPT-2 model)
  - `GPTBigCodeConfig` configuration class: `GPTBigCodeModel` (GPTBigCode model)
  - `GPTJConfig` configuration class: `GPTJModel` (GPT-J model)
  - `GPTNeoConfig` configuration class: `GPTNeoModel` (GPT Neo model)
  - `GPTNeoXConfig` configuration class: `GPTNeoXModel` (GPT NeoX model)
  - `GPTNeoXJapaneseConfig` configuration class: `GPTNeoXJapaneseModel` (GPT NeoX Japanese model)
  - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
  - `Gemma2Config` configuration class: `Gemma2Model` (Gemma2 model)
  - `Gemma3Config` configuration class: `Gemma3Model` (Gemma3ForConditionalGeneration model)
  - `Gemma3TextConfig` configuration class: `Gemma3TextModel` (Gemma3ForCausalLM model)
  - `Gemma3nAudioConfig` configuration class: `Gemma3nAudioEncoder` (Gemma3nAudioEncoder model)
  - `Gemma3nConfig` configuration class: `Gemma3nModel` (Gemma3nForConditionalGeneration model)
  - `Gemma3nTextConfig` configuration class: `Gemma3nTextModel` (Gemma3nForCausalLM model)
  - `Gemma3nVisionConfig` configuration class: `TimmWrapperModel` (TimmWrapperModel model)
  - `GemmaConfig` configuration class: `GemmaModel` (Gemma model)
  - `GitConfig` configuration class: `GitModel` (GIT model)
  - `Glm4Config` configuration class: `Glm4Model` (GLM4 model)
  - `Glm4MoeConfig` configuration class: `Glm4MoeModel` (Glm4MoE model)
  - `Glm4vConfig` configuration class: `Glm4vModel` (GLM4V model)
  - `Glm4vMoeConfig` configuration class: `Glm4vMoeModel` (GLM4VMOE model)
  - `Glm4vMoeTextConfig` configuration class: `Glm4vMoeTextModel` (GLM4VMOE model)
  - `Glm4vTextConfig` configuration class: `Glm4vTextModel` (GLM4V model)
  - `GlmConfig` configuration class: `GlmModel` (GLM model)
  - `GotOcr2Config` configuration class: `GotOcr2Model` (GOT-OCR2 model)
  - `GptOssConfig` configuration class: `GptOssModel` (GptOss model)
  - `GraniteConfig` configuration class: `GraniteModel` (Granite model)
  - `GraniteMoeConfig` configuration class: `GraniteMoeModel` (GraniteMoeMoe model)
  - `GraniteMoeHybridConfig` configuration class: `GraniteMoeHybridModel` (GraniteMoeHybrid model)
  - `GraniteMoeSharedConfig` configuration class: `GraniteMoeSharedModel` (GraniteMoeSharedMoe model)
  - `GraphormerConfig` configuration class: `GraphormerModel` (Graphormer model)
  - `GroundingDinoConfig` configuration class: `GroundingDinoModel` (Grounding DINO model)
  - `GroupViTConfig` configuration class: `GroupViTModel` (GroupViT model)
  - `HGNetV2Config` configuration class: `HGNetV2Backbone` (HGNet-V2 model)
  - `HeliumConfig` configuration class: `HeliumModel` (Helium model)
  - `HieraConfig` configuration class: `HieraModel` (Hiera model)
  - `HubertConfig` configuration class: `HubertModel` (Hubert model)
  - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1Model` (HunYuanDenseV1 model)
  - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1Model` (HunYuanMoeV1 model)
  - `IBertConfig` configuration class: `IBertModel` (I-BERT model)
  - `IJepaConfig` configuration class: `IJepaModel` (I-JEPA model)
  - `Idefics2Config` configuration class: `Idefics2Model` (Idefics2 model)
  - `Idefics3Config` configuration class: `Idefics3Model` (Idefics3 model)
  - `Idefics3VisionConfig` configuration class: `Idefics3VisionTransformer` (Idefics3VisionTransformer model)
  - `IdeficsConfig` configuration class: `IdeficsModel` (IDEFICS model)
  - `ImageGPTConfig` configuration class: `ImageGPTModel` (ImageGPT model)
  - `InformerConfig` configuration class: `InformerModel` (Informer model)
  - `InstructBlipConfig` configuration class: `InstructBlipModel` (InstructBLIP model)
  - `InstructBlipVideoConfig` configuration class: `InstructBlipVideoModel` (InstructBlipVideo model)
  - `InternVLConfig` configuration class: `InternVLModel` (InternVL model)
  - `InternVLVisionConfig` configuration class: `InternVLVisionModel` (InternVLVision model)
  - `JambaConfig` configuration class: `JambaModel` (Jamba model)
  - `JanusConfig` configuration class: `JanusModel` (Janus model)
  - `JetMoeConfig` configuration class: `JetMoeModel` (JetMoe model)
  - `JukeboxConfig` configuration class: `JukeboxModel` (Jukebox model)
  - `Kosmos2Config` configuration class: `Kosmos2Model` (KOSMOS-2 model)
  - `Kosmos2_5Config` configuration class: `Kosmos2_5Model` (KOSMOS-2.5 model)
  - `KyutaiSpeechToTextConfig` configuration class: `KyutaiSpeechToTextModel` (KyutaiSpeechToText model)
  - `LEDConfig` configuration class: `LEDModel` (LED model)
  - `LayoutLMConfig` configuration class: `LayoutLMModel` (LayoutLM model)
  - `LayoutLMv2Config` configuration class: `LayoutLMv2Model` (LayoutLMv2 model)
  - `LayoutLMv3Config` configuration class: `LayoutLMv3Model` (LayoutLMv3 model)
  - `LevitConfig` configuration class: `LevitModel` (LeViT model)
  - `Lfm2Config` configuration class: `Lfm2Model` (Lfm2 model)
  - `Lfm2VlConfig` configuration class: `Lfm2VlModel` (Lfm2Vl model)
  - `LightGlueConfig` configuration class: `LightGlueForKeypointMatching` (LightGlue model)
  - `LiltConfig` configuration class: `LiltModel` (LiLT model)
  - `Llama4Config` configuration class: `Llama4ForConditionalGeneration` (Llama4 model)
  - `Llama4TextConfig` configuration class: `Llama4TextModel` (Llama4ForCausalLM model)
  - `LlamaConfig` configuration class: `LlamaModel` (LLaMA model)
  - `LlavaConfig` configuration class: `LlavaModel` (LLaVa model)
  - `LlavaNextConfig` configuration class: `LlavaNextModel` (LLaVA-NeXT model)
  - `LlavaNextVideoConfig` configuration class: `LlavaNextVideoModel` (LLaVa-NeXT-Video model)
  - `LlavaOnevisionConfig` configuration class: `LlavaOnevisionModel` (LLaVA-Onevision model)
  - `LongT5Config` configuration class: `LongT5Model` (LongT5 model)
  - `LongcatFlashConfig` configuration class: `LongcatFlashModel` (LongCatFlash model)
  - `LongformerConfig` configuration class: `LongformerModel` (Longformer model)
  - `LukeConfig` configuration class: `LukeModel` (LUKE model)
  - `LxmertConfig` configuration class: `LxmertModel` (LXMERT model)
  - `M2M100Config` configuration class: `M2M100Model` (M2M100 model)
  - `MBartConfig` configuration class: `MBartModel` (mBART model)
  - `MCTCTConfig` configuration class: `MCTCTModel` (M-CTC-T model)
  - `MLCDVisionConfig` configuration class: `MLCDVisionModel` (MLCD model)
  - `MMGroundingDinoConfig` configuration class: `MMGroundingDinoModel` (MM Grounding DINO model)
  - `MPNetConfig` configuration class: `MPNetModel` (MPNet model)
  - `MT5Config` configuration class: `MT5Model` (MT5 model)
  - `Mamba2Config` configuration class: `Mamba2Model` (mamba2 model)
  - `MambaConfig` configuration class: `MambaModel` (Mamba model)
  - `MarianConfig` configuration class: `MarianModel` (Marian model)
  - `MarkupLMConfig` configuration class: `MarkupLMModel` (MarkupLM model)
  - `Mask2FormerConfig` configuration class: `Mask2FormerModel` (Mask2Former model)
  - `MaskFormerConfig` configuration class: `MaskFormerModel` (MaskFormer model)
  - `MaskFormerSwinConfig` configuration class: `MaskFormerSwinModel` (MaskFormerSwin model)
  - `MegaConfig` configuration class: `MegaModel` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertModel` (Megatron-BERT model)
  - `MetaClip2Config` configuration class: `MetaClip2Model` (MetaCLIP 2 model)
  - `MgpstrConfig` configuration class: `MgpstrForSceneTextRecognition` (MGP-STR model)
  - `MimiConfig` configuration class: `MimiModel` (Mimi model)
  - `MiniMaxConfig` configuration class: `MiniMaxModel` (MiniMax model)
  - `MinistralConfig` configuration class: `MinistralModel` (Ministral model)
  - `Mistral3Config` configuration class: `Mistral3Model` (Mistral3 model)
  - `MistralConfig` configuration class: `MistralModel` (Mistral model)
  - `MixtralConfig` configuration class: `MixtralModel` (Mixtral model)
  - `MllamaConfig` configuration class: `MllamaModel` (Mllama model)
  - `MobileBertConfig` configuration class: `MobileBertModel` (MobileBERT model)
  - `MobileNetV1Config` configuration class: `MobileNetV1Model` (MobileNetV1 model)
  - `MobileNetV2Config` configuration class: `MobileNetV2Model` (MobileNetV2 model)
  - `MobileViTConfig` configuration class: `MobileViTModel` (MobileViT model)
  - `MobileViTV2Config` configuration class: `MobileViTV2Model` (MobileViTV2 model)
  - `ModernBertConfig` configuration class: `ModernBertModel` (ModernBERT model)
  - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderModel` (ModernBertDecoder model)
  - `MoonshineConfig` configuration class: `MoonshineModel` (Moonshine model)
  - `MoshiConfig` configuration class: `MoshiModel` (Moshi model)
  - `MptConfig` configuration class: `MptModel` (MPT model)
  - `MraConfig` configuration class: `MraModel` (MRA model)
  - `MusicgenConfig` configuration class: `MusicgenModel` (MusicGen model)
  - `MusicgenMelodyConfig` configuration class: `MusicgenMelodyModel` (MusicGen Melody model)
  - `MvpConfig` configuration class: `MvpModel` (MVP model)
  - `NatConfig` configuration class: `NatModel` (NAT model)
  - `NemotronConfig` configuration class: `NemotronModel` (Nemotron model)
  - `NezhaConfig` configuration class: `NezhaModel` (Nezha model)
  - `NllbMoeConfig` configuration class: `NllbMoeModel` (NLLB-MOE model)
  - `NystromformerConfig` configuration class: `NystromformerModel` (Nyströmformer model)
  - `OPTConfig` configuration class: `OPTModel` (OPT model)
  - `Olmo2Config` configuration class: `Olmo2Model` (OLMo2 model)
  - `Olmo3Config` configuration class: `Olmo3Model` (Olmo3 model)
  - `OlmoConfig` configuration class: `OlmoModel` (OLMo model)
  - `OlmoeConfig` configuration class: `OlmoeModel` (OLMoE model)
  - `OmDetTurboConfig` configuration class: `OmDetTurboForObjectDetection` (OmDet-Turbo model)
  - `OneFormerConfig` configuration class: `OneFormerModel` (OneFormer model)
  - `OpenAIGPTConfig` configuration class: `OpenAIGPTModel` (OpenAI GPT model)
  - `OpenLlamaConfig` configuration class: `OpenLlamaModel` (OpenLlama model)
  - `Ovis2Config` configuration class: `Ovis2Model` (Ovis2 model)
  - `OwlViTConfig` configuration class: `OwlViTModel` (OWL-ViT model)
  - `Owlv2Config` configuration class: `Owlv2Model` (OWLv2 model)
  - `PLBartConfig` configuration class: `PLBartModel` (PLBart model)
  - `PaliGemmaConfig` configuration class: `PaliGemmaModel` (PaliGemma model)
  - `ParakeetCTCConfig` configuration class: `ParakeetForCTC` (Parakeet model)
  - `ParakeetEncoderConfig` configuration class: `ParakeetEncoder` (ParakeetEncoder model)
  - `PatchTSMixerConfig` configuration class: `PatchTSMixerModel` (PatchTSMixer model)
  - `PatchTSTConfig` configuration class: `PatchTSTModel` (PatchTST model)
  - `PegasusConfig` configuration class: `PegasusModel` (Pegasus model)
  - `PegasusXConfig` configuration class: `PegasusXModel` (PEGASUS-X model)
  - `PerceiverConfig` configuration class: `PerceiverModel` (Perceiver model)
  - `PerceptionLMConfig` configuration class: `PerceptionLMModel` (PerceptionLM model)
  - `PersimmonConfig` configuration class: `PersimmonModel` (Persimmon model)
  - `Phi3Config` configuration class: `Phi3Model` (Phi3 model)
  - `Phi4MultimodalConfig` configuration class: `Phi4MultimodalModel` (Phi4Multimodal model)
  - `PhiConfig` configuration class: `PhiModel` (Phi model)
  - `PhimoeConfig` configuration class: `PhimoeModel` (Phimoe model)
  - `PixtralVisionConfig` configuration class: `PixtralVisionModel` (Pixtral model)
  - `PoolFormerConfig` configuration class: `PoolFormerModel` (PoolFormer model)
  - `ProphetNetConfig` configuration class: `ProphetNetModel` (ProphetNet model)
  - `PvtConfig` configuration class: `PvtModel` (PVT model)
  - `PvtV2Config` configuration class: `PvtV2Model` (PVTv2 model)
  - `QDQBertConfig` configuration class: `QDQBertModel` (QDQBert model)
  - `Qwen2AudioEncoderConfig` configuration class: `Qwen2AudioEncoder` (Qwen2AudioEncoder model)
  - `Qwen2Config` configuration class: `Qwen2Model` (Qwen2 model)
  - `Qwen2MoeConfig` configuration class: `Qwen2MoeModel` (Qwen2MoE model)
  - `Qwen2VLConfig` configuration class: `Qwen2VLModel` (Qwen2VL model)
  - `Qwen2VLTextConfig` configuration class: `Qwen2VLTextModel` (Qwen2VL model)
  - `Qwen2_5_VLConfig` configuration class: `Qwen2_5_VLModel` (Qwen2_5_VL model)
  - `Qwen2_5_VLTextConfig` configuration class: `Qwen2_5_VLTextModel` (Qwen2_5_VL model)
  - `Qwen3Config` configuration class: `Qwen3Model` (Qwen3 model)
  - `Qwen3MoeConfig` configuration class: `Qwen3MoeModel` (Qwen3MoE model)
  - `Qwen3NextConfig` configuration class: `Qwen3NextModel` (Qwen3Next model)
  - `Qwen3VLConfig` configuration class: `Qwen3VLModel` (Qwen3VL model)
  - `Qwen3VLMoeConfig` configuration class: `Qwen3VLMoeModel` (Qwen3VLMoe model)
  - `Qwen3VLMoeTextConfig` configuration class: `Qwen3VLMoeTextModel` (Qwen3VLMoe model)
  - `Qwen3VLTextConfig` configuration class: `Qwen3VLTextModel` (Qwen3VL model)
  - `RTDetrConfig` configuration class: `RTDetrModel` (RT-DETR model)
  - `RTDetrV2Config` configuration class: `RTDetrV2Model` (RT-DETRv2 model)
  - `RecurrentGemmaConfig` configuration class: `RecurrentGemmaModel` (RecurrentGemma model)
  - `ReformerConfig` configuration class: `ReformerModel` (Reformer model)
  - `RegNetConfig` configuration class: `RegNetModel` (RegNet model)
  - `RemBertConfig` configuration class: `RemBertModel` (RemBERT model)
  - `ResNetConfig` configuration class: `ResNetModel` (ResNet model)
  - `RetriBertConfig` configuration class: `RetriBertModel` (RetriBERT model)
  - `RoCBertConfig` configuration class: `RoCBertModel` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerModel` (RoFormer model)
  - `RobertaConfig` configuration class: `RobertaModel` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
  - `RwkvConfig` configuration class: `RwkvModel` (RWKV model)
  - `SEWConfig` configuration class: `SEWModel` (SEW model)
  - `SEWDConfig` configuration class: `SEWDModel` (SEW-D model)
  - `Sam2Config` configuration class: `Sam2Model` (SAM2 model)
  - `Sam2HieraDetConfig` configuration class: `Sam2HieraDetModel` (Sam2HieraDetModel model)
  - `Sam2VideoConfig` configuration class: `Sam2VideoModel` (Sam2VideoModel model)
  - `Sam2VisionConfig` configuration class: `Sam2VisionModel` (Sam2VisionModel model)
  - `SamConfig` configuration class: `SamModel` (SAM model)
  - `SamHQConfig` configuration class: `SamHQModel` (SAM-HQ model)
  - `SamHQVisionConfig` configuration class: `SamHQVisionModel` (SamHQVisionModel model)
  - `SamVisionConfig` configuration class: `SamVisionModel` (SamVisionModel model)
  - `SeamlessM4TConfig` configuration class: `SeamlessM4TModel` (SeamlessM4T model)
  - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2Model` (SeamlessM4Tv2 model)
  - `SeedOssConfig` configuration class: `SeedOssModel` (SeedOss model)
  - `SegGptConfig` configuration class: `SegGptModel` (SegGPT model)
  - `SegformerConfig` configuration class: `SegformerModel` (SegFormer model)
  - `Siglip2Config` configuration class: `Siglip2Model` (SigLIP2 model)
  - `Siglip2VisionConfig` configuration class: `Siglip2VisionModel` (Siglip2VisionModel model)
  - `SiglipConfig` configuration class: `SiglipModel` (SigLIP model)
  - `SiglipVisionConfig` configuration class: `SiglipVisionModel` (SiglipVisionModel model)
  - `SmolLM3Config` configuration class: `SmolLM3Model` (SmolLM3 model)
  - `SmolVLMConfig` configuration class: `SmolVLMModel` (SmolVLM model)
  - `SmolVLMVisionConfig` configuration class: `SmolVLMVisionTransformer` (SmolVLMVisionTransformer model)
  - `Speech2TextConfig` configuration class: `Speech2TextModel` (Speech2Text model)
  - `SpeechT5Config` configuration class: `SpeechT5Model` (SpeechT5 model)
  - `SplinterConfig` configuration class: `SplinterModel` (Splinter model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertModel` (SqueezeBERT model)
  - `StableLmConfig` configuration class: `StableLmModel` (StableLm model)
  - `Starcoder2Config` configuration class: `Starcoder2Model` (Starcoder2 model)
  - `SwiftFormerConfig` configuration class: `SwiftFormerModel` (SwiftFormer model)
  - `Swin2SRConfig` configuration class: `Swin2SRModel` (Swin2SR model)
  - `SwinConfig` configuration class: `SwinModel` (Swin Transformer model)
  - `Swinv2Config` configuration class: `Swinv2Model` (Swin Transformer V2 model)
  - `SwitchTransformersConfig` configuration class: `SwitchTransformersModel` (SwitchTransformers model)
  - `T5Config` configuration class: `T5Model` (T5 model)
  - `T5GemmaConfig` configuration class: `T5GemmaModel` (T5Gemma model)
  - `TableTransformerConfig` configuration class: `TableTransformerModel` (Table Transformer model)
  - `TapasConfig` configuration class: `TapasModel` (TAPAS model)
  - `TextNetConfig` configuration class: `TextNetModel` (TextNet model)
  - `TimeSeriesTransformerConfig` configuration class: `TimeSeriesTransformerModel` (Time Series Transformer model)
  - `TimesFmConfig` configuration class: `TimesFmModel` (TimesFm model)
  - `TimesformerConfig` configuration class: `TimesformerModel` (TimeSformer model)
  - `TimmBackboneConfig` configuration class: `TimmBackbone` (TimmBackbone model)
  - `TimmWrapperConfig` configuration class: `TimmWrapperModel` (TimmWrapperModel model)
  - `TrajectoryTransformerConfig` configuration class: `TrajectoryTransformerModel` (Trajectory Transformer model)
  - `TransfoXLConfig` configuration class: `TransfoXLModel` (Transformer-XL model)
  - `TvltConfig` configuration class: `TvltModel` (TVLT model)
  - `TvpConfig` configuration class: `TvpModel` (TVP model)
  - `UMT5Config` configuration class: `UMT5Model` (UMT5 model)
  - `UdopConfig` configuration class: `UdopModel` (UDOP model)
  - `UniSpeechConfig` configuration class: `UniSpeechModel` (UniSpeech model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatModel` (UniSpeechSat model)
  - `UnivNetConfig` configuration class: `UnivNetModel` (UnivNet model)
  - `VJEPA2Config` configuration class: `VJEPA2Model` (VJEPA2Model model)
  - `VanConfig` configuration class: `VanModel` (VAN model)
  - `VaultGemmaConfig` configuration class: `VaultGemmaModel` (VaultGemma model)
  - `ViTConfig` configuration class: `ViTModel` (ViT model)
  - `ViTHybridConfig` configuration class: `ViTHybridModel` (ViT Hybrid model)
  - `ViTMAEConfig` configuration class: `ViTMAEModel` (ViTMAE model)
  - `ViTMSNConfig` configuration class: `ViTMSNModel` (ViTMSN model)
  - `VideoLlavaConfig` configuration class: `VideoLlavaModel` (VideoLlava model)
  - `VideoMAEConfig` configuration class: `VideoMAEModel` (VideoMAE model)
  - `ViltConfig` configuration class: `ViltModel` (ViLT model)
  - `VipLlavaConfig` configuration class: `VipLlavaModel` (VipLlava model)
  - `VisionTextDualEncoderConfig` configuration class: `VisionTextDualEncoderModel` (VisionTextDualEncoder model)
  - `VisualBertConfig` configuration class: `VisualBertModel` (VisualBERT model)
  - `VitDetConfig` configuration class: `VitDetModel` (VitDet model)
  - `VitsConfig` configuration class: `VitsModel` (VITS model)
  - `VivitConfig` configuration class: `VivitModel` (ViViT model)
  - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model)
  - `VoxtralEncoderConfig` configuration class: `VoxtralEncoder` (Voxtral Encoder model)
  - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertModel` (Wav2Vec2-BERT model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2Model` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerModel` (Wav2Vec2-Conformer model)
  - `WavLMConfig` configuration class: `WavLMModel` (WavLM model)
  - `WhisperConfig` configuration class: `WhisperModel` (Whisper model)
  - `XCLIPConfig` configuration class: `XCLIPModel` (X-CLIP model)
  - `XGLMConfig` configuration class: `XGLMModel` (XGLM model)
  - `XLMConfig` configuration class: `XLMModel` (XLM model)
  - `XLMProphetNetConfig` configuration class: `XLMProphetNetModel` (XLM-ProphetNet model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaModel` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLModel` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetModel` (XLNet model)
  - `XcodecConfig` configuration class: `XcodecModel` (X-CODEC model)
  - `XmodConfig` configuration class: `XmodModel` (X-MOD model)
  - `YolosConfig` configuration class: `YolosModel` (YOLOS model)
  - `YosoConfig` configuration class: `YosoModel` (YOSO model)
  - `Zamba2Config` configuration class: `Zamba2Model` (Zamba2 model)
  - `ZambaConfig` configuration class: `ZambaModel` (Zamba model)
  - `xLSTMConfig` configuration class: `xLSTMModel` (xLSTM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the base model classes of the library from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModel.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [ASTConfig](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTConfig) configuration class: [ASTModel](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTModel) (Audio Spectrogram Transformer model) - `Aimv2Config` configuration class: `Aimv2Model` (AIMv2 model) - `Aimv2VisionConfig` configuration class: `Aimv2VisionModel` (Aimv2VisionModel model) - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertModel](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertModel) (ALBERT model) - [AlignConfig](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignConfig) configuration class: [AlignModel](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignModel) (ALIGN model) - [AltCLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model) - `ApertusConfig` configuration class: `ApertusModel` (Apertus model) - `ArceeConfig` configuration class: `ArceeModel` (Arcee model) - `AriaConfig` configuration class: `AriaModel` (Aria model) - `AriaTextConfig` configuration class: `AriaTextModel` (AriaText model) - [AutoformerConfig](/docs/transformers/v4.57.1/ja/model_doc/autoformer#transformers.AutoformerConfig) configuration class: [AutoformerModel](/docs/transformers/v4.57.1/ja/model_doc/autoformer#transformers.AutoformerModel) (Autoformer model) - `AyaVisionConfig` configuration class: `AyaVisionModel` (AyaVision model) - `BambaConfig` configuration class: `BambaModel` (Bamba model) - [BarkConfig](/docs/transformers/v4.57.1/ja/model_doc/bark#transformers.BarkConfig) configuration class: [BarkModel](/docs/transformers/v4.57.1/ja/model_doc/bark#transformers.BarkModel) (Bark model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartModel](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartModel) (BART model) - [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) configuration class: [BeitModel](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitModel) (BEiT model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertModel) (BERT model) - [BertGenerationConfig](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationConfig) configuration class: [BertGenerationEncoder](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationEncoder) (Bert Generation model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdModel](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdModel) (BigBird model) - [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusModel](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusModel) (BigBird-Pegasus model) - [BioGptConfig](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptModel](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptModel) (BioGpt model) - [BitConfig](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitConfig) configuration class: [BitModel](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitModel) (BiT model) - `BitNetConfig` configuration class: `BitNetModel` (BitNet model) - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotModel) (Blenderbot model) - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallModel) (BlenderbotSmall model) - [Blip2Config](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2Model](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Model) (BLIP-2 model) - [Blip2QFormerConfig](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2QFormerConfig) configuration class: [Blip2QFormerModel](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2QFormerModel) (BLIP-2 QFormer model) - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipModel) (BLIP model) - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomModel](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomModel) (BLOOM model) - `BltConfig` configuration class: `BltModel` (Blt model) - [BridgeTowerConfig](/docs/transformers/v4.57.1/ja/model_doc/bridgetower#transformers.BridgeTowerConfig) configuration class: [BridgeTowerModel](/docs/transformers/v4.57.1/ja/model_doc/bridgetower#transformers.BridgeTowerModel) (BridgeTower model) - [BrosConfig](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosConfig) configuration class: [BrosModel](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosModel) (BROS model) - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPModel) (CLIP model) - [CLIPSegConfig](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model) - [CLIPTextConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTextConfig) configuration class: [CLIPTextModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTextModel) (CLIPTextModel model) - [CLIPVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPVisionConfig) configuration class: [CLIPVisionModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPVisionModel) (CLIPVisionModel model) - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLModel) (CTRL model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertModel](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertModel) (CamemBERT model) - [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) configuration class: [CanineModel](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineModel) (CANINE model) - `ChameleonConfig` configuration class: `ChameleonModel` (Chameleon model) - [ChineseCLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPConfig) configuration class: [ChineseCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPModel) (Chinese-CLIP model) - [ChineseCLIPVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPVisionConfig) configuration class: [ChineseCLIPVisionModel](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPVisionModel) (ChineseCLIPVisionModel model) - [ClapConfig](/docs/transformers/v4.57.1/ja/model_doc/clap#transformers.ClapConfig) configuration class: [ClapModel](/docs/transformers/v4.57.1/ja/model_doc/clap#transformers.ClapModel) (CLAP model) - [ClvpConfig](/docs/transformers/v4.57.1/ja/model_doc/clvp#transformers.ClvpConfig) configuration class: [ClvpModelForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/clvp#transformers.ClvpModelForConditionalGeneration) (CLVP model) - [CodeGenConfig](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenModel](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenModel) (CodeGen model) - `Cohere2Config` configuration class: `Cohere2Model` (Cohere2 model) - `Cohere2VisionConfig` configuration class: `Cohere2VisionModel` (Cohere2Vision model) - `CohereConfig` configuration class: `CohereModel` (Cohere model) - [ConditionalDetrConfig](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrConfig) configuration class: [ConditionalDetrModel](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrModel) (Conditional DETR model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertModel](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertModel) (ConvBERT model) - [ConvNextConfig](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextConfig) configuration class: [ConvNextModel](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextModel) (ConvNeXT model) - [ConvNextV2Config](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [ConvNextV2Model](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Model) (ConvNeXTV2 model) - [CpmAntConfig](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntConfig) configuration class: [CpmAntModel](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntModel) (CPM-Ant model) - `CsmConfig` configuration class: `CsmForConditionalGeneration` (CSM model) - [CvtConfig](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtConfig) configuration class: [CvtModel](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtModel) (CvT model) - `DFineConfig` configuration class: `DFineModel` (D-FINE model) - `DINOv3ConvNextConfig` configuration class: `DINOv3ConvNextModel` (DINOv3 ConvNext model) - `DINOv3ViTConfig` configuration class: `DINOv3ViTModel` (DINOv3 ViT model) - `DPRConfig` configuration class: `DPRQuestionEncoder` (DPR model) - `DPTConfig` configuration class: `DPTModel` (DPT model) - `DabDetrConfig` configuration class: `DabDetrModel` (DAB-DETR model) - `DacConfig` configuration class: `DacModel` (DAC model) - [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioModel) (Data2VecAudio model) - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextModel) (Data2VecText model) - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionModel) (Data2VecVision model) - `DbrxConfig` configuration class: `DbrxModel` (DBRX model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaModel](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaModel) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2Model](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Model) (DeBERTa-v2 model) - [DecisionTransformerConfig](/docs/transformers/v4.57.1/ja/model_doc/decision_transformer#transformers.DecisionTransformerConfig) configuration class: `DecisionTransformerModel` (Decision Transformer model) - `DeepseekV2Config` configuration class: `DeepseekV2Model` (DeepSeek-V2 model) - `DeepseekV3Config` configuration class: `DeepseekV3Model` (DeepSeek-V3 model) - `DeepseekVLConfig` configuration class: `DeepseekVLModel` (DeepseekVL model) - `DeepseekVLHybridConfig` configuration class: `DeepseekVLHybridModel` (DeepseekVLHybrid model) - [DeformableDetrConfig](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrConfig) configuration class: [DeformableDetrModel](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrModel) (Deformable DETR model) - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTModel](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTModel) (DeiT model) - `DepthProConfig` configuration class: `DepthProModel` (DepthPro model) - [DetaConfig](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaConfig) configuration class: [DetaModel](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaModel) (DETA model) - [DetrConfig](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrConfig) configuration class: [DetrModel](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrModel) (DETR model) - `DiaConfig` configuration class: `DiaModel` (Dia model) - `DiffLlamaConfig` configuration class: `DiffLlamaModel` (DiffLlama model) - [DinatConfig](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatConfig) configuration class: [DinatModel](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatModel) (DiNAT model) - `Dinov2Config` configuration class: `Dinov2Model` (DINOv2 model) - `Dinov2WithRegistersConfig` configuration class: `Dinov2WithRegistersModel` (DINOv2 with Registers model) - `DistilBertConfig` configuration class: `DistilBertModel` (DistilBERT model) - `DogeConfig` configuration class: `DogeModel` (Doge model) - `DonutSwinConfig` configuration class: `DonutSwinModel` (DonutSwin model) - `Dots1Config` configuration class: `Dots1Model` (dots1 model) - `EdgeTamConfig` configuration class: `EdgeTamModel` (EdgeTAM model) - `EdgeTamVideoConfig` configuration class: `EdgeTamVideoModel` (EdgeTamVideo model) - `EdgeTamVisionConfig` configuration class: `EdgeTamVisionModel` (EdgeTamVisionModel model) - `EfficientFormerConfig` configuration class: `EfficientFormerModel` (EfficientFormer model) - `EfficientLoFTRConfig` configuration class: `EfficientLoFTRModel` (EfficientLoFTR model) - `EfficientNetConfig` configuration class: `EfficientNetModel` (EfficientNet model) - `ElectraConfig` configuration class: `ElectraModel` (ELECTRA model) - `Emu3Config` configuration class: `Emu3Model` (Emu3 model) - `EncodecConfig` configuration class: `EncodecModel` (EnCodec model) - `Ernie4_5Config` configuration class: `Ernie4_5Model` (Ernie4_5 model) - `Ernie4_5_MoeConfig` configuration class: `Ernie4_5_MoeModel` (Ernie4_5_MoE model) - `ErnieConfig` configuration class: `ErnieModel` (ERNIE model) - `ErnieMConfig` configuration class: `ErnieMModel` (ErnieM model) - `EsmConfig` configuration class: `EsmModel` (ESM model) - `EvollaConfig` configuration class: `EvollaModel` (Evolla model) - `Exaone4Config` configuration class: `Exaone4Model` (EXAONE-4.0 model) - `FNetConfig` configuration class: `FNetModel` (FNet model) - `FSMTConfig` configuration class: `FSMTModel` (FairSeq Machine-Translation model) - `FalconConfig` configuration class: `FalconModel` (Falcon model) - `FalconH1Config` configuration class: `FalconH1Model` (FalconH1 model) - `FalconMambaConfig` configuration class: `FalconMambaModel` (FalconMamba model) - `FastSpeech2ConformerConfig` configuration class: `FastSpeech2ConformerModel` (FastSpeech2Conformer model) - `FastSpeech2ConformerWithHifiGanConfig` configuration class: `FastSpeech2ConformerWithHifiGan` (FastSpeech2ConformerWithHifiGan model) - `FlaubertConfig` configuration class: `FlaubertModel` (FlauBERT model) - `FlavaConfig` configuration class: `FlavaModel` (FLAVA model) - `FlexOlmoConfig` configuration class: `FlexOlmoModel` (FlexOlmo model) - `Florence2Config` configuration class: `Florence2Model` (Florence2 model) - `FocalNetConfig` configuration class: `FocalNetModel` (FocalNet model) - `FunnelConfig` configuration class: `FunnelModel` or `FunnelBaseModel` (Funnel Transformer model) - `FuyuConfig` configuration class: `FuyuModel` (Fuyu model) - `GLPNConfig` configuration class: `GLPNModel` (GLPN model) - `GPT2Config` configuration class: `GPT2Model` (OpenAI GPT-2 model) - `GPTBigCodeConfig` configuration class: `GPTBigCodeModel` (GPTBigCode model) - `GPTJConfig` configuration class: `GPTJModel` (GPT-J model) - `GPTNeoConfig` configuration class: `GPTNeoModel` (GPT Neo model) - `GPTNeoXConfig` configuration class: `GPTNeoXModel` (GPT NeoX model) - `GPTNeoXJapaneseConfig` configuration class: `GPTNeoXJapaneseModel` (GPT NeoX Japanese model) - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model) - `Gemma2Config` configuration class: `Gemma2Model` (Gemma2 model) - `Gemma3Config` configuration class: `Gemma3Model` (Gemma3ForConditionalGeneration model) - `Gemma3TextConfig` configuration class: `Gemma3TextModel` (Gemma3ForCausalLM model) - `Gemma3nAudioConfig` configuration class: `Gemma3nAudioEncoder` (Gemma3nAudioEncoder model) - `Gemma3nConfig` configuration class: `Gemma3nModel` (Gemma3nForConditionalGeneration model) - `Gemma3nTextConfig` configuration class: `Gemma3nTextModel` (Gemma3nForCausalLM model) - `Gemma3nVisionConfig` configuration class: `TimmWrapperModel` (TimmWrapperModel model) - `GemmaConfig` configuration class: `GemmaModel` (Gemma model) - `GitConfig` configuration class: `GitModel` (GIT model) - `Glm4Config` configuration class: `Glm4Model` (GLM4 model) - `Glm4MoeConfig` configuration class: `Glm4MoeModel` (Glm4MoE model) - `Glm4vConfig` configuration class: `Glm4vModel` (GLM4V model) - `Glm4vMoeConfig` configuration class: `Glm4vMoeModel` (GLM4VMOE model) - `Glm4vMoeTextConfig` configuration class: `Glm4vMoeTextModel` (GLM4VMOE model) - `Glm4vTextConfig` configuration class: `Glm4vTextModel` (GLM4V model) - `GlmConfig` configuration class: `GlmModel` (GLM model) - `GotOcr2Config` configuration class: `GotOcr2Model` (GOT-OCR2 model) - `GptOssConfig` configuration class: `GptOssModel` (GptOss model) - `GraniteConfig` configuration class: `GraniteModel` (Granite model) - `GraniteMoeConfig` configuration class: `GraniteMoeModel` (GraniteMoeMoe model) - `GraniteMoeHybridConfig` configuration class: `GraniteMoeHybridModel` (GraniteMoeHybrid model) - `GraniteMoeSharedConfig` configuration class: `GraniteMoeSharedModel` (GraniteMoeSharedMoe model) - `GraphormerConfig` configuration class: `GraphormerModel` (Graphormer model) - `GroundingDinoConfig` configuration class: `GroundingDinoModel` (Grounding DINO model) - `GroupViTConfig` configuration class: `GroupViTModel` (GroupViT model) - `HGNetV2Config` configuration class: `HGNetV2Backbone` (HGNet-V2 model) - `HeliumConfig` configuration class: `HeliumModel` (Helium model) - `HieraConfig` configuration class: `HieraModel` (Hiera model) - `HubertConfig` configuration class: `HubertModel` (Hubert model) - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1Model` (HunYuanDenseV1 model) - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1Model` (HunYuanMoeV1 model) - `IBertConfig` configuration class: `IBertModel` (I-BERT model) - `IJepaConfig` configuration class: `IJepaModel` (I-JEPA model) - `Idefics2Config` configuration class: `Idefics2Model` (Idefics2 model) - `Idefics3Config` configuration class: `Idefics3Model` (Idefics3 model) - `Idefics3VisionConfig` configuration class: `Idefics3VisionTransformer` (Idefics3VisionTransformer model) - `IdeficsConfig` configuration class: `IdeficsModel` (IDEFICS model) - `ImageGPTConfig` configuration class: `ImageGPTModel` (ImageGPT model) - `InformerConfig` configuration class: `InformerModel` (Informer model) - `InstructBlipConfig` configuration class: `InstructBlipModel` (InstructBLIP model) - `InstructBlipVideoConfig` configuration class: `InstructBlipVideoModel` (InstructBlipVideo model) - `InternVLConfig` configuration class: `InternVLModel` (InternVL model) - `InternVLVisionConfig` configuration class: `InternVLVisionModel` (InternVLVision model) - `JambaConfig` configuration class: `JambaModel` (Jamba model) - `JanusConfig` configuration class: `JanusModel` (Janus model) - `JetMoeConfig` configuration class: `JetMoeModel` (JetMoe model) - `JukeboxConfig` configuration class: `JukeboxModel` (Jukebox model) - `Kosmos2Config` configuration class: `Kosmos2Model` (KOSMOS-2 model) - `Kosmos2_5Config` configuration class: `Kosmos2_5Model` (KOSMOS-2.5 model) - `KyutaiSpeechToTextConfig` configuration class: `KyutaiSpeechToTextModel` (KyutaiSpeechToText model) - `LEDConfig` configuration class: `LEDModel` (LED model) - `LayoutLMConfig` configuration class: `LayoutLMModel` (LayoutLM model) - `LayoutLMv2Config` configuration class: `LayoutLMv2Model` (LayoutLMv2 model) - `LayoutLMv3Config` configuration class: `LayoutLMv3Model` (LayoutLMv3 model) - `LevitConfig` configuration class: `LevitModel` (LeViT model) - `Lfm2Config` configuration class: `Lfm2Model` (Lfm2 model) - `Lfm2VlConfig` configuration class: `Lfm2VlModel` (Lfm2Vl model) - `LightGlueConfig` configuration class: `LightGlueForKeypointMatching` (LightGlue model) - `LiltConfig` configuration class: `LiltModel` (LiLT model) - `Llama4Config` configuration class: `Llama4ForConditionalGeneration` (Llama4 model) - `Llama4TextConfig` configuration class: `Llama4TextModel` (Llama4ForCausalLM model) - `LlamaConfig` configuration class: `LlamaModel` (LLaMA model) - `LlavaConfig` configuration class: `LlavaModel` (LLaVa model) - `LlavaNextConfig` configuration class: `LlavaNextModel` (LLaVA-NeXT model) - `LlavaNextVideoConfig` configuration class: `LlavaNextVideoModel` (LLaVa-NeXT-Video model) - `LlavaOnevisionConfig` configuration class: `LlavaOnevisionModel` (LLaVA-Onevision model) - `LongT5Config` configuration class: `LongT5Model` (LongT5 model) - `LongcatFlashConfig` configuration class: `LongcatFlashModel` (LongCatFlash model) - `LongformerConfig` configuration class: `LongformerModel` (Longformer model) - `LukeConfig` configuration class: `LukeModel` (LUKE model) - `LxmertConfig` configuration class: `LxmertModel` (LXMERT model) - `M2M100Config` configuration class: `M2M100Model` (M2M100 model) - `MBartConfig` configuration class: `MBartModel` (mBART model) - `MCTCTConfig` configuration class: `MCTCTModel` (M-CTC-T model) - `MLCDVisionConfig` configuration class: `MLCDVisionModel` (MLCD model) - `MMGroundingDinoConfig` configuration class: `MMGroundingDinoModel` (MM Grounding DINO model) - `MPNetConfig` configuration class: `MPNetModel` (MPNet model) - `MT5Config` configuration class: `MT5Model` (MT5 model) - `Mamba2Config` configuration class: `Mamba2Model` (mamba2 model) - `MambaConfig` configuration class: `MambaModel` (Mamba model) - `MarianConfig` configuration class: `MarianModel` (Marian model) - `MarkupLMConfig` configuration class: `MarkupLMModel` (MarkupLM model) - `Mask2FormerConfig` configuration class: `Mask2FormerModel` (Mask2Former model) - `MaskFormerConfig` configuration class: `MaskFormerModel` (MaskFormer model) - `MaskFormerSwinConfig` configuration class: `MaskFormerSwinModel` (MaskFormerSwin model) - `MegaConfig` configuration class: `MegaModel` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertModel` (Megatron-BERT model) - `MetaClip2Config` configuration class: `MetaClip2Model` (MetaCLIP 2 model) - `MgpstrConfig` configuration class: `MgpstrForSceneTextRecognition` (MGP-STR model) - `MimiConfig` configuration class: `MimiModel` (Mimi model) - `MiniMaxConfig` configuration class: `MiniMaxModel` (MiniMax model) - `MinistralConfig` configuration class: `MinistralModel` (Ministral model) - `Mistral3Config` configuration class: `Mistral3Model` (Mistral3 model) - `MistralConfig` configuration class: `MistralModel` (Mistral model) - `MixtralConfig` configuration class: `MixtralModel` (Mixtral model) - `MllamaConfig` configuration class: `MllamaModel` (Mllama model) - `MobileBertConfig` configuration class: `MobileBertModel` (MobileBERT model) - `MobileNetV1Config` configuration class: `MobileNetV1Model` (MobileNetV1 model) - `MobileNetV2Config` configuration class: `MobileNetV2Model` (MobileNetV2 model) - `MobileViTConfig` configuration class: `MobileViTModel` (MobileViT model) - `MobileViTV2Config` configuration class: `MobileViTV2Model` (MobileViTV2 model) - `ModernBertConfig` configuration class: `ModernBertModel` (ModernBERT model) - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderModel` (ModernBertDecoder model) - `MoonshineConfig` configuration class: `MoonshineModel` (Moonshine model) - `MoshiConfig` configuration class: `MoshiModel` (Moshi model) - `MptConfig` configuration class: `MptModel` (MPT model) - `MraConfig` configuration class: `MraModel` (MRA model) - `MusicgenConfig` configuration class: `MusicgenModel` (MusicGen model) - `MusicgenMelodyConfig` configuration class: `MusicgenMelodyModel` (MusicGen Melody model) - `MvpConfig` configuration class: `MvpModel` (MVP model) - `NatConfig` configuration class: `NatModel` (NAT model) - `NemotronConfig` configuration class: `NemotronModel` (Nemotron model) - `NezhaConfig` configuration class: `NezhaModel` (Nezha model) - `NllbMoeConfig` configuration class: `NllbMoeModel` (NLLB-MOE model) - `NystromformerConfig` configuration class: `NystromformerModel` (Nyströmformer model) - `OPTConfig` configuration class: `OPTModel` (OPT model) - `Olmo2Config` configuration class: `Olmo2Model` (OLMo2 model) - `Olmo3Config` configuration class: `Olmo3Model` (Olmo3 model) - `OlmoConfig` configuration class: `OlmoModel` (OLMo model) - `OlmoeConfig` configuration class: `OlmoeModel` (OLMoE model) - `OmDetTurboConfig` configuration class: `OmDetTurboForObjectDetection` (OmDet-Turbo model) - `OneFormerConfig` configuration class: `OneFormerModel` (OneFormer model) - `OpenAIGPTConfig` configuration class: `OpenAIGPTModel` (OpenAI GPT model) - `OpenLlamaConfig` configuration class: `OpenLlamaModel` (OpenLlama model) - `Ovis2Config` configuration class: `Ovis2Model` (Ovis2 model) - `OwlViTConfig` configuration class: `OwlViTModel` (OWL-ViT model) - `Owlv2Config` configuration class: `Owlv2Model` (OWLv2 model) - `PLBartConfig` configuration class: `PLBartModel` (PLBart model) - `PaliGemmaConfig` configuration class: `PaliGemmaModel` (PaliGemma model) - `ParakeetCTCConfig` configuration class: `ParakeetForCTC` (Parakeet model) - `ParakeetEncoderConfig` configuration class: `ParakeetEncoder` (ParakeetEncoder model) - `PatchTSMixerConfig` configuration class: `PatchTSMixerModel` (PatchTSMixer model) - `PatchTSTConfig` configuration class: `PatchTSTModel` (PatchTST model) - `PegasusConfig` configuration class: `PegasusModel` (Pegasus model) - `PegasusXConfig` configuration class: `PegasusXModel` (PEGASUS-X model) - `PerceiverConfig` configuration class: `PerceiverModel` (Perceiver model) - `PerceptionLMConfig` configuration class: `PerceptionLMModel` (PerceptionLM model) - `PersimmonConfig` configuration class: `PersimmonModel` (Persimmon model) - `Phi3Config` configuration class: `Phi3Model` (Phi3 model) - `Phi4MultimodalConfig` configuration class: `Phi4MultimodalModel` (Phi4Multimodal model) - `PhiConfig` configuration class: `PhiModel` (Phi model) - `PhimoeConfig` configuration class: `PhimoeModel` (Phimoe model) - `PixtralVisionConfig` configuration class: `PixtralVisionModel` (Pixtral model) - `PoolFormerConfig` configuration class: `PoolFormerModel` (PoolFormer model) - `ProphetNetConfig` configuration class: `ProphetNetModel` (ProphetNet model) - `PvtConfig` configuration class: `PvtModel` (PVT model) - `PvtV2Config` configuration class: `PvtV2Model` (PVTv2 model) - `QDQBertConfig` configuration class: `QDQBertModel` (QDQBert model) - `Qwen2AudioEncoderConfig` configuration class: `Qwen2AudioEncoder` (Qwen2AudioEncoder model) - `Qwen2Config` configuration class: `Qwen2Model` (Qwen2 model) - `Qwen2MoeConfig` configuration class: `Qwen2MoeModel` (Qwen2MoE model) - `Qwen2VLConfig` configuration class: `Qwen2VLModel` (Qwen2VL model) - `Qwen2VLTextConfig` configuration class: `Qwen2VLTextModel` (Qwen2VL model) - `Qwen2_5_VLConfig` configuration class: `Qwen2_5_VLModel` (Qwen2_5_VL model) - `Qwen2_5_VLTextConfig` configuration class: `Qwen2_5_VLTextModel` (Qwen2_5_VL model) - `Qwen3Config` configuration class: `Qwen3Model` (Qwen3 model) - `Qwen3MoeConfig` configuration class: `Qwen3MoeModel` (Qwen3MoE model) - `Qwen3NextConfig` configuration class: `Qwen3NextModel` (Qwen3Next model) - `Qwen3VLConfig` configuration class: `Qwen3VLModel` (Qwen3VL model) - `Qwen3VLMoeConfig` configuration class: `Qwen3VLMoeModel` (Qwen3VLMoe model) - `Qwen3VLMoeTextConfig` configuration class: `Qwen3VLMoeTextModel` (Qwen3VLMoe model) - `Qwen3VLTextConfig` configuration class: `Qwen3VLTextModel` (Qwen3VL model) - `RTDetrConfig` configuration class: `RTDetrModel` (RT-DETR model) - `RTDetrV2Config` configuration class: `RTDetrV2Model` (RT-DETRv2 model) - `RecurrentGemmaConfig` configuration class: `RecurrentGemmaModel` (RecurrentGemma model) - `ReformerConfig` configuration class: `ReformerModel` (Reformer model) - `RegNetConfig` configuration class: `RegNetModel` (RegNet model) - `RemBertConfig` configuration class: `RemBertModel` (RemBERT model) - `ResNetConfig` configuration class: `ResNetModel` (ResNet model) - `RetriBertConfig` configuration class: `RetriBertModel` (RetriBERT model) - `RoCBertConfig` configuration class: `RoCBertModel` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerModel` (RoFormer model) - `RobertaConfig` configuration class: `RobertaModel` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model) - `RwkvConfig` configuration class: `RwkvModel` (RWKV model) - `SEWConfig` configuration class: `SEWModel` (SEW model) - `SEWDConfig` configuration class: `SEWDModel` (SEW-D model) - `Sam2Config` configuration class: `Sam2Model` (SAM2 model) - `Sam2HieraDetConfig` configuration class: `Sam2HieraDetModel` (Sam2HieraDetModel model) - `Sam2VideoConfig` configuration class: `Sam2VideoModel` (Sam2VideoModel model) - `Sam2VisionConfig` configuration class: `Sam2VisionModel` (Sam2VisionModel model) - `SamConfig` configuration class: `SamModel` (SAM model) - `SamHQConfig` configuration class: `SamHQModel` (SAM-HQ model) - `SamHQVisionConfig` configuration class: `SamHQVisionModel` (SamHQVisionModel model) - `SamVisionConfig` configuration class: `SamVisionModel` (SamVisionModel model) - `SeamlessM4TConfig` configuration class: `SeamlessM4TModel` (SeamlessM4T model) - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2Model` (SeamlessM4Tv2 model) - `SeedOssConfig` configuration class: `SeedOssModel` (SeedOss model) - `SegGptConfig` configuration class: `SegGptModel` (SegGPT model) - `SegformerConfig` configuration class: `SegformerModel` (SegFormer model) - `Siglip2Config` configuration class: `Siglip2Model` (SigLIP2 model) - `Siglip2VisionConfig` configuration class: `Siglip2VisionModel` (Siglip2VisionModel model) - `SiglipConfig` configuration class: `SiglipModel` (SigLIP model) - `SiglipVisionConfig` configuration class: `SiglipVisionModel` (SiglipVisionModel model) - `SmolLM3Config` configuration class: `SmolLM3Model` (SmolLM3 model) - `SmolVLMConfig` configuration class: `SmolVLMModel` (SmolVLM model) - `SmolVLMVisionConfig` configuration class: `SmolVLMVisionTransformer` (SmolVLMVisionTransformer model) - `Speech2TextConfig` configuration class: `Speech2TextModel` (Speech2Text model) - `SpeechT5Config` configuration class: `SpeechT5Model` (SpeechT5 model) - `SplinterConfig` configuration class: `SplinterModel` (Splinter model) - `SqueezeBertConfig` configuration class: `SqueezeBertModel` (SqueezeBERT model) - `StableLmConfig` configuration class: `StableLmModel` (StableLm model) - `Starcoder2Config` configuration class: `Starcoder2Model` (Starcoder2 model) - `SwiftFormerConfig` configuration class: `SwiftFormerModel` (SwiftFormer model) - `Swin2SRConfig` configuration class: `Swin2SRModel` (Swin2SR model) - `SwinConfig` configuration class: `SwinModel` (Swin Transformer model) - `Swinv2Config` configuration class: `Swinv2Model` (Swin Transformer V2 model) - `SwitchTransformersConfig` configuration class: `SwitchTransformersModel` (SwitchTransformers model) - `T5Config` configuration class: `T5Model` (T5 model) - `T5GemmaConfig` configuration class: `T5GemmaModel` (T5Gemma model) - `TableTransformerConfig` configuration class: `TableTransformerModel` (Table Transformer model) - `TapasConfig` configuration class: `TapasModel` (TAPAS model) - `TextNetConfig` configuration class: `TextNetModel` (TextNet model) - `TimeSeriesTransformerConfig` configuration class: `TimeSeriesTransformerModel` (Time Series Transformer model) - `TimesFmConfig` configuration class: `TimesFmModel` (TimesFm model) - `TimesformerConfig` configuration class: `TimesformerModel` (TimeSformer model) - `TimmBackboneConfig` configuration class: `TimmBackbone` (TimmBackbone model) - `TimmWrapperConfig` configuration class: `TimmWrapperModel` (TimmWrapperModel model) - `TrajectoryTransformerConfig` configuration class: `TrajectoryTransformerModel` (Trajectory Transformer model) - `TransfoXLConfig` configuration class: `TransfoXLModel` (Transformer-XL model) - `TvltConfig` configuration class: `TvltModel` (TVLT model) - `TvpConfig` configuration class: `TvpModel` (TVP model) - `UMT5Config` configuration class: `UMT5Model` (UMT5 model) - `UdopConfig` configuration class: `UdopModel` (UDOP model) - `UniSpeechConfig` configuration class: `UniSpeechModel` (UniSpeech model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatModel` (UniSpeechSat model) - `UnivNetConfig` configuration class: `UnivNetModel` (UnivNet model) - `VJEPA2Config` configuration class: `VJEPA2Model` (VJEPA2Model model) - `VanConfig` configuration class: `VanModel` (VAN model) - `VaultGemmaConfig` configuration class: `VaultGemmaModel` (VaultGemma model) - `ViTConfig` configuration class: `ViTModel` (ViT model) - `ViTHybridConfig` configuration class: `ViTHybridModel` (ViT Hybrid model) - `ViTMAEConfig` configuration class: `ViTMAEModel` (ViTMAE model) - `ViTMSNConfig` configuration class: `ViTMSNModel` (ViTMSN model) - `VideoLlavaConfig` configuration class: `VideoLlavaModel` (VideoLlava model) - `VideoMAEConfig` configuration class: `VideoMAEModel` (VideoMAE model) - `ViltConfig` configuration class: `ViltModel` (ViLT model) - `VipLlavaConfig` configuration class: `VipLlavaModel` (VipLlava model) - `VisionTextDualEncoderConfig` configuration class: `VisionTextDualEncoderModel` (VisionTextDualEncoder model) - `VisualBertConfig` configuration class: `VisualBertModel` (VisualBERT model) - `VitDetConfig` configuration class: `VitDetModel` (VitDet model) - `VitsConfig` configuration class: `VitsModel` (VITS model) - `VivitConfig` configuration class: `VivitModel` (ViViT model) - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model) - `VoxtralEncoderConfig` configuration class: `VoxtralEncoder` (Voxtral Encoder model) - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertModel` (Wav2Vec2-BERT model) - `Wav2Vec2Config` configuration class: `Wav2Vec2Model` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerModel` (Wav2Vec2-Conformer model) - `WavLMConfig` configuration class: `WavLMModel` (WavLM model) - `WhisperConfig` configuration class: `WhisperModel` (Whisper model) - `XCLIPConfig` configuration class: `XCLIPModel` (X-CLIP model) - `XGLMConfig` configuration class: `XGLMModel` (XGLM model) - `XLMConfig` configuration class: `XLMModel` (XLM model) - `XLMProphetNetConfig` configuration class: `XLMProphetNetModel` (XLM-ProphetNet model) - `XLMRobertaConfig` configuration class: `XLMRobertaModel` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLModel` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetModel` (XLNet model) - `XcodecConfig` configuration class: `XcodecModel` (X-CODEC model) - `XmodConfig` configuration class: `XmodModel` (X-MOD model) - `YolosConfig` configuration class: `YolosModel` (YOLOS model) - `YosoConfig` configuration class: `YosoModel` (YOSO model) - `Zamba2Config` configuration class: `Zamba2Model` (Zamba2 model) - `ZambaConfig` configuration class: `ZambaModel` (Zamba model) - `xLSTMConfig` configuration class: `xLSTMModel` (xLSTM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModel.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aimv2** -- `Aimv2Model` (AIMv2 model)
- **aimv2_vision_model** -- `Aimv2VisionModel` (Aimv2VisionModel model)
- **albert** -- [AlbertModel](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertModel) (ALBERT model)
- **align** -- [AlignModel](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignModel) (ALIGN model)
- **altclip** -- [AltCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model)
- **apertus** -- `ApertusModel` (Apertus model)
- **arcee** -- `ArceeModel` (Arcee model)
- **aria** -- `AriaModel` (Aria model)
- **aria_text** -- `AriaTextModel` (AriaText model)
- **audio-spectrogram-transformer** -- [ASTModel](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTModel) (Audio Spectrogram Transformer model)
- **autoformer** -- [AutoformerModel](/docs/transformers/v4.57.1/ja/model_doc/autoformer#transformers.AutoformerModel) (Autoformer model)
- **aya_vision** -- `AyaVisionModel` (AyaVision model)
- **bamba** -- `BambaModel` (Bamba model)
- **bark** -- [BarkModel](/docs/transformers/v4.57.1/ja/model_doc/bark#transformers.BarkModel) (Bark model)
- **bart** -- [BartModel](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartModel) (BART model)
- **beit** -- [BeitModel](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitModel) (BEiT model)
- **bert** -- [BertModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertModel) (BERT model)
- **bert-generation** -- [BertGenerationEncoder](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationEncoder) (Bert Generation model)
- **big_bird** -- [BigBirdModel](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdModel) (BigBird model)
- **bigbird_pegasus** -- [BigBirdPegasusModel](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusModel) (BigBird-Pegasus model)
- **biogpt** -- [BioGptModel](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptModel) (BioGpt model)
- **bit** -- [BitModel](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitModel) (BiT model)
- **bitnet** -- `BitNetModel` (BitNet model)
- **blenderbot** -- [BlenderbotModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotModel) (Blenderbot model)
- **blenderbot-small** -- [BlenderbotSmallModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallModel) (BlenderbotSmall model)
- **blip** -- [BlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipModel) (BLIP model)
- **blip-2** -- [Blip2Model](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Model) (BLIP-2 model)
- **blip_2_qformer** -- [Blip2QFormerModel](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2QFormerModel) (BLIP-2 QFormer model)
- **bloom** -- [BloomModel](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomModel) (BLOOM model)
- **blt** -- `BltModel` (Blt model)
- **bridgetower** -- [BridgeTowerModel](/docs/transformers/v4.57.1/ja/model_doc/bridgetower#transformers.BridgeTowerModel) (BridgeTower model)
- **bros** -- [BrosModel](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosModel) (BROS model)
- **camembert** -- [CamembertModel](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertModel) (CamemBERT model)
- **canine** -- [CanineModel](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineModel) (CANINE model)
- **chameleon** -- `ChameleonModel` (Chameleon model)
- **chinese_clip** -- [ChineseCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPModel) (Chinese-CLIP model)
- **chinese_clip_vision_model** -- [ChineseCLIPVisionModel](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPVisionModel) (ChineseCLIPVisionModel model)
- **clap** -- [ClapModel](/docs/transformers/v4.57.1/ja/model_doc/clap#transformers.ClapModel) (CLAP model)
- **clip** -- [CLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPModel) (CLIP model)
- **clip_text_model** -- [CLIPTextModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPTextModel) (CLIPTextModel model)
- **clip_vision_model** -- [CLIPVisionModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPVisionModel) (CLIPVisionModel model)
- **clipseg** -- [CLIPSegModel](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model)
- **clvp** -- [ClvpModelForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/clvp#transformers.ClvpModelForConditionalGeneration) (CLVP model)
- **code_llama** -- `LlamaModel` (CodeLlama model)
- **codegen** -- [CodeGenModel](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenModel) (CodeGen model)
- **cohere** -- `CohereModel` (Cohere model)
- **cohere2** -- `Cohere2Model` (Cohere2 model)
- **cohere2_vision** -- `Cohere2VisionModel` (Cohere2Vision model)
- **conditional_detr** -- [ConditionalDetrModel](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrModel) (Conditional DETR model)
- **convbert** -- [ConvBertModel](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertModel) (ConvBERT model)
- **convnext** -- [ConvNextModel](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextModel) (ConvNeXT model)
- **convnextv2** -- [ConvNextV2Model](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Model) (ConvNeXTV2 model)
- **cpmant** -- [CpmAntModel](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntModel) (CPM-Ant model)
- **csm** -- `CsmForConditionalGeneration` (CSM model)
- **ctrl** -- [CTRLModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLModel) (CTRL model)
- **cvt** -- [CvtModel](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtModel) (CvT model)
- **d_fine** -- `DFineModel` (D-FINE model)
- **dab-detr** -- `DabDetrModel` (DAB-DETR model)
- **dac** -- `DacModel` (DAC model)
- **data2vec-audio** -- [Data2VecAudioModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioModel) (Data2VecAudio model)
- **data2vec-text** -- [Data2VecTextModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextModel) (Data2VecText model)
- **data2vec-vision** -- [Data2VecVisionModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionModel) (Data2VecVision model)
- **dbrx** -- `DbrxModel` (DBRX model)
- **deberta** -- [DebertaModel](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaModel) (DeBERTa model)
- **deberta-v2** -- [DebertaV2Model](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Model) (DeBERTa-v2 model)
- **decision_transformer** -- `DecisionTransformerModel` (Decision Transformer model)
- **deepseek_v2** -- `DeepseekV2Model` (DeepSeek-V2 model)
- **deepseek_v3** -- `DeepseekV3Model` (DeepSeek-V3 model)
- **deepseek_vl** -- `DeepseekVLModel` (DeepseekVL model)
- **deepseek_vl_hybrid** -- `DeepseekVLHybridModel` (DeepseekVLHybrid model)
- **deformable_detr** -- [DeformableDetrModel](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrModel) (Deformable DETR model)
- **deit** -- [DeiTModel](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTModel) (DeiT model)
- **depth_pro** -- `DepthProModel` (DepthPro model)
- **deta** -- [DetaModel](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaModel) (DETA model)
- **detr** -- [DetrModel](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrModel) (DETR model)
- **dia** -- `DiaModel` (Dia model)
- **diffllama** -- `DiffLlamaModel` (DiffLlama model)
- **dinat** -- [DinatModel](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatModel) (DiNAT model)
- **dinov2** -- `Dinov2Model` (DINOv2 model)
- **dinov2_with_registers** -- `Dinov2WithRegistersModel` (DINOv2 with Registers model)
- **dinov3_convnext** -- `DINOv3ConvNextModel` (DINOv3 ConvNext model)
- **dinov3_vit** -- `DINOv3ViTModel` (DINOv3 ViT model)
- **distilbert** -- `DistilBertModel` (DistilBERT model)
- **doge** -- `DogeModel` (Doge model)
- **donut-swin** -- `DonutSwinModel` (DonutSwin model)
- **dots1** -- `Dots1Model` (dots1 model)
- **dpr** -- `DPRQuestionEncoder` (DPR model)
- **dpt** -- `DPTModel` (DPT model)
- **edgetam** -- `EdgeTamModel` (EdgeTAM model)
- **edgetam_video** -- `EdgeTamVideoModel` (EdgeTamVideo model)
- **edgetam_vision_model** -- `EdgeTamVisionModel` (EdgeTamVisionModel model)
- **efficientformer** -- `EfficientFormerModel` (EfficientFormer model)
- **efficientloftr** -- `EfficientLoFTRModel` (EfficientLoFTR model)
- **efficientnet** -- `EfficientNetModel` (EfficientNet model)
- **electra** -- `ElectraModel` (ELECTRA model)
- **emu3** -- `Emu3Model` (Emu3 model)
- **encodec** -- `EncodecModel` (EnCodec model)
- **ernie** -- `ErnieModel` (ERNIE model)
- **ernie4_5** -- `Ernie4_5Model` (Ernie4_5 model)
- **ernie4_5_moe** -- `Ernie4_5_MoeModel` (Ernie4_5_MoE model)
- **ernie_m** -- `ErnieMModel` (ErnieM model)
- **esm** -- `EsmModel` (ESM model)
- **evolla** -- `EvollaModel` (Evolla model)
- **exaone4** -- `Exaone4Model` (EXAONE-4.0 model)
- **falcon** -- `FalconModel` (Falcon model)
- **falcon_h1** -- `FalconH1Model` (FalconH1 model)
- **falcon_mamba** -- `FalconMambaModel` (FalconMamba model)
- **fastspeech2_conformer** -- `FastSpeech2ConformerModel` (FastSpeech2Conformer model)
- **fastspeech2_conformer_with_hifigan** -- `FastSpeech2ConformerWithHifiGan` (FastSpeech2ConformerWithHifiGan model)
- **flaubert** -- `FlaubertModel` (FlauBERT model)
- **flava** -- `FlavaModel` (FLAVA model)
- **flex_olmo** -- `FlexOlmoModel` (FlexOlmo model)
- **florence2** -- `Florence2Model` (Florence2 model)
- **fnet** -- `FNetModel` (FNet model)
- **focalnet** -- `FocalNetModel` (FocalNet model)
- **fsmt** -- `FSMTModel` (FairSeq Machine-Translation model)
- **funnel** -- `FunnelModel` or `FunnelBaseModel` (Funnel Transformer model)
- **fuyu** -- `FuyuModel` (Fuyu model)
- **gemma** -- `GemmaModel` (Gemma model)
- **gemma2** -- `Gemma2Model` (Gemma2 model)
- **gemma3** -- `Gemma3Model` (Gemma3ForConditionalGeneration model)
- **gemma3_text** -- `Gemma3TextModel` (Gemma3ForCausalLM model)
- **gemma3n** -- `Gemma3nModel` (Gemma3nForConditionalGeneration model)
- **gemma3n_audio** -- `Gemma3nAudioEncoder` (Gemma3nAudioEncoder model)
- **gemma3n_text** -- `Gemma3nTextModel` (Gemma3nForCausalLM model)
- **gemma3n_vision** -- `TimmWrapperModel` (TimmWrapperModel model)
- **git** -- `GitModel` (GIT model)
- **glm** -- `GlmModel` (GLM model)
- **glm4** -- `Glm4Model` (GLM4 model)
- **glm4_moe** -- `Glm4MoeModel` (Glm4MoE model)
- **glm4v** -- `Glm4vModel` (GLM4V model)
- **glm4v_moe** -- `Glm4vMoeModel` (GLM4VMOE model)
- **glm4v_moe_text** -- `Glm4vMoeTextModel` (GLM4VMOE model)
- **glm4v_text** -- `Glm4vTextModel` (GLM4V model)
- **glpn** -- `GLPNModel` (GLPN model)
- **got_ocr2** -- `GotOcr2Model` (GOT-OCR2 model)
- **gpt-sw3** -- `GPT2Model` (GPT-Sw3 model)
- **gpt2** -- `GPT2Model` (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeModel` (GPTBigCode model)
- **gpt_neo** -- `GPTNeoModel` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXModel` (GPT NeoX model)
- **gpt_neox_japanese** -- `GPTNeoXJapaneseModel` (GPT NeoX Japanese model)
- **gpt_oss** -- `GptOssModel` (GptOss model)
- **gptj** -- `GPTJModel` (GPT-J model)
- **gptsan-japanese** -- `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
- **granite** -- `GraniteModel` (Granite model)
- **granitemoe** -- `GraniteMoeModel` (GraniteMoeMoe model)
- **granitemoehybrid** -- `GraniteMoeHybridModel` (GraniteMoeHybrid model)
- **granitemoeshared** -- `GraniteMoeSharedModel` (GraniteMoeSharedMoe model)
- **graphormer** -- `GraphormerModel` (Graphormer model)
- **grounding-dino** -- `GroundingDinoModel` (Grounding DINO model)
- **groupvit** -- `GroupViTModel` (GroupViT model)
- **helium** -- `HeliumModel` (Helium model)
- **hgnet_v2** -- `HGNetV2Backbone` (HGNet-V2 model)
- **hiera** -- `HieraModel` (Hiera model)
- **hubert** -- `HubertModel` (Hubert model)
- **hunyuan_v1_dense** -- `HunYuanDenseV1Model` (HunYuanDenseV1 model)
- **hunyuan_v1_moe** -- `HunYuanMoEV1Model` (HunYuanMoeV1 model)
- **ibert** -- `IBertModel` (I-BERT model)
- **idefics** -- `IdeficsModel` (IDEFICS model)
- **idefics2** -- `Idefics2Model` (Idefics2 model)
- **idefics3** -- `Idefics3Model` (Idefics3 model)
- **idefics3_vision** -- `Idefics3VisionTransformer` (Idefics3VisionTransformer model)
- **ijepa** -- `IJepaModel` (I-JEPA model)
- **imagegpt** -- `ImageGPTModel` (ImageGPT model)
- **informer** -- `InformerModel` (Informer model)
- **instructblip** -- `InstructBlipModel` (InstructBLIP model)
- **instructblipvideo** -- `InstructBlipVideoModel` (InstructBlipVideo model)
- **internvl** -- `InternVLModel` (InternVL model)
- **internvl_vision** -- `InternVLVisionModel` (InternVLVision model)
- **jamba** -- `JambaModel` (Jamba model)
- **janus** -- `JanusModel` (Janus model)
- **jetmoe** -- `JetMoeModel` (JetMoe model)
- **jukebox** -- `JukeboxModel` (Jukebox model)
- **kosmos-2** -- `Kosmos2Model` (KOSMOS-2 model)
- **kosmos-2.5** -- `Kosmos2_5Model` (KOSMOS-2.5 model)
- **kyutai_speech_to_text** -- `KyutaiSpeechToTextModel` (KyutaiSpeechToText model)
- **layoutlm** -- `LayoutLMModel` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2Model` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3Model` (LayoutLMv3 model)
- **led** -- `LEDModel` (LED model)
- **levit** -- `LevitModel` (LeViT model)
- **lfm2** -- `Lfm2Model` (Lfm2 model)
- **lfm2_vl** -- `Lfm2VlModel` (Lfm2Vl model)
- **lightglue** -- `LightGlueForKeypointMatching` (LightGlue model)
- **lilt** -- `LiltModel` (LiLT model)
- **llama** -- `LlamaModel` (LLaMA model)
- **llama4** -- `Llama4ForConditionalGeneration` (Llama4 model)
- **llama4_text** -- `Llama4TextModel` (Llama4ForCausalLM model)
- **llava** -- `LlavaModel` (LLaVa model)
- **llava_next** -- `LlavaNextModel` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoModel` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionModel` (LLaVA-Onevision model)
- **longcat_flash** -- `LongcatFlashModel` (LongCatFlash model)
- **longformer** -- `LongformerModel` (Longformer model)
- **longt5** -- `LongT5Model` (LongT5 model)
- **luke** -- `LukeModel` (LUKE model)
- **lxmert** -- `LxmertModel` (LXMERT model)
- **m2m_100** -- `M2M100Model` (M2M100 model)
- **mamba** -- `MambaModel` (Mamba model)
- **mamba2** -- `Mamba2Model` (mamba2 model)
- **marian** -- `MarianModel` (Marian model)
- **markuplm** -- `MarkupLMModel` (MarkupLM model)
- **mask2former** -- `Mask2FormerModel` (Mask2Former model)
- **maskformer** -- `MaskFormerModel` (MaskFormer model)
- **maskformer-swin** -- `MaskFormerSwinModel` (MaskFormerSwin model)
- **mbart** -- `MBartModel` (mBART model)
- **mctct** -- `MCTCTModel` (M-CTC-T model)
- **mega** -- `MegaModel` (MEGA model)
- **megatron-bert** -- `MegatronBertModel` (Megatron-BERT model)
- **metaclip_2** -- `MetaClip2Model` (MetaCLIP 2 model)
- **mgp-str** -- `MgpstrForSceneTextRecognition` (MGP-STR model)
- **mimi** -- `MimiModel` (Mimi model)
- **minimax** -- `MiniMaxModel` (MiniMax model)
- **ministral** -- `MinistralModel` (Ministral model)
- **mistral** -- `MistralModel` (Mistral model)
- **mistral3** -- `Mistral3Model` (Mistral3 model)
- **mixtral** -- `MixtralModel` (Mixtral model)
- **mlcd** -- `MLCDVisionModel` (MLCD model)
- **mllama** -- `MllamaModel` (Mllama model)
- **mm-grounding-dino** -- `MMGroundingDinoModel` (MM Grounding DINO model)
- **mobilebert** -- `MobileBertModel` (MobileBERT model)
- **mobilenet_v1** -- `MobileNetV1Model` (MobileNetV1 model)
- **mobilenet_v2** -- `MobileNetV2Model` (MobileNetV2 model)
- **mobilevit** -- `MobileViTModel` (MobileViT model)
- **mobilevitv2** -- `MobileViTV2Model` (MobileViTV2 model)
- **modernbert** -- `ModernBertModel` (ModernBERT model)
- **modernbert-decoder** -- `ModernBertDecoderModel` (ModernBertDecoder model)
- **moonshine** -- `MoonshineModel` (Moonshine model)
- **moshi** -- `MoshiModel` (Moshi model)
- **mpnet** -- `MPNetModel` (MPNet model)
- **mpt** -- `MptModel` (MPT model)
- **mra** -- `MraModel` (MRA model)
- **mt5** -- `MT5Model` (MT5 model)
- **musicgen** -- `MusicgenModel` (MusicGen model)
- **musicgen_melody** -- `MusicgenMelodyModel` (MusicGen Melody model)
- **mvp** -- `MvpModel` (MVP model)
- **nat** -- `NatModel` (NAT model)
- **nemotron** -- `NemotronModel` (Nemotron model)
- **nezha** -- `NezhaModel` (Nezha model)
- **nllb-moe** -- `NllbMoeModel` (NLLB-MOE model)
- **nystromformer** -- `NystromformerModel` (Nyströmformer model)
- **olmo** -- `OlmoModel` (OLMo model)
- **olmo2** -- `Olmo2Model` (OLMo2 model)
- **olmo3** -- `Olmo3Model` (Olmo3 model)
- **olmoe** -- `OlmoeModel` (OLMoE model)
- **omdet-turbo** -- `OmDetTurboForObjectDetection` (OmDet-Turbo model)
- **oneformer** -- `OneFormerModel` (OneFormer model)
- **open-llama** -- `OpenLlamaModel` (OpenLlama model)
- **openai-gpt** -- `OpenAIGPTModel` (OpenAI GPT model)
- **opt** -- `OPTModel` (OPT model)
- **ovis2** -- `Ovis2Model` (Ovis2 model)
- **owlv2** -- `Owlv2Model` (OWLv2 model)
- **owlvit** -- `OwlViTModel` (OWL-ViT model)
- **paligemma** -- `PaliGemmaModel` (PaliGemma model)
- **parakeet_ctc** -- `ParakeetForCTC` (Parakeet model)
- **parakeet_encoder** -- `ParakeetEncoder` (ParakeetEncoder model)
- **patchtsmixer** -- `PatchTSMixerModel` (PatchTSMixer model)
- **patchtst** -- `PatchTSTModel` (PatchTST model)
- **pegasus** -- `PegasusModel` (Pegasus model)
- **pegasus_x** -- `PegasusXModel` (PEGASUS-X model)
- **perceiver** -- `PerceiverModel` (Perceiver model)
- **perception_encoder** -- `PerceptionEncoder` (PerceptionEncoder model)
- **perception_lm** -- `PerceptionLMModel` (PerceptionLM model)
- **persimmon** -- `PersimmonModel` (Persimmon model)
- **phi** -- `PhiModel` (Phi model)
- **phi3** -- `Phi3Model` (Phi3 model)
- **phi4_multimodal** -- `Phi4MultimodalModel` (Phi4Multimodal model)
- **phimoe** -- `PhimoeModel` (Phimoe model)
- **pixtral** -- `PixtralVisionModel` (Pixtral model)
- **plbart** -- `PLBartModel` (PLBart model)
- **poolformer** -- `PoolFormerModel` (PoolFormer model)
- **prophetnet** -- `ProphetNetModel` (ProphetNet model)
- **pvt** -- `PvtModel` (PVT model)
- **pvt_v2** -- `PvtV2Model` (PVTv2 model)
- **qdqbert** -- `QDQBertModel` (QDQBert model)
- **qwen2** -- `Qwen2Model` (Qwen2 model)
- **qwen2_5_vl** -- `Qwen2_5_VLModel` (Qwen2_5_VL model)
- **qwen2_5_vl_text** -- `Qwen2_5_VLTextModel` (Qwen2_5_VL model)
- **qwen2_audio_encoder** -- `Qwen2AudioEncoder` (Qwen2AudioEncoder model)
- **qwen2_moe** -- `Qwen2MoeModel` (Qwen2MoE model)
- **qwen2_vl** -- `Qwen2VLModel` (Qwen2VL model)
- **qwen2_vl_text** -- `Qwen2VLTextModel` (Qwen2VL model)
- **qwen3** -- `Qwen3Model` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeModel` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextModel` (Qwen3Next model)
- **qwen3_vl** -- `Qwen3VLModel` (Qwen3VL model)
- **qwen3_vl_moe** -- `Qwen3VLMoeModel` (Qwen3VLMoe model)
- **qwen3_vl_moe_text** -- `Qwen3VLMoeTextModel` (Qwen3VLMoe model)
- **qwen3_vl_text** -- `Qwen3VLTextModel` (Qwen3VL model)
- **recurrent_gemma** -- `RecurrentGemmaModel` (RecurrentGemma model)
- **reformer** -- `ReformerModel` (Reformer model)
- **regnet** -- `RegNetModel` (RegNet model)
- **rembert** -- `RemBertModel` (RemBERT model)
- **resnet** -- `ResNetModel` (ResNet model)
- **retribert** -- `RetriBertModel` (RetriBERT model)
- **roberta** -- `RobertaModel` (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertModel` (RoCBert model)
- **roformer** -- `RoFormerModel` (RoFormer model)
- **rt_detr** -- `RTDetrModel` (RT-DETR model)
- **rt_detr_v2** -- `RTDetrV2Model` (RT-DETRv2 model)
- **rwkv** -- `RwkvModel` (RWKV model)
- **sam** -- `SamModel` (SAM model)
- **sam2** -- `Sam2Model` (SAM2 model)
- **sam2_hiera_det_model** -- `Sam2HieraDetModel` (Sam2HieraDetModel model)
- **sam2_video** -- `Sam2VideoModel` (Sam2VideoModel model)
- **sam2_vision_model** -- `Sam2VisionModel` (Sam2VisionModel model)
- **sam_hq** -- `SamHQModel` (SAM-HQ model)
- **sam_hq_vision_model** -- `SamHQVisionModel` (SamHQVisionModel model)
- **sam_vision_model** -- `SamVisionModel` (SamVisionModel model)
- **seamless_m4t** -- `SeamlessM4TModel` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4Tv2Model` (SeamlessM4Tv2 model)
- **seed_oss** -- `SeedOssModel` (SeedOss model)
- **segformer** -- `SegformerModel` (SegFormer model)
- **seggpt** -- `SegGptModel` (SegGPT model)
- **sew** -- `SEWModel` (SEW model)
- **sew-d** -- `SEWDModel` (SEW-D model)
- **siglip** -- `SiglipModel` (SigLIP model)
- **siglip2** -- `Siglip2Model` (SigLIP2 model)
- **siglip2_vision_model** -- `Siglip2VisionModel` (Siglip2VisionModel model)
- **siglip_vision_model** -- `SiglipVisionModel` (SiglipVisionModel model)
- **smollm3** -- `SmolLM3Model` (SmolLM3 model)
- **smolvlm** -- `SmolVLMModel` (SmolVLM model)
- **smolvlm_vision** -- `SmolVLMVisionTransformer` (SmolVLMVisionTransformer model)
- **speech_to_text** -- `Speech2TextModel` (Speech2Text model)
- **speecht5** -- `SpeechT5Model` (SpeechT5 model)
- **splinter** -- `SplinterModel` (Splinter model)
- **squeezebert** -- `SqueezeBertModel` (SqueezeBERT model)
- **stablelm** -- `StableLmModel` (StableLm model)
- **starcoder2** -- `Starcoder2Model` (Starcoder2 model)
- **swiftformer** -- `SwiftFormerModel` (SwiftFormer model)
- **swin** -- `SwinModel` (Swin Transformer model)
- **swin2sr** -- `Swin2SRModel` (Swin2SR model)
- **swinv2** -- `Swinv2Model` (Swin Transformer V2 model)
- **switch_transformers** -- `SwitchTransformersModel` (SwitchTransformers model)
- **t5** -- `T5Model` (T5 model)
- **t5gemma** -- `T5GemmaModel` (T5Gemma model)
- **table-transformer** -- `TableTransformerModel` (Table Transformer model)
- **tapas** -- `TapasModel` (TAPAS model)
- **textnet** -- `TextNetModel` (TextNet model)
- **time_series_transformer** -- `TimeSeriesTransformerModel` (Time Series Transformer model)
- **timesfm** -- `TimesFmModel` (TimesFm model)
- **timesformer** -- `TimesformerModel` (TimeSformer model)
- **timm_backbone** -- `TimmBackbone` (TimmBackbone model)
- **timm_wrapper** -- `TimmWrapperModel` (TimmWrapperModel model)
- **trajectory_transformer** -- `TrajectoryTransformerModel` (Trajectory Transformer model)
- **transfo-xl** -- `TransfoXLModel` (Transformer-XL model)
- **tvlt** -- `TvltModel` (TVLT model)
- **tvp** -- `TvpModel` (TVP model)
- **udop** -- `UdopModel` (UDOP model)
- **umt5** -- `UMT5Model` (UMT5 model)
- **unispeech** -- `UniSpeechModel` (UniSpeech model)
- **unispeech-sat** -- `UniSpeechSatModel` (UniSpeechSat model)
- **univnet** -- `UnivNetModel` (UnivNet model)
- **van** -- `VanModel` (VAN model)
- **vaultgemma** -- `VaultGemmaModel` (VaultGemma model)
- **video_llava** -- `VideoLlavaModel` (VideoLlava model)
- **videomae** -- `VideoMAEModel` (VideoMAE model)
- **vilt** -- `ViltModel` (ViLT model)
- **vipllava** -- `VipLlavaModel` (VipLlava model)
- **vision-text-dual-encoder** -- `VisionTextDualEncoderModel` (VisionTextDualEncoder model)
- **visual_bert** -- `VisualBertModel` (VisualBERT model)
- **vit** -- `ViTModel` (ViT model)
- **vit_hybrid** -- `ViTHybridModel` (ViT Hybrid model)
- **vit_mae** -- `ViTMAEModel` (ViTMAE model)
- **vit_msn** -- `ViTMSNModel` (ViTMSN model)
- **vitdet** -- `VitDetModel` (VitDet model)
- **vits** -- `VitsModel` (VITS model)
- **vivit** -- `VivitModel` (ViViT model)
- **vjepa2** -- `VJEPA2Model` (VJEPA2Model model)
- **voxtral** -- `VoxtralForConditionalGeneration` (Voxtral model)
- **voxtral_encoder** -- `VoxtralEncoder` (Voxtral Encoder model)
- **wav2vec2** -- `Wav2Vec2Model` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertModel` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerModel` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMModel` (WavLM model)
- **whisper** -- `WhisperModel` (Whisper model)
- **xclip** -- `XCLIPModel` (X-CLIP model)
- **xcodec** -- `XcodecModel` (X-CODEC model)
- **xglm** -- `XGLMModel` (XGLM model)
- **xlm** -- `XLMModel` (XLM model)
- **xlm-prophetnet** -- `XLMProphetNetModel` (XLM-ProphetNet model)
- **xlm-roberta** -- `XLMRobertaModel` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLModel` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetModel` (XLNet model)
- **xlstm** -- `xLSTMModel` (xLSTM model)
- **xmod** -- `XmodModel` (X-MOD model)
- **yolos** -- `YolosModel` (YOLOS model)
- **yoso** -- `YosoModel` (YOSO model)
- **zamba** -- `ZambaModel` (Zamba model)
- **zamba2** -- `Zamba2Model` (Zamba2 model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModel.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModel[[transformers.TFAutoModel]]

#### transformers.TFAutoModel[[transformers.TFAutoModel]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L538)

This is a generic model class that will be instantiated as one of the base model classes of the library when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModel.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertModel](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertModel) (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [TFBartModel](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartModel) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertModel) (BERT model)
  - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [TFBlenderbotModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.TFBlenderbotModel) (Blenderbot model)
  - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [TFBlenderbotSmallModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.TFBlenderbotSmallModel) (BlenderbotSmall model)
  - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.TFBlipModel) (BLIP model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: [TFCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.TFCLIPModel) (CLIP model)
  - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [TFCTRLModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLModel) (CTRL model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertModel](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertModel) (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertModel](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertModel) (ConvBERT model)
  - [ConvNextConfig](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextConfig) configuration class: [TFConvNextModel](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.TFConvNextModel) (ConvNeXT model)
  - [ConvNextV2Config](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [TFConvNextV2Model](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.TFConvNextV2Model) (ConvNeXTV2 model)
  - [CvtConfig](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtConfig) configuration class: [TFCvtModel](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.TFCvtModel) (CvT model)
  - `DPRConfig` configuration class: `TFDPRQuestionEncoder` (DPR model)
  - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [TFData2VecVisionModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.TFData2VecVisionModel) (Data2VecVision model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaModel](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaModel) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2Model](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2Model) (DeBERTa-v2 model)
  - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [TFDeiTModel](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTModel) (DeiT model)
  - `DistilBertConfig` configuration class: `TFDistilBertModel` (DistilBERT model)
  - `EfficientFormerConfig` configuration class: `TFEfficientFormerModel` (EfficientFormer model)
  - `ElectraConfig` configuration class: `TFElectraModel` (ELECTRA model)
  - `EsmConfig` configuration class: `TFEsmModel` (ESM model)
  - `FlaubertConfig` configuration class: `TFFlaubertModel` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelModel` or `TFFunnelBaseModel` (Funnel Transformer model)
  - `GPT2Config` configuration class: `TFGPT2Model` (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `TFGPTJModel` (GPT-J model)
  - `GroupViTConfig` configuration class: `TFGroupViTModel` (GroupViT model)
  - `HubertConfig` configuration class: `TFHubertModel` (Hubert model)
  - `IdeficsConfig` configuration class: `TFIdeficsModel` (IDEFICS model)
  - `LEDConfig` configuration class: `TFLEDModel` (LED model)
  - `LayoutLMConfig` configuration class: `TFLayoutLMModel` (LayoutLM model)
  - `LayoutLMv3Config` configuration class: `TFLayoutLMv3Model` (LayoutLMv3 model)
  - `LongformerConfig` configuration class: `TFLongformerModel` (Longformer model)
  - `LxmertConfig` configuration class: `TFLxmertModel` (LXMERT model)
  - `MBartConfig` configuration class: `TFMBartModel` (mBART model)
  - `MPNetConfig` configuration class: `TFMPNetModel` (MPNet model)
  - `MT5Config` configuration class: `TFMT5Model` (MT5 model)
  - `MarianConfig` configuration class: `TFMarianModel` (Marian model)
  - `MistralConfig` configuration class: `TFMistralModel` (Mistral model)
  - `MobileBertConfig` configuration class: `TFMobileBertModel` (MobileBERT model)
  - `MobileViTConfig` configuration class: `TFMobileViTModel` (MobileViT model)
  - `OPTConfig` configuration class: `TFOPTModel` (OPT model)
  - `OpenAIGPTConfig` configuration class: `TFOpenAIGPTModel` (OpenAI GPT model)
  - `PegasusConfig` configuration class: `TFPegasusModel` (Pegasus model)
  - `RegNetConfig` configuration class: `TFRegNetModel` (RegNet model)
  - `RemBertConfig` configuration class: `TFRemBertModel` (RemBERT model)
  - `ResNetConfig` configuration class: `TFResNetModel` (ResNet model)
  - `RoFormerConfig` configuration class: `TFRoFormerModel` (RoFormer model)
  - `RobertaConfig` configuration class: `TFRobertaModel` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
  - `SamConfig` configuration class: `TFSamModel` (SAM model)
  - `SamVisionConfig` configuration class: `TFSamVisionModel` (SamVisionModel model)
  - `SegformerConfig` configuration class: `TFSegformerModel` (SegFormer model)
  - `Speech2TextConfig` configuration class: `TFSpeech2TextModel` (Speech2Text model)
  - `SwiftFormerConfig` configuration class: `TFSwiftFormerModel` (SwiftFormer model)
  - `SwinConfig` configuration class: `TFSwinModel` (Swin Transformer model)
  - `T5Config` configuration class: `TFT5Model` (T5 model)
  - `TapasConfig` configuration class: `TFTapasModel` (TAPAS model)
  - `TransfoXLConfig` configuration class: `TFTransfoXLModel` (Transformer-XL model)
  - `ViTConfig` configuration class: `TFViTModel` (ViT model)
  - `ViTMAEConfig` configuration class: `TFViTMAEModel` (ViTMAE model)
  - `VisionTextDualEncoderConfig` configuration class: `TFVisionTextDualEncoderModel` (VisionTextDualEncoder model)
  - `Wav2Vec2Config` configuration class: `TFWav2Vec2Model` (Wav2Vec2 model)
  - `WhisperConfig` configuration class: `TFWhisperModel` (Whisper model)
  - `XGLMConfig` configuration class: `TFXGLMModel` (XGLM model)
  - `XLMConfig` configuration class: `TFXLMModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaModel` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetModel` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the base model classes of the library from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModel.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertModel](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertModel) (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [TFBartModel](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartModel) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertModel) (BERT model) - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [TFBlenderbotModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.TFBlenderbotModel) (Blenderbot model) - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [TFBlenderbotSmallModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.TFBlenderbotSmallModel) (BlenderbotSmall model) - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.TFBlipModel) (BLIP model) - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: [TFCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.TFCLIPModel) (CLIP model) - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [TFCTRLModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLModel) (CTRL model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertModel](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertModel) (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertModel](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertModel) (ConvBERT model) - [ConvNextConfig](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextConfig) configuration class: [TFConvNextModel](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.TFConvNextModel) (ConvNeXT model) - [ConvNextV2Config](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [TFConvNextV2Model](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.TFConvNextV2Model) (ConvNeXTV2 model) - [CvtConfig](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtConfig) configuration class: [TFCvtModel](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.TFCvtModel) (CvT model) - `DPRConfig` configuration class: `TFDPRQuestionEncoder` (DPR model) - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [TFData2VecVisionModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.TFData2VecVisionModel) (Data2VecVision model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaModel](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaModel) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2Model](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2Model) (DeBERTa-v2 model) - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [TFDeiTModel](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTModel) (DeiT model) - `DistilBertConfig` configuration class: `TFDistilBertModel` (DistilBERT model) - `EfficientFormerConfig` configuration class: `TFEfficientFormerModel` (EfficientFormer model) - `ElectraConfig` configuration class: `TFElectraModel` (ELECTRA model) - `EsmConfig` configuration class: `TFEsmModel` (ESM model) - `FlaubertConfig` configuration class: `TFFlaubertModel` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelModel` or `TFFunnelBaseModel` (Funnel Transformer model) - `GPT2Config` configuration class: `TFGPT2Model` (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `TFGPTJModel` (GPT-J model) - `GroupViTConfig` configuration class: `TFGroupViTModel` (GroupViT model) - `HubertConfig` configuration class: `TFHubertModel` (Hubert model) - `IdeficsConfig` configuration class: `TFIdeficsModel` (IDEFICS model) - `LEDConfig` configuration class: `TFLEDModel` (LED model) - `LayoutLMConfig` configuration class: `TFLayoutLMModel` (LayoutLM model) - `LayoutLMv3Config` configuration class: `TFLayoutLMv3Model` (LayoutLMv3 model) - `LongformerConfig` configuration class: `TFLongformerModel` (Longformer model) - `LxmertConfig` configuration class: `TFLxmertModel` (LXMERT model) - `MBartConfig` configuration class: `TFMBartModel` (mBART model) - `MPNetConfig` configuration class: `TFMPNetModel` (MPNet model) - `MT5Config` configuration class: `TFMT5Model` (MT5 model) - `MarianConfig` configuration class: `TFMarianModel` (Marian model) - `MistralConfig` configuration class: `TFMistralModel` (Mistral model) - `MobileBertConfig` configuration class: `TFMobileBertModel` (MobileBERT model) - `MobileViTConfig` configuration class: `TFMobileViTModel` (MobileViT model) - `OPTConfig` configuration class: `TFOPTModel` (OPT model) - `OpenAIGPTConfig` configuration class: `TFOpenAIGPTModel` (OpenAI GPT model) - `PegasusConfig` configuration class: `TFPegasusModel` (Pegasus model) - `RegNetConfig` configuration class: `TFRegNetModel` (RegNet model) - `RemBertConfig` configuration class: `TFRemBertModel` (RemBERT model) - `ResNetConfig` configuration class: `TFResNetModel` (ResNet model) - `RoFormerConfig` configuration class: `TFRoFormerModel` (RoFormer model) - `RobertaConfig` configuration class: `TFRobertaModel` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model) - `SamConfig` configuration class: `TFSamModel` (SAM model) - `SamVisionConfig` configuration class: `TFSamVisionModel` (SamVisionModel model) - `SegformerConfig` configuration class: `TFSegformerModel` (SegFormer model) - `Speech2TextConfig` configuration class: `TFSpeech2TextModel` (Speech2Text model) - `SwiftFormerConfig` configuration class: `TFSwiftFormerModel` (SwiftFormer model) - `SwinConfig` configuration class: `TFSwinModel` (Swin Transformer model) - `T5Config` configuration class: `TFT5Model` (T5 model) - `TapasConfig` configuration class: `TFTapasModel` (TAPAS model) - `TransfoXLConfig` configuration class: `TFTransfoXLModel` (Transformer-XL model) - `ViTConfig` configuration class: `TFViTModel` (ViT model) - `ViTMAEConfig` configuration class: `TFViTMAEModel` (ViTMAE model) - `VisionTextDualEncoderConfig` configuration class: `TFVisionTextDualEncoderModel` (VisionTextDualEncoder model) - `Wav2Vec2Config` configuration class: `TFWav2Vec2Model` (Wav2Vec2 model) - `WhisperConfig` configuration class: `TFWhisperModel` (Whisper model) - `XGLMConfig` configuration class: `TFXGLMModel` (XGLM model) - `XLMConfig` configuration class: `TFXLMModel` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaModel` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetModel` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModel.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [TFAlbertModel](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertModel) (ALBERT model)
- **bart** -- [TFBartModel](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartModel) (BART model)
- **bert** -- [TFBertModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertModel) (BERT model)
- **blenderbot** -- [TFBlenderbotModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.TFBlenderbotModel) (Blenderbot model)
- **blenderbot-small** -- [TFBlenderbotSmallModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.TFBlenderbotSmallModel) (BlenderbotSmall model)
- **blip** -- [TFBlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.TFBlipModel) (BLIP model)
- **camembert** -- [TFCamembertModel](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertModel) (CamemBERT model)
- **clip** -- [TFCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.TFCLIPModel) (CLIP model)
- **convbert** -- [TFConvBertModel](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertModel) (ConvBERT model)
- **convnext** -- [TFConvNextModel](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.TFConvNextModel) (ConvNeXT model)
- **convnextv2** -- [TFConvNextV2Model](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.TFConvNextV2Model) (ConvNeXTV2 model)
- **ctrl** -- [TFCTRLModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLModel) (CTRL model)
- **cvt** -- [TFCvtModel](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.TFCvtModel) (CvT model)
- **data2vec-vision** -- [TFData2VecVisionModel](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.TFData2VecVisionModel) (Data2VecVision model)
- **deberta** -- [TFDebertaModel](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaModel) (DeBERTa model)
- **deberta-v2** -- [TFDebertaV2Model](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2Model) (DeBERTa-v2 model)
- **deit** -- [TFDeiTModel](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTModel) (DeiT model)
- **distilbert** -- `TFDistilBertModel` (DistilBERT model)
- **dpr** -- `TFDPRQuestionEncoder` (DPR model)
- **efficientformer** -- `TFEfficientFormerModel` (EfficientFormer model)
- **electra** -- `TFElectraModel` (ELECTRA model)
- **esm** -- `TFEsmModel` (ESM model)
- **flaubert** -- `TFFlaubertModel` (FlauBERT model)
- **funnel** -- `TFFunnelModel` or `TFFunnelBaseModel` (Funnel Transformer model)
- **gpt-sw3** -- `TFGPT2Model` (GPT-Sw3 model)
- **gpt2** -- `TFGPT2Model` (OpenAI GPT-2 model)
- **gptj** -- `TFGPTJModel` (GPT-J model)
- **groupvit** -- `TFGroupViTModel` (GroupViT model)
- **hubert** -- `TFHubertModel` (Hubert model)
- **idefics** -- `TFIdeficsModel` (IDEFICS model)
- **layoutlm** -- `TFLayoutLMModel` (LayoutLM model)
- **layoutlmv3** -- `TFLayoutLMv3Model` (LayoutLMv3 model)
- **led** -- `TFLEDModel` (LED model)
- **longformer** -- `TFLongformerModel` (Longformer model)
- **lxmert** -- `TFLxmertModel` (LXMERT model)
- **marian** -- `TFMarianModel` (Marian model)
- **mbart** -- `TFMBartModel` (mBART model)
- **mistral** -- `TFMistralModel` (Mistral model)
- **mobilebert** -- `TFMobileBertModel` (MobileBERT model)
- **mobilevit** -- `TFMobileViTModel` (MobileViT model)
- **mpnet** -- `TFMPNetModel` (MPNet model)
- **mt5** -- `TFMT5Model` (MT5 model)
- **openai-gpt** -- `TFOpenAIGPTModel` (OpenAI GPT model)
- **opt** -- `TFOPTModel` (OPT model)
- **pegasus** -- `TFPegasusModel` (Pegasus model)
- **regnet** -- `TFRegNetModel` (RegNet model)
- **rembert** -- `TFRemBertModel` (RemBERT model)
- **resnet** -- `TFResNetModel` (ResNet model)
- **roberta** -- `TFRobertaModel` (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerModel` (RoFormer model)
- **sam** -- `TFSamModel` (SAM model)
- **sam_vision_model** -- `TFSamVisionModel` (SamVisionModel model)
- **segformer** -- `TFSegformerModel` (SegFormer model)
- **speech_to_text** -- `TFSpeech2TextModel` (Speech2Text model)
- **swiftformer** -- `TFSwiftFormerModel` (SwiftFormer model)
- **swin** -- `TFSwinModel` (Swin Transformer model)
- **t5** -- `TFT5Model` (T5 model)
- **tapas** -- `TFTapasModel` (TAPAS model)
- **transfo-xl** -- `TFTransfoXLModel` (Transformer-XL model)
- **vision-text-dual-encoder** -- `TFVisionTextDualEncoderModel` (VisionTextDualEncoder model)
- **vit** -- `TFViTModel` (ViT model)
- **vit_mae** -- `TFViTMAEModel` (ViTMAE model)
- **wav2vec2** -- `TFWav2Vec2Model` (Wav2Vec2 model)
- **whisper** -- `TFWhisperModel` (Whisper model)
- **xglm** -- `TFXGLMModel` (XGLM model)
- **xlm** -- `TFXLMModel` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaModel` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetModel` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModel.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModel[[transformers.FlaxAutoModel]]

#### transformers.FlaxAutoModel[[transformers.FlaxAutoModel]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L281)

This is a generic model class that will be instantiated as one of the base model classes of the library when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModel.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertModel](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertModel) (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartModel](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartModel) (BART model)
  - [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) configuration class: [FlaxBeitModel](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.FlaxBeitModel) (BEiT model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertModel) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdModel](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdModel) (BigBird model)
  - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [FlaxBlenderbotModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.FlaxBlenderbotModel) (Blenderbot model)
  - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [FlaxBlenderbotSmallModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.FlaxBlenderbotSmallModel) (BlenderbotSmall model)
  - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [FlaxBloomModel](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.FlaxBloomModel) (BLOOM model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: [FlaxCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.FlaxCLIPModel) (CLIP model)
  - `Dinov2Config` configuration class: `FlaxDinov2Model` (DINOv2 model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertModel` (DistilBERT model)
  - `ElectraConfig` configuration class: `FlaxElectraModel` (ELECTRA model)
  - `GPT2Config` configuration class: `FlaxGPT2Model` (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `FlaxGPTJModel` (GPT-J model)
  - `GPTNeoConfig` configuration class: `FlaxGPTNeoModel` (GPT Neo model)
  - `GemmaConfig` configuration class: `FlaxGemmaModel` (Gemma model)
  - `LlamaConfig` configuration class: `FlaxLlamaModel` (LLaMA model)
  - `LongT5Config` configuration class: `FlaxLongT5Model` (LongT5 model)
  - `MBartConfig` configuration class: `FlaxMBartModel` (mBART model)
  - `MT5Config` configuration class: `FlaxMT5Model` (MT5 model)
  - `MarianConfig` configuration class: `FlaxMarianModel` (Marian model)
  - `MistralConfig` configuration class: `FlaxMistralModel` (Mistral model)
  - `OPTConfig` configuration class: `FlaxOPTModel` (OPT model)
  - `PegasusConfig` configuration class: `FlaxPegasusModel` (Pegasus model)
  - `RegNetConfig` configuration class: `FlaxRegNetModel` (RegNet model)
  - `ResNetConfig` configuration class: `FlaxResNetModel` (ResNet model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerModel` (RoFormer model)
  - `RobertaConfig` configuration class: `FlaxRobertaModel` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
  - `T5Config` configuration class: `FlaxT5Model` (T5 model)
  - `ViTConfig` configuration class: `FlaxViTModel` (ViT model)
  - `VisionTextDualEncoderConfig` configuration class: `FlaxVisionTextDualEncoderModel` (VisionTextDualEncoder model)
  - `Wav2Vec2Config` configuration class: `FlaxWav2Vec2Model` (Wav2Vec2 model)
  - `WhisperConfig` configuration class: `FlaxWhisperModel` (Whisper model)
  - `XGLMConfig` configuration class: `FlaxXGLMModel` (XGLM model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaModel` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the base model classes of the library from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModel.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertModel](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertModel) (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartModel](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartModel) (BART model) - [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) configuration class: [FlaxBeitModel](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.FlaxBeitModel) (BEiT model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertModel) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdModel](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdModel) (BigBird model) - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [FlaxBlenderbotModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.FlaxBlenderbotModel) (Blenderbot model) - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [FlaxBlenderbotSmallModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.FlaxBlenderbotSmallModel) (BlenderbotSmall model) - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [FlaxBloomModel](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.FlaxBloomModel) (BLOOM model) - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: [FlaxCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.FlaxCLIPModel) (CLIP model) - `Dinov2Config` configuration class: `FlaxDinov2Model` (DINOv2 model) - `DistilBertConfig` configuration class: `FlaxDistilBertModel` (DistilBERT model) - `ElectraConfig` configuration class: `FlaxElectraModel` (ELECTRA model) - `GPT2Config` configuration class: `FlaxGPT2Model` (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `FlaxGPTJModel` (GPT-J model) - `GPTNeoConfig` configuration class: `FlaxGPTNeoModel` (GPT Neo model) - `GemmaConfig` configuration class: `FlaxGemmaModel` (Gemma model) - `LlamaConfig` configuration class: `FlaxLlamaModel` (LLaMA model) - `LongT5Config` configuration class: `FlaxLongT5Model` (LongT5 model) - `MBartConfig` configuration class: `FlaxMBartModel` (mBART model) - `MT5Config` configuration class: `FlaxMT5Model` (MT5 model) - `MarianConfig` configuration class: `FlaxMarianModel` (Marian model) - `MistralConfig` configuration class: `FlaxMistralModel` (Mistral model) - `OPTConfig` configuration class: `FlaxOPTModel` (OPT model) - `PegasusConfig` configuration class: `FlaxPegasusModel` (Pegasus model) - `RegNetConfig` configuration class: `FlaxRegNetModel` (RegNet model) - `ResNetConfig` configuration class: `FlaxResNetModel` (ResNet model) - `RoFormerConfig` configuration class: `FlaxRoFormerModel` (RoFormer model) - `RobertaConfig` configuration class: `FlaxRobertaModel` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model) - `T5Config` configuration class: `FlaxT5Model` (T5 model) - `ViTConfig` configuration class: `FlaxViTModel` (ViT model) - `VisionTextDualEncoderConfig` configuration class: `FlaxVisionTextDualEncoderModel` (VisionTextDualEncoder model) - `Wav2Vec2Config` configuration class: `FlaxWav2Vec2Model` (Wav2Vec2 model) - `WhisperConfig` configuration class: `FlaxWhisperModel` (Whisper model) - `XGLMConfig` configuration class: `FlaxXGLMModel` (XGLM model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaModel` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModel.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [FlaxAlbertModel](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertModel) (ALBERT model)
- **bart** -- [FlaxBartModel](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartModel) (BART model)
- **beit** -- [FlaxBeitModel](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.FlaxBeitModel) (BEiT model)
- **bert** -- [FlaxBertModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertModel) (BERT model)
- **big_bird** -- [FlaxBigBirdModel](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdModel) (BigBird model)
- **blenderbot** -- [FlaxBlenderbotModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.FlaxBlenderbotModel) (Blenderbot model)
- **blenderbot-small** -- [FlaxBlenderbotSmallModel](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.FlaxBlenderbotSmallModel) (BlenderbotSmall model)
- **bloom** -- [FlaxBloomModel](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.FlaxBloomModel) (BLOOM model)
- **clip** -- [FlaxCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.FlaxCLIPModel) (CLIP model)
- **dinov2** -- `FlaxDinov2Model` (DINOv2 model)
- **distilbert** -- `FlaxDistilBertModel` (DistilBERT model)
- **electra** -- `FlaxElectraModel` (ELECTRA model)
- **gemma** -- `FlaxGemmaModel` (Gemma model)
- **gpt-sw3** -- `FlaxGPT2Model` (GPT-Sw3 model)
- **gpt2** -- `FlaxGPT2Model` (OpenAI GPT-2 model)
- **gpt_neo** -- `FlaxGPTNeoModel` (GPT Neo model)
- **gptj** -- `FlaxGPTJModel` (GPT-J model)
- **llama** -- `FlaxLlamaModel` (LLaMA model)
- **longt5** -- `FlaxLongT5Model` (LongT5 model)
- **marian** -- `FlaxMarianModel` (Marian model)
- **mbart** -- `FlaxMBartModel` (mBART model)
- **mistral** -- `FlaxMistralModel` (Mistral model)
- **mt5** -- `FlaxMT5Model` (MT5 model)
- **opt** -- `FlaxOPTModel` (OPT model)
- **pegasus** -- `FlaxPegasusModel` (Pegasus model)
- **regnet** -- `FlaxRegNetModel` (RegNet model)
- **resnet** -- `FlaxResNetModel` (ResNet model)
- **roberta** -- `FlaxRobertaModel` (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerModel` (RoFormer model)
- **t5** -- `FlaxT5Model` (T5 model)
- **vision-text-dual-encoder** -- `FlaxVisionTextDualEncoderModel` (VisionTextDualEncoder model)
- **vit** -- `FlaxViTModel` (ViT model)
- **wav2vec2** -- `FlaxWav2Vec2Model` (Wav2Vec2 model)
- **whisper** -- `FlaxWhisperModel` (Whisper model)
- **xglm** -- `FlaxXGLMModel` (XGLM model)
- **xlm-roberta** -- `FlaxXLMRobertaModel` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModel.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## Generic pretraining classes

以下の自動クラスは、事前学習ヘッドを持つモデルをインスタンス化するために利用可能です。

### AutoModelForPreTraining[[transformers.AutoModelForPreTraining]]

#### transformers.AutoModelForPreTraining[[transformers.AutoModelForPreTraining]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1947)

This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForPreTraining.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForPreTraining) (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForPreTraining) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForPreTraining) (BigBird model)
  - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForCausalLM) (BLOOM model)
  - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRL model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForMaskedLM) (CamemBERT model)
  - `ColPaliConfig` configuration class: `ColPaliForRetrieval` (ColPali model)
  - `ColQwen2Config` configuration class: `ColQwen2ForRetrieval` (ColQwen2 model)
  - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecText model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `DistilBertForMaskedLM` (DistilBERT model)
  - `ElectraConfig` configuration class: `ElectraForPreTraining` (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForPreTraining` (ERNIE model)
  - `EvollaConfig` configuration class: `EvollaForProteinText2Text` (Evolla model)
  - `Exaone4Config` configuration class: `Exaone4ForCausalLM` (EXAONE-4.0 model)
  - `FNetConfig` configuration class: `FNetForPreTraining` (FNet model)
  - `FSMTConfig` configuration class: `FSMTForConditionalGeneration` (FairSeq Machine-Translation model)
  - `FalconMambaConfig` configuration class: `FalconMambaForCausalLM` (FalconMamba model)
  - `FlaubertConfig` configuration class: `FlaubertWithLMHeadModel` (FlauBERT model)
  - `FlavaConfig` configuration class: `FlavaForPreTraining` (FLAVA model)
  - `Florence2Config` configuration class: `Florence2ForConditionalGeneration` (Florence2 model)
  - `FunnelConfig` configuration class: `FunnelForPreTraining` (Funnel Transformer model)
  - `GPT2Config` configuration class: `GPT2LMHeadModel` (OpenAI GPT-2 model)
  - `GPTBigCodeConfig` configuration class: `GPTBigCodeForCausalLM` (GPTBigCode model)
  - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
  - `Gemma3Config` configuration class: `Gemma3ForConditionalGeneration` (Gemma3ForConditionalGeneration model)
  - `HieraConfig` configuration class: `HieraForPreTraining` (Hiera model)
  - `IBertConfig` configuration class: `IBertForMaskedLM` (I-BERT model)
  - `Idefics2Config` configuration class: `Idefics2ForConditionalGeneration` (Idefics2 model)
  - `Idefics3Config` configuration class: `Idefics3ForConditionalGeneration` (Idefics3 model)
  - `IdeficsConfig` configuration class: `IdeficsForVisionText2Text` (IDEFICS model)
  - `JanusConfig` configuration class: `JanusForConditionalGeneration` (Janus model)
  - `LayoutLMConfig` configuration class: `LayoutLMForMaskedLM` (LayoutLM model)
  - `LlavaConfig` configuration class: `LlavaForConditionalGeneration` (LLaVa model)
  - `LlavaNextConfig` configuration class: `LlavaNextForConditionalGeneration` (LLaVA-NeXT model)
  - `LlavaNextVideoConfig` configuration class: `LlavaNextVideoForConditionalGeneration` (LLaVa-NeXT-Video model)
  - `LlavaOnevisionConfig` configuration class: `LlavaOnevisionForConditionalGeneration` (LLaVA-Onevision model)
  - `LongformerConfig` configuration class: `LongformerForMaskedLM` (Longformer model)
  - `LukeConfig` configuration class: `LukeForMaskedLM` (LUKE model)
  - `LxmertConfig` configuration class: `LxmertForPreTraining` (LXMERT model)
  - `MPNetConfig` configuration class: `MPNetForMaskedLM` (MPNet model)
  - `Mamba2Config` configuration class: `Mamba2ForCausalLM` (mamba2 model)
  - `MambaConfig` configuration class: `MambaForCausalLM` (Mamba model)
  - `MegaConfig` configuration class: `MegaForMaskedLM` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForPreTraining` (Megatron-BERT model)
  - `Mistral3Config` configuration class: `Mistral3ForConditionalGeneration` (Mistral3 model)
  - `MllamaConfig` configuration class: `MllamaForConditionalGeneration` (Mllama model)
  - `MobileBertConfig` configuration class: `MobileBertForPreTraining` (MobileBERT model)
  - `MptConfig` configuration class: `MptForCausalLM` (MPT model)
  - `MraConfig` configuration class: `MraForMaskedLM` (MRA model)
  - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model)
  - `NezhaConfig` configuration class: `NezhaForPreTraining` (Nezha model)
  - `NllbMoeConfig` configuration class: `NllbMoeForConditionalGeneration` (NLLB-MOE model)
  - `OpenAIGPTConfig` configuration class: `OpenAIGPTLMHeadModel` (OpenAI GPT model)
  - `PaliGemmaConfig` configuration class: `PaliGemmaForConditionalGeneration` (PaliGemma model)
  - `Qwen2AudioConfig` configuration class: `Qwen2AudioForConditionalGeneration` (Qwen2Audio model)
  - `RetriBertConfig` configuration class: `RetriBertModel` (RetriBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForPreTraining` (RoCBert model)
  - `RobertaConfig` configuration class: `RobertaForMaskedLM` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `RwkvConfig` configuration class: `RwkvForCausalLM` (RWKV model)
  - `SplinterConfig` configuration class: `SplinterForPreTraining` (Splinter model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForMaskedLM` (SqueezeBERT model)
  - `SwitchTransformersConfig` configuration class: `SwitchTransformersForConditionalGeneration` (SwitchTransformers model)
  - `T5Config` configuration class: `T5ForConditionalGeneration` (T5 model)
  - `T5GemmaConfig` configuration class: `T5GemmaForConditionalGeneration` (T5Gemma model)
  - `TapasConfig` configuration class: `TapasForMaskedLM` (TAPAS model)
  - `TransfoXLConfig` configuration class: `TransfoXLLMHeadModel` (Transformer-XL model)
  - `TvltConfig` configuration class: `TvltForPreTraining` (TVLT model)
  - `UniSpeechConfig` configuration class: `UniSpeechForPreTraining` (UniSpeech model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatForPreTraining` (UniSpeechSat model)
  - `ViTMAEConfig` configuration class: `ViTMAEForPreTraining` (ViTMAE model)
  - `VideoLlavaConfig` configuration class: `VideoLlavaForConditionalGeneration` (VideoLlava model)
  - `VideoMAEConfig` configuration class: `VideoMAEForPreTraining` (VideoMAE model)
  - `VipLlavaConfig` configuration class: `VipLlavaForConditionalGeneration` (VipLlava model)
  - `VisualBertConfig` configuration class: `VisualBertForPreTraining` (VisualBERT model)
  - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForPreTraining` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForPreTraining` (Wav2Vec2-Conformer model)
  - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForMaskedLM` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetLMHeadModel` (XLNet model)
  - `XmodConfig` configuration class: `XmodForMaskedLM` (X-MOD model)
  - `xLSTMConfig` configuration class: `xLSTMForCausalLM` (xLSTM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForPreTraining.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForPreTraining) (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForConditionalGeneration) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForPreTraining) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForPreTraining) (BigBird model) - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForCausalLM) (BLOOM model) - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRL model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForMaskedLM) (CamemBERT model) - `ColPaliConfig` configuration class: `ColPaliForRetrieval` (ColPali model) - `ColQwen2Config` configuration class: `ColQwen2ForRetrieval` (ColQwen2 model) - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecText model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `DistilBertForMaskedLM` (DistilBERT model) - `ElectraConfig` configuration class: `ElectraForPreTraining` (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForPreTraining` (ERNIE model) - `EvollaConfig` configuration class: `EvollaForProteinText2Text` (Evolla model) - `Exaone4Config` configuration class: `Exaone4ForCausalLM` (EXAONE-4.0 model) - `FNetConfig` configuration class: `FNetForPreTraining` (FNet model) - `FSMTConfig` configuration class: `FSMTForConditionalGeneration` (FairSeq Machine-Translation model) - `FalconMambaConfig` configuration class: `FalconMambaForCausalLM` (FalconMamba model) - `FlaubertConfig` configuration class: `FlaubertWithLMHeadModel` (FlauBERT model) - `FlavaConfig` configuration class: `FlavaForPreTraining` (FLAVA model) - `Florence2Config` configuration class: `Florence2ForConditionalGeneration` (Florence2 model) - `FunnelConfig` configuration class: `FunnelForPreTraining` (Funnel Transformer model) - `GPT2Config` configuration class: `GPT2LMHeadModel` (OpenAI GPT-2 model) - `GPTBigCodeConfig` configuration class: `GPTBigCodeForCausalLM` (GPTBigCode model) - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model) - `Gemma3Config` configuration class: `Gemma3ForConditionalGeneration` (Gemma3ForConditionalGeneration model) - `HieraConfig` configuration class: `HieraForPreTraining` (Hiera model) - `IBertConfig` configuration class: `IBertForMaskedLM` (I-BERT model) - `Idefics2Config` configuration class: `Idefics2ForConditionalGeneration` (Idefics2 model) - `Idefics3Config` configuration class: `Idefics3ForConditionalGeneration` (Idefics3 model) - `IdeficsConfig` configuration class: `IdeficsForVisionText2Text` (IDEFICS model) - `JanusConfig` configuration class: `JanusForConditionalGeneration` (Janus model) - `LayoutLMConfig` configuration class: `LayoutLMForMaskedLM` (LayoutLM model) - `LlavaConfig` configuration class: `LlavaForConditionalGeneration` (LLaVa model) - `LlavaNextConfig` configuration class: `LlavaNextForConditionalGeneration` (LLaVA-NeXT model) - `LlavaNextVideoConfig` configuration class: `LlavaNextVideoForConditionalGeneration` (LLaVa-NeXT-Video model) - `LlavaOnevisionConfig` configuration class: `LlavaOnevisionForConditionalGeneration` (LLaVA-Onevision model) - `LongformerConfig` configuration class: `LongformerForMaskedLM` (Longformer model) - `LukeConfig` configuration class: `LukeForMaskedLM` (LUKE model) - `LxmertConfig` configuration class: `LxmertForPreTraining` (LXMERT model) - `MPNetConfig` configuration class: `MPNetForMaskedLM` (MPNet model) - `Mamba2Config` configuration class: `Mamba2ForCausalLM` (mamba2 model) - `MambaConfig` configuration class: `MambaForCausalLM` (Mamba model) - `MegaConfig` configuration class: `MegaForMaskedLM` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForPreTraining` (Megatron-BERT model) - `Mistral3Config` configuration class: `Mistral3ForConditionalGeneration` (Mistral3 model) - `MllamaConfig` configuration class: `MllamaForConditionalGeneration` (Mllama model) - `MobileBertConfig` configuration class: `MobileBertForPreTraining` (MobileBERT model) - `MptConfig` configuration class: `MptForCausalLM` (MPT model) - `MraConfig` configuration class: `MraForMaskedLM` (MRA model) - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model) - `NezhaConfig` configuration class: `NezhaForPreTraining` (Nezha model) - `NllbMoeConfig` configuration class: `NllbMoeForConditionalGeneration` (NLLB-MOE model) - `OpenAIGPTConfig` configuration class: `OpenAIGPTLMHeadModel` (OpenAI GPT model) - `PaliGemmaConfig` configuration class: `PaliGemmaForConditionalGeneration` (PaliGemma model) - `Qwen2AudioConfig` configuration class: `Qwen2AudioForConditionalGeneration` (Qwen2Audio model) - `RetriBertConfig` configuration class: `RetriBertModel` (RetriBERT model) - `RoCBertConfig` configuration class: `RoCBertForPreTraining` (RoCBert model) - `RobertaConfig` configuration class: `RobertaForMaskedLM` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `RwkvConfig` configuration class: `RwkvForCausalLM` (RWKV model) - `SplinterConfig` configuration class: `SplinterForPreTraining` (Splinter model) - `SqueezeBertConfig` configuration class: `SqueezeBertForMaskedLM` (SqueezeBERT model) - `SwitchTransformersConfig` configuration class: `SwitchTransformersForConditionalGeneration` (SwitchTransformers model) - `T5Config` configuration class: `T5ForConditionalGeneration` (T5 model) - `T5GemmaConfig` configuration class: `T5GemmaForConditionalGeneration` (T5Gemma model) - `TapasConfig` configuration class: `TapasForMaskedLM` (TAPAS model) - `TransfoXLConfig` configuration class: `TransfoXLLMHeadModel` (Transformer-XL model) - `TvltConfig` configuration class: `TvltForPreTraining` (TVLT model) - `UniSpeechConfig` configuration class: `UniSpeechForPreTraining` (UniSpeech model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatForPreTraining` (UniSpeechSat model) - `ViTMAEConfig` configuration class: `ViTMAEForPreTraining` (ViTMAE model) - `VideoLlavaConfig` configuration class: `VideoLlavaForConditionalGeneration` (VideoLlava model) - `VideoMAEConfig` configuration class: `VideoMAEForPreTraining` (VideoMAE model) - `VipLlavaConfig` configuration class: `VipLlavaForConditionalGeneration` (VipLlava model) - `VisualBertConfig` configuration class: `VisualBertForPreTraining` (VisualBERT model) - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForPreTraining` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForPreTraining` (Wav2Vec2-Conformer model) - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForMaskedLM` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetLMHeadModel` (XLNet model) - `XmodConfig` configuration class: `XmodForMaskedLM` (X-MOD model) - `xLSTMConfig` configuration class: `xLSTMForCausalLM` (xLSTM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForPreTraining.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [AlbertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForPreTraining) (ALBERT model)
- **bart** -- [BartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
- **bert** -- [BertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForPreTraining) (BERT model)
- **big_bird** -- [BigBirdForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForPreTraining) (BigBird model)
- **bloom** -- [BloomForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForCausalLM) (BLOOM model)
- **camembert** -- [CamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForMaskedLM) (CamemBERT model)
- **colpali** -- `ColPaliForRetrieval` (ColPali model)
- **colqwen2** -- `ColQwen2ForRetrieval` (ColQwen2 model)
- **ctrl** -- [CTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRL model)
- **data2vec-text** -- [Data2VecTextForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecText model)
- **deberta** -- [DebertaForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model)
- **deberta-v2** -- [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model)
- **distilbert** -- `DistilBertForMaskedLM` (DistilBERT model)
- **electra** -- `ElectraForPreTraining` (ELECTRA model)
- **ernie** -- `ErnieForPreTraining` (ERNIE model)
- **evolla** -- `EvollaForProteinText2Text` (Evolla model)
- **exaone4** -- `Exaone4ForCausalLM` (EXAONE-4.0 model)
- **falcon_mamba** -- `FalconMambaForCausalLM` (FalconMamba model)
- **flaubert** -- `FlaubertWithLMHeadModel` (FlauBERT model)
- **flava** -- `FlavaForPreTraining` (FLAVA model)
- **florence2** -- `Florence2ForConditionalGeneration` (Florence2 model)
- **fnet** -- `FNetForPreTraining` (FNet model)
- **fsmt** -- `FSMTForConditionalGeneration` (FairSeq Machine-Translation model)
- **funnel** -- `FunnelForPreTraining` (Funnel Transformer model)
- **gemma3** -- `Gemma3ForConditionalGeneration` (Gemma3ForConditionalGeneration model)
- **gpt-sw3** -- `GPT2LMHeadModel` (GPT-Sw3 model)
- **gpt2** -- `GPT2LMHeadModel` (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeForCausalLM` (GPTBigCode model)
- **gptsan-japanese** -- `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
- **hiera** -- `HieraForPreTraining` (Hiera model)
- **ibert** -- `IBertForMaskedLM` (I-BERT model)
- **idefics** -- `IdeficsForVisionText2Text` (IDEFICS model)
- **idefics2** -- `Idefics2ForConditionalGeneration` (Idefics2 model)
- **idefics3** -- `Idefics3ForConditionalGeneration` (Idefics3 model)
- **janus** -- `JanusForConditionalGeneration` (Janus model)
- **layoutlm** -- `LayoutLMForMaskedLM` (LayoutLM model)
- **llava** -- `LlavaForConditionalGeneration` (LLaVa model)
- **llava_next** -- `LlavaNextForConditionalGeneration` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoForConditionalGeneration` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionForConditionalGeneration` (LLaVA-Onevision model)
- **longformer** -- `LongformerForMaskedLM` (Longformer model)
- **luke** -- `LukeForMaskedLM` (LUKE model)
- **lxmert** -- `LxmertForPreTraining` (LXMERT model)
- **mamba** -- `MambaForCausalLM` (Mamba model)
- **mamba2** -- `Mamba2ForCausalLM` (mamba2 model)
- **mega** -- `MegaForMaskedLM` (MEGA model)
- **megatron-bert** -- `MegatronBertForPreTraining` (Megatron-BERT model)
- **mistral3** -- `Mistral3ForConditionalGeneration` (Mistral3 model)
- **mllama** -- `MllamaForConditionalGeneration` (Mllama model)
- **mobilebert** -- `MobileBertForPreTraining` (MobileBERT model)
- **mpnet** -- `MPNetForMaskedLM` (MPNet model)
- **mpt** -- `MptForCausalLM` (MPT model)
- **mra** -- `MraForMaskedLM` (MRA model)
- **mvp** -- `MvpForConditionalGeneration` (MVP model)
- **nezha** -- `NezhaForPreTraining` (Nezha model)
- **nllb-moe** -- `NllbMoeForConditionalGeneration` (NLLB-MOE model)
- **openai-gpt** -- `OpenAIGPTLMHeadModel` (OpenAI GPT model)
- **paligemma** -- `PaliGemmaForConditionalGeneration` (PaliGemma model)
- **qwen2_audio** -- `Qwen2AudioForConditionalGeneration` (Qwen2Audio model)
- **retribert** -- `RetriBertModel` (RetriBERT model)
- **roberta** -- `RobertaForMaskedLM` (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForPreTraining` (RoCBert model)
- **rwkv** -- `RwkvForCausalLM` (RWKV model)
- **splinter** -- `SplinterForPreTraining` (Splinter model)
- **squeezebert** -- `SqueezeBertForMaskedLM` (SqueezeBERT model)
- **switch_transformers** -- `SwitchTransformersForConditionalGeneration` (SwitchTransformers model)
- **t5** -- `T5ForConditionalGeneration` (T5 model)
- **t5gemma** -- `T5GemmaForConditionalGeneration` (T5Gemma model)
- **tapas** -- `TapasForMaskedLM` (TAPAS model)
- **transfo-xl** -- `TransfoXLLMHeadModel` (Transformer-XL model)
- **tvlt** -- `TvltForPreTraining` (TVLT model)
- **unispeech** -- `UniSpeechForPreTraining` (UniSpeech model)
- **unispeech-sat** -- `UniSpeechSatForPreTraining` (UniSpeechSat model)
- **video_llava** -- `VideoLlavaForConditionalGeneration` (VideoLlava model)
- **videomae** -- `VideoMAEForPreTraining` (VideoMAE model)
- **vipllava** -- `VipLlavaForConditionalGeneration` (VipLlava model)
- **visual_bert** -- `VisualBertForPreTraining` (VisualBERT model)
- **vit_mae** -- `ViTMAEForPreTraining` (ViTMAE model)
- **voxtral** -- `VoxtralForConditionalGeneration` (Voxtral model)
- **wav2vec2** -- `Wav2Vec2ForPreTraining` (Wav2Vec2 model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerForPreTraining` (Wav2Vec2-Conformer model)
- **xlm** -- `XLMWithLMHeadModel` (XLM model)
- **xlm-roberta** -- `XLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetLMHeadModel` (XLNet model)
- **xlstm** -- `xLSTMForCausalLM` (xLSTM model)
- **xmod** -- `XmodForMaskedLM` (X-MOD model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForPreTraining.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForPreTraining[[transformers.TFAutoModelForPreTraining]]

#### transformers.TFAutoModelForPreTraining[[transformers.TFAutoModelForPreTraining]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L554)

This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForPreTraining.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForPreTraining) (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForPreTraining) (BERT model)
  - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [TFCTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLLMHeadModel) (CTRL model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForMaskedLM) (CamemBERT model)
  - `DistilBertConfig` configuration class: `TFDistilBertForMaskedLM` (DistilBERT model)
  - `ElectraConfig` configuration class: `TFElectraForPreTraining` (ELECTRA model)
  - `FlaubertConfig` configuration class: `TFFlaubertWithLMHeadModel` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForPreTraining` (Funnel Transformer model)
  - `GPT2Config` configuration class: `TFGPT2LMHeadModel` (OpenAI GPT-2 model)
  - `IdeficsConfig` configuration class: `TFIdeficsForVisionText2Text` (IDEFICS model)
  - `LayoutLMConfig` configuration class: `TFLayoutLMForMaskedLM` (LayoutLM model)
  - `LxmertConfig` configuration class: `TFLxmertForPreTraining` (LXMERT model)
  - `MPNetConfig` configuration class: `TFMPNetForMaskedLM` (MPNet model)
  - `MobileBertConfig` configuration class: `TFMobileBertForPreTraining` (MobileBERT model)
  - `OpenAIGPTConfig` configuration class: `TFOpenAIGPTLMHeadModel` (OpenAI GPT model)
  - `RobertaConfig` configuration class: `TFRobertaForMaskedLM` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `T5Config` configuration class: `TFT5ForConditionalGeneration` (T5 model)
  - `TapasConfig` configuration class: `TFTapasForMaskedLM` (TAPAS model)
  - `TransfoXLConfig` configuration class: `TFTransfoXLLMHeadModel` (Transformer-XL model)
  - `ViTMAEConfig` configuration class: `TFViTMAEForPreTraining` (ViTMAE model)
  - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetLMHeadModel` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForPreTraining.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForPreTraining) (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForPreTraining) (BERT model) - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [TFCTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLLMHeadModel) (CTRL model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForMaskedLM) (CamemBERT model) - `DistilBertConfig` configuration class: `TFDistilBertForMaskedLM` (DistilBERT model) - `ElectraConfig` configuration class: `TFElectraForPreTraining` (ELECTRA model) - `FlaubertConfig` configuration class: `TFFlaubertWithLMHeadModel` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForPreTraining` (Funnel Transformer model) - `GPT2Config` configuration class: `TFGPT2LMHeadModel` (OpenAI GPT-2 model) - `IdeficsConfig` configuration class: `TFIdeficsForVisionText2Text` (IDEFICS model) - `LayoutLMConfig` configuration class: `TFLayoutLMForMaskedLM` (LayoutLM model) - `LxmertConfig` configuration class: `TFLxmertForPreTraining` (LXMERT model) - `MPNetConfig` configuration class: `TFMPNetForMaskedLM` (MPNet model) - `MobileBertConfig` configuration class: `TFMobileBertForPreTraining` (MobileBERT model) - `OpenAIGPTConfig` configuration class: `TFOpenAIGPTLMHeadModel` (OpenAI GPT model) - `RobertaConfig` configuration class: `TFRobertaForMaskedLM` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `T5Config` configuration class: `TFT5ForConditionalGeneration` (T5 model) - `TapasConfig` configuration class: `TFTapasForMaskedLM` (TAPAS model) - `TransfoXLConfig` configuration class: `TFTransfoXLLMHeadModel` (Transformer-XL model) - `ViTMAEConfig` configuration class: `TFViTMAEForPreTraining` (ViTMAE model) - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetLMHeadModel` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForPreTraining.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [TFAlbertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForPreTraining) (ALBERT model)
- **bart** -- [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model)
- **bert** -- [TFBertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForPreTraining) (BERT model)
- **camembert** -- [TFCamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForMaskedLM) (CamemBERT model)
- **ctrl** -- [TFCTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLLMHeadModel) (CTRL model)
- **distilbert** -- `TFDistilBertForMaskedLM` (DistilBERT model)
- **electra** -- `TFElectraForPreTraining` (ELECTRA model)
- **flaubert** -- `TFFlaubertWithLMHeadModel` (FlauBERT model)
- **funnel** -- `TFFunnelForPreTraining` (Funnel Transformer model)
- **gpt-sw3** -- `TFGPT2LMHeadModel` (GPT-Sw3 model)
- **gpt2** -- `TFGPT2LMHeadModel` (OpenAI GPT-2 model)
- **idefics** -- `TFIdeficsForVisionText2Text` (IDEFICS model)
- **layoutlm** -- `TFLayoutLMForMaskedLM` (LayoutLM model)
- **lxmert** -- `TFLxmertForPreTraining` (LXMERT model)
- **mobilebert** -- `TFMobileBertForPreTraining` (MobileBERT model)
- **mpnet** -- `TFMPNetForMaskedLM` (MPNet model)
- **openai-gpt** -- `TFOpenAIGPTLMHeadModel` (OpenAI GPT model)
- **roberta** -- `TFRobertaForMaskedLM` (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **t5** -- `TFT5ForConditionalGeneration` (T5 model)
- **tapas** -- `TFTapasForMaskedLM` (TAPAS model)
- **transfo-xl** -- `TFTransfoXLLMHeadModel` (Transformer-XL model)
- **vit_mae** -- `TFViTMAEForPreTraining` (ViTMAE model)
- **xlm** -- `TFXLMWithLMHeadModel` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetLMHeadModel` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForPreTraining[[transformers.FlaxAutoModelForPreTraining]]

#### transformers.FlaxAutoModelForPreTraining[[transformers.FlaxAutoModelForPreTraining]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L288)

This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForPreTraining.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForPreTraining) (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForPreTraining) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForPreTraining) (BigBird model)
  - `ElectraConfig` configuration class: `FlaxElectraForPreTraining` (ELECTRA model)
  - `LongT5Config` configuration class: `FlaxLongT5ForConditionalGeneration` (LongT5 model)
  - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model)
  - `MT5Config` configuration class: `FlaxMT5ForConditionalGeneration` (MT5 model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForMaskedLM` (RoFormer model)
  - `RobertaConfig` configuration class: `FlaxRobertaForMaskedLM` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `T5Config` configuration class: `FlaxT5ForConditionalGeneration` (T5 model)
  - `Wav2Vec2Config` configuration class: `FlaxWav2Vec2ForPreTraining` (Wav2Vec2 model)
  - `WhisperConfig` configuration class: `FlaxWhisperForConditionalGeneration` (Whisper model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForPreTraining.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForPreTraining) (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForPreTraining) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForPreTraining) (BigBird model) - `ElectraConfig` configuration class: `FlaxElectraForPreTraining` (ELECTRA model) - `LongT5Config` configuration class: `FlaxLongT5ForConditionalGeneration` (LongT5 model) - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model) - `MT5Config` configuration class: `FlaxMT5ForConditionalGeneration` (MT5 model) - `RoFormerConfig` configuration class: `FlaxRoFormerForMaskedLM` (RoFormer model) - `RobertaConfig` configuration class: `FlaxRobertaForMaskedLM` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `T5Config` configuration class: `FlaxT5ForConditionalGeneration` (T5 model) - `Wav2Vec2Config` configuration class: `FlaxWav2Vec2ForPreTraining` (Wav2Vec2 model) - `WhisperConfig` configuration class: `FlaxWhisperForConditionalGeneration` (Whisper model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForPreTraining.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [FlaxAlbertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForPreTraining) (ALBERT model)
- **bart** -- [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
- **bert** -- [FlaxBertForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForPreTraining) (BERT model)
- **big_bird** -- [FlaxBigBirdForPreTraining](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForPreTraining) (BigBird model)
- **electra** -- `FlaxElectraForPreTraining` (ELECTRA model)
- **longt5** -- `FlaxLongT5ForConditionalGeneration` (LongT5 model)
- **mbart** -- `FlaxMBartForConditionalGeneration` (mBART model)
- **mt5** -- `FlaxMT5ForConditionalGeneration` (MT5 model)
- **roberta** -- `FlaxRobertaForMaskedLM` (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForMaskedLM` (RoFormer model)
- **t5** -- `FlaxT5ForConditionalGeneration` (T5 model)
- **wav2vec2** -- `FlaxWav2Vec2ForPreTraining` (Wav2Vec2 model)
- **whisper** -- `FlaxWhisperForConditionalGeneration` (Whisper model)
- **xlm-roberta** -- `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## Natural Language Processing

以下の自動クラスは、次の自然言語処理タスクに利用可能です。

### AutoModelForCausalLM[[transformers.AutoModelForCausalLM]]

#### transformers.AutoModelForCausalLM[[transformers.AutoModelForCausalLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1962)

This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForCausalLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `ApertusConfig` configuration class: `ApertusForCausalLM` (Apertus model)
  - `ArceeConfig` configuration class: `ArceeForCausalLM` (Arcee model)
  - `AriaTextConfig` configuration class: `AriaTextForCausalLM` (AriaText model)
  - `BambaConfig` configuration class: `BambaForCausalLM` (Bamba model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForCausalLM) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertLMHeadModel) (BERT model)
  - [BertGenerationConfig](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationConfig) configuration class: [BertGenerationDecoder](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationDecoder) (Bert Generation model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForCausalLM) (BigBird model)
  - [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForCausalLM) (BigBird-Pegasus model)
  - [BioGptConfig](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptForCausalLM) (BioGpt model)
  - `BitNetConfig` configuration class: `BitNetForCausalLM` (BitNet model)
  - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotForCausalLM) (Blenderbot model)
  - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallForCausalLM) (BlenderbotSmall model)
  - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForCausalLM) (BLOOM model)
  - `BltConfig` configuration class: `BltForCausalLM` (Blt model)
  - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRL model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForCausalLM) (CamemBERT model)
  - [CodeGenConfig](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenForCausalLM) (CodeGen model)
  - `Cohere2Config` configuration class: `Cohere2ForCausalLM` (Cohere2 model)
  - `CohereConfig` configuration class: `CohereForCausalLM` (Cohere model)
  - [CpmAntConfig](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntConfig) configuration class: [CpmAntForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntForCausalLM) (CPM-Ant model)
  - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForCausalLM) (Data2VecText model)
  - `DbrxConfig` configuration class: `DbrxForCausalLM` (DBRX model)
  - `DeepseekV2Config` configuration class: `DeepseekV2ForCausalLM` (DeepSeek-V2 model)
  - `DeepseekV3Config` configuration class: `DeepseekV3ForCausalLM` (DeepSeek-V3 model)
  - `DiffLlamaConfig` configuration class: `DiffLlamaForCausalLM` (DiffLlama model)
  - `DogeConfig` configuration class: `DogeForCausalLM` (Doge model)
  - `Dots1Config` configuration class: `Dots1ForCausalLM` (dots1 model)
  - `ElectraConfig` configuration class: `ElectraForCausalLM` (ELECTRA model)
  - `Emu3Config` configuration class: `Emu3ForCausalLM` (Emu3 model)
  - `Ernie4_5Config` configuration class: `Ernie4_5ForCausalLM` (Ernie4_5 model)
  - `Ernie4_5_MoeConfig` configuration class: `Ernie4_5_MoeForCausalLM` (Ernie4_5_MoE model)
  - `ErnieConfig` configuration class: `ErnieForCausalLM` (ERNIE model)
  - `Exaone4Config` configuration class: `Exaone4ForCausalLM` (EXAONE-4.0 model)
  - `FalconConfig` configuration class: `FalconForCausalLM` (Falcon model)
  - `FalconH1Config` configuration class: `FalconH1ForCausalLM` (FalconH1 model)
  - `FalconMambaConfig` configuration class: `FalconMambaForCausalLM` (FalconMamba model)
  - `FlexOlmoConfig` configuration class: `FlexOlmoForCausalLM` (FlexOlmo model)
  - `FuyuConfig` configuration class: `FuyuForCausalLM` (Fuyu model)
  - `GPT2Config` configuration class: `GPT2LMHeadModel` (OpenAI GPT-2 model)
  - `GPTBigCodeConfig` configuration class: `GPTBigCodeForCausalLM` (GPTBigCode model)
  - `GPTJConfig` configuration class: `GPTJForCausalLM` (GPT-J model)
  - `GPTNeoConfig` configuration class: `GPTNeoForCausalLM` (GPT Neo model)
  - `GPTNeoXConfig` configuration class: `GPTNeoXForCausalLM` (GPT NeoX model)
  - `GPTNeoXJapaneseConfig` configuration class: `GPTNeoXJapaneseForCausalLM` (GPT NeoX Japanese model)
  - `Gemma2Config` configuration class: `Gemma2ForCausalLM` (Gemma2 model)
  - `Gemma3Config` configuration class: `Gemma3ForConditionalGeneration` (Gemma3ForConditionalGeneration model)
  - `Gemma3TextConfig` configuration class: `Gemma3ForCausalLM` (Gemma3ForCausalLM model)
  - `Gemma3nConfig` configuration class: `Gemma3nForConditionalGeneration` (Gemma3nForConditionalGeneration model)
  - `Gemma3nTextConfig` configuration class: `Gemma3nForCausalLM` (Gemma3nForCausalLM model)
  - `GemmaConfig` configuration class: `GemmaForCausalLM` (Gemma model)
  - `GitConfig` configuration class: `GitForCausalLM` (GIT model)
  - `Glm4Config` configuration class: `Glm4ForCausalLM` (GLM4 model)
  - `Glm4MoeConfig` configuration class: `Glm4MoeForCausalLM` (Glm4MoE model)
  - `GlmConfig` configuration class: `GlmForCausalLM` (GLM model)
  - `GotOcr2Config` configuration class: `GotOcr2ForConditionalGeneration` (GOT-OCR2 model)
  - `GptOssConfig` configuration class: `GptOssForCausalLM` (GptOss model)
  - `GraniteConfig` configuration class: `GraniteForCausalLM` (Granite model)
  - `GraniteMoeConfig` configuration class: `GraniteMoeForCausalLM` (GraniteMoeMoe model)
  - `GraniteMoeHybridConfig` configuration class: `GraniteMoeHybridForCausalLM` (GraniteMoeHybrid model)
  - `GraniteMoeSharedConfig` configuration class: `GraniteMoeSharedForCausalLM` (GraniteMoeSharedMoe model)
  - `HeliumConfig` configuration class: `HeliumForCausalLM` (Helium model)
  - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1ForCausalLM` (HunYuanDenseV1 model)
  - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1ForCausalLM` (HunYuanMoeV1 model)
  - `JambaConfig` configuration class: `JambaForCausalLM` (Jamba model)
  - `JetMoeConfig` configuration class: `JetMoeForCausalLM` (JetMoe model)
  - `Lfm2Config` configuration class: `Lfm2ForCausalLM` (Lfm2 model)
  - `Llama4Config` configuration class: `Llama4ForCausalLM` (Llama4 model)
  - `Llama4TextConfig` configuration class: `Llama4ForCausalLM` (Llama4ForCausalLM model)
  - `LlamaConfig` configuration class: `LlamaForCausalLM` (LLaMA model)
  - `LongcatFlashConfig` configuration class: `LongcatFlashForCausalLM` (LongCatFlash model)
  - `MBartConfig` configuration class: `MBartForCausalLM` (mBART model)
  - `Mamba2Config` configuration class: `Mamba2ForCausalLM` (mamba2 model)
  - `MambaConfig` configuration class: `MambaForCausalLM` (Mamba model)
  - `MarianConfig` configuration class: `MarianForCausalLM` (Marian model)
  - `MegaConfig` configuration class: `MegaForCausalLM` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForCausalLM` (Megatron-BERT model)
  - `MiniMaxConfig` configuration class: `MiniMaxForCausalLM` (MiniMax model)
  - `MinistralConfig` configuration class: `MinistralForCausalLM` (Ministral model)
  - `MistralConfig` configuration class: `MistralForCausalLM` (Mistral model)
  - `MixtralConfig` configuration class: `MixtralForCausalLM` (Mixtral model)
  - `MllamaConfig` configuration class: `MllamaForCausalLM` (Mllama model)
  - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderForCausalLM` (ModernBertDecoder model)
  - `MoshiConfig` configuration class: `MoshiForCausalLM` (Moshi model)
  - `MptConfig` configuration class: `MptForCausalLM` (MPT model)
  - `MusicgenConfig` configuration class: `MusicgenForCausalLM` (MusicGen model)
  - `MusicgenMelodyConfig` configuration class: `MusicgenMelodyForCausalLM` (MusicGen Melody model)
  - `MvpConfig` configuration class: `MvpForCausalLM` (MVP model)
  - `NemotronConfig` configuration class: `NemotronForCausalLM` (Nemotron model)
  - `OPTConfig` configuration class: `OPTForCausalLM` (OPT model)
  - `Olmo2Config` configuration class: `Olmo2ForCausalLM` (OLMo2 model)
  - `Olmo3Config` configuration class: `Olmo3ForCausalLM` (Olmo3 model)
  - `OlmoConfig` configuration class: `OlmoForCausalLM` (OLMo model)
  - `OlmoeConfig` configuration class: `OlmoeForCausalLM` (OLMoE model)
  - `OpenAIGPTConfig` configuration class: `OpenAIGPTLMHeadModel` (OpenAI GPT model)
  - `OpenLlamaConfig` configuration class: `OpenLlamaForCausalLM` (OpenLlama model)
  - `PLBartConfig` configuration class: `PLBartForCausalLM` (PLBart model)
  - `PegasusConfig` configuration class: `PegasusForCausalLM` (Pegasus model)
  - `PersimmonConfig` configuration class: `PersimmonForCausalLM` (Persimmon model)
  - `Phi3Config` configuration class: `Phi3ForCausalLM` (Phi3 model)
  - `Phi4MultimodalConfig` configuration class: `Phi4MultimodalForCausalLM` (Phi4Multimodal model)
  - `PhiConfig` configuration class: `PhiForCausalLM` (Phi model)
  - `PhimoeConfig` configuration class: `PhimoeForCausalLM` (Phimoe model)
  - `ProphetNetConfig` configuration class: `ProphetNetForCausalLM` (ProphetNet model)
  - `QDQBertConfig` configuration class: `QDQBertLMHeadModel` (QDQBert model)
  - `Qwen2Config` configuration class: `Qwen2ForCausalLM` (Qwen2 model)
  - `Qwen2MoeConfig` configuration class: `Qwen2MoeForCausalLM` (Qwen2MoE model)
  - `Qwen3Config` configuration class: `Qwen3ForCausalLM` (Qwen3 model)
  - `Qwen3MoeConfig` configuration class: `Qwen3MoeForCausalLM` (Qwen3MoE model)
  - `Qwen3NextConfig` configuration class: `Qwen3NextForCausalLM` (Qwen3Next model)
  - `RecurrentGemmaConfig` configuration class: `RecurrentGemmaForCausalLM` (RecurrentGemma model)
  - `ReformerConfig` configuration class: `ReformerModelWithLMHead` (Reformer model)
  - `RemBertConfig` configuration class: `RemBertForCausalLM` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForCausalLM` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForCausalLM` (RoFormer model)
  - `RobertaConfig` configuration class: `RobertaForCausalLM` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
  - `RwkvConfig` configuration class: `RwkvForCausalLM` (RWKV model)
  - `SeedOssConfig` configuration class: `SeedOssForCausalLM` (SeedOss model)
  - `SmolLM3Config` configuration class: `SmolLM3ForCausalLM` (SmolLM3 model)
  - `Speech2Text2Config` configuration class: `Speech2Text2ForCausalLM` (Speech2Text2 model)
  - `StableLmConfig` configuration class: `StableLmForCausalLM` (StableLm model)
  - `Starcoder2Config` configuration class: `Starcoder2ForCausalLM` (Starcoder2 model)
  - `TrOCRConfig` configuration class: `TrOCRForCausalLM` (TrOCR model)
  - `TransfoXLConfig` configuration class: `TransfoXLLMHeadModel` (Transformer-XL model)
  - `VaultGemmaConfig` configuration class: `VaultGemmaForCausalLM` (VaultGemma model)
  - `WhisperConfig` configuration class: `WhisperForCausalLM` (Whisper model)
  - `XGLMConfig` configuration class: `XGLMForCausalLM` (XGLM model)
  - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model)
  - `XLMProphetNetConfig` configuration class: `XLMProphetNetForCausalLM` (XLM-ProphetNet model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForCausalLM` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForCausalLM` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetLMHeadModel` (XLNet model)
  - `XmodConfig` configuration class: `XmodForCausalLM` (X-MOD model)
  - `Zamba2Config` configuration class: `Zamba2ForCausalLM` (Zamba2 model)
  - `ZambaConfig` configuration class: `ZambaForCausalLM` (Zamba model)
  - `xLSTMConfig` configuration class: `xLSTMForCausalLM` (xLSTM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCausalLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `ApertusConfig` configuration class: `ApertusForCausalLM` (Apertus model) - `ArceeConfig` configuration class: `ArceeForCausalLM` (Arcee model) - `AriaTextConfig` configuration class: `AriaTextForCausalLM` (AriaText model) - `BambaConfig` configuration class: `BambaForCausalLM` (Bamba model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForCausalLM) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertLMHeadModel) (BERT model) - [BertGenerationConfig](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationConfig) configuration class: [BertGenerationDecoder](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationDecoder) (Bert Generation model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForCausalLM) (BigBird model) - [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForCausalLM) (BigBird-Pegasus model) - [BioGptConfig](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptForCausalLM) (BioGpt model) - `BitNetConfig` configuration class: `BitNetForCausalLM` (BitNet model) - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotForCausalLM) (Blenderbot model) - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallForCausalLM) (BlenderbotSmall model) - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForCausalLM) (BLOOM model) - `BltConfig` configuration class: `BltForCausalLM` (Blt model) - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRL model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForCausalLM) (CamemBERT model) - [CodeGenConfig](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenForCausalLM) (CodeGen model) - `Cohere2Config` configuration class: `Cohere2ForCausalLM` (Cohere2 model) - `CohereConfig` configuration class: `CohereForCausalLM` (Cohere model) - [CpmAntConfig](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntConfig) configuration class: [CpmAntForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntForCausalLM) (CPM-Ant model) - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForCausalLM) (Data2VecText model) - `DbrxConfig` configuration class: `DbrxForCausalLM` (DBRX model) - `DeepseekV2Config` configuration class: `DeepseekV2ForCausalLM` (DeepSeek-V2 model) - `DeepseekV3Config` configuration class: `DeepseekV3ForCausalLM` (DeepSeek-V3 model) - `DiffLlamaConfig` configuration class: `DiffLlamaForCausalLM` (DiffLlama model) - `DogeConfig` configuration class: `DogeForCausalLM` (Doge model) - `Dots1Config` configuration class: `Dots1ForCausalLM` (dots1 model) - `ElectraConfig` configuration class: `ElectraForCausalLM` (ELECTRA model) - `Emu3Config` configuration class: `Emu3ForCausalLM` (Emu3 model) - `Ernie4_5Config` configuration class: `Ernie4_5ForCausalLM` (Ernie4_5 model) - `Ernie4_5_MoeConfig` configuration class: `Ernie4_5_MoeForCausalLM` (Ernie4_5_MoE model) - `ErnieConfig` configuration class: `ErnieForCausalLM` (ERNIE model) - `Exaone4Config` configuration class: `Exaone4ForCausalLM` (EXAONE-4.0 model) - `FalconConfig` configuration class: `FalconForCausalLM` (Falcon model) - `FalconH1Config` configuration class: `FalconH1ForCausalLM` (FalconH1 model) - `FalconMambaConfig` configuration class: `FalconMambaForCausalLM` (FalconMamba model) - `FlexOlmoConfig` configuration class: `FlexOlmoForCausalLM` (FlexOlmo model) - `FuyuConfig` configuration class: `FuyuForCausalLM` (Fuyu model) - `GPT2Config` configuration class: `GPT2LMHeadModel` (OpenAI GPT-2 model) - `GPTBigCodeConfig` configuration class: `GPTBigCodeForCausalLM` (GPTBigCode model) - `GPTJConfig` configuration class: `GPTJForCausalLM` (GPT-J model) - `GPTNeoConfig` configuration class: `GPTNeoForCausalLM` (GPT Neo model) - `GPTNeoXConfig` configuration class: `GPTNeoXForCausalLM` (GPT NeoX model) - `GPTNeoXJapaneseConfig` configuration class: `GPTNeoXJapaneseForCausalLM` (GPT NeoX Japanese model) - `Gemma2Config` configuration class: `Gemma2ForCausalLM` (Gemma2 model) - `Gemma3Config` configuration class: `Gemma3ForConditionalGeneration` (Gemma3ForConditionalGeneration model) - `Gemma3TextConfig` configuration class: `Gemma3ForCausalLM` (Gemma3ForCausalLM model) - `Gemma3nConfig` configuration class: `Gemma3nForConditionalGeneration` (Gemma3nForConditionalGeneration model) - `Gemma3nTextConfig` configuration class: `Gemma3nForCausalLM` (Gemma3nForCausalLM model) - `GemmaConfig` configuration class: `GemmaForCausalLM` (Gemma model) - `GitConfig` configuration class: `GitForCausalLM` (GIT model) - `Glm4Config` configuration class: `Glm4ForCausalLM` (GLM4 model) - `Glm4MoeConfig` configuration class: `Glm4MoeForCausalLM` (Glm4MoE model) - `GlmConfig` configuration class: `GlmForCausalLM` (GLM model) - `GotOcr2Config` configuration class: `GotOcr2ForConditionalGeneration` (GOT-OCR2 model) - `GptOssConfig` configuration class: `GptOssForCausalLM` (GptOss model) - `GraniteConfig` configuration class: `GraniteForCausalLM` (Granite model) - `GraniteMoeConfig` configuration class: `GraniteMoeForCausalLM` (GraniteMoeMoe model) - `GraniteMoeHybridConfig` configuration class: `GraniteMoeHybridForCausalLM` (GraniteMoeHybrid model) - `GraniteMoeSharedConfig` configuration class: `GraniteMoeSharedForCausalLM` (GraniteMoeSharedMoe model) - `HeliumConfig` configuration class: `HeliumForCausalLM` (Helium model) - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1ForCausalLM` (HunYuanDenseV1 model) - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1ForCausalLM` (HunYuanMoeV1 model) - `JambaConfig` configuration class: `JambaForCausalLM` (Jamba model) - `JetMoeConfig` configuration class: `JetMoeForCausalLM` (JetMoe model) - `Lfm2Config` configuration class: `Lfm2ForCausalLM` (Lfm2 model) - `Llama4Config` configuration class: `Llama4ForCausalLM` (Llama4 model) - `Llama4TextConfig` configuration class: `Llama4ForCausalLM` (Llama4ForCausalLM model) - `LlamaConfig` configuration class: `LlamaForCausalLM` (LLaMA model) - `LongcatFlashConfig` configuration class: `LongcatFlashForCausalLM` (LongCatFlash model) - `MBartConfig` configuration class: `MBartForCausalLM` (mBART model) - `Mamba2Config` configuration class: `Mamba2ForCausalLM` (mamba2 model) - `MambaConfig` configuration class: `MambaForCausalLM` (Mamba model) - `MarianConfig` configuration class: `MarianForCausalLM` (Marian model) - `MegaConfig` configuration class: `MegaForCausalLM` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForCausalLM` (Megatron-BERT model) - `MiniMaxConfig` configuration class: `MiniMaxForCausalLM` (MiniMax model) - `MinistralConfig` configuration class: `MinistralForCausalLM` (Ministral model) - `MistralConfig` configuration class: `MistralForCausalLM` (Mistral model) - `MixtralConfig` configuration class: `MixtralForCausalLM` (Mixtral model) - `MllamaConfig` configuration class: `MllamaForCausalLM` (Mllama model) - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderForCausalLM` (ModernBertDecoder model) - `MoshiConfig` configuration class: `MoshiForCausalLM` (Moshi model) - `MptConfig` configuration class: `MptForCausalLM` (MPT model) - `MusicgenConfig` configuration class: `MusicgenForCausalLM` (MusicGen model) - `MusicgenMelodyConfig` configuration class: `MusicgenMelodyForCausalLM` (MusicGen Melody model) - `MvpConfig` configuration class: `MvpForCausalLM` (MVP model) - `NemotronConfig` configuration class: `NemotronForCausalLM` (Nemotron model) - `OPTConfig` configuration class: `OPTForCausalLM` (OPT model) - `Olmo2Config` configuration class: `Olmo2ForCausalLM` (OLMo2 model) - `Olmo3Config` configuration class: `Olmo3ForCausalLM` (Olmo3 model) - `OlmoConfig` configuration class: `OlmoForCausalLM` (OLMo model) - `OlmoeConfig` configuration class: `OlmoeForCausalLM` (OLMoE model) - `OpenAIGPTConfig` configuration class: `OpenAIGPTLMHeadModel` (OpenAI GPT model) - `OpenLlamaConfig` configuration class: `OpenLlamaForCausalLM` (OpenLlama model) - `PLBartConfig` configuration class: `PLBartForCausalLM` (PLBart model) - `PegasusConfig` configuration class: `PegasusForCausalLM` (Pegasus model) - `PersimmonConfig` configuration class: `PersimmonForCausalLM` (Persimmon model) - `Phi3Config` configuration class: `Phi3ForCausalLM` (Phi3 model) - `Phi4MultimodalConfig` configuration class: `Phi4MultimodalForCausalLM` (Phi4Multimodal model) - `PhiConfig` configuration class: `PhiForCausalLM` (Phi model) - `PhimoeConfig` configuration class: `PhimoeForCausalLM` (Phimoe model) - `ProphetNetConfig` configuration class: `ProphetNetForCausalLM` (ProphetNet model) - `QDQBertConfig` configuration class: `QDQBertLMHeadModel` (QDQBert model) - `Qwen2Config` configuration class: `Qwen2ForCausalLM` (Qwen2 model) - `Qwen2MoeConfig` configuration class: `Qwen2MoeForCausalLM` (Qwen2MoE model) - `Qwen3Config` configuration class: `Qwen3ForCausalLM` (Qwen3 model) - `Qwen3MoeConfig` configuration class: `Qwen3MoeForCausalLM` (Qwen3MoE model) - `Qwen3NextConfig` configuration class: `Qwen3NextForCausalLM` (Qwen3Next model) - `RecurrentGemmaConfig` configuration class: `RecurrentGemmaForCausalLM` (RecurrentGemma model) - `ReformerConfig` configuration class: `ReformerModelWithLMHead` (Reformer model) - `RemBertConfig` configuration class: `RemBertForCausalLM` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForCausalLM` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForCausalLM` (RoFormer model) - `RobertaConfig` configuration class: `RobertaForCausalLM` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model) - `RwkvConfig` configuration class: `RwkvForCausalLM` (RWKV model) - `SeedOssConfig` configuration class: `SeedOssForCausalLM` (SeedOss model) - `SmolLM3Config` configuration class: `SmolLM3ForCausalLM` (SmolLM3 model) - `Speech2Text2Config` configuration class: `Speech2Text2ForCausalLM` (Speech2Text2 model) - `StableLmConfig` configuration class: `StableLmForCausalLM` (StableLm model) - `Starcoder2Config` configuration class: `Starcoder2ForCausalLM` (Starcoder2 model) - `TrOCRConfig` configuration class: `TrOCRForCausalLM` (TrOCR model) - `TransfoXLConfig` configuration class: `TransfoXLLMHeadModel` (Transformer-XL model) - `VaultGemmaConfig` configuration class: `VaultGemmaForCausalLM` (VaultGemma model) - `WhisperConfig` configuration class: `WhisperForCausalLM` (Whisper model) - `XGLMConfig` configuration class: `XGLMForCausalLM` (XGLM model) - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model) - `XLMProphetNetConfig` configuration class: `XLMProphetNetForCausalLM` (XLM-ProphetNet model) - `XLMRobertaConfig` configuration class: `XLMRobertaForCausalLM` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForCausalLM` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetLMHeadModel` (XLNet model) - `XmodConfig` configuration class: `XmodForCausalLM` (X-MOD model) - `Zamba2Config` configuration class: `Zamba2ForCausalLM` (Zamba2 model) - `ZambaConfig` configuration class: `ZambaForCausalLM` (Zamba model) - `xLSTMConfig` configuration class: `xLSTMForCausalLM` (xLSTM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForCausalLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **apertus** -- `ApertusForCausalLM` (Apertus model)
- **arcee** -- `ArceeForCausalLM` (Arcee model)
- **aria_text** -- `AriaTextForCausalLM` (AriaText model)
- **bamba** -- `BambaForCausalLM` (Bamba model)
- **bart** -- [BartForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForCausalLM) (BART model)
- **bert** -- [BertLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertLMHeadModel) (BERT model)
- **bert-generation** -- [BertGenerationDecoder](/docs/transformers/v4.57.1/ja/model_doc/bert-generation#transformers.BertGenerationDecoder) (Bert Generation model)
- **big_bird** -- [BigBirdForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForCausalLM) (BigBird model)
- **bigbird_pegasus** -- [BigBirdPegasusForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForCausalLM) (BigBird-Pegasus model)
- **biogpt** -- [BioGptForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptForCausalLM) (BioGpt model)
- **bitnet** -- `BitNetForCausalLM` (BitNet model)
- **blenderbot** -- [BlenderbotForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotForCausalLM) (Blenderbot model)
- **blenderbot-small** -- [BlenderbotSmallForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallForCausalLM) (BlenderbotSmall model)
- **bloom** -- [BloomForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForCausalLM) (BLOOM model)
- **blt** -- `BltForCausalLM` (Blt model)
- **camembert** -- [CamembertForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForCausalLM) (CamemBERT model)
- **code_llama** -- `LlamaForCausalLM` (CodeLlama model)
- **codegen** -- [CodeGenForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/codegen#transformers.CodeGenForCausalLM) (CodeGen model)
- **cohere** -- `CohereForCausalLM` (Cohere model)
- **cohere2** -- `Cohere2ForCausalLM` (Cohere2 model)
- **cpmant** -- [CpmAntForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/cpmant#transformers.CpmAntForCausalLM) (CPM-Ant model)
- **ctrl** -- [CTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRL model)
- **data2vec-text** -- [Data2VecTextForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForCausalLM) (Data2VecText model)
- **dbrx** -- `DbrxForCausalLM` (DBRX model)
- **deepseek_v2** -- `DeepseekV2ForCausalLM` (DeepSeek-V2 model)
- **deepseek_v3** -- `DeepseekV3ForCausalLM` (DeepSeek-V3 model)
- **diffllama** -- `DiffLlamaForCausalLM` (DiffLlama model)
- **doge** -- `DogeForCausalLM` (Doge model)
- **dots1** -- `Dots1ForCausalLM` (dots1 model)
- **electra** -- `ElectraForCausalLM` (ELECTRA model)
- **emu3** -- `Emu3ForCausalLM` (Emu3 model)
- **ernie** -- `ErnieForCausalLM` (ERNIE model)
- **ernie4_5** -- `Ernie4_5ForCausalLM` (Ernie4_5 model)
- **ernie4_5_moe** -- `Ernie4_5_MoeForCausalLM` (Ernie4_5_MoE model)
- **exaone4** -- `Exaone4ForCausalLM` (EXAONE-4.0 model)
- **falcon** -- `FalconForCausalLM` (Falcon model)
- **falcon_h1** -- `FalconH1ForCausalLM` (FalconH1 model)
- **falcon_mamba** -- `FalconMambaForCausalLM` (FalconMamba model)
- **flex_olmo** -- `FlexOlmoForCausalLM` (FlexOlmo model)
- **fuyu** -- `FuyuForCausalLM` (Fuyu model)
- **gemma** -- `GemmaForCausalLM` (Gemma model)
- **gemma2** -- `Gemma2ForCausalLM` (Gemma2 model)
- **gemma3** -- `Gemma3ForConditionalGeneration` (Gemma3ForConditionalGeneration model)
- **gemma3_text** -- `Gemma3ForCausalLM` (Gemma3ForCausalLM model)
- **gemma3n** -- `Gemma3nForConditionalGeneration` (Gemma3nForConditionalGeneration model)
- **gemma3n_text** -- `Gemma3nForCausalLM` (Gemma3nForCausalLM model)
- **git** -- `GitForCausalLM` (GIT model)
- **glm** -- `GlmForCausalLM` (GLM model)
- **glm4** -- `Glm4ForCausalLM` (GLM4 model)
- **glm4_moe** -- `Glm4MoeForCausalLM` (Glm4MoE model)
- **got_ocr2** -- `GotOcr2ForConditionalGeneration` (GOT-OCR2 model)
- **gpt-sw3** -- `GPT2LMHeadModel` (GPT-Sw3 model)
- **gpt2** -- `GPT2LMHeadModel` (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeForCausalLM` (GPTBigCode model)
- **gpt_neo** -- `GPTNeoForCausalLM` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXForCausalLM` (GPT NeoX model)
- **gpt_neox_japanese** -- `GPTNeoXJapaneseForCausalLM` (GPT NeoX Japanese model)
- **gpt_oss** -- `GptOssForCausalLM` (GptOss model)
- **gptj** -- `GPTJForCausalLM` (GPT-J model)
- **granite** -- `GraniteForCausalLM` (Granite model)
- **granitemoe** -- `GraniteMoeForCausalLM` (GraniteMoeMoe model)
- **granitemoehybrid** -- `GraniteMoeHybridForCausalLM` (GraniteMoeHybrid model)
- **granitemoeshared** -- `GraniteMoeSharedForCausalLM` (GraniteMoeSharedMoe model)
- **helium** -- `HeliumForCausalLM` (Helium model)
- **hunyuan_v1_dense** -- `HunYuanDenseV1ForCausalLM` (HunYuanDenseV1 model)
- **hunyuan_v1_moe** -- `HunYuanMoEV1ForCausalLM` (HunYuanMoeV1 model)
- **jamba** -- `JambaForCausalLM` (Jamba model)
- **jetmoe** -- `JetMoeForCausalLM` (JetMoe model)
- **lfm2** -- `Lfm2ForCausalLM` (Lfm2 model)
- **llama** -- `LlamaForCausalLM` (LLaMA model)
- **llama4** -- `Llama4ForCausalLM` (Llama4 model)
- **llama4_text** -- `Llama4ForCausalLM` (Llama4ForCausalLM model)
- **longcat_flash** -- `LongcatFlashForCausalLM` (LongCatFlash model)
- **mamba** -- `MambaForCausalLM` (Mamba model)
- **mamba2** -- `Mamba2ForCausalLM` (mamba2 model)
- **marian** -- `MarianForCausalLM` (Marian model)
- **mbart** -- `MBartForCausalLM` (mBART model)
- **mega** -- `MegaForCausalLM` (MEGA model)
- **megatron-bert** -- `MegatronBertForCausalLM` (Megatron-BERT model)
- **minimax** -- `MiniMaxForCausalLM` (MiniMax model)
- **ministral** -- `MinistralForCausalLM` (Ministral model)
- **mistral** -- `MistralForCausalLM` (Mistral model)
- **mixtral** -- `MixtralForCausalLM` (Mixtral model)
- **mllama** -- `MllamaForCausalLM` (Mllama model)
- **modernbert-decoder** -- `ModernBertDecoderForCausalLM` (ModernBertDecoder model)
- **moshi** -- `MoshiForCausalLM` (Moshi model)
- **mpt** -- `MptForCausalLM` (MPT model)
- **musicgen** -- `MusicgenForCausalLM` (MusicGen model)
- **musicgen_melody** -- `MusicgenMelodyForCausalLM` (MusicGen Melody model)
- **mvp** -- `MvpForCausalLM` (MVP model)
- **nemotron** -- `NemotronForCausalLM` (Nemotron model)
- **olmo** -- `OlmoForCausalLM` (OLMo model)
- **olmo2** -- `Olmo2ForCausalLM` (OLMo2 model)
- **olmo3** -- `Olmo3ForCausalLM` (Olmo3 model)
- **olmoe** -- `OlmoeForCausalLM` (OLMoE model)
- **open-llama** -- `OpenLlamaForCausalLM` (OpenLlama model)
- **openai-gpt** -- `OpenAIGPTLMHeadModel` (OpenAI GPT model)
- **opt** -- `OPTForCausalLM` (OPT model)
- **pegasus** -- `PegasusForCausalLM` (Pegasus model)
- **persimmon** -- `PersimmonForCausalLM` (Persimmon model)
- **phi** -- `PhiForCausalLM` (Phi model)
- **phi3** -- `Phi3ForCausalLM` (Phi3 model)
- **phi4_multimodal** -- `Phi4MultimodalForCausalLM` (Phi4Multimodal model)
- **phimoe** -- `PhimoeForCausalLM` (Phimoe model)
- **plbart** -- `PLBartForCausalLM` (PLBart model)
- **prophetnet** -- `ProphetNetForCausalLM` (ProphetNet model)
- **qdqbert** -- `QDQBertLMHeadModel` (QDQBert model)
- **qwen2** -- `Qwen2ForCausalLM` (Qwen2 model)
- **qwen2_moe** -- `Qwen2MoeForCausalLM` (Qwen2MoE model)
- **qwen3** -- `Qwen3ForCausalLM` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeForCausalLM` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextForCausalLM` (Qwen3Next model)
- **recurrent_gemma** -- `RecurrentGemmaForCausalLM` (RecurrentGemma model)
- **reformer** -- `ReformerModelWithLMHead` (Reformer model)
- **rembert** -- `RemBertForCausalLM` (RemBERT model)
- **roberta** -- `RobertaForCausalLM` (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForCausalLM` (RoCBert model)
- **roformer** -- `RoFormerForCausalLM` (RoFormer model)
- **rwkv** -- `RwkvForCausalLM` (RWKV model)
- **seed_oss** -- `SeedOssForCausalLM` (SeedOss model)
- **smollm3** -- `SmolLM3ForCausalLM` (SmolLM3 model)
- **speech_to_text_2** -- `Speech2Text2ForCausalLM` (Speech2Text2 model)
- **stablelm** -- `StableLmForCausalLM` (StableLm model)
- **starcoder2** -- `Starcoder2ForCausalLM` (Starcoder2 model)
- **transfo-xl** -- `TransfoXLLMHeadModel` (Transformer-XL model)
- **trocr** -- `TrOCRForCausalLM` (TrOCR model)
- **vaultgemma** -- `VaultGemmaForCausalLM` (VaultGemma model)
- **whisper** -- `WhisperForCausalLM` (Whisper model)
- **xglm** -- `XGLMForCausalLM` (XGLM model)
- **xlm** -- `XLMWithLMHeadModel` (XLM model)
- **xlm-prophetnet** -- `XLMProphetNetForCausalLM` (XLM-ProphetNet model)
- **xlm-roberta** -- `XLMRobertaForCausalLM` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForCausalLM` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetLMHeadModel` (XLNet model)
- **xlstm** -- `xLSTMForCausalLM` (xLSTM model)
- **xmod** -- `XmodForCausalLM` (X-MOD model)
- **zamba** -- `ZambaForCausalLM` (Zamba model)
- **zamba2** -- `Zamba2ForCausalLM` (Zamba2 model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCausalLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForCausalLM[[transformers.TFAutoModelForCausalLM]]

#### transformers.TFAutoModelForCausalLM[[transformers.TFAutoModelForCausalLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L569)

This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForCausalLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertLMHeadModel) (BERT model)
  - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [TFCTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLLMHeadModel) (CTRL model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForCausalLM) (CamemBERT model)
  - `GPT2Config` configuration class: `TFGPT2LMHeadModel` (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `TFGPTJForCausalLM` (GPT-J model)
  - `MistralConfig` configuration class: `TFMistralForCausalLM` (Mistral model)
  - `OPTConfig` configuration class: `TFOPTForCausalLM` (OPT model)
  - `OpenAIGPTConfig` configuration class: `TFOpenAIGPTLMHeadModel` (OpenAI GPT model)
  - `RemBertConfig` configuration class: `TFRemBertForCausalLM` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForCausalLM` (RoFormer model)
  - `RobertaConfig` configuration class: `TFRobertaForCausalLM` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
  - `TransfoXLConfig` configuration class: `TFTransfoXLLMHeadModel` (Transformer-XL model)
  - `XGLMConfig` configuration class: `TFXGLMForCausalLM` (XGLM model)
  - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForCausalLM` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetLMHeadModel` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForCausalLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertLMHeadModel) (BERT model) - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [TFCTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLLMHeadModel) (CTRL model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForCausalLM) (CamemBERT model) - `GPT2Config` configuration class: `TFGPT2LMHeadModel` (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `TFGPTJForCausalLM` (GPT-J model) - `MistralConfig` configuration class: `TFMistralForCausalLM` (Mistral model) - `OPTConfig` configuration class: `TFOPTForCausalLM` (OPT model) - `OpenAIGPTConfig` configuration class: `TFOpenAIGPTLMHeadModel` (OpenAI GPT model) - `RemBertConfig` configuration class: `TFRemBertForCausalLM` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForCausalLM` (RoFormer model) - `RobertaConfig` configuration class: `TFRobertaForCausalLM` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model) - `TransfoXLConfig` configuration class: `TFTransfoXLLMHeadModel` (Transformer-XL model) - `XGLMConfig` configuration class: `TFXGLMForCausalLM` (XGLM model) - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForCausalLM` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetLMHeadModel` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForCausalLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bert** -- [TFBertLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertLMHeadModel) (BERT model)
- **camembert** -- [TFCamembertForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForCausalLM) (CamemBERT model)
- **ctrl** -- [TFCTRLLMHeadModel](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLLMHeadModel) (CTRL model)
- **gpt-sw3** -- `TFGPT2LMHeadModel` (GPT-Sw3 model)
- **gpt2** -- `TFGPT2LMHeadModel` (OpenAI GPT-2 model)
- **gptj** -- `TFGPTJForCausalLM` (GPT-J model)
- **mistral** -- `TFMistralForCausalLM` (Mistral model)
- **openai-gpt** -- `TFOpenAIGPTLMHeadModel` (OpenAI GPT model)
- **opt** -- `TFOPTForCausalLM` (OPT model)
- **rembert** -- `TFRemBertForCausalLM` (RemBERT model)
- **roberta** -- `TFRobertaForCausalLM` (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForCausalLM` (RoFormer model)
- **transfo-xl** -- `TFTransfoXLLMHeadModel` (Transformer-XL model)
- **xglm** -- `TFXGLMForCausalLM` (XGLM model)
- **xlm** -- `TFXLMWithLMHeadModel` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForCausalLM` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetLMHeadModel` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForCausalLM[[transformers.FlaxAutoModelForCausalLM]]

#### transformers.FlaxAutoModelForCausalLM[[transformers.FlaxAutoModelForCausalLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L295)

This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForCausalLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForCausalLM) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForCausalLM) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForCausalLM) (BigBird model)
  - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [FlaxBloomForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.FlaxBloomForCausalLM) (BLOOM model)
  - `ElectraConfig` configuration class: `FlaxElectraForCausalLM` (ELECTRA model)
  - `GPT2Config` configuration class: `FlaxGPT2LMHeadModel` (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `FlaxGPTJForCausalLM` (GPT-J model)
  - `GPTNeoConfig` configuration class: `FlaxGPTNeoForCausalLM` (GPT Neo model)
  - `GemmaConfig` configuration class: `FlaxGemmaForCausalLM` (Gemma model)
  - `LlamaConfig` configuration class: `FlaxLlamaForCausalLM` (LLaMA model)
  - `MistralConfig` configuration class: `FlaxMistralForCausalLM` (Mistral model)
  - `OPTConfig` configuration class: `FlaxOPTForCausalLM` (OPT model)
  - `RobertaConfig` configuration class: `FlaxRobertaForCausalLM` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
  - `XGLMConfig` configuration class: `FlaxXGLMForCausalLM` (XGLM model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForCausalLM` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForCausalLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForCausalLM) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForCausalLM) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForCausalLM) (BigBird model) - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [FlaxBloomForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.FlaxBloomForCausalLM) (BLOOM model) - `ElectraConfig` configuration class: `FlaxElectraForCausalLM` (ELECTRA model) - `GPT2Config` configuration class: `FlaxGPT2LMHeadModel` (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `FlaxGPTJForCausalLM` (GPT-J model) - `GPTNeoConfig` configuration class: `FlaxGPTNeoForCausalLM` (GPT Neo model) - `GemmaConfig` configuration class: `FlaxGemmaForCausalLM` (Gemma model) - `LlamaConfig` configuration class: `FlaxLlamaForCausalLM` (LLaMA model) - `MistralConfig` configuration class: `FlaxMistralForCausalLM` (Mistral model) - `OPTConfig` configuration class: `FlaxOPTForCausalLM` (OPT model) - `RobertaConfig` configuration class: `FlaxRobertaForCausalLM` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model) - `XGLMConfig` configuration class: `FlaxXGLMForCausalLM` (XGLM model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForCausalLM` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForCausalLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bart** -- [FlaxBartForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForCausalLM) (BART model)
- **bert** -- [FlaxBertForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForCausalLM) (BERT model)
- **big_bird** -- [FlaxBigBirdForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForCausalLM) (BigBird model)
- **bloom** -- [FlaxBloomForCausalLM](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.FlaxBloomForCausalLM) (BLOOM model)
- **electra** -- `FlaxElectraForCausalLM` (ELECTRA model)
- **gemma** -- `FlaxGemmaForCausalLM` (Gemma model)
- **gpt-sw3** -- `FlaxGPT2LMHeadModel` (GPT-Sw3 model)
- **gpt2** -- `FlaxGPT2LMHeadModel` (OpenAI GPT-2 model)
- **gpt_neo** -- `FlaxGPTNeoForCausalLM` (GPT Neo model)
- **gptj** -- `FlaxGPTJForCausalLM` (GPT-J model)
- **llama** -- `FlaxLlamaForCausalLM` (LLaMA model)
- **mistral** -- `FlaxMistralForCausalLM` (Mistral model)
- **opt** -- `FlaxOPTForCausalLM` (OPT model)
- **roberta** -- `FlaxRobertaForCausalLM` (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
- **xglm** -- `FlaxXGLMForCausalLM` (XGLM model)
- **xlm-roberta** -- `FlaxXLMRobertaForCausalLM` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForMaskedLM[[transformers.AutoModelForMaskedLM]]

#### transformers.AutoModelForMaskedLM[[transformers.AutoModelForMaskedLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1979)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForMaskedLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForMaskedLM) (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForMaskedLM) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForMaskedLM) (BigBird model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForMaskedLM) (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForMaskedLM) (ConvBERT model)
  - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecText model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `DistilBertForMaskedLM` (DistilBERT model)
  - `ElectraConfig` configuration class: `ElectraForMaskedLM` (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForMaskedLM` (ERNIE model)
  - `EsmConfig` configuration class: `EsmForMaskedLM` (ESM model)
  - `FNetConfig` configuration class: `FNetForMaskedLM` (FNet model)
  - `FlaubertConfig` configuration class: `FlaubertWithLMHeadModel` (FlauBERT model)
  - `FunnelConfig` configuration class: `FunnelForMaskedLM` (Funnel Transformer model)
  - `IBertConfig` configuration class: `IBertForMaskedLM` (I-BERT model)
  - `LayoutLMConfig` configuration class: `LayoutLMForMaskedLM` (LayoutLM model)
  - `LongformerConfig` configuration class: `LongformerForMaskedLM` (Longformer model)
  - `LukeConfig` configuration class: `LukeForMaskedLM` (LUKE model)
  - `MBartConfig` configuration class: `MBartForConditionalGeneration` (mBART model)
  - `MPNetConfig` configuration class: `MPNetForMaskedLM` (MPNet model)
  - `MegaConfig` configuration class: `MegaForMaskedLM` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForMaskedLM` (Megatron-BERT model)
  - `MobileBertConfig` configuration class: `MobileBertForMaskedLM` (MobileBERT model)
  - `ModernBertConfig` configuration class: `ModernBertForMaskedLM` (ModernBERT model)
  - `MraConfig` configuration class: `MraForMaskedLM` (MRA model)
  - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model)
  - `NezhaConfig` configuration class: `NezhaForMaskedLM` (Nezha model)
  - `NystromformerConfig` configuration class: `NystromformerForMaskedLM` (Nyströmformer model)
  - `PerceiverConfig` configuration class: `PerceiverForMaskedLM` (Perceiver model)
  - `QDQBertConfig` configuration class: `QDQBertForMaskedLM` (QDQBert model)
  - `ReformerConfig` configuration class: `ReformerForMaskedLM` (Reformer model)
  - `RemBertConfig` configuration class: `RemBertForMaskedLM` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForMaskedLM` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForMaskedLM` (RoFormer model)
  - `RobertaConfig` configuration class: `RobertaForMaskedLM` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForMaskedLM` (SqueezeBERT model)
  - `TapasConfig` configuration class: `TapasForMaskedLM` (TAPAS model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForMaskedLM` (Wav2Vec2 model)
  - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForMaskedLM` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model)
  - `XmodConfig` configuration class: `XmodForMaskedLM` (X-MOD model)
  - `YosoConfig` configuration class: `YosoForMaskedLM` (YOSO model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForMaskedLM) (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForConditionalGeneration) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForMaskedLM) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForMaskedLM) (BigBird model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForMaskedLM) (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForMaskedLM) (ConvBERT model) - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecText model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `DistilBertForMaskedLM` (DistilBERT model) - `ElectraConfig` configuration class: `ElectraForMaskedLM` (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForMaskedLM` (ERNIE model) - `EsmConfig` configuration class: `EsmForMaskedLM` (ESM model) - `FNetConfig` configuration class: `FNetForMaskedLM` (FNet model) - `FlaubertConfig` configuration class: `FlaubertWithLMHeadModel` (FlauBERT model) - `FunnelConfig` configuration class: `FunnelForMaskedLM` (Funnel Transformer model) - `IBertConfig` configuration class: `IBertForMaskedLM` (I-BERT model) - `LayoutLMConfig` configuration class: `LayoutLMForMaskedLM` (LayoutLM model) - `LongformerConfig` configuration class: `LongformerForMaskedLM` (Longformer model) - `LukeConfig` configuration class: `LukeForMaskedLM` (LUKE model) - `MBartConfig` configuration class: `MBartForConditionalGeneration` (mBART model) - `MPNetConfig` configuration class: `MPNetForMaskedLM` (MPNet model) - `MegaConfig` configuration class: `MegaForMaskedLM` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForMaskedLM` (Megatron-BERT model) - `MobileBertConfig` configuration class: `MobileBertForMaskedLM` (MobileBERT model) - `ModernBertConfig` configuration class: `ModernBertForMaskedLM` (ModernBERT model) - `MraConfig` configuration class: `MraForMaskedLM` (MRA model) - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model) - `NezhaConfig` configuration class: `NezhaForMaskedLM` (Nezha model) - `NystromformerConfig` configuration class: `NystromformerForMaskedLM` (Nyströmformer model) - `PerceiverConfig` configuration class: `PerceiverForMaskedLM` (Perceiver model) - `QDQBertConfig` configuration class: `QDQBertForMaskedLM` (QDQBert model) - `ReformerConfig` configuration class: `ReformerForMaskedLM` (Reformer model) - `RemBertConfig` configuration class: `RemBertForMaskedLM` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForMaskedLM` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForMaskedLM` (RoFormer model) - `RobertaConfig` configuration class: `RobertaForMaskedLM` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `SqueezeBertConfig` configuration class: `SqueezeBertForMaskedLM` (SqueezeBERT model) - `TapasConfig` configuration class: `TapasForMaskedLM` (TAPAS model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForMaskedLM` (Wav2Vec2 model) - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForMaskedLM` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model) - `XmodConfig` configuration class: `XmodForMaskedLM` (X-MOD model) - `YosoConfig` configuration class: `YosoForMaskedLM` (YOSO model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForMaskedLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [AlbertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForMaskedLM) (ALBERT model)
- **bart** -- [BartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
- **bert** -- [BertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForMaskedLM) (BERT model)
- **big_bird** -- [BigBirdForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForMaskedLM) (BigBird model)
- **camembert** -- [CamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForMaskedLM) (CamemBERT model)
- **convbert** -- [ConvBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForMaskedLM) (ConvBERT model)
- **data2vec-text** -- [Data2VecTextForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecText model)
- **deberta** -- [DebertaForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model)
- **deberta-v2** -- [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model)
- **distilbert** -- `DistilBertForMaskedLM` (DistilBERT model)
- **electra** -- `ElectraForMaskedLM` (ELECTRA model)
- **ernie** -- `ErnieForMaskedLM` (ERNIE model)
- **esm** -- `EsmForMaskedLM` (ESM model)
- **flaubert** -- `FlaubertWithLMHeadModel` (FlauBERT model)
- **fnet** -- `FNetForMaskedLM` (FNet model)
- **funnel** -- `FunnelForMaskedLM` (Funnel Transformer model)
- **ibert** -- `IBertForMaskedLM` (I-BERT model)
- **layoutlm** -- `LayoutLMForMaskedLM` (LayoutLM model)
- **longformer** -- `LongformerForMaskedLM` (Longformer model)
- **luke** -- `LukeForMaskedLM` (LUKE model)
- **mbart** -- `MBartForConditionalGeneration` (mBART model)
- **mega** -- `MegaForMaskedLM` (MEGA model)
- **megatron-bert** -- `MegatronBertForMaskedLM` (Megatron-BERT model)
- **mobilebert** -- `MobileBertForMaskedLM` (MobileBERT model)
- **modernbert** -- `ModernBertForMaskedLM` (ModernBERT model)
- **mpnet** -- `MPNetForMaskedLM` (MPNet model)
- **mra** -- `MraForMaskedLM` (MRA model)
- **mvp** -- `MvpForConditionalGeneration` (MVP model)
- **nezha** -- `NezhaForMaskedLM` (Nezha model)
- **nystromformer** -- `NystromformerForMaskedLM` (Nyströmformer model)
- **perceiver** -- `PerceiverForMaskedLM` (Perceiver model)
- **qdqbert** -- `QDQBertForMaskedLM` (QDQBert model)
- **reformer** -- `ReformerForMaskedLM` (Reformer model)
- **rembert** -- `RemBertForMaskedLM` (RemBERT model)
- **roberta** -- `RobertaForMaskedLM` (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForMaskedLM` (RoCBert model)
- **roformer** -- `RoFormerForMaskedLM` (RoFormer model)
- **squeezebert** -- `SqueezeBertForMaskedLM` (SqueezeBERT model)
- **tapas** -- `TapasForMaskedLM` (TAPAS model)
- **wav2vec2** -- `Wav2Vec2ForMaskedLM` (Wav2Vec2 model)
- **xlm** -- `XLMWithLMHeadModel` (XLM model)
- **xlm-roberta** -- `XLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model)
- **xmod** -- `XmodForMaskedLM` (X-MOD model)
- **yoso** -- `YosoForMaskedLM` (YOSO model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForMaskedLM[[transformers.TFAutoModelForMaskedLM]]

#### transformers.TFAutoModelForMaskedLM[[transformers.TFAutoModelForMaskedLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L619)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForMaskedLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForMaskedLM) (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForMaskedLM) (BERT model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForMaskedLM) (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForMaskedLM) (ConvBERT model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForMaskedLM) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForMaskedLM) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `TFDistilBertForMaskedLM` (DistilBERT model)
  - `ElectraConfig` configuration class: `TFElectraForMaskedLM` (ELECTRA model)
  - `EsmConfig` configuration class: `TFEsmForMaskedLM` (ESM model)
  - `FlaubertConfig` configuration class: `TFFlaubertWithLMHeadModel` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForMaskedLM` (Funnel Transformer model)
  - `LayoutLMConfig` configuration class: `TFLayoutLMForMaskedLM` (LayoutLM model)
  - `LongformerConfig` configuration class: `TFLongformerForMaskedLM` (Longformer model)
  - `MPNetConfig` configuration class: `TFMPNetForMaskedLM` (MPNet model)
  - `MobileBertConfig` configuration class: `TFMobileBertForMaskedLM` (MobileBERT model)
  - `RemBertConfig` configuration class: `TFRemBertForMaskedLM` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForMaskedLM` (RoFormer model)
  - `RobertaConfig` configuration class: `TFRobertaForMaskedLM` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `TapasConfig` configuration class: `TFTapasForMaskedLM` (TAPAS model)
  - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMaskedLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForMaskedLM) (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForMaskedLM) (BERT model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForMaskedLM) (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForMaskedLM) (ConvBERT model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForMaskedLM) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForMaskedLM) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `TFDistilBertForMaskedLM` (DistilBERT model) - `ElectraConfig` configuration class: `TFElectraForMaskedLM` (ELECTRA model) - `EsmConfig` configuration class: `TFEsmForMaskedLM` (ESM model) - `FlaubertConfig` configuration class: `TFFlaubertWithLMHeadModel` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForMaskedLM` (Funnel Transformer model) - `LayoutLMConfig` configuration class: `TFLayoutLMForMaskedLM` (LayoutLM model) - `LongformerConfig` configuration class: `TFLongformerForMaskedLM` (Longformer model) - `MPNetConfig` configuration class: `TFMPNetForMaskedLM` (MPNet model) - `MobileBertConfig` configuration class: `TFMobileBertForMaskedLM` (MobileBERT model) - `RemBertConfig` configuration class: `TFRemBertForMaskedLM` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForMaskedLM` (RoFormer model) - `RobertaConfig` configuration class: `TFRobertaForMaskedLM` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `TapasConfig` configuration class: `TFTapasForMaskedLM` (TAPAS model) - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForMaskedLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [TFAlbertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForMaskedLM) (ALBERT model)
- **bert** -- [TFBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForMaskedLM) (BERT model)
- **camembert** -- [TFCamembertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForMaskedLM) (CamemBERT model)
- **convbert** -- [TFConvBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForMaskedLM) (ConvBERT model)
- **deberta** -- [TFDebertaForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForMaskedLM) (DeBERTa model)
- **deberta-v2** -- [TFDebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForMaskedLM) (DeBERTa-v2 model)
- **distilbert** -- `TFDistilBertForMaskedLM` (DistilBERT model)
- **electra** -- `TFElectraForMaskedLM` (ELECTRA model)
- **esm** -- `TFEsmForMaskedLM` (ESM model)
- **flaubert** -- `TFFlaubertWithLMHeadModel` (FlauBERT model)
- **funnel** -- `TFFunnelForMaskedLM` (Funnel Transformer model)
- **layoutlm** -- `TFLayoutLMForMaskedLM` (LayoutLM model)
- **longformer** -- `TFLongformerForMaskedLM` (Longformer model)
- **mobilebert** -- `TFMobileBertForMaskedLM` (MobileBERT model)
- **mpnet** -- `TFMPNetForMaskedLM` (MPNet model)
- **rembert** -- `TFRemBertForMaskedLM` (RemBERT model)
- **roberta** -- `TFRobertaForMaskedLM` (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForMaskedLM` (RoFormer model)
- **tapas** -- `TFTapasForMaskedLM` (TAPAS model)
- **xlm** -- `TFXLMWithLMHeadModel` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForMaskedLM[[transformers.FlaxAutoModelForMaskedLM]]

#### transformers.FlaxAutoModelForMaskedLM[[transformers.FlaxAutoModelForMaskedLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L302)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForMaskedLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForMaskedLM) (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForMaskedLM) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForMaskedLM) (BigBird model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertForMaskedLM` (DistilBERT model)
  - `ElectraConfig` configuration class: `FlaxElectraForMaskedLM` (ELECTRA model)
  - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForMaskedLM` (RoFormer model)
  - `RobertaConfig` configuration class: `FlaxRobertaForMaskedLM` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForMaskedLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForMaskedLM) (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForMaskedLM) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForMaskedLM) (BigBird model) - `DistilBertConfig` configuration class: `FlaxDistilBertForMaskedLM` (DistilBERT model) - `ElectraConfig` configuration class: `FlaxElectraForMaskedLM` (ELECTRA model) - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model) - `RoFormerConfig` configuration class: `FlaxRoFormerForMaskedLM` (RoFormer model) - `RobertaConfig` configuration class: `FlaxRobertaForMaskedLM` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForMaskedLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [FlaxAlbertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForMaskedLM) (ALBERT model)
- **bart** -- [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
- **bert** -- [FlaxBertForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForMaskedLM) (BERT model)
- **big_bird** -- [FlaxBigBirdForMaskedLM](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForMaskedLM) (BigBird model)
- **distilbert** -- `FlaxDistilBertForMaskedLM` (DistilBERT model)
- **electra** -- `FlaxElectraForMaskedLM` (ELECTRA model)
- **mbart** -- `FlaxMBartForConditionalGeneration` (mBART model)
- **roberta** -- `FlaxRobertaForMaskedLM` (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForMaskedLM` (RoFormer model)
- **xlm-roberta** -- `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMaskedLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForMaskGeneration[[transformers.AutoModelForMaskGeneration]]

#### transformers.AutoModelForMaskGeneration[[transformers.AutoModelForMaskGeneration]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1920)

### TFAutoModelForMaskGeneration[[transformers.TFAutoModelForMaskGeneration]]

#### transformers.TFAutoModelForMaskGeneration[[transformers.TFAutoModelForMaskGeneration]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L530)

### AutoModelForSeq2SeqLM[[transformers.AutoModelForSeq2SeqLM]]

#### transformers.AutoModelForSeq2SeqLM[[transformers.AutoModelForSeq2SeqLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1986)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSeq2SeqLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
  - [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForConditionalGeneration) (BigBird-Pegasus model)
  - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotForConditionalGeneration) (Blenderbot model)
  - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallForConditionalGeneration) (BlenderbotSmall model)
  - `EncoderDecoderConfig` configuration class: `EncoderDecoderModel` (Encoder decoder model)
  - `FSMTConfig` configuration class: `FSMTForConditionalGeneration` (FairSeq Machine-Translation model)
  - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
  - `GraniteSpeechConfig` configuration class: `GraniteSpeechForConditionalGeneration` (GraniteSpeech model)
  - `LEDConfig` configuration class: `LEDForConditionalGeneration` (LED model)
  - `LongT5Config` configuration class: `LongT5ForConditionalGeneration` (LongT5 model)
  - `M2M100Config` configuration class: `M2M100ForConditionalGeneration` (M2M100 model)
  - `MBartConfig` configuration class: `MBartForConditionalGeneration` (mBART model)
  - `MT5Config` configuration class: `MT5ForConditionalGeneration` (MT5 model)
  - `MarianConfig` configuration class: `MarianMTModel` (Marian model)
  - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model)
  - `NllbMoeConfig` configuration class: `NllbMoeForConditionalGeneration` (NLLB-MOE model)
  - `PLBartConfig` configuration class: `PLBartForConditionalGeneration` (PLBart model)
  - `PegasusConfig` configuration class: `PegasusForConditionalGeneration` (Pegasus model)
  - `PegasusXConfig` configuration class: `PegasusXForConditionalGeneration` (PEGASUS-X model)
  - `ProphetNetConfig` configuration class: `ProphetNetForConditionalGeneration` (ProphetNet model)
  - `Qwen2AudioConfig` configuration class: `Qwen2AudioForConditionalGeneration` (Qwen2Audio model)
  - `SeamlessM4TConfig` configuration class: `SeamlessM4TForTextToText` (SeamlessM4T model)
  - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2ForTextToText` (SeamlessM4Tv2 model)
  - `SwitchTransformersConfig` configuration class: `SwitchTransformersForConditionalGeneration` (SwitchTransformers model)
  - `T5Config` configuration class: `T5ForConditionalGeneration` (T5 model)
  - `T5GemmaConfig` configuration class: `T5GemmaForConditionalGeneration` (T5Gemma model)
  - `UMT5Config` configuration class: `UMT5ForConditionalGeneration` (UMT5 model)
  - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model)
  - `XLMProphetNetConfig` configuration class: `XLMProphetNetForConditionalGeneration` (XLM-ProphetNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = AutoModelForSeq2SeqLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForConditionalGeneration) (BART model) - [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForConditionalGeneration) (BigBird-Pegasus model) - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotForConditionalGeneration) (Blenderbot model) - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallForConditionalGeneration) (BlenderbotSmall model) - `EncoderDecoderConfig` configuration class: `EncoderDecoderModel` (Encoder decoder model) - `FSMTConfig` configuration class: `FSMTForConditionalGeneration` (FairSeq Machine-Translation model) - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model) - `GraniteSpeechConfig` configuration class: `GraniteSpeechForConditionalGeneration` (GraniteSpeech model) - `LEDConfig` configuration class: `LEDForConditionalGeneration` (LED model) - `LongT5Config` configuration class: `LongT5ForConditionalGeneration` (LongT5 model) - `M2M100Config` configuration class: `M2M100ForConditionalGeneration` (M2M100 model) - `MBartConfig` configuration class: `MBartForConditionalGeneration` (mBART model) - `MT5Config` configuration class: `MT5ForConditionalGeneration` (MT5 model) - `MarianConfig` configuration class: `MarianMTModel` (Marian model) - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model) - `NllbMoeConfig` configuration class: `NllbMoeForConditionalGeneration` (NLLB-MOE model) - `PLBartConfig` configuration class: `PLBartForConditionalGeneration` (PLBart model) - `PegasusConfig` configuration class: `PegasusForConditionalGeneration` (Pegasus model) - `PegasusXConfig` configuration class: `PegasusXForConditionalGeneration` (PEGASUS-X model) - `ProphetNetConfig` configuration class: `ProphetNetForConditionalGeneration` (ProphetNet model) - `Qwen2AudioConfig` configuration class: `Qwen2AudioForConditionalGeneration` (Qwen2Audio model) - `SeamlessM4TConfig` configuration class: `SeamlessM4TForTextToText` (SeamlessM4T model) - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2ForTextToText` (SeamlessM4Tv2 model) - `SwitchTransformersConfig` configuration class: `SwitchTransformersForConditionalGeneration` (SwitchTransformers model) - `T5Config` configuration class: `T5ForConditionalGeneration` (T5 model) - `T5GemmaConfig` configuration class: `T5GemmaForConditionalGeneration` (T5Gemma model) - `UMT5Config` configuration class: `UMT5ForConditionalGeneration` (UMT5 model) - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model) - `XLMProphetNetConfig` configuration class: `XLMProphetNetForConditionalGeneration` (XLM-ProphetNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSeq2SeqLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bart** -- [BartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
- **bigbird_pegasus** -- [BigBirdPegasusForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForConditionalGeneration) (BigBird-Pegasus model)
- **blenderbot** -- [BlenderbotForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotForConditionalGeneration) (Blenderbot model)
- **blenderbot-small** -- [BlenderbotSmallForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallForConditionalGeneration) (BlenderbotSmall model)
- **encoder-decoder** -- `EncoderDecoderModel` (Encoder decoder model)
- **fsmt** -- `FSMTForConditionalGeneration` (FairSeq Machine-Translation model)
- **gptsan-japanese** -- `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
- **granite_speech** -- `GraniteSpeechForConditionalGeneration` (GraniteSpeech model)
- **led** -- `LEDForConditionalGeneration` (LED model)
- **longt5** -- `LongT5ForConditionalGeneration` (LongT5 model)
- **m2m_100** -- `M2M100ForConditionalGeneration` (M2M100 model)
- **marian** -- `MarianMTModel` (Marian model)
- **mbart** -- `MBartForConditionalGeneration` (mBART model)
- **mt5** -- `MT5ForConditionalGeneration` (MT5 model)
- **mvp** -- `MvpForConditionalGeneration` (MVP model)
- **nllb-moe** -- `NllbMoeForConditionalGeneration` (NLLB-MOE model)
- **pegasus** -- `PegasusForConditionalGeneration` (Pegasus model)
- **pegasus_x** -- `PegasusXForConditionalGeneration` (PEGASUS-X model)
- **plbart** -- `PLBartForConditionalGeneration` (PLBart model)
- **prophetnet** -- `ProphetNetForConditionalGeneration` (ProphetNet model)
- **qwen2_audio** -- `Qwen2AudioForConditionalGeneration` (Qwen2Audio model)
- **seamless_m4t** -- `SeamlessM4TForTextToText` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4Tv2ForTextToText` (SeamlessM4Tv2 model)
- **switch_transformers** -- `SwitchTransformersForConditionalGeneration` (SwitchTransformers model)
- **t5** -- `T5ForConditionalGeneration` (T5 model)
- **t5gemma** -- `T5GemmaForConditionalGeneration` (T5Gemma model)
- **umt5** -- `UMT5ForConditionalGeneration` (UMT5 model)
- **voxtral** -- `VoxtralForConditionalGeneration` (Voxtral model)
- **xlm-prophetnet** -- `XLMProphetNetForConditionalGeneration` (XLM-ProphetNet model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/t5_tf_model_config.json")
>>> model = AutoModelForSeq2SeqLM.from_pretrained(
...     "./tf_model/t5_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForSeq2SeqLM[[transformers.TFAutoModelForSeq2SeqLM]]

#### transformers.TFAutoModelForSeq2SeqLM[[transformers.TFAutoModelForSeq2SeqLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L626)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForSeq2SeqLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model)
  - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [TFBlenderbotForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.TFBlenderbotForConditionalGeneration) (Blenderbot model)
  - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [TFBlenderbotSmallForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.TFBlenderbotSmallForConditionalGeneration) (BlenderbotSmall model)
  - `EncoderDecoderConfig` configuration class: `TFEncoderDecoderModel` (Encoder decoder model)
  - `LEDConfig` configuration class: `TFLEDForConditionalGeneration` (LED model)
  - `MBartConfig` configuration class: `TFMBartForConditionalGeneration` (mBART model)
  - `MT5Config` configuration class: `TFMT5ForConditionalGeneration` (MT5 model)
  - `MarianConfig` configuration class: `TFMarianMTModel` (Marian model)
  - `PegasusConfig` configuration class: `TFPegasusForConditionalGeneration` (Pegasus model)
  - `T5Config` configuration class: `TFT5ForConditionalGeneration` (T5 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = TFAutoModelForSeq2SeqLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model) - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [TFBlenderbotForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.TFBlenderbotForConditionalGeneration) (Blenderbot model) - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [TFBlenderbotSmallForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.TFBlenderbotSmallForConditionalGeneration) (BlenderbotSmall model) - `EncoderDecoderConfig` configuration class: `TFEncoderDecoderModel` (Encoder decoder model) - `LEDConfig` configuration class: `TFLEDForConditionalGeneration` (LED model) - `MBartConfig` configuration class: `TFMBartForConditionalGeneration` (mBART model) - `MT5Config` configuration class: `TFMT5ForConditionalGeneration` (MT5 model) - `MarianConfig` configuration class: `TFMarianMTModel` (Marian model) - `PegasusConfig` configuration class: `TFPegasusForConditionalGeneration` (Pegasus model) - `T5Config` configuration class: `TFT5ForConditionalGeneration` (T5 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForSeq2SeqLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bart** -- [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model)
- **blenderbot** -- [TFBlenderbotForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.TFBlenderbotForConditionalGeneration) (Blenderbot model)
- **blenderbot-small** -- [TFBlenderbotSmallForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.TFBlenderbotSmallForConditionalGeneration) (BlenderbotSmall model)
- **encoder-decoder** -- `TFEncoderDecoderModel` (Encoder decoder model)
- **led** -- `TFLEDForConditionalGeneration` (LED model)
- **marian** -- `TFMarianMTModel` (Marian model)
- **mbart** -- `TFMBartForConditionalGeneration` (mBART model)
- **mt5** -- `TFMT5ForConditionalGeneration` (MT5 model)
- **pegasus** -- `TFPegasusForConditionalGeneration` (Pegasus model)
- **t5** -- `TFT5ForConditionalGeneration` (T5 model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained(
...     "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForSeq2SeqLM[[transformers.FlaxAutoModelForSeq2SeqLM]]

#### transformers.FlaxAutoModelForSeq2SeqLM[[transformers.FlaxAutoModelForSeq2SeqLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L309)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForSeq2SeqLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
  - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [FlaxBlenderbotForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.FlaxBlenderbotForConditionalGeneration) (Blenderbot model)
  - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [FlaxBlenderbotSmallForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.FlaxBlenderbotSmallForConditionalGeneration) (BlenderbotSmall model)
  - `EncoderDecoderConfig` configuration class: `FlaxEncoderDecoderModel` (Encoder decoder model)
  - `LongT5Config` configuration class: `FlaxLongT5ForConditionalGeneration` (LongT5 model)
  - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model)
  - `MT5Config` configuration class: `FlaxMT5ForConditionalGeneration` (MT5 model)
  - `MarianConfig` configuration class: `FlaxMarianMTModel` (Marian model)
  - `PegasusConfig` configuration class: `FlaxPegasusForConditionalGeneration` (Pegasus model)
  - `T5Config` configuration class: `FlaxT5ForConditionalGeneration` (T5 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = FlaxAutoModelForSeq2SeqLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model) - [BlenderbotConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [FlaxBlenderbotForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.FlaxBlenderbotForConditionalGeneration) (Blenderbot model) - [BlenderbotSmallConfig](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [FlaxBlenderbotSmallForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.FlaxBlenderbotSmallForConditionalGeneration) (BlenderbotSmall model) - `EncoderDecoderConfig` configuration class: `FlaxEncoderDecoderModel` (Encoder decoder model) - `LongT5Config` configuration class: `FlaxLongT5ForConditionalGeneration` (LongT5 model) - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model) - `MT5Config` configuration class: `FlaxMT5ForConditionalGeneration` (MT5 model) - `MarianConfig` configuration class: `FlaxMarianMTModel` (Marian model) - `PegasusConfig` configuration class: `FlaxPegasusForConditionalGeneration` (Pegasus model) - `T5Config` configuration class: `FlaxT5ForConditionalGeneration` (T5 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForSeq2SeqLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bart** -- [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
- **blenderbot** -- [FlaxBlenderbotForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot#transformers.FlaxBlenderbotForConditionalGeneration) (Blenderbot model)
- **blenderbot-small** -- [FlaxBlenderbotSmallForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blenderbot-small#transformers.FlaxBlenderbotSmallForConditionalGeneration) (BlenderbotSmall model)
- **encoder-decoder** -- `FlaxEncoderDecoderModel` (Encoder decoder model)
- **longt5** -- `FlaxLongT5ForConditionalGeneration` (LongT5 model)
- **marian** -- `FlaxMarianMTModel` (Marian model)
- **mbart** -- `FlaxMBartForConditionalGeneration` (mBART model)
- **mt5** -- `FlaxMT5ForConditionalGeneration` (MT5 model)
- **pegasus** -- `FlaxPegasusForConditionalGeneration` (Pegasus model)
- **t5** -- `FlaxT5ForConditionalGeneration` (T5 model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained(
...     "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForSequenceClassification[[transformers.AutoModelForSequenceClassification]]

#### transformers.AutoModelForSequenceClassification[[transformers.AutoModelForSequenceClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1997)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSequenceClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForSequenceClassification) (ALBERT model)
  - `ArceeConfig` configuration class: `ArceeForSequenceClassification` (Arcee model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForSequenceClassification) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForSequenceClassification) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForSequenceClassification) (BigBird model)
  - [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForSequenceClassification) (BigBird-Pegasus model)
  - [BioGptConfig](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptForSequenceClassification) (BioGpt model)
  - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForSequenceClassification) (BLOOM model)
  - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLForSequenceClassification) (CTRL model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForSequenceClassification) (CamemBERT model)
  - [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForSequenceClassification) (CANINE model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForSequenceClassification) (ConvBERT model)
  - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForSequenceClassification) (Data2VecText model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForSequenceClassification) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForSequenceClassification) (DeBERTa-v2 model)
  - `DeepseekV2Config` configuration class: `DeepseekV2ForSequenceClassification` (DeepSeek-V2 model)
  - `DeepseekV3Config` configuration class: `DeepseekV3ForSequenceClassification` (DeepSeek-V3 model)
  - `DiffLlamaConfig` configuration class: `DiffLlamaForSequenceClassification` (DiffLlama model)
  - `DistilBertConfig` configuration class: `DistilBertForSequenceClassification` (DistilBERT model)
  - `DogeConfig` configuration class: `DogeForSequenceClassification` (Doge model)
  - `ElectraConfig` configuration class: `ElectraForSequenceClassification` (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForSequenceClassification` (ERNIE model)
  - `ErnieMConfig` configuration class: `ErnieMForSequenceClassification` (ErnieM model)
  - `EsmConfig` configuration class: `EsmForSequenceClassification` (ESM model)
  - `Exaone4Config` configuration class: `Exaone4ForSequenceClassification` (EXAONE-4.0 model)
  - `FNetConfig` configuration class: `FNetForSequenceClassification` (FNet model)
  - `FalconConfig` configuration class: `FalconForSequenceClassification` (Falcon model)
  - `FlaubertConfig` configuration class: `FlaubertForSequenceClassification` (FlauBERT model)
  - `FunnelConfig` configuration class: `FunnelForSequenceClassification` (Funnel Transformer model)
  - `GPT2Config` configuration class: `GPT2ForSequenceClassification` (OpenAI GPT-2 model)
  - `GPTBigCodeConfig` configuration class: `GPTBigCodeForSequenceClassification` (GPTBigCode model)
  - `GPTJConfig` configuration class: `GPTJForSequenceClassification` (GPT-J model)
  - `GPTNeoConfig` configuration class: `GPTNeoForSequenceClassification` (GPT Neo model)
  - `GPTNeoXConfig` configuration class: `GPTNeoXForSequenceClassification` (GPT NeoX model)
  - `Gemma2Config` configuration class: `Gemma2ForSequenceClassification` (Gemma2 model)
  - `Gemma3Config` configuration class: `Gemma3ForSequenceClassification` (Gemma3ForConditionalGeneration model)
  - `Gemma3TextConfig` configuration class: `Gemma3TextForSequenceClassification` (Gemma3ForCausalLM model)
  - `GemmaConfig` configuration class: `GemmaForSequenceClassification` (Gemma model)
  - `Glm4Config` configuration class: `Glm4ForSequenceClassification` (GLM4 model)
  - `GlmConfig` configuration class: `GlmForSequenceClassification` (GLM model)
  - `GptOssConfig` configuration class: `GptOssForSequenceClassification` (GptOss model)
  - `HeliumConfig` configuration class: `HeliumForSequenceClassification` (Helium model)
  - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1ForSequenceClassification` (HunYuanDenseV1 model)
  - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1ForSequenceClassification` (HunYuanMoeV1 model)
  - `IBertConfig` configuration class: `IBertForSequenceClassification` (I-BERT model)
  - `JambaConfig` configuration class: `JambaForSequenceClassification` (Jamba model)
  - `JetMoeConfig` configuration class: `JetMoeForSequenceClassification` (JetMoe model)
  - `LEDConfig` configuration class: `LEDForSequenceClassification` (LED model)
  - `LayoutLMConfig` configuration class: `LayoutLMForSequenceClassification` (LayoutLM model)
  - `LayoutLMv2Config` configuration class: `LayoutLMv2ForSequenceClassification` (LayoutLMv2 model)
  - `LayoutLMv3Config` configuration class: `LayoutLMv3ForSequenceClassification` (LayoutLMv3 model)
  - `LiltConfig` configuration class: `LiltForSequenceClassification` (LiLT model)
  - `LlamaConfig` configuration class: `LlamaForSequenceClassification` (LLaMA model)
  - `LongformerConfig` configuration class: `LongformerForSequenceClassification` (Longformer model)
  - `LukeConfig` configuration class: `LukeForSequenceClassification` (LUKE model)
  - `MBartConfig` configuration class: `MBartForSequenceClassification` (mBART model)
  - `MPNetConfig` configuration class: `MPNetForSequenceClassification` (MPNet model)
  - `MT5Config` configuration class: `MT5ForSequenceClassification` (MT5 model)
  - `MarkupLMConfig` configuration class: `MarkupLMForSequenceClassification` (MarkupLM model)
  - `MegaConfig` configuration class: `MegaForSequenceClassification` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForSequenceClassification` (Megatron-BERT model)
  - `MiniMaxConfig` configuration class: `MiniMaxForSequenceClassification` (MiniMax model)
  - `MinistralConfig` configuration class: `MinistralForSequenceClassification` (Ministral model)
  - `MistralConfig` configuration class: `MistralForSequenceClassification` (Mistral model)
  - `MixtralConfig` configuration class: `MixtralForSequenceClassification` (Mixtral model)
  - `MobileBertConfig` configuration class: `MobileBertForSequenceClassification` (MobileBERT model)
  - `ModernBertConfig` configuration class: `ModernBertForSequenceClassification` (ModernBERT model)
  - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderForSequenceClassification` (ModernBertDecoder model)
  - `MptConfig` configuration class: `MptForSequenceClassification` (MPT model)
  - `MraConfig` configuration class: `MraForSequenceClassification` (MRA model)
  - `MvpConfig` configuration class: `MvpForSequenceClassification` (MVP model)
  - `NemotronConfig` configuration class: `NemotronForSequenceClassification` (Nemotron model)
  - `NezhaConfig` configuration class: `NezhaForSequenceClassification` (Nezha model)
  - `NystromformerConfig` configuration class: `NystromformerForSequenceClassification` (Nyströmformer model)
  - `OPTConfig` configuration class: `OPTForSequenceClassification` (OPT model)
  - `OpenAIGPTConfig` configuration class: `OpenAIGPTForSequenceClassification` (OpenAI GPT model)
  - `OpenLlamaConfig` configuration class: `OpenLlamaForSequenceClassification` (OpenLlama model)
  - `PLBartConfig` configuration class: `PLBartForSequenceClassification` (PLBart model)
  - `PerceiverConfig` configuration class: `PerceiverForSequenceClassification` (Perceiver model)
  - `PersimmonConfig` configuration class: `PersimmonForSequenceClassification` (Persimmon model)
  - `Phi3Config` configuration class: `Phi3ForSequenceClassification` (Phi3 model)
  - `PhiConfig` configuration class: `PhiForSequenceClassification` (Phi model)
  - `PhimoeConfig` configuration class: `PhimoeForSequenceClassification` (Phimoe model)
  - `QDQBertConfig` configuration class: `QDQBertForSequenceClassification` (QDQBert model)
  - `Qwen2Config` configuration class: `Qwen2ForSequenceClassification` (Qwen2 model)
  - `Qwen2MoeConfig` configuration class: `Qwen2MoeForSequenceClassification` (Qwen2MoE model)
  - `Qwen3Config` configuration class: `Qwen3ForSequenceClassification` (Qwen3 model)
  - `Qwen3MoeConfig` configuration class: `Qwen3MoeForSequenceClassification` (Qwen3MoE model)
  - `Qwen3NextConfig` configuration class: `Qwen3NextForSequenceClassification` (Qwen3Next model)
  - `ReformerConfig` configuration class: `ReformerForSequenceClassification` (Reformer model)
  - `RemBertConfig` configuration class: `RemBertForSequenceClassification` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForSequenceClassification` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForSequenceClassification` (RoFormer model)
  - `RobertaConfig` configuration class: `RobertaForSequenceClassification` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
  - `SeedOssConfig` configuration class: `SeedOssForSequenceClassification` (SeedOss model)
  - `SmolLM3Config` configuration class: `SmolLM3ForSequenceClassification` (SmolLM3 model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForSequenceClassification` (SqueezeBERT model)
  - `StableLmConfig` configuration class: `StableLmForSequenceClassification` (StableLm model)
  - `Starcoder2Config` configuration class: `Starcoder2ForSequenceClassification` (Starcoder2 model)
  - `T5Config` configuration class: `T5ForSequenceClassification` (T5 model)
  - `T5GemmaConfig` configuration class: `T5GemmaForSequenceClassification` (T5Gemma model)
  - `TapasConfig` configuration class: `TapasForSequenceClassification` (TAPAS model)
  - `TransfoXLConfig` configuration class: `TransfoXLForSequenceClassification` (Transformer-XL model)
  - `UMT5Config` configuration class: `UMT5ForSequenceClassification` (UMT5 model)
  - `XLMConfig` configuration class: `XLMForSequenceClassification` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForSequenceClassification` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForSequenceClassification` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetForSequenceClassification` (XLNet model)
  - `XmodConfig` configuration class: `XmodForSequenceClassification` (X-MOD model)
  - `YosoConfig` configuration class: `YosoForSequenceClassification` (YOSO model)
  - `Zamba2Config` configuration class: `Zamba2ForSequenceClassification` (Zamba2 model)
  - `ZambaConfig` configuration class: `ZambaForSequenceClassification` (Zamba model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSequenceClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForSequenceClassification) (ALBERT model) - `ArceeConfig` configuration class: `ArceeForSequenceClassification` (Arcee model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForSequenceClassification) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForSequenceClassification) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForSequenceClassification) (BigBird model) - [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForSequenceClassification) (BigBird-Pegasus model) - [BioGptConfig](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptForSequenceClassification) (BioGpt model) - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForSequenceClassification) (BLOOM model) - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLForSequenceClassification) (CTRL model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForSequenceClassification) (CamemBERT model) - [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForSequenceClassification) (CANINE model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForSequenceClassification) (ConvBERT model) - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForSequenceClassification) (Data2VecText model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForSequenceClassification) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForSequenceClassification) (DeBERTa-v2 model) - `DeepseekV2Config` configuration class: `DeepseekV2ForSequenceClassification` (DeepSeek-V2 model) - `DeepseekV3Config` configuration class: `DeepseekV3ForSequenceClassification` (DeepSeek-V3 model) - `DiffLlamaConfig` configuration class: `DiffLlamaForSequenceClassification` (DiffLlama model) - `DistilBertConfig` configuration class: `DistilBertForSequenceClassification` (DistilBERT model) - `DogeConfig` configuration class: `DogeForSequenceClassification` (Doge model) - `ElectraConfig` configuration class: `ElectraForSequenceClassification` (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForSequenceClassification` (ERNIE model) - `ErnieMConfig` configuration class: `ErnieMForSequenceClassification` (ErnieM model) - `EsmConfig` configuration class: `EsmForSequenceClassification` (ESM model) - `Exaone4Config` configuration class: `Exaone4ForSequenceClassification` (EXAONE-4.0 model) - `FNetConfig` configuration class: `FNetForSequenceClassification` (FNet model) - `FalconConfig` configuration class: `FalconForSequenceClassification` (Falcon model) - `FlaubertConfig` configuration class: `FlaubertForSequenceClassification` (FlauBERT model) - `FunnelConfig` configuration class: `FunnelForSequenceClassification` (Funnel Transformer model) - `GPT2Config` configuration class: `GPT2ForSequenceClassification` (OpenAI GPT-2 model) - `GPTBigCodeConfig` configuration class: `GPTBigCodeForSequenceClassification` (GPTBigCode model) - `GPTJConfig` configuration class: `GPTJForSequenceClassification` (GPT-J model) - `GPTNeoConfig` configuration class: `GPTNeoForSequenceClassification` (GPT Neo model) - `GPTNeoXConfig` configuration class: `GPTNeoXForSequenceClassification` (GPT NeoX model) - `Gemma2Config` configuration class: `Gemma2ForSequenceClassification` (Gemma2 model) - `Gemma3Config` configuration class: `Gemma3ForSequenceClassification` (Gemma3ForConditionalGeneration model) - `Gemma3TextConfig` configuration class: `Gemma3TextForSequenceClassification` (Gemma3ForCausalLM model) - `GemmaConfig` configuration class: `GemmaForSequenceClassification` (Gemma model) - `Glm4Config` configuration class: `Glm4ForSequenceClassification` (GLM4 model) - `GlmConfig` configuration class: `GlmForSequenceClassification` (GLM model) - `GptOssConfig` configuration class: `GptOssForSequenceClassification` (GptOss model) - `HeliumConfig` configuration class: `HeliumForSequenceClassification` (Helium model) - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1ForSequenceClassification` (HunYuanDenseV1 model) - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1ForSequenceClassification` (HunYuanMoeV1 model) - `IBertConfig` configuration class: `IBertForSequenceClassification` (I-BERT model) - `JambaConfig` configuration class: `JambaForSequenceClassification` (Jamba model) - `JetMoeConfig` configuration class: `JetMoeForSequenceClassification` (JetMoe model) - `LEDConfig` configuration class: `LEDForSequenceClassification` (LED model) - `LayoutLMConfig` configuration class: `LayoutLMForSequenceClassification` (LayoutLM model) - `LayoutLMv2Config` configuration class: `LayoutLMv2ForSequenceClassification` (LayoutLMv2 model) - `LayoutLMv3Config` configuration class: `LayoutLMv3ForSequenceClassification` (LayoutLMv3 model) - `LiltConfig` configuration class: `LiltForSequenceClassification` (LiLT model) - `LlamaConfig` configuration class: `LlamaForSequenceClassification` (LLaMA model) - `LongformerConfig` configuration class: `LongformerForSequenceClassification` (Longformer model) - `LukeConfig` configuration class: `LukeForSequenceClassification` (LUKE model) - `MBartConfig` configuration class: `MBartForSequenceClassification` (mBART model) - `MPNetConfig` configuration class: `MPNetForSequenceClassification` (MPNet model) - `MT5Config` configuration class: `MT5ForSequenceClassification` (MT5 model) - `MarkupLMConfig` configuration class: `MarkupLMForSequenceClassification` (MarkupLM model) - `MegaConfig` configuration class: `MegaForSequenceClassification` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForSequenceClassification` (Megatron-BERT model) - `MiniMaxConfig` configuration class: `MiniMaxForSequenceClassification` (MiniMax model) - `MinistralConfig` configuration class: `MinistralForSequenceClassification` (Ministral model) - `MistralConfig` configuration class: `MistralForSequenceClassification` (Mistral model) - `MixtralConfig` configuration class: `MixtralForSequenceClassification` (Mixtral model) - `MobileBertConfig` configuration class: `MobileBertForSequenceClassification` (MobileBERT model) - `ModernBertConfig` configuration class: `ModernBertForSequenceClassification` (ModernBERT model) - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderForSequenceClassification` (ModernBertDecoder model) - `MptConfig` configuration class: `MptForSequenceClassification` (MPT model) - `MraConfig` configuration class: `MraForSequenceClassification` (MRA model) - `MvpConfig` configuration class: `MvpForSequenceClassification` (MVP model) - `NemotronConfig` configuration class: `NemotronForSequenceClassification` (Nemotron model) - `NezhaConfig` configuration class: `NezhaForSequenceClassification` (Nezha model) - `NystromformerConfig` configuration class: `NystromformerForSequenceClassification` (Nyströmformer model) - `OPTConfig` configuration class: `OPTForSequenceClassification` (OPT model) - `OpenAIGPTConfig` configuration class: `OpenAIGPTForSequenceClassification` (OpenAI GPT model) - `OpenLlamaConfig` configuration class: `OpenLlamaForSequenceClassification` (OpenLlama model) - `PLBartConfig` configuration class: `PLBartForSequenceClassification` (PLBart model) - `PerceiverConfig` configuration class: `PerceiverForSequenceClassification` (Perceiver model) - `PersimmonConfig` configuration class: `PersimmonForSequenceClassification` (Persimmon model) - `Phi3Config` configuration class: `Phi3ForSequenceClassification` (Phi3 model) - `PhiConfig` configuration class: `PhiForSequenceClassification` (Phi model) - `PhimoeConfig` configuration class: `PhimoeForSequenceClassification` (Phimoe model) - `QDQBertConfig` configuration class: `QDQBertForSequenceClassification` (QDQBert model) - `Qwen2Config` configuration class: `Qwen2ForSequenceClassification` (Qwen2 model) - `Qwen2MoeConfig` configuration class: `Qwen2MoeForSequenceClassification` (Qwen2MoE model) - `Qwen3Config` configuration class: `Qwen3ForSequenceClassification` (Qwen3 model) - `Qwen3MoeConfig` configuration class: `Qwen3MoeForSequenceClassification` (Qwen3MoE model) - `Qwen3NextConfig` configuration class: `Qwen3NextForSequenceClassification` (Qwen3Next model) - `ReformerConfig` configuration class: `ReformerForSequenceClassification` (Reformer model) - `RemBertConfig` configuration class: `RemBertForSequenceClassification` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForSequenceClassification` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForSequenceClassification` (RoFormer model) - `RobertaConfig` configuration class: `RobertaForSequenceClassification` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model) - `SeedOssConfig` configuration class: `SeedOssForSequenceClassification` (SeedOss model) - `SmolLM3Config` configuration class: `SmolLM3ForSequenceClassification` (SmolLM3 model) - `SqueezeBertConfig` configuration class: `SqueezeBertForSequenceClassification` (SqueezeBERT model) - `StableLmConfig` configuration class: `StableLmForSequenceClassification` (StableLm model) - `Starcoder2Config` configuration class: `Starcoder2ForSequenceClassification` (Starcoder2 model) - `T5Config` configuration class: `T5ForSequenceClassification` (T5 model) - `T5GemmaConfig` configuration class: `T5GemmaForSequenceClassification` (T5Gemma model) - `TapasConfig` configuration class: `TapasForSequenceClassification` (TAPAS model) - `TransfoXLConfig` configuration class: `TransfoXLForSequenceClassification` (Transformer-XL model) - `UMT5Config` configuration class: `UMT5ForSequenceClassification` (UMT5 model) - `XLMConfig` configuration class: `XLMForSequenceClassification` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForSequenceClassification` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForSequenceClassification` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetForSequenceClassification` (XLNet model) - `XmodConfig` configuration class: `XmodForSequenceClassification` (X-MOD model) - `YosoConfig` configuration class: `YosoForSequenceClassification` (YOSO model) - `Zamba2Config` configuration class: `Zamba2ForSequenceClassification` (Zamba2 model) - `ZambaConfig` configuration class: `ZambaForSequenceClassification` (Zamba model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSequenceClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [AlbertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForSequenceClassification) (ALBERT model)
- **arcee** -- `ArceeForSequenceClassification` (Arcee model)
- **bart** -- [BartForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForSequenceClassification) (BART model)
- **bert** -- [BertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForSequenceClassification) (BERT model)
- **big_bird** -- [BigBirdForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForSequenceClassification) (BigBird model)
- **bigbird_pegasus** -- [BigBirdPegasusForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForSequenceClassification) (BigBird-Pegasus model)
- **biogpt** -- [BioGptForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptForSequenceClassification) (BioGpt model)
- **bloom** -- [BloomForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForSequenceClassification) (BLOOM model)
- **camembert** -- [CamembertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForSequenceClassification) (CamemBERT model)
- **canine** -- [CanineForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForSequenceClassification) (CANINE model)
- **code_llama** -- `LlamaForSequenceClassification` (CodeLlama model)
- **convbert** -- [ConvBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForSequenceClassification) (ConvBERT model)
- **ctrl** -- [CTRLForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLForSequenceClassification) (CTRL model)
- **data2vec-text** -- [Data2VecTextForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForSequenceClassification) (Data2VecText model)
- **deberta** -- [DebertaForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForSequenceClassification) (DeBERTa model)
- **deberta-v2** -- [DebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForSequenceClassification) (DeBERTa-v2 model)
- **deepseek_v2** -- `DeepseekV2ForSequenceClassification` (DeepSeek-V2 model)
- **deepseek_v3** -- `DeepseekV3ForSequenceClassification` (DeepSeek-V3 model)
- **diffllama** -- `DiffLlamaForSequenceClassification` (DiffLlama model)
- **distilbert** -- `DistilBertForSequenceClassification` (DistilBERT model)
- **doge** -- `DogeForSequenceClassification` (Doge model)
- **electra** -- `ElectraForSequenceClassification` (ELECTRA model)
- **ernie** -- `ErnieForSequenceClassification` (ERNIE model)
- **ernie_m** -- `ErnieMForSequenceClassification` (ErnieM model)
- **esm** -- `EsmForSequenceClassification` (ESM model)
- **exaone4** -- `Exaone4ForSequenceClassification` (EXAONE-4.0 model)
- **falcon** -- `FalconForSequenceClassification` (Falcon model)
- **flaubert** -- `FlaubertForSequenceClassification` (FlauBERT model)
- **fnet** -- `FNetForSequenceClassification` (FNet model)
- **funnel** -- `FunnelForSequenceClassification` (Funnel Transformer model)
- **gemma** -- `GemmaForSequenceClassification` (Gemma model)
- **gemma2** -- `Gemma2ForSequenceClassification` (Gemma2 model)
- **gemma3** -- `Gemma3ForSequenceClassification` (Gemma3ForConditionalGeneration model)
- **gemma3_text** -- `Gemma3TextForSequenceClassification` (Gemma3ForCausalLM model)
- **glm** -- `GlmForSequenceClassification` (GLM model)
- **glm4** -- `Glm4ForSequenceClassification` (GLM4 model)
- **gpt-sw3** -- `GPT2ForSequenceClassification` (GPT-Sw3 model)
- **gpt2** -- `GPT2ForSequenceClassification` (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeForSequenceClassification` (GPTBigCode model)
- **gpt_neo** -- `GPTNeoForSequenceClassification` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXForSequenceClassification` (GPT NeoX model)
- **gpt_oss** -- `GptOssForSequenceClassification` (GptOss model)
- **gptj** -- `GPTJForSequenceClassification` (GPT-J model)
- **helium** -- `HeliumForSequenceClassification` (Helium model)
- **hunyuan_v1_dense** -- `HunYuanDenseV1ForSequenceClassification` (HunYuanDenseV1 model)
- **hunyuan_v1_moe** -- `HunYuanMoEV1ForSequenceClassification` (HunYuanMoeV1 model)
- **ibert** -- `IBertForSequenceClassification` (I-BERT model)
- **jamba** -- `JambaForSequenceClassification` (Jamba model)
- **jetmoe** -- `JetMoeForSequenceClassification` (JetMoe model)
- **layoutlm** -- `LayoutLMForSequenceClassification` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2ForSequenceClassification` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3ForSequenceClassification` (LayoutLMv3 model)
- **led** -- `LEDForSequenceClassification` (LED model)
- **lilt** -- `LiltForSequenceClassification` (LiLT model)
- **llama** -- `LlamaForSequenceClassification` (LLaMA model)
- **longformer** -- `LongformerForSequenceClassification` (Longformer model)
- **luke** -- `LukeForSequenceClassification` (LUKE model)
- **markuplm** -- `MarkupLMForSequenceClassification` (MarkupLM model)
- **mbart** -- `MBartForSequenceClassification` (mBART model)
- **mega** -- `MegaForSequenceClassification` (MEGA model)
- **megatron-bert** -- `MegatronBertForSequenceClassification` (Megatron-BERT model)
- **minimax** -- `MiniMaxForSequenceClassification` (MiniMax model)
- **ministral** -- `MinistralForSequenceClassification` (Ministral model)
- **mistral** -- `MistralForSequenceClassification` (Mistral model)
- **mixtral** -- `MixtralForSequenceClassification` (Mixtral model)
- **mobilebert** -- `MobileBertForSequenceClassification` (MobileBERT model)
- **modernbert** -- `ModernBertForSequenceClassification` (ModernBERT model)
- **modernbert-decoder** -- `ModernBertDecoderForSequenceClassification` (ModernBertDecoder model)
- **mpnet** -- `MPNetForSequenceClassification` (MPNet model)
- **mpt** -- `MptForSequenceClassification` (MPT model)
- **mra** -- `MraForSequenceClassification` (MRA model)
- **mt5** -- `MT5ForSequenceClassification` (MT5 model)
- **mvp** -- `MvpForSequenceClassification` (MVP model)
- **nemotron** -- `NemotronForSequenceClassification` (Nemotron model)
- **nezha** -- `NezhaForSequenceClassification` (Nezha model)
- **nystromformer** -- `NystromformerForSequenceClassification` (Nyströmformer model)
- **open-llama** -- `OpenLlamaForSequenceClassification` (OpenLlama model)
- **openai-gpt** -- `OpenAIGPTForSequenceClassification` (OpenAI GPT model)
- **opt** -- `OPTForSequenceClassification` (OPT model)
- **perceiver** -- `PerceiverForSequenceClassification` (Perceiver model)
- **persimmon** -- `PersimmonForSequenceClassification` (Persimmon model)
- **phi** -- `PhiForSequenceClassification` (Phi model)
- **phi3** -- `Phi3ForSequenceClassification` (Phi3 model)
- **phimoe** -- `PhimoeForSequenceClassification` (Phimoe model)
- **plbart** -- `PLBartForSequenceClassification` (PLBart model)
- **qdqbert** -- `QDQBertForSequenceClassification` (QDQBert model)
- **qwen2** -- `Qwen2ForSequenceClassification` (Qwen2 model)
- **qwen2_moe** -- `Qwen2MoeForSequenceClassification` (Qwen2MoE model)
- **qwen3** -- `Qwen3ForSequenceClassification` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeForSequenceClassification` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextForSequenceClassification` (Qwen3Next model)
- **reformer** -- `ReformerForSequenceClassification` (Reformer model)
- **rembert** -- `RemBertForSequenceClassification` (RemBERT model)
- **roberta** -- `RobertaForSequenceClassification` (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForSequenceClassification` (RoCBert model)
- **roformer** -- `RoFormerForSequenceClassification` (RoFormer model)
- **seed_oss** -- `SeedOssForSequenceClassification` (SeedOss model)
- **smollm3** -- `SmolLM3ForSequenceClassification` (SmolLM3 model)
- **squeezebert** -- `SqueezeBertForSequenceClassification` (SqueezeBERT model)
- **stablelm** -- `StableLmForSequenceClassification` (StableLm model)
- **starcoder2** -- `Starcoder2ForSequenceClassification` (Starcoder2 model)
- **t5** -- `T5ForSequenceClassification` (T5 model)
- **t5gemma** -- `T5GemmaForSequenceClassification` (T5Gemma model)
- **tapas** -- `TapasForSequenceClassification` (TAPAS model)
- **transfo-xl** -- `TransfoXLForSequenceClassification` (Transformer-XL model)
- **umt5** -- `UMT5ForSequenceClassification` (UMT5 model)
- **xlm** -- `XLMForSequenceClassification` (XLM model)
- **xlm-roberta** -- `XLMRobertaForSequenceClassification` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForSequenceClassification` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetForSequenceClassification` (XLNet model)
- **xmod** -- `XmodForSequenceClassification` (X-MOD model)
- **yoso** -- `YosoForSequenceClassification` (YOSO model)
- **zamba** -- `ZambaForSequenceClassification` (Zamba model)
- **zamba2** -- `Zamba2ForSequenceClassification` (Zamba2 model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSequenceClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForSequenceClassification[[transformers.TFAutoModelForSequenceClassification]]

#### transformers.TFAutoModelForSequenceClassification[[transformers.TFAutoModelForSequenceClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L637)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForSequenceClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForSequenceClassification) (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartForSequenceClassification) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForSequenceClassification) (BERT model)
  - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [TFCTRLForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLForSequenceClassification) (CTRL model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForSequenceClassification) (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForSequenceClassification) (ConvBERT model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForSequenceClassification) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForSequenceClassification) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `TFDistilBertForSequenceClassification` (DistilBERT model)
  - `ElectraConfig` configuration class: `TFElectraForSequenceClassification` (ELECTRA model)
  - `EsmConfig` configuration class: `TFEsmForSequenceClassification` (ESM model)
  - `FlaubertConfig` configuration class: `TFFlaubertForSequenceClassification` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForSequenceClassification` (Funnel Transformer model)
  - `GPT2Config` configuration class: `TFGPT2ForSequenceClassification` (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `TFGPTJForSequenceClassification` (GPT-J model)
  - `LayoutLMConfig` configuration class: `TFLayoutLMForSequenceClassification` (LayoutLM model)
  - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForSequenceClassification` (LayoutLMv3 model)
  - `LongformerConfig` configuration class: `TFLongformerForSequenceClassification` (Longformer model)
  - `MPNetConfig` configuration class: `TFMPNetForSequenceClassification` (MPNet model)
  - `MistralConfig` configuration class: `TFMistralForSequenceClassification` (Mistral model)
  - `MobileBertConfig` configuration class: `TFMobileBertForSequenceClassification` (MobileBERT model)
  - `OpenAIGPTConfig` configuration class: `TFOpenAIGPTForSequenceClassification` (OpenAI GPT model)
  - `RemBertConfig` configuration class: `TFRemBertForSequenceClassification` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForSequenceClassification` (RoFormer model)
  - `RobertaConfig` configuration class: `TFRobertaForSequenceClassification` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
  - `TapasConfig` configuration class: `TFTapasForSequenceClassification` (TAPAS model)
  - `TransfoXLConfig` configuration class: `TFTransfoXLForSequenceClassification` (Transformer-XL model)
  - `XLMConfig` configuration class: `TFXLMForSequenceClassification` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForSequenceClassification` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetForSequenceClassification` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSequenceClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForSequenceClassification) (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartForSequenceClassification) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForSequenceClassification) (BERT model) - [CTRLConfig](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.CTRLConfig) configuration class: [TFCTRLForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLForSequenceClassification) (CTRL model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForSequenceClassification) (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForSequenceClassification) (ConvBERT model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForSequenceClassification) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForSequenceClassification) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `TFDistilBertForSequenceClassification` (DistilBERT model) - `ElectraConfig` configuration class: `TFElectraForSequenceClassification` (ELECTRA model) - `EsmConfig` configuration class: `TFEsmForSequenceClassification` (ESM model) - `FlaubertConfig` configuration class: `TFFlaubertForSequenceClassification` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForSequenceClassification` (Funnel Transformer model) - `GPT2Config` configuration class: `TFGPT2ForSequenceClassification` (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `TFGPTJForSequenceClassification` (GPT-J model) - `LayoutLMConfig` configuration class: `TFLayoutLMForSequenceClassification` (LayoutLM model) - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForSequenceClassification` (LayoutLMv3 model) - `LongformerConfig` configuration class: `TFLongformerForSequenceClassification` (Longformer model) - `MPNetConfig` configuration class: `TFMPNetForSequenceClassification` (MPNet model) - `MistralConfig` configuration class: `TFMistralForSequenceClassification` (Mistral model) - `MobileBertConfig` configuration class: `TFMobileBertForSequenceClassification` (MobileBERT model) - `OpenAIGPTConfig` configuration class: `TFOpenAIGPTForSequenceClassification` (OpenAI GPT model) - `RemBertConfig` configuration class: `TFRemBertForSequenceClassification` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForSequenceClassification` (RoFormer model) - `RobertaConfig` configuration class: `TFRobertaForSequenceClassification` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model) - `TapasConfig` configuration class: `TFTapasForSequenceClassification` (TAPAS model) - `TransfoXLConfig` configuration class: `TFTransfoXLForSequenceClassification` (Transformer-XL model) - `XLMConfig` configuration class: `TFXLMForSequenceClassification` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForSequenceClassification` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetForSequenceClassification` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForSequenceClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [TFAlbertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForSequenceClassification) (ALBERT model)
- **bart** -- [TFBartForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.TFBartForSequenceClassification) (BART model)
- **bert** -- [TFBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForSequenceClassification) (BERT model)
- **camembert** -- [TFCamembertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForSequenceClassification) (CamemBERT model)
- **convbert** -- [TFConvBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForSequenceClassification) (ConvBERT model)
- **ctrl** -- [TFCTRLForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/ctrl#transformers.TFCTRLForSequenceClassification) (CTRL model)
- **deberta** -- [TFDebertaForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForSequenceClassification) (DeBERTa model)
- **deberta-v2** -- [TFDebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForSequenceClassification) (DeBERTa-v2 model)
- **distilbert** -- `TFDistilBertForSequenceClassification` (DistilBERT model)
- **electra** -- `TFElectraForSequenceClassification` (ELECTRA model)
- **esm** -- `TFEsmForSequenceClassification` (ESM model)
- **flaubert** -- `TFFlaubertForSequenceClassification` (FlauBERT model)
- **funnel** -- `TFFunnelForSequenceClassification` (Funnel Transformer model)
- **gpt-sw3** -- `TFGPT2ForSequenceClassification` (GPT-Sw3 model)
- **gpt2** -- `TFGPT2ForSequenceClassification` (OpenAI GPT-2 model)
- **gptj** -- `TFGPTJForSequenceClassification` (GPT-J model)
- **layoutlm** -- `TFLayoutLMForSequenceClassification` (LayoutLM model)
- **layoutlmv3** -- `TFLayoutLMv3ForSequenceClassification` (LayoutLMv3 model)
- **longformer** -- `TFLongformerForSequenceClassification` (Longformer model)
- **mistral** -- `TFMistralForSequenceClassification` (Mistral model)
- **mobilebert** -- `TFMobileBertForSequenceClassification` (MobileBERT model)
- **mpnet** -- `TFMPNetForSequenceClassification` (MPNet model)
- **openai-gpt** -- `TFOpenAIGPTForSequenceClassification` (OpenAI GPT model)
- **rembert** -- `TFRemBertForSequenceClassification` (RemBERT model)
- **roberta** -- `TFRobertaForSequenceClassification` (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForSequenceClassification` (RoFormer model)
- **tapas** -- `TFTapasForSequenceClassification` (TAPAS model)
- **transfo-xl** -- `TFTransfoXLForSequenceClassification` (Transformer-XL model)
- **xlm** -- `TFXLMForSequenceClassification` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForSequenceClassification` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetForSequenceClassification` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSequenceClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForSequenceClassification[[transformers.FlaxAutoModelForSequenceClassification]]

#### transformers.FlaxAutoModelForSequenceClassification[[transformers.FlaxAutoModelForSequenceClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L320)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForSequenceClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForSequenceClassification) (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForSequenceClassification) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForSequenceClassification) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForSequenceClassification) (BigBird model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertForSequenceClassification` (DistilBERT model)
  - `ElectraConfig` configuration class: `FlaxElectraForSequenceClassification` (ELECTRA model)
  - `MBartConfig` configuration class: `FlaxMBartForSequenceClassification` (mBART model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForSequenceClassification` (RoFormer model)
  - `RobertaConfig` configuration class: `FlaxRobertaForSequenceClassification` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForSequenceClassification` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForSequenceClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForSequenceClassification) (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForSequenceClassification) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForSequenceClassification) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForSequenceClassification) (BigBird model) - `DistilBertConfig` configuration class: `FlaxDistilBertForSequenceClassification` (DistilBERT model) - `ElectraConfig` configuration class: `FlaxElectraForSequenceClassification` (ELECTRA model) - `MBartConfig` configuration class: `FlaxMBartForSequenceClassification` (mBART model) - `RoFormerConfig` configuration class: `FlaxRoFormerForSequenceClassification` (RoFormer model) - `RobertaConfig` configuration class: `FlaxRobertaForSequenceClassification` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForSequenceClassification` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForSequenceClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [FlaxAlbertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForSequenceClassification) (ALBERT model)
- **bart** -- [FlaxBartForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForSequenceClassification) (BART model)
- **bert** -- [FlaxBertForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForSequenceClassification) (BERT model)
- **big_bird** -- [FlaxBigBirdForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForSequenceClassification) (BigBird model)
- **distilbert** -- `FlaxDistilBertForSequenceClassification` (DistilBERT model)
- **electra** -- `FlaxElectraForSequenceClassification` (ELECTRA model)
- **mbart** -- `FlaxMBartForSequenceClassification` (mBART model)
- **roberta** -- `FlaxRobertaForSequenceClassification` (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForSequenceClassification` (RoFormer model)
- **xlm-roberta** -- `FlaxXLMRobertaForSequenceClassification` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForMultipleChoice[[transformers.AutoModelForMultipleChoice]]

#### transformers.AutoModelForMultipleChoice[[transformers.AutoModelForMultipleChoice]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2053)

This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForMultipleChoice.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForMultipleChoice) (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForMultipleChoice) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForMultipleChoice) (BigBird model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForMultipleChoice) (CamemBERT model)
  - [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForMultipleChoice) (CANINE model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForMultipleChoice) (ConvBERT model)
  - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForMultipleChoice) (Data2VecText model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForMultipleChoice) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `DistilBertForMultipleChoice` (DistilBERT model)
  - `ElectraConfig` configuration class: `ElectraForMultipleChoice` (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForMultipleChoice` (ERNIE model)
  - `ErnieMConfig` configuration class: `ErnieMForMultipleChoice` (ErnieM model)
  - `FNetConfig` configuration class: `FNetForMultipleChoice` (FNet model)
  - `FlaubertConfig` configuration class: `FlaubertForMultipleChoice` (FlauBERT model)
  - `FunnelConfig` configuration class: `FunnelForMultipleChoice` (Funnel Transformer model)
  - `IBertConfig` configuration class: `IBertForMultipleChoice` (I-BERT model)
  - `LongformerConfig` configuration class: `LongformerForMultipleChoice` (Longformer model)
  - `LukeConfig` configuration class: `LukeForMultipleChoice` (LUKE model)
  - `MPNetConfig` configuration class: `MPNetForMultipleChoice` (MPNet model)
  - `MegaConfig` configuration class: `MegaForMultipleChoice` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForMultipleChoice` (Megatron-BERT model)
  - `MobileBertConfig` configuration class: `MobileBertForMultipleChoice` (MobileBERT model)
  - `ModernBertConfig` configuration class: `ModernBertForMultipleChoice` (ModernBERT model)
  - `MraConfig` configuration class: `MraForMultipleChoice` (MRA model)
  - `NezhaConfig` configuration class: `NezhaForMultipleChoice` (Nezha model)
  - `NystromformerConfig` configuration class: `NystromformerForMultipleChoice` (Nyströmformer model)
  - `QDQBertConfig` configuration class: `QDQBertForMultipleChoice` (QDQBert model)
  - `RemBertConfig` configuration class: `RemBertForMultipleChoice` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForMultipleChoice` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForMultipleChoice` (RoFormer model)
  - `RobertaConfig` configuration class: `RobertaForMultipleChoice` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForMultipleChoice` (SqueezeBERT model)
  - `XLMConfig` configuration class: `XLMForMultipleChoice` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForMultipleChoice` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMultipleChoice` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetForMultipleChoice` (XLNet model)
  - `XmodConfig` configuration class: `XmodForMultipleChoice` (X-MOD model)
  - `YosoConfig` configuration class: `YosoForMultipleChoice` (YOSO model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMultipleChoice.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForMultipleChoice) (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForMultipleChoice) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForMultipleChoice) (BigBird model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForMultipleChoice) (CamemBERT model) - [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForMultipleChoice) (CANINE model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForMultipleChoice) (ConvBERT model) - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForMultipleChoice) (Data2VecText model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForMultipleChoice) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `DistilBertForMultipleChoice` (DistilBERT model) - `ElectraConfig` configuration class: `ElectraForMultipleChoice` (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForMultipleChoice` (ERNIE model) - `ErnieMConfig` configuration class: `ErnieMForMultipleChoice` (ErnieM model) - `FNetConfig` configuration class: `FNetForMultipleChoice` (FNet model) - `FlaubertConfig` configuration class: `FlaubertForMultipleChoice` (FlauBERT model) - `FunnelConfig` configuration class: `FunnelForMultipleChoice` (Funnel Transformer model) - `IBertConfig` configuration class: `IBertForMultipleChoice` (I-BERT model) - `LongformerConfig` configuration class: `LongformerForMultipleChoice` (Longformer model) - `LukeConfig` configuration class: `LukeForMultipleChoice` (LUKE model) - `MPNetConfig` configuration class: `MPNetForMultipleChoice` (MPNet model) - `MegaConfig` configuration class: `MegaForMultipleChoice` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForMultipleChoice` (Megatron-BERT model) - `MobileBertConfig` configuration class: `MobileBertForMultipleChoice` (MobileBERT model) - `ModernBertConfig` configuration class: `ModernBertForMultipleChoice` (ModernBERT model) - `MraConfig` configuration class: `MraForMultipleChoice` (MRA model) - `NezhaConfig` configuration class: `NezhaForMultipleChoice` (Nezha model) - `NystromformerConfig` configuration class: `NystromformerForMultipleChoice` (Nyströmformer model) - `QDQBertConfig` configuration class: `QDQBertForMultipleChoice` (QDQBert model) - `RemBertConfig` configuration class: `RemBertForMultipleChoice` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForMultipleChoice` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForMultipleChoice` (RoFormer model) - `RobertaConfig` configuration class: `RobertaForMultipleChoice` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model) - `SqueezeBertConfig` configuration class: `SqueezeBertForMultipleChoice` (SqueezeBERT model) - `XLMConfig` configuration class: `XLMForMultipleChoice` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForMultipleChoice` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMultipleChoice` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetForMultipleChoice` (XLNet model) - `XmodConfig` configuration class: `XmodForMultipleChoice` (X-MOD model) - `YosoConfig` configuration class: `YosoForMultipleChoice` (YOSO model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForMultipleChoice.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [AlbertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForMultipleChoice) (ALBERT model)
- **bert** -- [BertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForMultipleChoice) (BERT model)
- **big_bird** -- [BigBirdForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForMultipleChoice) (BigBird model)
- **camembert** -- [CamembertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForMultipleChoice) (CamemBERT model)
- **canine** -- [CanineForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForMultipleChoice) (CANINE model)
- **convbert** -- [ConvBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForMultipleChoice) (ConvBERT model)
- **data2vec-text** -- [Data2VecTextForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForMultipleChoice) (Data2VecText model)
- **deberta-v2** -- [DebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForMultipleChoice) (DeBERTa-v2 model)
- **distilbert** -- `DistilBertForMultipleChoice` (DistilBERT model)
- **electra** -- `ElectraForMultipleChoice` (ELECTRA model)
- **ernie** -- `ErnieForMultipleChoice` (ERNIE model)
- **ernie_m** -- `ErnieMForMultipleChoice` (ErnieM model)
- **flaubert** -- `FlaubertForMultipleChoice` (FlauBERT model)
- **fnet** -- `FNetForMultipleChoice` (FNet model)
- **funnel** -- `FunnelForMultipleChoice` (Funnel Transformer model)
- **ibert** -- `IBertForMultipleChoice` (I-BERT model)
- **longformer** -- `LongformerForMultipleChoice` (Longformer model)
- **luke** -- `LukeForMultipleChoice` (LUKE model)
- **mega** -- `MegaForMultipleChoice` (MEGA model)
- **megatron-bert** -- `MegatronBertForMultipleChoice` (Megatron-BERT model)
- **mobilebert** -- `MobileBertForMultipleChoice` (MobileBERT model)
- **modernbert** -- `ModernBertForMultipleChoice` (ModernBERT model)
- **mpnet** -- `MPNetForMultipleChoice` (MPNet model)
- **mra** -- `MraForMultipleChoice` (MRA model)
- **nezha** -- `NezhaForMultipleChoice` (Nezha model)
- **nystromformer** -- `NystromformerForMultipleChoice` (Nyströmformer model)
- **qdqbert** -- `QDQBertForMultipleChoice` (QDQBert model)
- **rembert** -- `RemBertForMultipleChoice` (RemBERT model)
- **roberta** -- `RobertaForMultipleChoice` (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForMultipleChoice` (RoCBert model)
- **roformer** -- `RoFormerForMultipleChoice` (RoFormer model)
- **squeezebert** -- `SqueezeBertForMultipleChoice` (SqueezeBERT model)
- **xlm** -- `XLMForMultipleChoice` (XLM model)
- **xlm-roberta** -- `XLMRobertaForMultipleChoice` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForMultipleChoice` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetForMultipleChoice` (XLNet model)
- **xmod** -- `XmodForMultipleChoice` (X-MOD model)
- **yoso** -- `YosoForMultipleChoice` (YOSO model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMultipleChoice.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForMultipleChoice[[transformers.TFAutoModelForMultipleChoice]]

#### transformers.TFAutoModelForMultipleChoice[[transformers.TFAutoModelForMultipleChoice]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L684)

This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForMultipleChoice.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForMultipleChoice) (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForMultipleChoice) (BERT model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForMultipleChoice) (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForMultipleChoice) (ConvBERT model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForMultipleChoice) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `TFDistilBertForMultipleChoice` (DistilBERT model)
  - `ElectraConfig` configuration class: `TFElectraForMultipleChoice` (ELECTRA model)
  - `FlaubertConfig` configuration class: `TFFlaubertForMultipleChoice` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForMultipleChoice` (Funnel Transformer model)
  - `LongformerConfig` configuration class: `TFLongformerForMultipleChoice` (Longformer model)
  - `MPNetConfig` configuration class: `TFMPNetForMultipleChoice` (MPNet model)
  - `MobileBertConfig` configuration class: `TFMobileBertForMultipleChoice` (MobileBERT model)
  - `RemBertConfig` configuration class: `TFRemBertForMultipleChoice` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForMultipleChoice` (RoFormer model)
  - `RobertaConfig` configuration class: `TFRobertaForMultipleChoice` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
  - `XLMConfig` configuration class: `TFXLMForMultipleChoice` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMultipleChoice` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetForMultipleChoice` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMultipleChoice.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForMultipleChoice) (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForMultipleChoice) (BERT model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForMultipleChoice) (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForMultipleChoice) (ConvBERT model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForMultipleChoice) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `TFDistilBertForMultipleChoice` (DistilBERT model) - `ElectraConfig` configuration class: `TFElectraForMultipleChoice` (ELECTRA model) - `FlaubertConfig` configuration class: `TFFlaubertForMultipleChoice` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForMultipleChoice` (Funnel Transformer model) - `LongformerConfig` configuration class: `TFLongformerForMultipleChoice` (Longformer model) - `MPNetConfig` configuration class: `TFMPNetForMultipleChoice` (MPNet model) - `MobileBertConfig` configuration class: `TFMobileBertForMultipleChoice` (MobileBERT model) - `RemBertConfig` configuration class: `TFRemBertForMultipleChoice` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForMultipleChoice` (RoFormer model) - `RobertaConfig` configuration class: `TFRobertaForMultipleChoice` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model) - `XLMConfig` configuration class: `TFXLMForMultipleChoice` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMultipleChoice` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetForMultipleChoice` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForMultipleChoice.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [TFAlbertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForMultipleChoice) (ALBERT model)
- **bert** -- [TFBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForMultipleChoice) (BERT model)
- **camembert** -- [TFCamembertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForMultipleChoice) (CamemBERT model)
- **convbert** -- [TFConvBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForMultipleChoice) (ConvBERT model)
- **deberta-v2** -- [TFDebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForMultipleChoice) (DeBERTa-v2 model)
- **distilbert** -- `TFDistilBertForMultipleChoice` (DistilBERT model)
- **electra** -- `TFElectraForMultipleChoice` (ELECTRA model)
- **flaubert** -- `TFFlaubertForMultipleChoice` (FlauBERT model)
- **funnel** -- `TFFunnelForMultipleChoice` (Funnel Transformer model)
- **longformer** -- `TFLongformerForMultipleChoice` (Longformer model)
- **mobilebert** -- `TFMobileBertForMultipleChoice` (MobileBERT model)
- **mpnet** -- `TFMPNetForMultipleChoice` (MPNet model)
- **rembert** -- `TFRemBertForMultipleChoice` (RemBERT model)
- **roberta** -- `TFRobertaForMultipleChoice` (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForMultipleChoice` (RoFormer model)
- **xlm** -- `TFXLMForMultipleChoice` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForMultipleChoice` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetForMultipleChoice` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMultipleChoice.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForMultipleChoice[[transformers.FlaxAutoModelForMultipleChoice]]

#### transformers.FlaxAutoModelForMultipleChoice[[transformers.FlaxAutoModelForMultipleChoice]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L345)

This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForMultipleChoice.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForMultipleChoice) (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForMultipleChoice) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForMultipleChoice) (BigBird model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertForMultipleChoice` (DistilBERT model)
  - `ElectraConfig` configuration class: `FlaxElectraForMultipleChoice` (ELECTRA model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForMultipleChoice` (RoFormer model)
  - `RobertaConfig` configuration class: `FlaxRobertaForMultipleChoice` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMultipleChoice` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForMultipleChoice.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForMultipleChoice) (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForMultipleChoice) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForMultipleChoice) (BigBird model) - `DistilBertConfig` configuration class: `FlaxDistilBertForMultipleChoice` (DistilBERT model) - `ElectraConfig` configuration class: `FlaxElectraForMultipleChoice` (ELECTRA model) - `RoFormerConfig` configuration class: `FlaxRoFormerForMultipleChoice` (RoFormer model) - `RobertaConfig` configuration class: `FlaxRobertaForMultipleChoice` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMultipleChoice` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForMultipleChoice.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [FlaxAlbertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForMultipleChoice) (ALBERT model)
- **bert** -- [FlaxBertForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForMultipleChoice) (BERT model)
- **big_bird** -- [FlaxBigBirdForMultipleChoice](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForMultipleChoice) (BigBird model)
- **distilbert** -- `FlaxDistilBertForMultipleChoice` (DistilBERT model)
- **electra** -- `FlaxElectraForMultipleChoice` (ELECTRA model)
- **roberta** -- `FlaxRobertaForMultipleChoice` (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForMultipleChoice` (RoFormer model)
- **xlm-roberta** -- `FlaxXLMRobertaForMultipleChoice` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForNextSentencePrediction[[transformers.AutoModelForNextSentencePrediction]]

#### transformers.AutoModelForNextSentencePrediction[[transformers.AutoModelForNextSentencePrediction]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2060)

This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForNextSentencePrediction.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForNextSentencePrediction](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForNextSentencePrediction) (BERT model)
  - `ErnieConfig` configuration class: `ErnieForNextSentencePrediction` (ERNIE model)
  - `FNetConfig` configuration class: `FNetForNextSentencePrediction` (FNet model)
  - `MegatronBertConfig` configuration class: `MegatronBertForNextSentencePrediction` (Megatron-BERT model)
  - `MobileBertConfig` configuration class: `MobileBertForNextSentencePrediction` (MobileBERT model)
  - `NezhaConfig` configuration class: `NezhaForNextSentencePrediction` (Nezha model)
  - `QDQBertConfig` configuration class: `QDQBertForNextSentencePrediction` (QDQBert model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForNextSentencePrediction.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForNextSentencePrediction](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForNextSentencePrediction) (BERT model) - `ErnieConfig` configuration class: `ErnieForNextSentencePrediction` (ERNIE model) - `FNetConfig` configuration class: `FNetForNextSentencePrediction` (FNet model) - `MegatronBertConfig` configuration class: `MegatronBertForNextSentencePrediction` (Megatron-BERT model) - `MobileBertConfig` configuration class: `MobileBertForNextSentencePrediction` (MobileBERT model) - `NezhaConfig` configuration class: `NezhaForNextSentencePrediction` (Nezha model) - `QDQBertConfig` configuration class: `QDQBertForNextSentencePrediction` (QDQBert model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForNextSentencePrediction.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bert** -- [BertForNextSentencePrediction](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForNextSentencePrediction) (BERT model)
- **ernie** -- `ErnieForNextSentencePrediction` (ERNIE model)
- **fnet** -- `FNetForNextSentencePrediction` (FNet model)
- **megatron-bert** -- `MegatronBertForNextSentencePrediction` (Megatron-BERT model)
- **mobilebert** -- `MobileBertForNextSentencePrediction` (MobileBERT model)
- **nezha** -- `NezhaForNextSentencePrediction` (Nezha model)
- **qdqbert** -- `QDQBertForNextSentencePrediction` (QDQBert model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForNextSentencePrediction.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForNextSentencePrediction[[transformers.TFAutoModelForNextSentencePrediction]]

#### transformers.TFAutoModelForNextSentencePrediction[[transformers.TFAutoModelForNextSentencePrediction]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L691)

This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForNextSentencePrediction.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForNextSentencePrediction](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForNextSentencePrediction) (BERT model)
  - `MobileBertConfig` configuration class: `TFMobileBertForNextSentencePrediction` (MobileBERT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForNextSentencePrediction.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForNextSentencePrediction](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForNextSentencePrediction) (BERT model) - `MobileBertConfig` configuration class: `TFMobileBertForNextSentencePrediction` (MobileBERT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForNextSentencePrediction.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bert** -- [TFBertForNextSentencePrediction](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForNextSentencePrediction) (BERT model)
- **mobilebert** -- `TFMobileBertForNextSentencePrediction` (MobileBERT model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForNextSentencePrediction[[transformers.FlaxAutoModelForNextSentencePrediction]]

#### transformers.FlaxAutoModelForNextSentencePrediction[[transformers.FlaxAutoModelForNextSentencePrediction]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L352)

This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForNextSentencePrediction.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForNextSentencePrediction](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForNextSentencePrediction) (BERT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForNextSentencePrediction.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForNextSentencePrediction](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForNextSentencePrediction) (BERT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForNextSentencePrediction.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bert** -- [FlaxBertForNextSentencePrediction](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForNextSentencePrediction) (BERT model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTokenClassification[[transformers.AutoModelForTokenClassification]]

#### transformers.AutoModelForTokenClassification[[transformers.AutoModelForTokenClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2046)

This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTokenClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForTokenClassification) (ALBERT model)
  - `ApertusConfig` configuration class: `ApertusForTokenClassification` (Apertus model)
  - `ArceeConfig` configuration class: `ArceeForTokenClassification` (Arcee model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForTokenClassification) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForTokenClassification) (BigBird model)
  - [BioGptConfig](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptForTokenClassification) (BioGpt model)
  - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForTokenClassification) (BLOOM model)
  - [BrosConfig](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosConfig) configuration class: [BrosForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosForTokenClassification) (BROS model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForTokenClassification) (CamemBERT model)
  - [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForTokenClassification) (CANINE model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForTokenClassification) (ConvBERT model)
  - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForTokenClassification) (Data2VecText model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForTokenClassification) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForTokenClassification) (DeBERTa-v2 model)
  - `DeepseekV3Config` configuration class: `DeepseekV3ForTokenClassification` (DeepSeek-V3 model)
  - `DiffLlamaConfig` configuration class: `DiffLlamaForTokenClassification` (DiffLlama model)
  - `DistilBertConfig` configuration class: `DistilBertForTokenClassification` (DistilBERT model)
  - `ElectraConfig` configuration class: `ElectraForTokenClassification` (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForTokenClassification` (ERNIE model)
  - `ErnieMConfig` configuration class: `ErnieMForTokenClassification` (ErnieM model)
  - `EsmConfig` configuration class: `EsmForTokenClassification` (ESM model)
  - `Exaone4Config` configuration class: `Exaone4ForTokenClassification` (EXAONE-4.0 model)
  - `FNetConfig` configuration class: `FNetForTokenClassification` (FNet model)
  - `FalconConfig` configuration class: `FalconForTokenClassification` (Falcon model)
  - `FlaubertConfig` configuration class: `FlaubertForTokenClassification` (FlauBERT model)
  - `FunnelConfig` configuration class: `FunnelForTokenClassification` (Funnel Transformer model)
  - `GPT2Config` configuration class: `GPT2ForTokenClassification` (OpenAI GPT-2 model)
  - `GPTBigCodeConfig` configuration class: `GPTBigCodeForTokenClassification` (GPTBigCode model)
  - `GPTNeoConfig` configuration class: `GPTNeoForTokenClassification` (GPT Neo model)
  - `GPTNeoXConfig` configuration class: `GPTNeoXForTokenClassification` (GPT NeoX model)
  - `Gemma2Config` configuration class: `Gemma2ForTokenClassification` (Gemma2 model)
  - `GemmaConfig` configuration class: `GemmaForTokenClassification` (Gemma model)
  - `Glm4Config` configuration class: `Glm4ForTokenClassification` (GLM4 model)
  - `GlmConfig` configuration class: `GlmForTokenClassification` (GLM model)
  - `GptOssConfig` configuration class: `GptOssForTokenClassification` (GptOss model)
  - `HeliumConfig` configuration class: `HeliumForTokenClassification` (Helium model)
  - `IBertConfig` configuration class: `IBertForTokenClassification` (I-BERT model)
  - `LayoutLMConfig` configuration class: `LayoutLMForTokenClassification` (LayoutLM model)
  - `LayoutLMv2Config` configuration class: `LayoutLMv2ForTokenClassification` (LayoutLMv2 model)
  - `LayoutLMv3Config` configuration class: `LayoutLMv3ForTokenClassification` (LayoutLMv3 model)
  - `LiltConfig` configuration class: `LiltForTokenClassification` (LiLT model)
  - `LlamaConfig` configuration class: `LlamaForTokenClassification` (LLaMA model)
  - `LongformerConfig` configuration class: `LongformerForTokenClassification` (Longformer model)
  - `LukeConfig` configuration class: `LukeForTokenClassification` (LUKE model)
  - `MPNetConfig` configuration class: `MPNetForTokenClassification` (MPNet model)
  - `MT5Config` configuration class: `MT5ForTokenClassification` (MT5 model)
  - `MarkupLMConfig` configuration class: `MarkupLMForTokenClassification` (MarkupLM model)
  - `MegaConfig` configuration class: `MegaForTokenClassification` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForTokenClassification` (Megatron-BERT model)
  - `MiniMaxConfig` configuration class: `MiniMaxForTokenClassification` (MiniMax model)
  - `MinistralConfig` configuration class: `MinistralForTokenClassification` (Ministral model)
  - `MistralConfig` configuration class: `MistralForTokenClassification` (Mistral model)
  - `MixtralConfig` configuration class: `MixtralForTokenClassification` (Mixtral model)
  - `MobileBertConfig` configuration class: `MobileBertForTokenClassification` (MobileBERT model)
  - `ModernBertConfig` configuration class: `ModernBertForTokenClassification` (ModernBERT model)
  - `MptConfig` configuration class: `MptForTokenClassification` (MPT model)
  - `MraConfig` configuration class: `MraForTokenClassification` (MRA model)
  - `NemotronConfig` configuration class: `NemotronForTokenClassification` (Nemotron model)
  - `NezhaConfig` configuration class: `NezhaForTokenClassification` (Nezha model)
  - `NystromformerConfig` configuration class: `NystromformerForTokenClassification` (Nyströmformer model)
  - `PersimmonConfig` configuration class: `PersimmonForTokenClassification` (Persimmon model)
  - `Phi3Config` configuration class: `Phi3ForTokenClassification` (Phi3 model)
  - `PhiConfig` configuration class: `PhiForTokenClassification` (Phi model)
  - `QDQBertConfig` configuration class: `QDQBertForTokenClassification` (QDQBert model)
  - `Qwen2Config` configuration class: `Qwen2ForTokenClassification` (Qwen2 model)
  - `Qwen2MoeConfig` configuration class: `Qwen2MoeForTokenClassification` (Qwen2MoE model)
  - `Qwen3Config` configuration class: `Qwen3ForTokenClassification` (Qwen3 model)
  - `Qwen3MoeConfig` configuration class: `Qwen3MoeForTokenClassification` (Qwen3MoE model)
  - `Qwen3NextConfig` configuration class: `Qwen3NextForTokenClassification` (Qwen3Next model)
  - `RemBertConfig` configuration class: `RemBertForTokenClassification` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForTokenClassification` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForTokenClassification` (RoFormer model)
  - `RobertaConfig` configuration class: `RobertaForTokenClassification` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
  - `SeedOssConfig` configuration class: `SeedOssForTokenClassification` (SeedOss model)
  - `SmolLM3Config` configuration class: `SmolLM3ForTokenClassification` (SmolLM3 model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForTokenClassification` (SqueezeBERT model)
  - `StableLmConfig` configuration class: `StableLmForTokenClassification` (StableLm model)
  - `Starcoder2Config` configuration class: `Starcoder2ForTokenClassification` (Starcoder2 model)
  - `T5Config` configuration class: `T5ForTokenClassification` (T5 model)
  - `T5GemmaConfig` configuration class: `T5GemmaForTokenClassification` (T5Gemma model)
  - `UMT5Config` configuration class: `UMT5ForTokenClassification` (UMT5 model)
  - `XLMConfig` configuration class: `XLMForTokenClassification` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForTokenClassification` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForTokenClassification` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetForTokenClassification` (XLNet model)
  - `XmodConfig` configuration class: `XmodForTokenClassification` (X-MOD model)
  - `YosoConfig` configuration class: `YosoForTokenClassification` (YOSO model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTokenClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForTokenClassification) (ALBERT model) - `ApertusConfig` configuration class: `ApertusForTokenClassification` (Apertus model) - `ArceeConfig` configuration class: `ArceeForTokenClassification` (Arcee model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForTokenClassification) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForTokenClassification) (BigBird model) - [BioGptConfig](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptForTokenClassification) (BioGpt model) - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForTokenClassification) (BLOOM model) - [BrosConfig](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosConfig) configuration class: [BrosForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosForTokenClassification) (BROS model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForTokenClassification) (CamemBERT model) - [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForTokenClassification) (CANINE model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForTokenClassification) (ConvBERT model) - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForTokenClassification) (Data2VecText model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForTokenClassification) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForTokenClassification) (DeBERTa-v2 model) - `DeepseekV3Config` configuration class: `DeepseekV3ForTokenClassification` (DeepSeek-V3 model) - `DiffLlamaConfig` configuration class: `DiffLlamaForTokenClassification` (DiffLlama model) - `DistilBertConfig` configuration class: `DistilBertForTokenClassification` (DistilBERT model) - `ElectraConfig` configuration class: `ElectraForTokenClassification` (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForTokenClassification` (ERNIE model) - `ErnieMConfig` configuration class: `ErnieMForTokenClassification` (ErnieM model) - `EsmConfig` configuration class: `EsmForTokenClassification` (ESM model) - `Exaone4Config` configuration class: `Exaone4ForTokenClassification` (EXAONE-4.0 model) - `FNetConfig` configuration class: `FNetForTokenClassification` (FNet model) - `FalconConfig` configuration class: `FalconForTokenClassification` (Falcon model) - `FlaubertConfig` configuration class: `FlaubertForTokenClassification` (FlauBERT model) - `FunnelConfig` configuration class: `FunnelForTokenClassification` (Funnel Transformer model) - `GPT2Config` configuration class: `GPT2ForTokenClassification` (OpenAI GPT-2 model) - `GPTBigCodeConfig` configuration class: `GPTBigCodeForTokenClassification` (GPTBigCode model) - `GPTNeoConfig` configuration class: `GPTNeoForTokenClassification` (GPT Neo model) - `GPTNeoXConfig` configuration class: `GPTNeoXForTokenClassification` (GPT NeoX model) - `Gemma2Config` configuration class: `Gemma2ForTokenClassification` (Gemma2 model) - `GemmaConfig` configuration class: `GemmaForTokenClassification` (Gemma model) - `Glm4Config` configuration class: `Glm4ForTokenClassification` (GLM4 model) - `GlmConfig` configuration class: `GlmForTokenClassification` (GLM model) - `GptOssConfig` configuration class: `GptOssForTokenClassification` (GptOss model) - `HeliumConfig` configuration class: `HeliumForTokenClassification` (Helium model) - `IBertConfig` configuration class: `IBertForTokenClassification` (I-BERT model) - `LayoutLMConfig` configuration class: `LayoutLMForTokenClassification` (LayoutLM model) - `LayoutLMv2Config` configuration class: `LayoutLMv2ForTokenClassification` (LayoutLMv2 model) - `LayoutLMv3Config` configuration class: `LayoutLMv3ForTokenClassification` (LayoutLMv3 model) - `LiltConfig` configuration class: `LiltForTokenClassification` (LiLT model) - `LlamaConfig` configuration class: `LlamaForTokenClassification` (LLaMA model) - `LongformerConfig` configuration class: `LongformerForTokenClassification` (Longformer model) - `LukeConfig` configuration class: `LukeForTokenClassification` (LUKE model) - `MPNetConfig` configuration class: `MPNetForTokenClassification` (MPNet model) - `MT5Config` configuration class: `MT5ForTokenClassification` (MT5 model) - `MarkupLMConfig` configuration class: `MarkupLMForTokenClassification` (MarkupLM model) - `MegaConfig` configuration class: `MegaForTokenClassification` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForTokenClassification` (Megatron-BERT model) - `MiniMaxConfig` configuration class: `MiniMaxForTokenClassification` (MiniMax model) - `MinistralConfig` configuration class: `MinistralForTokenClassification` (Ministral model) - `MistralConfig` configuration class: `MistralForTokenClassification` (Mistral model) - `MixtralConfig` configuration class: `MixtralForTokenClassification` (Mixtral model) - `MobileBertConfig` configuration class: `MobileBertForTokenClassification` (MobileBERT model) - `ModernBertConfig` configuration class: `ModernBertForTokenClassification` (ModernBERT model) - `MptConfig` configuration class: `MptForTokenClassification` (MPT model) - `MraConfig` configuration class: `MraForTokenClassification` (MRA model) - `NemotronConfig` configuration class: `NemotronForTokenClassification` (Nemotron model) - `NezhaConfig` configuration class: `NezhaForTokenClassification` (Nezha model) - `NystromformerConfig` configuration class: `NystromformerForTokenClassification` (Nyströmformer model) - `PersimmonConfig` configuration class: `PersimmonForTokenClassification` (Persimmon model) - `Phi3Config` configuration class: `Phi3ForTokenClassification` (Phi3 model) - `PhiConfig` configuration class: `PhiForTokenClassification` (Phi model) - `QDQBertConfig` configuration class: `QDQBertForTokenClassification` (QDQBert model) - `Qwen2Config` configuration class: `Qwen2ForTokenClassification` (Qwen2 model) - `Qwen2MoeConfig` configuration class: `Qwen2MoeForTokenClassification` (Qwen2MoE model) - `Qwen3Config` configuration class: `Qwen3ForTokenClassification` (Qwen3 model) - `Qwen3MoeConfig` configuration class: `Qwen3MoeForTokenClassification` (Qwen3MoE model) - `Qwen3NextConfig` configuration class: `Qwen3NextForTokenClassification` (Qwen3Next model) - `RemBertConfig` configuration class: `RemBertForTokenClassification` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForTokenClassification` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForTokenClassification` (RoFormer model) - `RobertaConfig` configuration class: `RobertaForTokenClassification` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model) - `SeedOssConfig` configuration class: `SeedOssForTokenClassification` (SeedOss model) - `SmolLM3Config` configuration class: `SmolLM3ForTokenClassification` (SmolLM3 model) - `SqueezeBertConfig` configuration class: `SqueezeBertForTokenClassification` (SqueezeBERT model) - `StableLmConfig` configuration class: `StableLmForTokenClassification` (StableLm model) - `Starcoder2Config` configuration class: `Starcoder2ForTokenClassification` (Starcoder2 model) - `T5Config` configuration class: `T5ForTokenClassification` (T5 model) - `T5GemmaConfig` configuration class: `T5GemmaForTokenClassification` (T5Gemma model) - `UMT5Config` configuration class: `UMT5ForTokenClassification` (UMT5 model) - `XLMConfig` configuration class: `XLMForTokenClassification` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForTokenClassification` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForTokenClassification` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetForTokenClassification` (XLNet model) - `XmodConfig` configuration class: `XmodForTokenClassification` (X-MOD model) - `YosoConfig` configuration class: `YosoForTokenClassification` (YOSO model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTokenClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [AlbertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForTokenClassification) (ALBERT model)
- **apertus** -- `ApertusForTokenClassification` (Apertus model)
- **arcee** -- `ArceeForTokenClassification` (Arcee model)
- **bert** -- [BertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForTokenClassification) (BERT model)
- **big_bird** -- [BigBirdForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForTokenClassification) (BigBird model)
- **biogpt** -- [BioGptForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/biogpt#transformers.BioGptForTokenClassification) (BioGpt model)
- **bloom** -- [BloomForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForTokenClassification) (BLOOM model)
- **bros** -- [BrosForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bros#transformers.BrosForTokenClassification) (BROS model)
- **camembert** -- [CamembertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForTokenClassification) (CamemBERT model)
- **canine** -- [CanineForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForTokenClassification) (CANINE model)
- **convbert** -- [ConvBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForTokenClassification) (ConvBERT model)
- **data2vec-text** -- [Data2VecTextForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForTokenClassification) (Data2VecText model)
- **deberta** -- [DebertaForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForTokenClassification) (DeBERTa model)
- **deberta-v2** -- [DebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForTokenClassification) (DeBERTa-v2 model)
- **deepseek_v3** -- `DeepseekV3ForTokenClassification` (DeepSeek-V3 model)
- **diffllama** -- `DiffLlamaForTokenClassification` (DiffLlama model)
- **distilbert** -- `DistilBertForTokenClassification` (DistilBERT model)
- **electra** -- `ElectraForTokenClassification` (ELECTRA model)
- **ernie** -- `ErnieForTokenClassification` (ERNIE model)
- **ernie_m** -- `ErnieMForTokenClassification` (ErnieM model)
- **esm** -- `EsmForTokenClassification` (ESM model)
- **exaone4** -- `Exaone4ForTokenClassification` (EXAONE-4.0 model)
- **falcon** -- `FalconForTokenClassification` (Falcon model)
- **flaubert** -- `FlaubertForTokenClassification` (FlauBERT model)
- **fnet** -- `FNetForTokenClassification` (FNet model)
- **funnel** -- `FunnelForTokenClassification` (Funnel Transformer model)
- **gemma** -- `GemmaForTokenClassification` (Gemma model)
- **gemma2** -- `Gemma2ForTokenClassification` (Gemma2 model)
- **glm** -- `GlmForTokenClassification` (GLM model)
- **glm4** -- `Glm4ForTokenClassification` (GLM4 model)
- **gpt-sw3** -- `GPT2ForTokenClassification` (GPT-Sw3 model)
- **gpt2** -- `GPT2ForTokenClassification` (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeForTokenClassification` (GPTBigCode model)
- **gpt_neo** -- `GPTNeoForTokenClassification` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXForTokenClassification` (GPT NeoX model)
- **gpt_oss** -- `GptOssForTokenClassification` (GptOss model)
- **helium** -- `HeliumForTokenClassification` (Helium model)
- **ibert** -- `IBertForTokenClassification` (I-BERT model)
- **layoutlm** -- `LayoutLMForTokenClassification` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2ForTokenClassification` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3ForTokenClassification` (LayoutLMv3 model)
- **lilt** -- `LiltForTokenClassification` (LiLT model)
- **llama** -- `LlamaForTokenClassification` (LLaMA model)
- **longformer** -- `LongformerForTokenClassification` (Longformer model)
- **luke** -- `LukeForTokenClassification` (LUKE model)
- **markuplm** -- `MarkupLMForTokenClassification` (MarkupLM model)
- **mega** -- `MegaForTokenClassification` (MEGA model)
- **megatron-bert** -- `MegatronBertForTokenClassification` (Megatron-BERT model)
- **minimax** -- `MiniMaxForTokenClassification` (MiniMax model)
- **ministral** -- `MinistralForTokenClassification` (Ministral model)
- **mistral** -- `MistralForTokenClassification` (Mistral model)
- **mixtral** -- `MixtralForTokenClassification` (Mixtral model)
- **mobilebert** -- `MobileBertForTokenClassification` (MobileBERT model)
- **modernbert** -- `ModernBertForTokenClassification` (ModernBERT model)
- **mpnet** -- `MPNetForTokenClassification` (MPNet model)
- **mpt** -- `MptForTokenClassification` (MPT model)
- **mra** -- `MraForTokenClassification` (MRA model)
- **mt5** -- `MT5ForTokenClassification` (MT5 model)
- **nemotron** -- `NemotronForTokenClassification` (Nemotron model)
- **nezha** -- `NezhaForTokenClassification` (Nezha model)
- **nystromformer** -- `NystromformerForTokenClassification` (Nyströmformer model)
- **persimmon** -- `PersimmonForTokenClassification` (Persimmon model)
- **phi** -- `PhiForTokenClassification` (Phi model)
- **phi3** -- `Phi3ForTokenClassification` (Phi3 model)
- **qdqbert** -- `QDQBertForTokenClassification` (QDQBert model)
- **qwen2** -- `Qwen2ForTokenClassification` (Qwen2 model)
- **qwen2_moe** -- `Qwen2MoeForTokenClassification` (Qwen2MoE model)
- **qwen3** -- `Qwen3ForTokenClassification` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeForTokenClassification` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextForTokenClassification` (Qwen3Next model)
- **rembert** -- `RemBertForTokenClassification` (RemBERT model)
- **roberta** -- `RobertaForTokenClassification` (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForTokenClassification` (RoCBert model)
- **roformer** -- `RoFormerForTokenClassification` (RoFormer model)
- **seed_oss** -- `SeedOssForTokenClassification` (SeedOss model)
- **smollm3** -- `SmolLM3ForTokenClassification` (SmolLM3 model)
- **squeezebert** -- `SqueezeBertForTokenClassification` (SqueezeBERT model)
- **stablelm** -- `StableLmForTokenClassification` (StableLm model)
- **starcoder2** -- `Starcoder2ForTokenClassification` (Starcoder2 model)
- **t5** -- `T5ForTokenClassification` (T5 model)
- **t5gemma** -- `T5GemmaForTokenClassification` (T5Gemma model)
- **umt5** -- `UMT5ForTokenClassification` (UMT5 model)
- **xlm** -- `XLMForTokenClassification` (XLM model)
- **xlm-roberta** -- `XLMRobertaForTokenClassification` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForTokenClassification` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetForTokenClassification` (XLNet model)
- **xmod** -- `XmodForTokenClassification` (X-MOD model)
- **yoso** -- `YosoForTokenClassification` (YOSO model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForTokenClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForTokenClassification[[transformers.TFAutoModelForTokenClassification]]

#### transformers.TFAutoModelForTokenClassification[[transformers.TFAutoModelForTokenClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L675)

This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForTokenClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForTokenClassification) (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForTokenClassification) (BERT model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForTokenClassification) (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForTokenClassification) (ConvBERT model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForTokenClassification) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForTokenClassification) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `TFDistilBertForTokenClassification` (DistilBERT model)
  - `ElectraConfig` configuration class: `TFElectraForTokenClassification` (ELECTRA model)
  - `EsmConfig` configuration class: `TFEsmForTokenClassification` (ESM model)
  - `FlaubertConfig` configuration class: `TFFlaubertForTokenClassification` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForTokenClassification` (Funnel Transformer model)
  - `LayoutLMConfig` configuration class: `TFLayoutLMForTokenClassification` (LayoutLM model)
  - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForTokenClassification` (LayoutLMv3 model)
  - `LongformerConfig` configuration class: `TFLongformerForTokenClassification` (Longformer model)
  - `MPNetConfig` configuration class: `TFMPNetForTokenClassification` (MPNet model)
  - `MobileBertConfig` configuration class: `TFMobileBertForTokenClassification` (MobileBERT model)
  - `RemBertConfig` configuration class: `TFRemBertForTokenClassification` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForTokenClassification` (RoFormer model)
  - `RobertaConfig` configuration class: `TFRobertaForTokenClassification` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
  - `XLMConfig` configuration class: `TFXLMForTokenClassification` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForTokenClassification` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetForTokenClassification` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForTokenClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForTokenClassification) (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForTokenClassification) (BERT model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForTokenClassification) (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForTokenClassification) (ConvBERT model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForTokenClassification) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForTokenClassification) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `TFDistilBertForTokenClassification` (DistilBERT model) - `ElectraConfig` configuration class: `TFElectraForTokenClassification` (ELECTRA model) - `EsmConfig` configuration class: `TFEsmForTokenClassification` (ESM model) - `FlaubertConfig` configuration class: `TFFlaubertForTokenClassification` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForTokenClassification` (Funnel Transformer model) - `LayoutLMConfig` configuration class: `TFLayoutLMForTokenClassification` (LayoutLM model) - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForTokenClassification` (LayoutLMv3 model) - `LongformerConfig` configuration class: `TFLongformerForTokenClassification` (Longformer model) - `MPNetConfig` configuration class: `TFMPNetForTokenClassification` (MPNet model) - `MobileBertConfig` configuration class: `TFMobileBertForTokenClassification` (MobileBERT model) - `RemBertConfig` configuration class: `TFRemBertForTokenClassification` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForTokenClassification` (RoFormer model) - `RobertaConfig` configuration class: `TFRobertaForTokenClassification` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model) - `XLMConfig` configuration class: `TFXLMForTokenClassification` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForTokenClassification` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetForTokenClassification` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForTokenClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [TFAlbertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForTokenClassification) (ALBERT model)
- **bert** -- [TFBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForTokenClassification) (BERT model)
- **camembert** -- [TFCamembertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForTokenClassification) (CamemBERT model)
- **convbert** -- [TFConvBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForTokenClassification) (ConvBERT model)
- **deberta** -- [TFDebertaForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForTokenClassification) (DeBERTa model)
- **deberta-v2** -- [TFDebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForTokenClassification) (DeBERTa-v2 model)
- **distilbert** -- `TFDistilBertForTokenClassification` (DistilBERT model)
- **electra** -- `TFElectraForTokenClassification` (ELECTRA model)
- **esm** -- `TFEsmForTokenClassification` (ESM model)
- **flaubert** -- `TFFlaubertForTokenClassification` (FlauBERT model)
- **funnel** -- `TFFunnelForTokenClassification` (Funnel Transformer model)
- **layoutlm** -- `TFLayoutLMForTokenClassification` (LayoutLM model)
- **layoutlmv3** -- `TFLayoutLMv3ForTokenClassification` (LayoutLMv3 model)
- **longformer** -- `TFLongformerForTokenClassification` (Longformer model)
- **mobilebert** -- `TFMobileBertForTokenClassification` (MobileBERT model)
- **mpnet** -- `TFMPNetForTokenClassification` (MPNet model)
- **rembert** -- `TFRemBertForTokenClassification` (RemBERT model)
- **roberta** -- `TFRobertaForTokenClassification` (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForTokenClassification` (RoFormer model)
- **xlm** -- `TFXLMForTokenClassification` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForTokenClassification` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetForTokenClassification` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForTokenClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForTokenClassification[[transformers.FlaxAutoModelForTokenClassification]]

#### transformers.FlaxAutoModelForTokenClassification[[transformers.FlaxAutoModelForTokenClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L336)

This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForTokenClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForTokenClassification) (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForTokenClassification) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForTokenClassification) (BigBird model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertForTokenClassification` (DistilBERT model)
  - `ElectraConfig` configuration class: `FlaxElectraForTokenClassification` (ELECTRA model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForTokenClassification` (RoFormer model)
  - `RobertaConfig` configuration class: `FlaxRobertaForTokenClassification` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForTokenClassification` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForTokenClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForTokenClassification) (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForTokenClassification) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForTokenClassification) (BigBird model) - `DistilBertConfig` configuration class: `FlaxDistilBertForTokenClassification` (DistilBERT model) - `ElectraConfig` configuration class: `FlaxElectraForTokenClassification` (ELECTRA model) - `RoFormerConfig` configuration class: `FlaxRoFormerForTokenClassification` (RoFormer model) - `RobertaConfig` configuration class: `FlaxRobertaForTokenClassification` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForTokenClassification` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForTokenClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [FlaxAlbertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForTokenClassification) (ALBERT model)
- **bert** -- [FlaxBertForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForTokenClassification) (BERT model)
- **big_bird** -- [FlaxBigBirdForTokenClassification](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForTokenClassification) (BigBird model)
- **distilbert** -- `FlaxDistilBertForTokenClassification` (DistilBERT model)
- **electra** -- `FlaxElectraForTokenClassification` (ELECTRA model)
- **roberta** -- `FlaxRobertaForTokenClassification` (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForTokenClassification` (RoFormer model)
- **xlm-roberta** -- `FlaxXLMRobertaForTokenClassification` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForTokenClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForQuestionAnswering[[transformers.AutoModelForQuestionAnswering]]

#### transformers.AutoModelForQuestionAnswering[[transformers.AutoModelForQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2006)

This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForQuestionAnswering) (ALBERT model)
  - `ArceeConfig` configuration class: `ArceeForQuestionAnswering` (Arcee model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForQuestionAnswering) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForQuestionAnswering) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForQuestionAnswering) (BigBird model)
  - [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForQuestionAnswering) (BigBird-Pegasus model)
  - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForQuestionAnswering) (BLOOM model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForQuestionAnswering) (CamemBERT model)
  - [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForQuestionAnswering) (CANINE model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForQuestionAnswering) (ConvBERT model)
  - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForQuestionAnswering) (Data2VecText model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForQuestionAnswering) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForQuestionAnswering) (DeBERTa-v2 model)
  - `DiffLlamaConfig` configuration class: `DiffLlamaForQuestionAnswering` (DiffLlama model)
  - `DistilBertConfig` configuration class: `DistilBertForQuestionAnswering` (DistilBERT model)
  - `ElectraConfig` configuration class: `ElectraForQuestionAnswering` (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForQuestionAnswering` (ERNIE model)
  - `ErnieMConfig` configuration class: `ErnieMForQuestionAnswering` (ErnieM model)
  - `Exaone4Config` configuration class: `Exaone4ForQuestionAnswering` (EXAONE-4.0 model)
  - `FNetConfig` configuration class: `FNetForQuestionAnswering` (FNet model)
  - `FalconConfig` configuration class: `FalconForQuestionAnswering` (Falcon model)
  - `FlaubertConfig` configuration class: `FlaubertForQuestionAnsweringSimple` (FlauBERT model)
  - `FunnelConfig` configuration class: `FunnelForQuestionAnswering` (Funnel Transformer model)
  - `GPT2Config` configuration class: `GPT2ForQuestionAnswering` (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `GPTJForQuestionAnswering` (GPT-J model)
  - `GPTNeoConfig` configuration class: `GPTNeoForQuestionAnswering` (GPT Neo model)
  - `GPTNeoXConfig` configuration class: `GPTNeoXForQuestionAnswering` (GPT NeoX model)
  - `IBertConfig` configuration class: `IBertForQuestionAnswering` (I-BERT model)
  - `LEDConfig` configuration class: `LEDForQuestionAnswering` (LED model)
  - `LayoutLMv2Config` configuration class: `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model)
  - `LayoutLMv3Config` configuration class: `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
  - `LiltConfig` configuration class: `LiltForQuestionAnswering` (LiLT model)
  - `LlamaConfig` configuration class: `LlamaForQuestionAnswering` (LLaMA model)
  - `LongformerConfig` configuration class: `LongformerForQuestionAnswering` (Longformer model)
  - `LukeConfig` configuration class: `LukeForQuestionAnswering` (LUKE model)
  - `LxmertConfig` configuration class: `LxmertForQuestionAnswering` (LXMERT model)
  - `MBartConfig` configuration class: `MBartForQuestionAnswering` (mBART model)
  - `MPNetConfig` configuration class: `MPNetForQuestionAnswering` (MPNet model)
  - `MT5Config` configuration class: `MT5ForQuestionAnswering` (MT5 model)
  - `MarkupLMConfig` configuration class: `MarkupLMForQuestionAnswering` (MarkupLM model)
  - `MegaConfig` configuration class: `MegaForQuestionAnswering` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForQuestionAnswering` (Megatron-BERT model)
  - `MiniMaxConfig` configuration class: `MiniMaxForQuestionAnswering` (MiniMax model)
  - `MinistralConfig` configuration class: `MinistralForQuestionAnswering` (Ministral model)
  - `MistralConfig` configuration class: `MistralForQuestionAnswering` (Mistral model)
  - `MixtralConfig` configuration class: `MixtralForQuestionAnswering` (Mixtral model)
  - `MobileBertConfig` configuration class: `MobileBertForQuestionAnswering` (MobileBERT model)
  - `ModernBertConfig` configuration class: `ModernBertForQuestionAnswering` (ModernBERT model)
  - `MptConfig` configuration class: `MptForQuestionAnswering` (MPT model)
  - `MraConfig` configuration class: `MraForQuestionAnswering` (MRA model)
  - `MvpConfig` configuration class: `MvpForQuestionAnswering` (MVP model)
  - `NemotronConfig` configuration class: `NemotronForQuestionAnswering` (Nemotron model)
  - `NezhaConfig` configuration class: `NezhaForQuestionAnswering` (Nezha model)
  - `NystromformerConfig` configuration class: `NystromformerForQuestionAnswering` (Nyströmformer model)
  - `OPTConfig` configuration class: `OPTForQuestionAnswering` (OPT model)
  - `QDQBertConfig` configuration class: `QDQBertForQuestionAnswering` (QDQBert model)
  - `Qwen2Config` configuration class: `Qwen2ForQuestionAnswering` (Qwen2 model)
  - `Qwen2MoeConfig` configuration class: `Qwen2MoeForQuestionAnswering` (Qwen2MoE model)
  - `Qwen3Config` configuration class: `Qwen3ForQuestionAnswering` (Qwen3 model)
  - `Qwen3MoeConfig` configuration class: `Qwen3MoeForQuestionAnswering` (Qwen3MoE model)
  - `Qwen3NextConfig` configuration class: `Qwen3NextForQuestionAnswering` (Qwen3Next model)
  - `ReformerConfig` configuration class: `ReformerForQuestionAnswering` (Reformer model)
  - `RemBertConfig` configuration class: `RemBertForQuestionAnswering` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForQuestionAnswering` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForQuestionAnswering` (RoFormer model)
  - `RobertaConfig` configuration class: `RobertaForQuestionAnswering` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
  - `SeedOssConfig` configuration class: `SeedOssForQuestionAnswering` (SeedOss model)
  - `SmolLM3Config` configuration class: `SmolLM3ForQuestionAnswering` (SmolLM3 model)
  - `SplinterConfig` configuration class: `SplinterForQuestionAnswering` (Splinter model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForQuestionAnswering` (SqueezeBERT model)
  - `T5Config` configuration class: `T5ForQuestionAnswering` (T5 model)
  - `UMT5Config` configuration class: `UMT5ForQuestionAnswering` (UMT5 model)
  - `XLMConfig` configuration class: `XLMForQuestionAnsweringSimple` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForQuestionAnswering` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForQuestionAnswering` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetForQuestionAnsweringSimple` (XLNet model)
  - `XmodConfig` configuration class: `XmodForQuestionAnswering` (X-MOD model)
  - `YosoConfig` configuration class: `YosoForQuestionAnswering` (YOSO model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForQuestionAnswering) (ALBERT model) - `ArceeConfig` configuration class: `ArceeForQuestionAnswering` (Arcee model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [BartForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForQuestionAnswering) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [BertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForQuestionAnswering) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForQuestionAnswering) (BigBird model) - [BigBirdPegasusConfig](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForQuestionAnswering) (BigBird-Pegasus model) - [BloomConfig](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForQuestionAnswering) (BLOOM model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForQuestionAnswering) (CamemBERT model) - [CanineConfig](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForQuestionAnswering) (CANINE model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForQuestionAnswering) (ConvBERT model) - [Data2VecTextConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForQuestionAnswering) (Data2VecText model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForQuestionAnswering) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForQuestionAnswering) (DeBERTa-v2 model) - `DiffLlamaConfig` configuration class: `DiffLlamaForQuestionAnswering` (DiffLlama model) - `DistilBertConfig` configuration class: `DistilBertForQuestionAnswering` (DistilBERT model) - `ElectraConfig` configuration class: `ElectraForQuestionAnswering` (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForQuestionAnswering` (ERNIE model) - `ErnieMConfig` configuration class: `ErnieMForQuestionAnswering` (ErnieM model) - `Exaone4Config` configuration class: `Exaone4ForQuestionAnswering` (EXAONE-4.0 model) - `FNetConfig` configuration class: `FNetForQuestionAnswering` (FNet model) - `FalconConfig` configuration class: `FalconForQuestionAnswering` (Falcon model) - `FlaubertConfig` configuration class: `FlaubertForQuestionAnsweringSimple` (FlauBERT model) - `FunnelConfig` configuration class: `FunnelForQuestionAnswering` (Funnel Transformer model) - `GPT2Config` configuration class: `GPT2ForQuestionAnswering` (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `GPTJForQuestionAnswering` (GPT-J model) - `GPTNeoConfig` configuration class: `GPTNeoForQuestionAnswering` (GPT Neo model) - `GPTNeoXConfig` configuration class: `GPTNeoXForQuestionAnswering` (GPT NeoX model) - `IBertConfig` configuration class: `IBertForQuestionAnswering` (I-BERT model) - `LEDConfig` configuration class: `LEDForQuestionAnswering` (LED model) - `LayoutLMv2Config` configuration class: `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model) - `LayoutLMv3Config` configuration class: `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model) - `LiltConfig` configuration class: `LiltForQuestionAnswering` (LiLT model) - `LlamaConfig` configuration class: `LlamaForQuestionAnswering` (LLaMA model) - `LongformerConfig` configuration class: `LongformerForQuestionAnswering` (Longformer model) - `LukeConfig` configuration class: `LukeForQuestionAnswering` (LUKE model) - `LxmertConfig` configuration class: `LxmertForQuestionAnswering` (LXMERT model) - `MBartConfig` configuration class: `MBartForQuestionAnswering` (mBART model) - `MPNetConfig` configuration class: `MPNetForQuestionAnswering` (MPNet model) - `MT5Config` configuration class: `MT5ForQuestionAnswering` (MT5 model) - `MarkupLMConfig` configuration class: `MarkupLMForQuestionAnswering` (MarkupLM model) - `MegaConfig` configuration class: `MegaForQuestionAnswering` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForQuestionAnswering` (Megatron-BERT model) - `MiniMaxConfig` configuration class: `MiniMaxForQuestionAnswering` (MiniMax model) - `MinistralConfig` configuration class: `MinistralForQuestionAnswering` (Ministral model) - `MistralConfig` configuration class: `MistralForQuestionAnswering` (Mistral model) - `MixtralConfig` configuration class: `MixtralForQuestionAnswering` (Mixtral model) - `MobileBertConfig` configuration class: `MobileBertForQuestionAnswering` (MobileBERT model) - `ModernBertConfig` configuration class: `ModernBertForQuestionAnswering` (ModernBERT model) - `MptConfig` configuration class: `MptForQuestionAnswering` (MPT model) - `MraConfig` configuration class: `MraForQuestionAnswering` (MRA model) - `MvpConfig` configuration class: `MvpForQuestionAnswering` (MVP model) - `NemotronConfig` configuration class: `NemotronForQuestionAnswering` (Nemotron model) - `NezhaConfig` configuration class: `NezhaForQuestionAnswering` (Nezha model) - `NystromformerConfig` configuration class: `NystromformerForQuestionAnswering` (Nyströmformer model) - `OPTConfig` configuration class: `OPTForQuestionAnswering` (OPT model) - `QDQBertConfig` configuration class: `QDQBertForQuestionAnswering` (QDQBert model) - `Qwen2Config` configuration class: `Qwen2ForQuestionAnswering` (Qwen2 model) - `Qwen2MoeConfig` configuration class: `Qwen2MoeForQuestionAnswering` (Qwen2MoE model) - `Qwen3Config` configuration class: `Qwen3ForQuestionAnswering` (Qwen3 model) - `Qwen3MoeConfig` configuration class: `Qwen3MoeForQuestionAnswering` (Qwen3MoE model) - `Qwen3NextConfig` configuration class: `Qwen3NextForQuestionAnswering` (Qwen3Next model) - `ReformerConfig` configuration class: `ReformerForQuestionAnswering` (Reformer model) - `RemBertConfig` configuration class: `RemBertForQuestionAnswering` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForQuestionAnswering` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForQuestionAnswering` (RoFormer model) - `RobertaConfig` configuration class: `RobertaForQuestionAnswering` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model) - `SeedOssConfig` configuration class: `SeedOssForQuestionAnswering` (SeedOss model) - `SmolLM3Config` configuration class: `SmolLM3ForQuestionAnswering` (SmolLM3 model) - `SplinterConfig` configuration class: `SplinterForQuestionAnswering` (Splinter model) - `SqueezeBertConfig` configuration class: `SqueezeBertForQuestionAnswering` (SqueezeBERT model) - `T5Config` configuration class: `T5ForQuestionAnswering` (T5 model) - `UMT5Config` configuration class: `UMT5ForQuestionAnswering` (UMT5 model) - `XLMConfig` configuration class: `XLMForQuestionAnsweringSimple` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForQuestionAnswering` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForQuestionAnswering` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetForQuestionAnsweringSimple` (XLNet model) - `XmodConfig` configuration class: `XmodForQuestionAnswering` (X-MOD model) - `YosoConfig` configuration class: `YosoForQuestionAnswering` (YOSO model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [AlbertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertForQuestionAnswering) (ALBERT model)
- **arcee** -- `ArceeForQuestionAnswering` (Arcee model)
- **bart** -- [BartForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartForQuestionAnswering) (BART model)
- **bert** -- [BertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertForQuestionAnswering) (BERT model)
- **big_bird** -- [BigBirdForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdForQuestionAnswering) (BigBird model)
- **bigbird_pegasus** -- [BigBirdPegasusForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForQuestionAnswering) (BigBird-Pegasus model)
- **bloom** -- [BloomForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bloom#transformers.BloomForQuestionAnswering) (BLOOM model)
- **camembert** -- [CamembertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertForQuestionAnswering) (CamemBERT model)
- **canine** -- [CanineForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/canine#transformers.CanineForQuestionAnswering) (CANINE model)
- **convbert** -- [ConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertForQuestionAnswering) (ConvBERT model)
- **data2vec-text** -- [Data2VecTextForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecTextForQuestionAnswering) (Data2VecText model)
- **deberta** -- [DebertaForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaForQuestionAnswering) (DeBERTa model)
- **deberta-v2** -- [DebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2ForQuestionAnswering) (DeBERTa-v2 model)
- **diffllama** -- `DiffLlamaForQuestionAnswering` (DiffLlama model)
- **distilbert** -- `DistilBertForQuestionAnswering` (DistilBERT model)
- **electra** -- `ElectraForQuestionAnswering` (ELECTRA model)
- **ernie** -- `ErnieForQuestionAnswering` (ERNIE model)
- **ernie_m** -- `ErnieMForQuestionAnswering` (ErnieM model)
- **exaone4** -- `Exaone4ForQuestionAnswering` (EXAONE-4.0 model)
- **falcon** -- `FalconForQuestionAnswering` (Falcon model)
- **flaubert** -- `FlaubertForQuestionAnsweringSimple` (FlauBERT model)
- **fnet** -- `FNetForQuestionAnswering` (FNet model)
- **funnel** -- `FunnelForQuestionAnswering` (Funnel Transformer model)
- **gpt2** -- `GPT2ForQuestionAnswering` (OpenAI GPT-2 model)
- **gpt_neo** -- `GPTNeoForQuestionAnswering` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXForQuestionAnswering` (GPT NeoX model)
- **gptj** -- `GPTJForQuestionAnswering` (GPT-J model)
- **ibert** -- `IBertForQuestionAnswering` (I-BERT model)
- **layoutlmv2** -- `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
- **led** -- `LEDForQuestionAnswering` (LED model)
- **lilt** -- `LiltForQuestionAnswering` (LiLT model)
- **llama** -- `LlamaForQuestionAnswering` (LLaMA model)
- **longformer** -- `LongformerForQuestionAnswering` (Longformer model)
- **luke** -- `LukeForQuestionAnswering` (LUKE model)
- **lxmert** -- `LxmertForQuestionAnswering` (LXMERT model)
- **markuplm** -- `MarkupLMForQuestionAnswering` (MarkupLM model)
- **mbart** -- `MBartForQuestionAnswering` (mBART model)
- **mega** -- `MegaForQuestionAnswering` (MEGA model)
- **megatron-bert** -- `MegatronBertForQuestionAnswering` (Megatron-BERT model)
- **minimax** -- `MiniMaxForQuestionAnswering` (MiniMax model)
- **ministral** -- `MinistralForQuestionAnswering` (Ministral model)
- **mistral** -- `MistralForQuestionAnswering` (Mistral model)
- **mixtral** -- `MixtralForQuestionAnswering` (Mixtral model)
- **mobilebert** -- `MobileBertForQuestionAnswering` (MobileBERT model)
- **modernbert** -- `ModernBertForQuestionAnswering` (ModernBERT model)
- **mpnet** -- `MPNetForQuestionAnswering` (MPNet model)
- **mpt** -- `MptForQuestionAnswering` (MPT model)
- **mra** -- `MraForQuestionAnswering` (MRA model)
- **mt5** -- `MT5ForQuestionAnswering` (MT5 model)
- **mvp** -- `MvpForQuestionAnswering` (MVP model)
- **nemotron** -- `NemotronForQuestionAnswering` (Nemotron model)
- **nezha** -- `NezhaForQuestionAnswering` (Nezha model)
- **nystromformer** -- `NystromformerForQuestionAnswering` (Nyströmformer model)
- **opt** -- `OPTForQuestionAnswering` (OPT model)
- **qdqbert** -- `QDQBertForQuestionAnswering` (QDQBert model)
- **qwen2** -- `Qwen2ForQuestionAnswering` (Qwen2 model)
- **qwen2_moe** -- `Qwen2MoeForQuestionAnswering` (Qwen2MoE model)
- **qwen3** -- `Qwen3ForQuestionAnswering` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeForQuestionAnswering` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextForQuestionAnswering` (Qwen3Next model)
- **reformer** -- `ReformerForQuestionAnswering` (Reformer model)
- **rembert** -- `RemBertForQuestionAnswering` (RemBERT model)
- **roberta** -- `RobertaForQuestionAnswering` (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForQuestionAnswering` (RoCBert model)
- **roformer** -- `RoFormerForQuestionAnswering` (RoFormer model)
- **seed_oss** -- `SeedOssForQuestionAnswering` (SeedOss model)
- **smollm3** -- `SmolLM3ForQuestionAnswering` (SmolLM3 model)
- **splinter** -- `SplinterForQuestionAnswering` (Splinter model)
- **squeezebert** -- `SqueezeBertForQuestionAnswering` (SqueezeBERT model)
- **t5** -- `T5ForQuestionAnswering` (T5 model)
- **umt5** -- `UMT5ForQuestionAnswering` (UMT5 model)
- **xlm** -- `XLMForQuestionAnsweringSimple` (XLM model)
- **xlm-roberta** -- `XLMRobertaForQuestionAnswering` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForQuestionAnswering` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetForQuestionAnsweringSimple` (XLNet model)
- **xmod** -- `XmodForQuestionAnswering` (X-MOD model)
- **yoso** -- `YosoForQuestionAnswering` (YOSO model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForQuestionAnswering.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForQuestionAnswering[[transformers.TFAutoModelForQuestionAnswering]]

#### transformers.TFAutoModelForQuestionAnswering[[transformers.TFAutoModelForQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L646)

This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForQuestionAnswering) (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForQuestionAnswering) (BERT model)
  - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForQuestionAnswering) (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForQuestionAnswering) (ConvBERT model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForQuestionAnswering) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForQuestionAnswering) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `TFDistilBertForQuestionAnswering` (DistilBERT model)
  - `ElectraConfig` configuration class: `TFElectraForQuestionAnswering` (ELECTRA model)
  - `FlaubertConfig` configuration class: `TFFlaubertForQuestionAnsweringSimple` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForQuestionAnswering` (Funnel Transformer model)
  - `GPTJConfig` configuration class: `TFGPTJForQuestionAnswering` (GPT-J model)
  - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
  - `LongformerConfig` configuration class: `TFLongformerForQuestionAnswering` (Longformer model)
  - `MPNetConfig` configuration class: `TFMPNetForQuestionAnswering` (MPNet model)
  - `MobileBertConfig` configuration class: `TFMobileBertForQuestionAnswering` (MobileBERT model)
  - `RemBertConfig` configuration class: `TFRemBertForQuestionAnswering` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForQuestionAnswering` (RoFormer model)
  - `RobertaConfig` configuration class: `TFRobertaForQuestionAnswering` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
  - `XLMConfig` configuration class: `TFXLMForQuestionAnsweringSimple` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForQuestionAnswering` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetForQuestionAnsweringSimple` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [TFAlbertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForQuestionAnswering) (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForQuestionAnswering) (BERT model) - [CamembertConfig](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.CamembertConfig) configuration class: [TFCamembertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForQuestionAnswering) (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForQuestionAnswering) (ConvBERT model) - [DebertaConfig](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForQuestionAnswering) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForQuestionAnswering) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `TFDistilBertForQuestionAnswering` (DistilBERT model) - `ElectraConfig` configuration class: `TFElectraForQuestionAnswering` (ELECTRA model) - `FlaubertConfig` configuration class: `TFFlaubertForQuestionAnsweringSimple` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForQuestionAnswering` (Funnel Transformer model) - `GPTJConfig` configuration class: `TFGPTJForQuestionAnswering` (GPT-J model) - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model) - `LongformerConfig` configuration class: `TFLongformerForQuestionAnswering` (Longformer model) - `MPNetConfig` configuration class: `TFMPNetForQuestionAnswering` (MPNet model) - `MobileBertConfig` configuration class: `TFMobileBertForQuestionAnswering` (MobileBERT model) - `RemBertConfig` configuration class: `TFRemBertForQuestionAnswering` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForQuestionAnswering` (RoFormer model) - `RobertaConfig` configuration class: `TFRobertaForQuestionAnswering` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model) - `XLMConfig` configuration class: `TFXLMForQuestionAnsweringSimple` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForQuestionAnswering` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetForQuestionAnsweringSimple` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [TFAlbertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.TFAlbertForQuestionAnswering) (ALBERT model)
- **bert** -- [TFBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.TFBertForQuestionAnswering) (BERT model)
- **camembert** -- [TFCamembertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/camembert#transformers.TFCamembertForQuestionAnswering) (CamemBERT model)
- **convbert** -- [TFConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/convbert#transformers.TFConvBertForQuestionAnswering) (ConvBERT model)
- **deberta** -- [TFDebertaForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta#transformers.TFDebertaForQuestionAnswering) (DeBERTa model)
- **deberta-v2** -- [TFDebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/deberta-v2#transformers.TFDebertaV2ForQuestionAnswering) (DeBERTa-v2 model)
- **distilbert** -- `TFDistilBertForQuestionAnswering` (DistilBERT model)
- **electra** -- `TFElectraForQuestionAnswering` (ELECTRA model)
- **flaubert** -- `TFFlaubertForQuestionAnsweringSimple` (FlauBERT model)
- **funnel** -- `TFFunnelForQuestionAnswering` (Funnel Transformer model)
- **gptj** -- `TFGPTJForQuestionAnswering` (GPT-J model)
- **layoutlmv3** -- `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
- **longformer** -- `TFLongformerForQuestionAnswering` (Longformer model)
- **mobilebert** -- `TFMobileBertForQuestionAnswering` (MobileBERT model)
- **mpnet** -- `TFMPNetForQuestionAnswering` (MPNet model)
- **rembert** -- `TFRemBertForQuestionAnswering` (RemBERT model)
- **roberta** -- `TFRobertaForQuestionAnswering` (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForQuestionAnswering` (RoFormer model)
- **xlm** -- `TFXLMForQuestionAnsweringSimple` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForQuestionAnswering` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetForQuestionAnsweringSimple` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForQuestionAnswering.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForQuestionAnswering[[transformers.FlaxAutoModelForQuestionAnswering]]

#### transformers.FlaxAutoModelForQuestionAnswering[[transformers.FlaxAutoModelForQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L329)

This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForQuestionAnswering) (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForQuestionAnswering) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForQuestionAnswering) (BERT model)
  - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForQuestionAnswering) (BigBird model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertForQuestionAnswering` (DistilBERT model)
  - `ElectraConfig` configuration class: `FlaxElectraForQuestionAnswering` (ELECTRA model)
  - `MBartConfig` configuration class: `FlaxMBartForQuestionAnswering` (mBART model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForQuestionAnswering` (RoFormer model)
  - `RobertaConfig` configuration class: `FlaxRobertaForQuestionAnswering` (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForQuestionAnswering` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.AlbertConfig) configuration class: [FlaxAlbertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForQuestionAnswering) (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForQuestionAnswering) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForQuestionAnswering) (BERT model) - [BigBirdConfig](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [FlaxBigBirdForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForQuestionAnswering) (BigBird model) - `DistilBertConfig` configuration class: `FlaxDistilBertForQuestionAnswering` (DistilBERT model) - `ElectraConfig` configuration class: `FlaxElectraForQuestionAnswering` (ELECTRA model) - `MBartConfig` configuration class: `FlaxMBartForQuestionAnswering` (mBART model) - `RoFormerConfig` configuration class: `FlaxRoFormerForQuestionAnswering` (RoFormer model) - `RobertaConfig` configuration class: `FlaxRobertaForQuestionAnswering` (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForQuestionAnswering` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [FlaxAlbertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/albert#transformers.FlaxAlbertForQuestionAnswering) (ALBERT model)
- **bart** -- [FlaxBartForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bart#transformers.FlaxBartForQuestionAnswering) (BART model)
- **bert** -- [FlaxBertForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/bert#transformers.FlaxBertForQuestionAnswering) (BERT model)
- **big_bird** -- [FlaxBigBirdForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/big_bird#transformers.FlaxBigBirdForQuestionAnswering) (BigBird model)
- **distilbert** -- `FlaxDistilBertForQuestionAnswering` (DistilBERT model)
- **electra** -- `FlaxElectraForQuestionAnswering` (ELECTRA model)
- **mbart** -- `FlaxMBartForQuestionAnswering` (mBART model)
- **roberta** -- `FlaxRobertaForQuestionAnswering` (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForQuestionAnswering` (RoFormer model)
- **xlm-roberta** -- `FlaxXLMRobertaForQuestionAnswering` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTextEncoding[[transformers.AutoModelForTextEncoding]]

#### transformers.AutoModelForTextEncoding[[transformers.AutoModelForTextEncoding]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1932)

### TFAutoModelForTextEncoding[[transformers.TFAutoModelForTextEncoding]]

#### transformers.TFAutoModelForTextEncoding[[transformers.TFAutoModelForTextEncoding]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L534)

## Computer vision

以下の自動クラスは、次のコンピュータービジョンタスクに利用可能です。

### AutoModelForDepthEstimation[[transformers.AutoModelForDepthEstimation]]

#### transformers.AutoModelForDepthEstimation[[transformers.AutoModelForDepthEstimation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2144)

This is a generic model class that will be instantiated as one of the model classes of the library (with a depth estimation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForDepthEstimation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `DPTConfig` configuration class: `DPTForDepthEstimation` (DPT model)
  - `DepthAnythingConfig` configuration class: `DepthAnythingForDepthEstimation` (Depth Anything model)
  - `DepthProConfig` configuration class: `DepthProForDepthEstimation` (DepthPro model)
  - `GLPNConfig` configuration class: `GLPNForDepthEstimation` (GLPN model)
  - `PromptDepthAnythingConfig` configuration class: `PromptDepthAnythingForDepthEstimation` (PromptDepthAnything model)
  - `ZoeDepthConfig` configuration class: `ZoeDepthForDepthEstimation` (ZoeDepth model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a depth estimation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForDepthEstimation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `DPTConfig` configuration class: `DPTForDepthEstimation` (DPT model) - `DepthAnythingConfig` configuration class: `DepthAnythingForDepthEstimation` (Depth Anything model) - `DepthProConfig` configuration class: `DepthProForDepthEstimation` (DepthPro model) - `GLPNConfig` configuration class: `GLPNForDepthEstimation` (GLPN model) - `PromptDepthAnythingConfig` configuration class: `PromptDepthAnythingForDepthEstimation` (PromptDepthAnything model) - `ZoeDepthConfig` configuration class: `ZoeDepthForDepthEstimation` (ZoeDepth model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForDepthEstimation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a depth estimation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **depth_anything** -- `DepthAnythingForDepthEstimation` (Depth Anything model)
- **depth_pro** -- `DepthProForDepthEstimation` (DepthPro model)
- **dpt** -- `DPTForDepthEstimation` (DPT model)
- **glpn** -- `GLPNForDepthEstimation` (GLPN model)
- **prompt_depth_anything** -- `PromptDepthAnythingForDepthEstimation` (PromptDepthAnything model)
- **zoedepth** -- `ZoeDepthForDepthEstimation` (ZoeDepth model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForDepthEstimation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageClassification[[transformers.AutoModelForImageClassification]]

#### transformers.AutoModelForImageClassification[[transformers.AutoModelForImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2069)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) configuration class: [BeitForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitForImageClassification) (BEiT model)
  - [BitConfig](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitConfig) configuration class: [BitForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitForImageClassification) (BiT model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: `CLIPForImageClassification` (CLIP model)
  - [ConvNextConfig](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextConfig) configuration class: [ConvNextForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextForImageClassification) (ConvNeXT model)
  - [ConvNextV2Config](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [ConvNextV2ForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2ForImageClassification) (ConvNeXTV2 model)
  - [CvtConfig](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtConfig) configuration class: [CvtForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtForImageClassification) (CvT model)
  - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionForImageClassification) (Data2VecVision model)
  - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTForImageClassification) or [DeiTForImageClassificationWithTeacher](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTForImageClassificationWithTeacher) (DeiT model)
  - [DinatConfig](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatConfig) configuration class: [DinatForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatForImageClassification) (DiNAT model)
  - `Dinov2Config` configuration class: `Dinov2ForImageClassification` (DINOv2 model)
  - `Dinov2WithRegistersConfig` configuration class: `Dinov2WithRegistersForImageClassification` (DINOv2 with Registers model)
  - `DonutSwinConfig` configuration class: `DonutSwinForImageClassification` (DonutSwin model)
  - `EfficientFormerConfig` configuration class: `EfficientFormerForImageClassification` or `EfficientFormerForImageClassificationWithTeacher` (EfficientFormer model)
  - `EfficientNetConfig` configuration class: `EfficientNetForImageClassification` (EfficientNet model)
  - `FocalNetConfig` configuration class: `FocalNetForImageClassification` (FocalNet model)
  - `HGNetV2Config` configuration class: `HGNetV2ForImageClassification` (HGNet-V2 model)
  - `HieraConfig` configuration class: `HieraForImageClassification` (Hiera model)
  - `IJepaConfig` configuration class: `IJepaForImageClassification` (I-JEPA model)
  - `ImageGPTConfig` configuration class: `ImageGPTForImageClassification` (ImageGPT model)
  - `LevitConfig` configuration class: `LevitForImageClassification` or `LevitForImageClassificationWithTeacher` (LeViT model)
  - `MetaClip2Config` configuration class: `MetaClip2ForImageClassification` (MetaCLIP 2 model)
  - `MobileNetV1Config` configuration class: `MobileNetV1ForImageClassification` (MobileNetV1 model)
  - `MobileNetV2Config` configuration class: `MobileNetV2ForImageClassification` (MobileNetV2 model)
  - `MobileViTConfig` configuration class: `MobileViTForImageClassification` (MobileViT model)
  - `MobileViTV2Config` configuration class: `MobileViTV2ForImageClassification` (MobileViTV2 model)
  - `NatConfig` configuration class: `NatForImageClassification` (NAT model)
  - `PerceiverConfig` configuration class: `PerceiverForImageClassificationLearned` or `PerceiverForImageClassificationFourier` or `PerceiverForImageClassificationConvProcessing` (Perceiver model)
  - `PoolFormerConfig` configuration class: `PoolFormerForImageClassification` (PoolFormer model)
  - `PvtConfig` configuration class: `PvtForImageClassification` (PVT model)
  - `PvtV2Config` configuration class: `PvtV2ForImageClassification` (PVTv2 model)
  - `RegNetConfig` configuration class: `RegNetForImageClassification` (RegNet model)
  - `ResNetConfig` configuration class: `ResNetForImageClassification` (ResNet model)
  - `SegformerConfig` configuration class: `SegformerForImageClassification` (SegFormer model)
  - `ShieldGemma2Config` configuration class: `ShieldGemma2ForImageClassification` (Shieldgemma2 model)
  - `Siglip2Config` configuration class: `Siglip2ForImageClassification` (SigLIP2 model)
  - `SiglipConfig` configuration class: `SiglipForImageClassification` (SigLIP model)
  - `SwiftFormerConfig` configuration class: `SwiftFormerForImageClassification` (SwiftFormer model)
  - `SwinConfig` configuration class: `SwinForImageClassification` (Swin Transformer model)
  - `Swinv2Config` configuration class: `Swinv2ForImageClassification` (Swin Transformer V2 model)
  - `TextNetConfig` configuration class: `TextNetForImageClassification` (TextNet model)
  - `TimmWrapperConfig` configuration class: `TimmWrapperForImageClassification` (TimmWrapperModel model)
  - `VanConfig` configuration class: `VanForImageClassification` (VAN model)
  - `ViTConfig` configuration class: `ViTForImageClassification` (ViT model)
  - `ViTHybridConfig` configuration class: `ViTHybridForImageClassification` (ViT Hybrid model)
  - `ViTMSNConfig` configuration class: `ViTMSNForImageClassification` (ViTMSN model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) configuration class: [BeitForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitForImageClassification) (BEiT model) - [BitConfig](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitConfig) configuration class: [BitForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitForImageClassification) (BiT model) - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: `CLIPForImageClassification` (CLIP model) - [ConvNextConfig](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextConfig) configuration class: [ConvNextForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextForImageClassification) (ConvNeXT model) - [ConvNextV2Config](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [ConvNextV2ForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2ForImageClassification) (ConvNeXTV2 model) - [CvtConfig](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtConfig) configuration class: [CvtForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtForImageClassification) (CvT model) - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionForImageClassification) (Data2VecVision model) - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTForImageClassification) or [DeiTForImageClassificationWithTeacher](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTForImageClassificationWithTeacher) (DeiT model) - [DinatConfig](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatConfig) configuration class: [DinatForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatForImageClassification) (DiNAT model) - `Dinov2Config` configuration class: `Dinov2ForImageClassification` (DINOv2 model) - `Dinov2WithRegistersConfig` configuration class: `Dinov2WithRegistersForImageClassification` (DINOv2 with Registers model) - `DonutSwinConfig` configuration class: `DonutSwinForImageClassification` (DonutSwin model) - `EfficientFormerConfig` configuration class: `EfficientFormerForImageClassification` or `EfficientFormerForImageClassificationWithTeacher` (EfficientFormer model) - `EfficientNetConfig` configuration class: `EfficientNetForImageClassification` (EfficientNet model) - `FocalNetConfig` configuration class: `FocalNetForImageClassification` (FocalNet model) - `HGNetV2Config` configuration class: `HGNetV2ForImageClassification` (HGNet-V2 model) - `HieraConfig` configuration class: `HieraForImageClassification` (Hiera model) - `IJepaConfig` configuration class: `IJepaForImageClassification` (I-JEPA model) - `ImageGPTConfig` configuration class: `ImageGPTForImageClassification` (ImageGPT model) - `LevitConfig` configuration class: `LevitForImageClassification` or `LevitForImageClassificationWithTeacher` (LeViT model) - `MetaClip2Config` configuration class: `MetaClip2ForImageClassification` (MetaCLIP 2 model) - `MobileNetV1Config` configuration class: `MobileNetV1ForImageClassification` (MobileNetV1 model) - `MobileNetV2Config` configuration class: `MobileNetV2ForImageClassification` (MobileNetV2 model) - `MobileViTConfig` configuration class: `MobileViTForImageClassification` (MobileViT model) - `MobileViTV2Config` configuration class: `MobileViTV2ForImageClassification` (MobileViTV2 model) - `NatConfig` configuration class: `NatForImageClassification` (NAT model) - `PerceiverConfig` configuration class: `PerceiverForImageClassificationLearned` or `PerceiverForImageClassificationFourier` or `PerceiverForImageClassificationConvProcessing` (Perceiver model) - `PoolFormerConfig` configuration class: `PoolFormerForImageClassification` (PoolFormer model) - `PvtConfig` configuration class: `PvtForImageClassification` (PVT model) - `PvtV2Config` configuration class: `PvtV2ForImageClassification` (PVTv2 model) - `RegNetConfig` configuration class: `RegNetForImageClassification` (RegNet model) - `ResNetConfig` configuration class: `ResNetForImageClassification` (ResNet model) - `SegformerConfig` configuration class: `SegformerForImageClassification` (SegFormer model) - `ShieldGemma2Config` configuration class: `ShieldGemma2ForImageClassification` (Shieldgemma2 model) - `Siglip2Config` configuration class: `Siglip2ForImageClassification` (SigLIP2 model) - `SiglipConfig` configuration class: `SiglipForImageClassification` (SigLIP model) - `SwiftFormerConfig` configuration class: `SwiftFormerForImageClassification` (SwiftFormer model) - `SwinConfig` configuration class: `SwinForImageClassification` (Swin Transformer model) - `Swinv2Config` configuration class: `Swinv2ForImageClassification` (Swin Transformer V2 model) - `TextNetConfig` configuration class: `TextNetForImageClassification` (TextNet model) - `TimmWrapperConfig` configuration class: `TimmWrapperForImageClassification` (TimmWrapperModel model) - `VanConfig` configuration class: `VanForImageClassification` (VAN model) - `ViTConfig` configuration class: `ViTForImageClassification` (ViT model) - `ViTHybridConfig` configuration class: `ViTHybridForImageClassification` (ViT Hybrid model) - `ViTMSNConfig` configuration class: `ViTMSNForImageClassification` (ViTMSN model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **beit** -- [BeitForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitForImageClassification) (BEiT model)
- **bit** -- [BitForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/bit#transformers.BitForImageClassification) (BiT model)
- **clip** -- `CLIPForImageClassification` (CLIP model)
- **convnext** -- [ConvNextForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextForImageClassification) (ConvNeXT model)
- **convnextv2** -- [ConvNextV2ForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2ForImageClassification) (ConvNeXTV2 model)
- **cvt** -- [CvtForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtForImageClassification) (CvT model)
- **data2vec-vision** -- [Data2VecVisionForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionForImageClassification) (Data2VecVision model)
- **deit** -- [DeiTForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTForImageClassification) or [DeiTForImageClassificationWithTeacher](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTForImageClassificationWithTeacher) (DeiT model)
- **dinat** -- [DinatForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/dinat#transformers.DinatForImageClassification) (DiNAT model)
- **dinov2** -- `Dinov2ForImageClassification` (DINOv2 model)
- **dinov2_with_registers** -- `Dinov2WithRegistersForImageClassification` (DINOv2 with Registers model)
- **donut-swin** -- `DonutSwinForImageClassification` (DonutSwin model)
- **efficientformer** -- `EfficientFormerForImageClassification` or `EfficientFormerForImageClassificationWithTeacher` (EfficientFormer model)
- **efficientnet** -- `EfficientNetForImageClassification` (EfficientNet model)
- **focalnet** -- `FocalNetForImageClassification` (FocalNet model)
- **hgnet_v2** -- `HGNetV2ForImageClassification` (HGNet-V2 model)
- **hiera** -- `HieraForImageClassification` (Hiera model)
- **ijepa** -- `IJepaForImageClassification` (I-JEPA model)
- **imagegpt** -- `ImageGPTForImageClassification` (ImageGPT model)
- **levit** -- `LevitForImageClassification` or `LevitForImageClassificationWithTeacher` (LeViT model)
- **metaclip_2** -- `MetaClip2ForImageClassification` (MetaCLIP 2 model)
- **mobilenet_v1** -- `MobileNetV1ForImageClassification` (MobileNetV1 model)
- **mobilenet_v2** -- `MobileNetV2ForImageClassification` (MobileNetV2 model)
- **mobilevit** -- `MobileViTForImageClassification` (MobileViT model)
- **mobilevitv2** -- `MobileViTV2ForImageClassification` (MobileViTV2 model)
- **nat** -- `NatForImageClassification` (NAT model)
- **perceiver** -- `PerceiverForImageClassificationLearned` or `PerceiverForImageClassificationFourier` or `PerceiverForImageClassificationConvProcessing` (Perceiver model)
- **poolformer** -- `PoolFormerForImageClassification` (PoolFormer model)
- **pvt** -- `PvtForImageClassification` (PVT model)
- **pvt_v2** -- `PvtV2ForImageClassification` (PVTv2 model)
- **regnet** -- `RegNetForImageClassification` (RegNet model)
- **resnet** -- `ResNetForImageClassification` (ResNet model)
- **segformer** -- `SegformerForImageClassification` (SegFormer model)
- **shieldgemma2** -- `ShieldGemma2ForImageClassification` (Shieldgemma2 model)
- **siglip** -- `SiglipForImageClassification` (SigLIP model)
- **siglip2** -- `Siglip2ForImageClassification` (SigLIP2 model)
- **swiftformer** -- `SwiftFormerForImageClassification` (SwiftFormer model)
- **swin** -- `SwinForImageClassification` (Swin Transformer model)
- **swinv2** -- `Swinv2ForImageClassification` (Swin Transformer V2 model)
- **textnet** -- `TextNetForImageClassification` (TextNet model)
- **timm_wrapper** -- `TimmWrapperForImageClassification` (TimmWrapperModel model)
- **van** -- `VanForImageClassification` (VAN model)
- **vit** -- `ViTForImageClassification` (ViT model)
- **vit_hybrid** -- `ViTHybridForImageClassification` (ViT Hybrid model)
- **vit_msn** -- `ViTMSNForImageClassification` (ViTMSN model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForImageClassification[[transformers.TFAutoModelForImageClassification]]

#### transformers.TFAutoModelForImageClassification[[transformers.TFAutoModelForImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L585)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [ConvNextConfig](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextConfig) configuration class: [TFConvNextForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.TFConvNextForImageClassification) (ConvNeXT model)
  - [ConvNextV2Config](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [TFConvNextV2ForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.TFConvNextV2ForImageClassification) (ConvNeXTV2 model)
  - [CvtConfig](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtConfig) configuration class: [TFCvtForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.TFCvtForImageClassification) (CvT model)
  - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [TFData2VecVisionForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.TFData2VecVisionForImageClassification) (Data2VecVision model)
  - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [TFDeiTForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTForImageClassification) or [TFDeiTForImageClassificationWithTeacher](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTForImageClassificationWithTeacher) (DeiT model)
  - `EfficientFormerConfig` configuration class: `TFEfficientFormerForImageClassification` or `TFEfficientFormerForImageClassificationWithTeacher` (EfficientFormer model)
  - `MobileViTConfig` configuration class: `TFMobileViTForImageClassification` (MobileViT model)
  - `RegNetConfig` configuration class: `TFRegNetForImageClassification` (RegNet model)
  - `ResNetConfig` configuration class: `TFResNetForImageClassification` (ResNet model)
  - `SegformerConfig` configuration class: `TFSegformerForImageClassification` (SegFormer model)
  - `SwiftFormerConfig` configuration class: `TFSwiftFormerForImageClassification` (SwiftFormer model)
  - `SwinConfig` configuration class: `TFSwinForImageClassification` (Swin Transformer model)
  - `ViTConfig` configuration class: `TFViTForImageClassification` (ViT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForImageClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [ConvNextConfig](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.ConvNextConfig) configuration class: [TFConvNextForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.TFConvNextForImageClassification) (ConvNeXT model) - [ConvNextV2Config](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [TFConvNextV2ForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.TFConvNextV2ForImageClassification) (ConvNeXTV2 model) - [CvtConfig](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.CvtConfig) configuration class: [TFCvtForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.TFCvtForImageClassification) (CvT model) - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [TFData2VecVisionForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.TFData2VecVisionForImageClassification) (Data2VecVision model) - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [TFDeiTForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTForImageClassification) or [TFDeiTForImageClassificationWithTeacher](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTForImageClassificationWithTeacher) (DeiT model) - `EfficientFormerConfig` configuration class: `TFEfficientFormerForImageClassification` or `TFEfficientFormerForImageClassificationWithTeacher` (EfficientFormer model) - `MobileViTConfig` configuration class: `TFMobileViTForImageClassification` (MobileViT model) - `RegNetConfig` configuration class: `TFRegNetForImageClassification` (RegNet model) - `ResNetConfig` configuration class: `TFResNetForImageClassification` (ResNet model) - `SegformerConfig` configuration class: `TFSegformerForImageClassification` (SegFormer model) - `SwiftFormerConfig` configuration class: `TFSwiftFormerForImageClassification` (SwiftFormer model) - `SwinConfig` configuration class: `TFSwinForImageClassification` (Swin Transformer model) - `ViTConfig` configuration class: `TFViTForImageClassification` (ViT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **convnext** -- [TFConvNextForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnext#transformers.TFConvNextForImageClassification) (ConvNeXT model)
- **convnextv2** -- [TFConvNextV2ForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/convnextv2#transformers.TFConvNextV2ForImageClassification) (ConvNeXTV2 model)
- **cvt** -- [TFCvtForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/cvt#transformers.TFCvtForImageClassification) (CvT model)
- **data2vec-vision** -- [TFData2VecVisionForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.TFData2VecVisionForImageClassification) (Data2VecVision model)
- **deit** -- [TFDeiTForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTForImageClassification) or [TFDeiTForImageClassificationWithTeacher](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTForImageClassificationWithTeacher) (DeiT model)
- **efficientformer** -- `TFEfficientFormerForImageClassification` or `TFEfficientFormerForImageClassificationWithTeacher` (EfficientFormer model)
- **mobilevit** -- `TFMobileViTForImageClassification` (MobileViT model)
- **regnet** -- `TFRegNetForImageClassification` (RegNet model)
- **resnet** -- `TFResNetForImageClassification` (ResNet model)
- **segformer** -- `TFSegformerForImageClassification` (SegFormer model)
- **swiftformer** -- `TFSwiftFormerForImageClassification` (SwiftFormer model)
- **swin** -- `TFSwinForImageClassification` (Swin Transformer model)
- **vit** -- `TFViTForImageClassification` (ViT model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForImageClassification[[transformers.FlaxAutoModelForImageClassification]]

#### transformers.FlaxAutoModelForImageClassification[[transformers.FlaxAutoModelForImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L361)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) configuration class: [FlaxBeitForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.FlaxBeitForImageClassification) (BEiT model)
  - `Dinov2Config` configuration class: `FlaxDinov2ForImageClassification` (DINOv2 model)
  - `RegNetConfig` configuration class: `FlaxRegNetForImageClassification` (RegNet model)
  - `ResNetConfig` configuration class: `FlaxResNetForImageClassification` (ResNet model)
  - `ViTConfig` configuration class: `FlaxViTForImageClassification` (ViT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForImageClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) configuration class: [FlaxBeitForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.FlaxBeitForImageClassification) (BEiT model) - `Dinov2Config` configuration class: `FlaxDinov2ForImageClassification` (DINOv2 model) - `RegNetConfig` configuration class: `FlaxRegNetForImageClassification` (RegNet model) - `ResNetConfig` configuration class: `FlaxResNetForImageClassification` (ResNet model) - `ViTConfig` configuration class: `FlaxViTForImageClassification` (ViT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **beit** -- [FlaxBeitForImageClassification](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.FlaxBeitForImageClassification) (BEiT model)
- **dinov2** -- `FlaxDinov2ForImageClassification` (DINOv2 model)
- **regnet** -- `FlaxRegNetForImageClassification` (RegNet model)
- **resnet** -- `FlaxResNetForImageClassification` (ResNet model)
- **vit** -- `FlaxViTForImageClassification` (ViT model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForVideoClassification[[transformers.AutoModelForVideoClassification]]

#### transformers.AutoModelForVideoClassification[[transformers.AutoModelForVideoClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2151)

This is a generic model class that will be instantiated as one of the model classes of the library (with a video classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForVideoClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `TimesformerConfig` configuration class: `TimesformerForVideoClassification` (TimeSformer model)
  - `VJEPA2Config` configuration class: `VJEPA2ForVideoClassification` (VJEPA2Model model)
  - `VideoMAEConfig` configuration class: `VideoMAEForVideoClassification` (VideoMAE model)
  - `VivitConfig` configuration class: `VivitForVideoClassification` (ViViT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a video classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForVideoClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `TimesformerConfig` configuration class: `TimesformerForVideoClassification` (TimeSformer model) - `VJEPA2Config` configuration class: `VJEPA2ForVideoClassification` (VJEPA2Model model) - `VideoMAEConfig` configuration class: `VideoMAEForVideoClassification` (VideoMAE model) - `VivitConfig` configuration class: `VivitForVideoClassification` (ViViT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForVideoClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a video classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **timesformer** -- `TimesformerForVideoClassification` (TimeSformer model)
- **videomae** -- `VideoMAEForVideoClassification` (VideoMAE model)
- **vivit** -- `VivitForVideoClassification` (ViViT model)
- **vjepa2** -- `VJEPA2ForVideoClassification` (VJEPA2Model model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForVideoClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForMaskedImageModeling[[transformers.AutoModelForMaskedImageModeling]]

#### transformers.AutoModelForMaskedImageModeling[[transformers.AutoModelForMaskedImageModeling]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2234)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked image modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForMaskedImageModeling.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTForMaskedImageModeling](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTForMaskedImageModeling) (DeiT model)
  - `FocalNetConfig` configuration class: `FocalNetForMaskedImageModeling` (FocalNet model)
  - `SwinConfig` configuration class: `SwinForMaskedImageModeling` (Swin Transformer model)
  - `Swinv2Config` configuration class: `Swinv2ForMaskedImageModeling` (Swin Transformer V2 model)
  - `ViTConfig` configuration class: `ViTForMaskedImageModeling` (ViT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked image modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedImageModeling.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTForMaskedImageModeling](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTForMaskedImageModeling) (DeiT model) - `FocalNetConfig` configuration class: `FocalNetForMaskedImageModeling` (FocalNet model) - `SwinConfig` configuration class: `SwinForMaskedImageModeling` (Swin Transformer model) - `Swinv2Config` configuration class: `Swinv2ForMaskedImageModeling` (Swin Transformer V2 model) - `ViTConfig` configuration class: `ViTForMaskedImageModeling` (ViT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForMaskedImageModeling.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a masked image modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **deit** -- [DeiTForMaskedImageModeling](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTForMaskedImageModeling) (DeiT model)
- **focalnet** -- `FocalNetForMaskedImageModeling` (FocalNet model)
- **swin** -- `SwinForMaskedImageModeling` (Swin Transformer model)
- **swinv2** -- `Swinv2ForMaskedImageModeling` (Swin Transformer V2 model)
- **vit** -- `ViTForMaskedImageModeling` (ViT model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedImageModeling.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForMaskedImageModeling[[transformers.TFAutoModelForMaskedImageModeling]]

#### transformers.TFAutoModelForMaskedImageModeling[[transformers.TFAutoModelForMaskedImageModeling]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L576)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked image modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForMaskedImageModeling.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [TFDeiTForMaskedImageModeling](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTForMaskedImageModeling) (DeiT model)
  - `SwinConfig` configuration class: `TFSwinForMaskedImageModeling` (Swin Transformer model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked image modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMaskedImageModeling

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMaskedImageModeling.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [DeiTConfig](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.DeiTConfig) configuration class: [TFDeiTForMaskedImageModeling](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTForMaskedImageModeling) (DeiT model) - `SwinConfig` configuration class: `TFSwinForMaskedImageModeling` (Swin Transformer model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForMaskedImageModeling.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a masked image modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **deit** -- [TFDeiTForMaskedImageModeling](/docs/transformers/v4.57.1/ja/model_doc/deit#transformers.TFDeiTForMaskedImageModeling) (DeiT model)
- **swin** -- `TFSwinForMaskedImageModeling` (Swin Transformer model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMaskedImageModeling

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForObjectDetection[[transformers.AutoModelForObjectDetection]]

#### transformers.AutoModelForObjectDetection[[transformers.AutoModelForObjectDetection]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2128)

This is a generic model class that will be instantiated as one of the model classes of the library (with a object detection head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForObjectDetection.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [ConditionalDetrConfig](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrConfig) configuration class: [ConditionalDetrForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrForObjectDetection) (Conditional DETR model)
  - `DFineConfig` configuration class: `DFineForObjectDetection` (D-FINE model)
  - `DabDetrConfig` configuration class: `DabDetrForObjectDetection` (DAB-DETR model)
  - [DeformableDetrConfig](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrConfig) configuration class: [DeformableDetrForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrForObjectDetection) (Deformable DETR model)
  - [DetaConfig](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaConfig) configuration class: [DetaForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaForObjectDetection) (DETA model)
  - [DetrConfig](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrForObjectDetection) (DETR model)
  - `RTDetrConfig` configuration class: `RTDetrForObjectDetection` (RT-DETR model)
  - `RTDetrV2Config` configuration class: `RTDetrV2ForObjectDetection` (RT-DETRv2 model)
  - `TableTransformerConfig` configuration class: `TableTransformerForObjectDetection` (Table Transformer model)
  - `YolosConfig` configuration class: `YolosForObjectDetection` (YOLOS model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a object detection head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForObjectDetection.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [ConditionalDetrConfig](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrConfig) configuration class: [ConditionalDetrForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrForObjectDetection) (Conditional DETR model) - `DFineConfig` configuration class: `DFineForObjectDetection` (D-FINE model) - `DabDetrConfig` configuration class: `DabDetrForObjectDetection` (DAB-DETR model) - [DeformableDetrConfig](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrConfig) configuration class: [DeformableDetrForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrForObjectDetection) (Deformable DETR model) - [DetaConfig](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaConfig) configuration class: [DetaForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaForObjectDetection) (DETA model) - [DetrConfig](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrForObjectDetection) (DETR model) - `RTDetrConfig` configuration class: `RTDetrForObjectDetection` (RT-DETR model) - `RTDetrV2Config` configuration class: `RTDetrV2ForObjectDetection` (RT-DETRv2 model) - `TableTransformerConfig` configuration class: `TableTransformerForObjectDetection` (Table Transformer model) - `YolosConfig` configuration class: `YolosForObjectDetection` (YOLOS model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForObjectDetection.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a object detection head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **conditional_detr** -- [ConditionalDetrForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/conditional_detr#transformers.ConditionalDetrForObjectDetection) (Conditional DETR model)
- **d_fine** -- `DFineForObjectDetection` (D-FINE model)
- **dab-detr** -- `DabDetrForObjectDetection` (DAB-DETR model)
- **deformable_detr** -- [DeformableDetrForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/deformable_detr#transformers.DeformableDetrForObjectDetection) (Deformable DETR model)
- **deta** -- [DetaForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/deta#transformers.DetaForObjectDetection) (DETA model)
- **detr** -- [DetrForObjectDetection](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrForObjectDetection) (DETR model)
- **rt_detr** -- `RTDetrForObjectDetection` (RT-DETR model)
- **rt_detr_v2** -- `RTDetrV2ForObjectDetection` (RT-DETRv2 model)
- **table-transformer** -- `TableTransformerForObjectDetection` (Table Transformer model)
- **yolos** -- `YolosForObjectDetection` (YOLOS model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForObjectDetection.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageSegmentation[[transformers.AutoModelForImageSegmentation]]

#### transformers.AutoModelForImageSegmentation[[transformers.AutoModelForImageSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2085)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image segmentation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForImageSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [DetrConfig](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForSegmentation](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrForSegmentation) (DETR model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageSegmentation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [DetrConfig](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForSegmentation](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrForSegmentation) (DETR model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForImageSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a image segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **detr** -- [DetrForSegmentation](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrForSegmentation) (DETR model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageToImage[[transformers.AutoModelForImageToImage]]

#### transformers.AutoModelForImageToImage[[transformers.AutoModelForImageToImage]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1936)

### AutoModelForSemanticSegmentation[[transformers.AutoModelForSemanticSegmentation]]

#### transformers.AutoModelForSemanticSegmentation[[transformers.AutoModelForSemanticSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2092)

This is a generic model class that will be instantiated as one of the model classes of the library (with a semantic segmentation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSemanticSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) configuration class: [BeitForSemanticSegmentation](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitForSemanticSegmentation) (BEiT model)
  - `DPTConfig` configuration class: `DPTForSemanticSegmentation` (DPT model)
  - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionForSemanticSegmentation](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionForSemanticSegmentation) (Data2VecVision model)
  - `MobileNetV2Config` configuration class: `MobileNetV2ForSemanticSegmentation` (MobileNetV2 model)
  - `MobileViTConfig` configuration class: `MobileViTForSemanticSegmentation` (MobileViT model)
  - `MobileViTV2Config` configuration class: `MobileViTV2ForSemanticSegmentation` (MobileViTV2 model)
  - `SegformerConfig` configuration class: `SegformerForSemanticSegmentation` (SegFormer model)
  - `UperNetConfig` configuration class: `UperNetForSemanticSegmentation` (UPerNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a semantic segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSemanticSegmentation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BeitConfig](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitConfig) configuration class: [BeitForSemanticSegmentation](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitForSemanticSegmentation) (BEiT model) - `DPTConfig` configuration class: `DPTForSemanticSegmentation` (DPT model) - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionForSemanticSegmentation](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionForSemanticSegmentation) (Data2VecVision model) - `MobileNetV2Config` configuration class: `MobileNetV2ForSemanticSegmentation` (MobileNetV2 model) - `MobileViTConfig` configuration class: `MobileViTForSemanticSegmentation` (MobileViT model) - `MobileViTV2Config` configuration class: `MobileViTV2ForSemanticSegmentation` (MobileViTV2 model) - `SegformerConfig` configuration class: `SegformerForSemanticSegmentation` (SegFormer model) - `UperNetConfig` configuration class: `UperNetForSemanticSegmentation` (UPerNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSemanticSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a semantic segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **beit** -- [BeitForSemanticSegmentation](/docs/transformers/v4.57.1/ja/model_doc/beit#transformers.BeitForSemanticSegmentation) (BEiT model)
- **data2vec-vision** -- [Data2VecVisionForSemanticSegmentation](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionForSemanticSegmentation) (Data2VecVision model)
- **dpt** -- `DPTForSemanticSegmentation` (DPT model)
- **mobilenet_v2** -- `MobileNetV2ForSemanticSegmentation` (MobileNetV2 model)
- **mobilevit** -- `MobileViTForSemanticSegmentation` (MobileViT model)
- **mobilevitv2** -- `MobileViTV2ForSemanticSegmentation` (MobileViTV2 model)
- **segformer** -- `SegformerForSemanticSegmentation` (SegFormer model)
- **upernet** -- `UperNetForSemanticSegmentation` (UPerNet model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSemanticSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForSemanticSegmentation[[transformers.TFAutoModelForSemanticSegmentation]]

#### transformers.TFAutoModelForSemanticSegmentation[[transformers.TFAutoModelForSemanticSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L603)

This is a generic model class that will be instantiated as one of the model classes of the library (with a semantic segmentation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForSemanticSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [TFData2VecVisionForSemanticSegmentation](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.TFData2VecVisionForSemanticSegmentation) (Data2VecVision model)
  - `MobileViTConfig` configuration class: `TFMobileViTForSemanticSegmentation` (MobileViT model)
  - `SegformerConfig` configuration class: `TFSegformerForSemanticSegmentation` (SegFormer model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a semantic segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSemanticSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSemanticSegmentation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [Data2VecVisionConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [TFData2VecVisionForSemanticSegmentation](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.TFData2VecVisionForSemanticSegmentation) (Data2VecVision model) - `MobileViTConfig` configuration class: `TFMobileViTForSemanticSegmentation` (MobileViT model) - `SegformerConfig` configuration class: `TFSegformerForSemanticSegmentation` (SegFormer model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForSemanticSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a semantic segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-vision** -- [TFData2VecVisionForSemanticSegmentation](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.TFData2VecVisionForSemanticSegmentation) (Data2VecVision model)
- **mobilevit** -- `TFMobileViTForSemanticSegmentation` (MobileViT model)
- **segformer** -- `TFSegformerForSemanticSegmentation` (SegFormer model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSemanticSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForInstanceSegmentation[[transformers.AutoModelForInstanceSegmentation]]

#### transformers.AutoModelForInstanceSegmentation[[transformers.AutoModelForInstanceSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2119)

This is a generic model class that will be instantiated as one of the model classes of the library (with a instance segmentation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForInstanceSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `MaskFormerConfig` configuration class: `MaskFormerForInstanceSegmentation` (MaskFormer model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a instance segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForInstanceSegmentation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `MaskFormerConfig` configuration class: `MaskFormerForInstanceSegmentation` (MaskFormer model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForInstanceSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a instance segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **maskformer** -- `MaskFormerForInstanceSegmentation` (MaskFormer model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForInstanceSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForUniversalSegmentation[[transformers.AutoModelForUniversalSegmentation]]

#### transformers.AutoModelForUniversalSegmentation[[transformers.AutoModelForUniversalSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2110)

This is a generic model class that will be instantiated as one of the model classes of the library (with a universal image segmentation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForUniversalSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [DetrConfig](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForSegmentation](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrForSegmentation) (DETR model)
  - `EomtConfig` configuration class: `EomtForUniversalSegmentation` (EoMT model)
  - `Mask2FormerConfig` configuration class: `Mask2FormerForUniversalSegmentation` (Mask2Former model)
  - `MaskFormerConfig` configuration class: `MaskFormerForInstanceSegmentation` (MaskFormer model)
  - `OneFormerConfig` configuration class: `OneFormerForUniversalSegmentation` (OneFormer model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a universal image segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForUniversalSegmentation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [DetrConfig](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForSegmentation](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrForSegmentation) (DETR model) - `EomtConfig` configuration class: `EomtForUniversalSegmentation` (EoMT model) - `Mask2FormerConfig` configuration class: `Mask2FormerForUniversalSegmentation` (Mask2Former model) - `MaskFormerConfig` configuration class: `MaskFormerForInstanceSegmentation` (MaskFormer model) - `OneFormerConfig` configuration class: `OneFormerForUniversalSegmentation` (OneFormer model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForUniversalSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a universal image segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **detr** -- [DetrForSegmentation](/docs/transformers/v4.57.1/ja/model_doc/detr#transformers.DetrForSegmentation) (DETR model)
- **eomt** -- `EomtForUniversalSegmentation` (EoMT model)
- **mask2former** -- `Mask2FormerForUniversalSegmentation` (Mask2Former model)
- **maskformer** -- `MaskFormerForInstanceSegmentation` (MaskFormer model)
- **oneformer** -- `OneFormerForUniversalSegmentation` (OneFormer model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForUniversalSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForZeroShotImageClassification[[transformers.AutoModelForZeroShotImageClassification]]

#### transformers.AutoModelForZeroShotImageClassification[[transformers.AutoModelForZeroShotImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2076)

This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot image classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForZeroShotImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlignConfig](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignConfig) configuration class: [AlignModel](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignModel) (ALIGN model)
  - [AltCLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model)
  - [Blip2Config](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Config) configuration class: `Blip2ForImageTextRetrieval` (BLIP-2 model)
  - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipModel) (BLIP model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPModel) (CLIP model)
  - [CLIPSegConfig](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model)
  - [ChineseCLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPConfig) configuration class: [ChineseCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPModel) (Chinese-CLIP model)
  - `MetaClip2Config` configuration class: `MetaClip2Model` (MetaCLIP 2 model)
  - `Siglip2Config` configuration class: `Siglip2Model` (SigLIP2 model)
  - `SiglipConfig` configuration class: `SiglipModel` (SigLIP model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a zero-shot image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotImageClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlignConfig](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignConfig) configuration class: [AlignModel](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignModel) (ALIGN model) - [AltCLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model) - [Blip2Config](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Config) configuration class: `Blip2ForImageTextRetrieval` (BLIP-2 model) - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipModel) (BLIP model) - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPModel) (CLIP model) - [CLIPSegConfig](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model) - [ChineseCLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPConfig) configuration class: [ChineseCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPModel) (Chinese-CLIP model) - `MetaClip2Config` configuration class: `MetaClip2Model` (MetaCLIP 2 model) - `Siglip2Config` configuration class: `Siglip2Model` (SigLIP2 model) - `SiglipConfig` configuration class: `SiglipModel` (SigLIP model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForZeroShotImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a zero-shot image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **align** -- [AlignModel](/docs/transformers/v4.57.1/ja/model_doc/align#transformers.AlignModel) (ALIGN model)
- **altclip** -- [AltCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model)
- **blip** -- [BlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipModel) (BLIP model)
- **blip-2** -- `Blip2ForImageTextRetrieval` (BLIP-2 model)
- **chinese_clip** -- [ChineseCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/chinese_clip#transformers.ChineseCLIPModel) (Chinese-CLIP model)
- **clip** -- [CLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPModel) (CLIP model)
- **clipseg** -- [CLIPSegModel](/docs/transformers/v4.57.1/ja/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model)
- **metaclip_2** -- `MetaClip2Model` (MetaCLIP 2 model)
- **siglip** -- `SiglipModel` (SigLIP model)
- **siglip2** -- `Siglip2Model` (SigLIP2 model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForZeroShotImageClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForZeroShotImageClassification[[transformers.TFAutoModelForZeroShotImageClassification]]

#### transformers.TFAutoModelForZeroShotImageClassification[[transformers.TFAutoModelForZeroShotImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L594)

This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot image classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForZeroShotImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.TFBlipModel) (BLIP model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: [TFCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.TFCLIPModel) (CLIP model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a zero-shot image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForZeroShotImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForZeroShotImageClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.TFBlipModel) (BLIP model) - [CLIPConfig](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.CLIPConfig) configuration class: [TFCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.TFCLIPModel) (CLIP model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForZeroShotImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a zero-shot image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **blip** -- [TFBlipModel](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.TFBlipModel) (BLIP model)
- **clip** -- [TFCLIPModel](/docs/transformers/v4.57.1/ja/model_doc/clip#transformers.TFCLIPModel) (CLIP model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForZeroShotImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForZeroShotObjectDetection[[transformers.AutoModelForZeroShotObjectDetection]]

#### transformers.AutoModelForZeroShotObjectDetection[[transformers.AutoModelForZeroShotObjectDetection]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2135)

This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot object detection head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForZeroShotObjectDetection.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `GroundingDinoConfig` configuration class: `GroundingDinoForObjectDetection` (Grounding DINO model)
  - `MMGroundingDinoConfig` configuration class: `MMGroundingDinoForObjectDetection` (MM Grounding DINO model)
  - `OmDetTurboConfig` configuration class: `OmDetTurboForObjectDetection` (OmDet-Turbo model)
  - `OwlViTConfig` configuration class: `OwlViTForObjectDetection` (OWL-ViT model)
  - `Owlv2Config` configuration class: `Owlv2ForObjectDetection` (OWLv2 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a zero-shot object detection head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotObjectDetection.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `GroundingDinoConfig` configuration class: `GroundingDinoForObjectDetection` (Grounding DINO model) - `MMGroundingDinoConfig` configuration class: `MMGroundingDinoForObjectDetection` (MM Grounding DINO model) - `OmDetTurboConfig` configuration class: `OmDetTurboForObjectDetection` (OmDet-Turbo model) - `OwlViTConfig` configuration class: `OwlViTForObjectDetection` (OWL-ViT model) - `Owlv2Config` configuration class: `Owlv2ForObjectDetection` (OWLv2 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForZeroShotObjectDetection.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a zero-shot object detection head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **grounding-dino** -- `GroundingDinoForObjectDetection` (Grounding DINO model)
- **mm-grounding-dino** -- `MMGroundingDinoForObjectDetection` (MM Grounding DINO model)
- **omdet-turbo** -- `OmDetTurboForObjectDetection` (OmDet-Turbo model)
- **owlv2** -- `Owlv2ForObjectDetection` (OWLv2 model)
- **owlvit** -- `OwlViTForObjectDetection` (OWL-ViT model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## Audio

以下の自動クラスは、次の音声タスクに利用可能です。

### AutoModelForAudioClassification[[transformers.AutoModelForAudioClassification]]

#### transformers.AutoModelForAudioClassification[[transformers.AutoModelForAudioClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2183)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForAudioClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [ASTConfig](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTConfig) configuration class: [ASTForAudioClassification](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTForAudioClassification) (Audio Spectrogram Transformer model)
  - [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForSequenceClassification) (Data2VecAudio model)
  - `HubertConfig` configuration class: `HubertForSequenceClassification` (Hubert model)
  - `SEWConfig` configuration class: `SEWForSequenceClassification` (SEW model)
  - `SEWDConfig` configuration class: `SEWDForSequenceClassification` (SEW-D model)
  - `UniSpeechConfig` configuration class: `UniSpeechForSequenceClassification` (UniSpeech model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatForSequenceClassification` (UniSpeechSat model)
  - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForSequenceClassification` (Wav2Vec2-BERT model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForSequenceClassification` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForSequenceClassification` (Wav2Vec2-Conformer model)
  - `WavLMConfig` configuration class: `WavLMForSequenceClassification` (WavLM model)
  - `WhisperConfig` configuration class: `WhisperForAudioClassification` (Whisper model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [ASTConfig](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTConfig) configuration class: [ASTForAudioClassification](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTForAudioClassification) (Audio Spectrogram Transformer model) - [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForSequenceClassification) (Data2VecAudio model) - `HubertConfig` configuration class: `HubertForSequenceClassification` (Hubert model) - `SEWConfig` configuration class: `SEWForSequenceClassification` (SEW model) - `SEWDConfig` configuration class: `SEWDForSequenceClassification` (SEW-D model) - `UniSpeechConfig` configuration class: `UniSpeechForSequenceClassification` (UniSpeech model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatForSequenceClassification` (UniSpeechSat model) - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForSequenceClassification` (Wav2Vec2-BERT model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForSequenceClassification` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForSequenceClassification` (Wav2Vec2-Conformer model) - `WavLMConfig` configuration class: `WavLMForSequenceClassification` (WavLM model) - `WhisperConfig` configuration class: `WhisperForAudioClassification` (Whisper model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForAudioClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a audio classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **audio-spectrogram-transformer** -- [ASTForAudioClassification](/docs/transformers/v4.57.1/ja/model_doc/audio-spectrogram-transformer#transformers.ASTForAudioClassification) (Audio Spectrogram Transformer model)
- **data2vec-audio** -- [Data2VecAudioForSequenceClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForSequenceClassification) (Data2VecAudio model)
- **hubert** -- `HubertForSequenceClassification` (Hubert model)
- **sew** -- `SEWForSequenceClassification` (SEW model)
- **sew-d** -- `SEWDForSequenceClassification` (SEW-D model)
- **unispeech** -- `UniSpeechForSequenceClassification` (UniSpeech model)
- **unispeech-sat** -- `UniSpeechSatForSequenceClassification` (UniSpeechSat model)
- **wav2vec2** -- `Wav2Vec2ForSequenceClassification` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertForSequenceClassification` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerForSequenceClassification` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMForSequenceClassification` (WavLM model)
- **whisper** -- `WhisperForAudioClassification` (Whisper model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForAudioFrameClassification[[transformers.TFAutoModelForAudioClassification]]

#### transformers.TFAutoModelForAudioClassification[[transformers.TFAutoModelForAudioClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L545)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForAudioClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `Wav2Vec2Config` configuration class: `TFWav2Vec2ForSequenceClassification` (Wav2Vec2 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForAudioClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForAudioClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `Wav2Vec2Config` configuration class: `TFWav2Vec2ForSequenceClassification` (Wav2Vec2 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForAudioClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a audio classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **wav2vec2** -- `TFWav2Vec2ForSequenceClassification` (Wav2Vec2 model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForAudioClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForAudioClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForAudioFrameClassification[[transformers.AutoModelForAudioFrameClassification]]

#### transformers.AutoModelForAudioFrameClassification[[transformers.AutoModelForAudioFrameClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2206)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio frame (token) classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForAudioFrameClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForAudioFrameClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForAudioFrameClassification) (Data2VecAudio model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatForAudioFrameClassification` (UniSpeechSat model)
  - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForAudioFrameClassification` (Wav2Vec2-BERT model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForAudioFrameClassification` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForAudioFrameClassification` (Wav2Vec2-Conformer model)
  - `WavLMConfig` configuration class: `WavLMForAudioFrameClassification` (WavLM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio frame (token) classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioFrameClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForAudioFrameClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForAudioFrameClassification) (Data2VecAudio model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatForAudioFrameClassification` (UniSpeechSat model) - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForAudioFrameClassification` (Wav2Vec2-BERT model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForAudioFrameClassification` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForAudioFrameClassification` (Wav2Vec2-Conformer model) - `WavLMConfig` configuration class: `WavLMForAudioFrameClassification` (WavLM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForAudioFrameClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a audio frame (token) classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-audio** -- [Data2VecAudioForAudioFrameClassification](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForAudioFrameClassification) (Data2VecAudio model)
- **unispeech-sat** -- `UniSpeechSatForAudioFrameClassification` (UniSpeechSat model)
- **wav2vec2** -- `Wav2Vec2ForAudioFrameClassification` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertForAudioFrameClassification` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerForAudioFrameClassification` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMForAudioFrameClassification` (WavLM model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioFrameClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForCTC[[transformers.AutoModelForCTC]]

#### transformers.AutoModelForCTC[[transformers.AutoModelForCTC]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2190)

This is a generic model class that will be instantiated as one of the model classes of the library (with a connectionist temporal classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForCTC.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForCTC](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForCTC) (Data2VecAudio model)
  - `HubertConfig` configuration class: `HubertForCTC` (Hubert model)
  - `MCTCTConfig` configuration class: `MCTCTForCTC` (M-CTC-T model)
  - `ParakeetCTCConfig` configuration class: `ParakeetForCTC` (Parakeet model)
  - `SEWConfig` configuration class: `SEWForCTC` (SEW model)
  - `SEWDConfig` configuration class: `SEWDForCTC` (SEW-D model)
  - `UniSpeechConfig` configuration class: `UniSpeechForCTC` (UniSpeech model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatForCTC` (UniSpeechSat model)
  - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForCTC` (Wav2Vec2-BERT model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForCTC` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForCTC` (Wav2Vec2-Conformer model)
  - `WavLMConfig` configuration class: `WavLMForCTC` (WavLM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a connectionist temporal classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCTC

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCTC.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForCTC](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForCTC) (Data2VecAudio model) - `HubertConfig` configuration class: `HubertForCTC` (Hubert model) - `MCTCTConfig` configuration class: `MCTCTForCTC` (M-CTC-T model) - `ParakeetCTCConfig` configuration class: `ParakeetForCTC` (Parakeet model) - `SEWConfig` configuration class: `SEWForCTC` (SEW model) - `SEWDConfig` configuration class: `SEWDForCTC` (SEW-D model) - `UniSpeechConfig` configuration class: `UniSpeechForCTC` (UniSpeech model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatForCTC` (UniSpeechSat model) - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForCTC` (Wav2Vec2-BERT model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForCTC` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForCTC` (Wav2Vec2-Conformer model) - `WavLMConfig` configuration class: `WavLMForCTC` (WavLM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForCTC.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a connectionist temporal classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-audio** -- [Data2VecAudioForCTC](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForCTC) (Data2VecAudio model)
- **hubert** -- `HubertForCTC` (Hubert model)
- **mctct** -- `MCTCTForCTC` (M-CTC-T model)
- **parakeet_ctc** -- `ParakeetForCTC` (Parakeet model)
- **sew** -- `SEWForCTC` (SEW model)
- **sew-d** -- `SEWDForCTC` (SEW-D model)
- **unispeech** -- `UniSpeechForCTC` (UniSpeech model)
- **unispeech-sat** -- `UniSpeechSatForCTC` (UniSpeechSat model)
- **wav2vec2** -- `Wav2Vec2ForCTC` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertForCTC` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerForCTC` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMForCTC` (WavLM model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCTC

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCTC.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForSpeechSeq2Seq[[transformers.AutoModelForSpeechSeq2Seq]]

#### transformers.AutoModelForSpeechSeq2Seq[[transformers.AutoModelForSpeechSeq2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2197)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSpeechSeq2Seq.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `DiaConfig` configuration class: `DiaForConditionalGeneration` (Dia model)
  - `GraniteSpeechConfig` configuration class: `GraniteSpeechForConditionalGeneration` (GraniteSpeech model)
  - `KyutaiSpeechToTextConfig` configuration class: `KyutaiSpeechToTextForConditionalGeneration` (KyutaiSpeechToText model)
  - `MoonshineConfig` configuration class: `MoonshineForConditionalGeneration` (Moonshine model)
  - `Pop2PianoConfig` configuration class: `Pop2PianoForConditionalGeneration` (Pop2Piano model)
  - `SeamlessM4TConfig` configuration class: `SeamlessM4TForSpeechToText` (SeamlessM4T model)
  - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2ForSpeechToText` (SeamlessM4Tv2 model)
  - `Speech2TextConfig` configuration class: `Speech2TextForConditionalGeneration` (Speech2Text model)
  - `SpeechEncoderDecoderConfig` configuration class: `SpeechEncoderDecoderModel` (Speech Encoder decoder model)
  - `SpeechT5Config` configuration class: `SpeechT5ForSpeechToText` (SpeechT5 model)
  - `WhisperConfig` configuration class: `WhisperForConditionalGeneration` (Whisper model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSpeechSeq2Seq.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `DiaConfig` configuration class: `DiaForConditionalGeneration` (Dia model) - `GraniteSpeechConfig` configuration class: `GraniteSpeechForConditionalGeneration` (GraniteSpeech model) - `KyutaiSpeechToTextConfig` configuration class: `KyutaiSpeechToTextForConditionalGeneration` (KyutaiSpeechToText model) - `MoonshineConfig` configuration class: `MoonshineForConditionalGeneration` (Moonshine model) - `Pop2PianoConfig` configuration class: `Pop2PianoForConditionalGeneration` (Pop2Piano model) - `SeamlessM4TConfig` configuration class: `SeamlessM4TForSpeechToText` (SeamlessM4T model) - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2ForSpeechToText` (SeamlessM4Tv2 model) - `Speech2TextConfig` configuration class: `Speech2TextForConditionalGeneration` (Speech2Text model) - `SpeechEncoderDecoderConfig` configuration class: `SpeechEncoderDecoderModel` (Speech Encoder decoder model) - `SpeechT5Config` configuration class: `SpeechT5ForSpeechToText` (SpeechT5 model) - `WhisperConfig` configuration class: `WhisperForConditionalGeneration` (Whisper model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSpeechSeq2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **dia** -- `DiaForConditionalGeneration` (Dia model)
- **granite_speech** -- `GraniteSpeechForConditionalGeneration` (GraniteSpeech model)
- **kyutai_speech_to_text** -- `KyutaiSpeechToTextForConditionalGeneration` (KyutaiSpeechToText model)
- **moonshine** -- `MoonshineForConditionalGeneration` (Moonshine model)
- **pop2piano** -- `Pop2PianoForConditionalGeneration` (Pop2Piano model)
- **seamless_m4t** -- `SeamlessM4TForSpeechToText` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4Tv2ForSpeechToText` (SeamlessM4Tv2 model)
- **speech-encoder-decoder** -- `SpeechEncoderDecoderModel` (Speech Encoder decoder model)
- **speech_to_text** -- `Speech2TextForConditionalGeneration` (Speech2Text model)
- **speecht5** -- `SpeechT5ForSpeechToText` (SpeechT5 model)
- **whisper** -- `WhisperForConditionalGeneration` (Whisper model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForSpeechSeq2Seq[[transformers.TFAutoModelForSpeechSeq2Seq]]

#### transformers.TFAutoModelForSpeechSeq2Seq[[transformers.TFAutoModelForSpeechSeq2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L700)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForSpeechSeq2Seq.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `Speech2TextConfig` configuration class: `TFSpeech2TextForConditionalGeneration` (Speech2Text model)
  - `WhisperConfig` configuration class: `TFWhisperForConditionalGeneration` (Whisper model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSpeechSeq2Seq.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `Speech2TextConfig` configuration class: `TFSpeech2TextForConditionalGeneration` (Speech2Text model) - `WhisperConfig` configuration class: `TFWhisperForConditionalGeneration` (Whisper model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForSpeechSeq2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **speech_to_text** -- `TFSpeech2TextForConditionalGeneration` (Speech2Text model)
- **whisper** -- `TFWhisperForConditionalGeneration` (Whisper model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForSpeechSeq2Seq[[transformers.FlaxAutoModelForSpeechSeq2Seq]]

#### transformers.FlaxAutoModelForSpeechSeq2Seq[[transformers.FlaxAutoModelForSpeechSeq2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L377)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForSpeechSeq2Seq.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `SpeechEncoderDecoderConfig` configuration class: `FlaxSpeechEncoderDecoderModel` (Speech Encoder decoder model)
  - `WhisperConfig` configuration class: `FlaxWhisperForConditionalGeneration` (Whisper model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `SpeechEncoderDecoderConfig` configuration class: `FlaxSpeechEncoderDecoderModel` (Speech Encoder decoder model) - `WhisperConfig` configuration class: `FlaxWhisperForConditionalGeneration` (Whisper model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForSpeechSeq2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **speech-encoder-decoder** -- `FlaxSpeechEncoderDecoderModel` (Speech Encoder decoder model)
- **whisper** -- `FlaxWhisperForConditionalGeneration` (Whisper model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForAudioXVector[[transformers.AutoModelForAudioXVector]]

#### transformers.AutoModelForAudioXVector[[transformers.AutoModelForAudioXVector]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2215)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio retrieval via x-vector head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForAudioXVector.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForXVector](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForXVector) (Data2VecAudio model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatForXVector` (UniSpeechSat model)
  - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForXVector` (Wav2Vec2-BERT model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForXVector` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForXVector` (Wav2Vec2-Conformer model)
  - `WavLMConfig` configuration class: `WavLMForXVector` (WavLM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio retrieval via x-vector head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioXVector.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [Data2VecAudioConfig](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForXVector](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForXVector) (Data2VecAudio model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatForXVector` (UniSpeechSat model) - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForXVector` (Wav2Vec2-BERT model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForXVector` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForXVector` (Wav2Vec2-Conformer model) - `WavLMConfig` configuration class: `WavLMForXVector` (WavLM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForAudioXVector.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a audio retrieval via x-vector head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-audio** -- [Data2VecAudioForXVector](/docs/transformers/v4.57.1/ja/model_doc/data2vec#transformers.Data2VecAudioForXVector) (Data2VecAudio model)
- **unispeech-sat** -- `UniSpeechSatForXVector` (UniSpeechSat model)
- **wav2vec2** -- `Wav2Vec2ForXVector` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertForXVector` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerForXVector` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMForXVector` (WavLM model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioXVector.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTextToSpectrogram[[transformers.AutoModelForTextToSpectrogram]]

#### transformers.AutoModelForTextToSpectrogram[[transformers.AutoModelForTextToSpectrogram]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2219)

### AutoModelForTextToWaveform[[transformers.AutoModelForTextToWaveform]]

#### transformers.AutoModelForTextToWaveform[[transformers.AutoModelForTextToWaveform]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2223)

## Multimodal

以下の自動クラスは、次のマルチモーダルタスクに利用可能です。

### AutoModelForTableQuestionAnswering[[transformers.AutoModelForTableQuestionAnswering]]

#### transformers.AutoModelForTableQuestionAnswering[[transformers.AutoModelForTableQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2013)

This is a generic model class that will be instantiated as one of the model classes of the library (with a table question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTableQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `TapasConfig` configuration class: `TapasForQuestionAnswering` (TAPAS model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a table question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = AutoModelForTableQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `TapasConfig` configuration class: `TapasForQuestionAnswering` (TAPAS model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTableQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **tapas** -- `TapasForQuestionAnswering` (TAPAS model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> # Update configuration during loading
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/tapas_tf_model_config.json")
>>> model = AutoModelForTableQuestionAnswering.from_pretrained(
...     "./tf_model/tapas_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForTableQuestionAnswering[[transformers.TFAutoModelForTableQuestionAnswering]]

#### transformers.TFAutoModelForTableQuestionAnswering[[transformers.TFAutoModelForTableQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L664)

This is a generic model class that will be instantiated as one of the model classes of the library (with a table question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForTableQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `TapasConfig` configuration class: `TFTapasForQuestionAnswering` (TAPAS model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a table question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = TFAutoModelForTableQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `TapasConfig` configuration class: `TFTapasForQuestionAnswering` (TAPAS model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForTableQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **tapas** -- `TFTapasForQuestionAnswering` (TAPAS model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> # Update configuration during loading
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/tapas_pt_model_config.json")
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained(
...     "./pt_model/tapas_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForDocumentQuestionAnswering[[transformers.AutoModelForDocumentQuestionAnswering]]

#### transformers.AutoModelForDocumentQuestionAnswering[[transformers.AutoModelForDocumentQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2035)

This is a generic model class that will be instantiated as one of the model classes of the library (with a document question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForDocumentQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `LayoutLMConfig` configuration class: `LayoutLMForQuestionAnswering` (LayoutLM model)
  - `LayoutLMv2Config` configuration class: `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model)
  - `LayoutLMv3Config` configuration class: `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a document question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = AutoModelForDocumentQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `LayoutLMConfig` configuration class: `LayoutLMForQuestionAnswering` (LayoutLM model) - `LayoutLMv2Config` configuration class: `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model) - `LayoutLMv3Config` configuration class: `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForDocumentQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a document question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **layoutlm** -- `LayoutLMForQuestionAnswering` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> # Update configuration during loading
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/layoutlm_tf_model_config.json")
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained(
...     "./tf_model/layoutlm_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForDocumentQuestionAnswering[[transformers.TFAutoModelForDocumentQuestionAnswering]]

#### transformers.TFAutoModelForDocumentQuestionAnswering[[transformers.TFAutoModelForDocumentQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L653)

This is a generic model class that will be instantiated as one of the model classes of the library (with a document question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForDocumentQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `LayoutLMConfig` configuration class: `TFLayoutLMForQuestionAnswering` (LayoutLM model)
  - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a document question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForDocumentQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = TFAutoModelForDocumentQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `LayoutLMConfig` configuration class: `TFLayoutLMForQuestionAnswering` (LayoutLM model) - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForDocumentQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a document question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **layoutlm** -- `TFLayoutLMForQuestionAnswering` (LayoutLM model)
- **layoutlmv3** -- `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForDocumentQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> # Update configuration during loading
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/layoutlm_pt_model_config.json")
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained(
...     "./pt_model/layoutlm_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForVisualQuestionAnswering[[transformers.AutoModelForVisualQuestionAnswering]]

#### transformers.AutoModelForVisualQuestionAnswering[[transformers.AutoModelForVisualQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2024)

This is a generic model class that will be instantiated as one of the model classes of the library (with a visual question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForVisualQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [Blip2Config](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (BLIP-2 model)
  - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipForQuestionAnswering) (BLIP model)
  - `ViltConfig` configuration class: `ViltForQuestionAnswering` (ViLT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a visual question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("dandelin/vilt-b32-finetuned-vqa")
>>> model = AutoModelForVisualQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [Blip2Config](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (BLIP-2 model) - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipForQuestionAnswering) (BLIP model) - `ViltConfig` configuration class: `ViltForQuestionAnswering` (ViLT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForVisualQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a visual question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **blip** -- [BlipForQuestionAnswering](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipForQuestionAnswering) (BLIP model)
- **blip-2** -- [Blip2ForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (BLIP-2 model)
- **vilt** -- `ViltForQuestionAnswering` (ViLT model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa")

>>> # Update configuration during loading
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/vilt_tf_model_config.json")
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained(
...     "./tf_model/vilt_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForVision2Seq[[transformers.AutoModelForVision2Seq]]

#### transformers.AutoModelForVision2Seq[[transformers.AutoModelForVision2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2272)

### TFAutoModelForVision2Seq[[transformers.TFAutoModelForVision2Seq]]

#### transformers.TFAutoModelForVision2Seq[[transformers.TFAutoModelForVision2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L612)

This is a generic model class that will be instantiated as one of the model classes of the library (with a vision-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForVision2Seq.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.TFBlipForConditionalGeneration) (BLIP model)
  - `VisionEncoderDecoderConfig` configuration class: `TFVisionEncoderDecoderModel` (Vision Encoder decoder model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a vision-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForVision2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForVision2Seq.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.TFBlipForConditionalGeneration) (BLIP model) - `VisionEncoderDecoderConfig` configuration class: `TFVisionEncoderDecoderModel` (Vision Encoder decoder model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForVision2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a vision-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **blip** -- [TFBlipForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.TFBlipForConditionalGeneration) (BLIP model)
- **vision-encoder-decoder** -- `TFVisionEncoderDecoderModel` (Vision Encoder decoder model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForVision2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForVision2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForVision2Seq[[transformers.FlaxAutoModelForVision2Seq]]

#### transformers.FlaxAutoModelForVision2Seq[[transformers.FlaxAutoModelForVision2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L370)

This is a generic model class that will be instantiated as one of the model classes of the library (with a vision-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForVision2Seq.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `VisionEncoderDecoderConfig` configuration class: `FlaxVisionEncoderDecoderModel` (Vision Encoder decoder model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a vision-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForVision2Seq.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `VisionEncoderDecoderConfig` configuration class: `FlaxVisionEncoderDecoderModel` (Vision Encoder decoder model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForVision2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a vision-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **vision-encoder-decoder** -- `FlaxVisionEncoderDecoderModel` (Vision Encoder decoder model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForVision2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageTextToText[[transformers.AutoModelForImageTextToText]]

#### transformers.AutoModelForImageTextToText[[transformers.AutoModelForImageTextToText]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2166)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image-text-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForImageTextToText.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `AriaConfig` configuration class: `AriaForConditionalGeneration` (Aria model)
  - `AyaVisionConfig` configuration class: `AyaVisionForConditionalGeneration` (AyaVision model)
  - [Blip2Config](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (BLIP-2 model)
  - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipForConditionalGeneration) (BLIP model)
  - `ChameleonConfig` configuration class: `ChameleonForConditionalGeneration` (Chameleon model)
  - `Cohere2VisionConfig` configuration class: `Cohere2VisionForConditionalGeneration` (Cohere2Vision model)
  - `DeepseekVLConfig` configuration class: `DeepseekVLForConditionalGeneration` (DeepseekVL model)
  - `DeepseekVLHybridConfig` configuration class: `DeepseekVLHybridForConditionalGeneration` (DeepseekVLHybrid model)
  - `Emu3Config` configuration class: `Emu3ForConditionalGeneration` (Emu3 model)
  - `EvollaConfig` configuration class: `EvollaForProteinText2Text` (Evolla model)
  - `Florence2Config` configuration class: `Florence2ForConditionalGeneration` (Florence2 model)
  - `FuyuConfig` configuration class: `FuyuForCausalLM` (Fuyu model)
  - `Gemma3Config` configuration class: `Gemma3ForConditionalGeneration` (Gemma3ForConditionalGeneration model)
  - `Gemma3nConfig` configuration class: `Gemma3nForConditionalGeneration` (Gemma3nForConditionalGeneration model)
  - `GitConfig` configuration class: `GitForCausalLM` (GIT model)
  - `Glm4vConfig` configuration class: `Glm4vForConditionalGeneration` (GLM4V model)
  - `Glm4vMoeConfig` configuration class: `Glm4vMoeForConditionalGeneration` (GLM4VMOE model)
  - `GotOcr2Config` configuration class: `GotOcr2ForConditionalGeneration` (GOT-OCR2 model)
  - `Idefics2Config` configuration class: `Idefics2ForConditionalGeneration` (Idefics2 model)
  - `Idefics3Config` configuration class: `Idefics3ForConditionalGeneration` (Idefics3 model)
  - `IdeficsConfig` configuration class: `IdeficsForVisionText2Text` (IDEFICS model)
  - `InstructBlipConfig` configuration class: `InstructBlipForConditionalGeneration` (InstructBLIP model)
  - `InternVLConfig` configuration class: `InternVLForConditionalGeneration` (InternVL model)
  - `JanusConfig` configuration class: `JanusForConditionalGeneration` (Janus model)
  - `Kosmos2Config` configuration class: `Kosmos2ForConditionalGeneration` (KOSMOS-2 model)
  - `Kosmos2_5Config` configuration class: `Kosmos2_5ForConditionalGeneration` (KOSMOS-2.5 model)
  - `Lfm2VlConfig` configuration class: `Lfm2VlForConditionalGeneration` (Lfm2Vl model)
  - `Llama4Config` configuration class: `Llama4ForConditionalGeneration` (Llama4 model)
  - `LlavaConfig` configuration class: `LlavaForConditionalGeneration` (LLaVa model)
  - `LlavaNextConfig` configuration class: `LlavaNextForConditionalGeneration` (LLaVA-NeXT model)
  - `LlavaNextVideoConfig` configuration class: `LlavaNextVideoForConditionalGeneration` (LLaVa-NeXT-Video model)
  - `LlavaOnevisionConfig` configuration class: `LlavaOnevisionForConditionalGeneration` (LLaVA-Onevision model)
  - `Mistral3Config` configuration class: `Mistral3ForConditionalGeneration` (Mistral3 model)
  - `MllamaConfig` configuration class: `MllamaForConditionalGeneration` (Mllama model)
  - `Ovis2Config` configuration class: `Ovis2ForConditionalGeneration` (Ovis2 model)
  - `PaliGemmaConfig` configuration class: `PaliGemmaForConditionalGeneration` (PaliGemma model)
  - `PerceptionLMConfig` configuration class: `PerceptionLMForConditionalGeneration` (PerceptionLM model)
  - `Pix2StructConfig` configuration class: `Pix2StructForConditionalGeneration` (Pix2Struct model)
  - `PixtralVisionConfig` configuration class: `LlavaForConditionalGeneration` (Pixtral model)
  - `Qwen2VLConfig` configuration class: `Qwen2VLForConditionalGeneration` (Qwen2VL model)
  - `Qwen2_5_VLConfig` configuration class: `Qwen2_5_VLForConditionalGeneration` (Qwen2_5_VL model)
  - `Qwen3VLConfig` configuration class: `Qwen3VLForConditionalGeneration` (Qwen3VL model)
  - `Qwen3VLMoeConfig` configuration class: `Qwen3VLMoeForConditionalGeneration` (Qwen3VLMoe model)
  - `ShieldGemma2Config` configuration class: `Gemma3ForConditionalGeneration` (Shieldgemma2 model)
  - `SmolVLMConfig` configuration class: `SmolVLMForConditionalGeneration` (SmolVLM model)
  - `UdopConfig` configuration class: `UdopForConditionalGeneration` (UDOP model)
  - `VipLlavaConfig` configuration class: `VipLlavaForConditionalGeneration` (VipLlava model)
  - `VisionEncoderDecoderConfig` configuration class: `VisionEncoderDecoderModel` (Vision Encoder decoder model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image-text-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageTextToText

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageTextToText.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `AriaConfig` configuration class: `AriaForConditionalGeneration` (Aria model) - `AyaVisionConfig` configuration class: `AyaVisionForConditionalGeneration` (AyaVision model) - [Blip2Config](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (BLIP-2 model) - [BlipConfig](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipForConditionalGeneration) (BLIP model) - `ChameleonConfig` configuration class: `ChameleonForConditionalGeneration` (Chameleon model) - `Cohere2VisionConfig` configuration class: `Cohere2VisionForConditionalGeneration` (Cohere2Vision model) - `DeepseekVLConfig` configuration class: `DeepseekVLForConditionalGeneration` (DeepseekVL model) - `DeepseekVLHybridConfig` configuration class: `DeepseekVLHybridForConditionalGeneration` (DeepseekVLHybrid model) - `Emu3Config` configuration class: `Emu3ForConditionalGeneration` (Emu3 model) - `EvollaConfig` configuration class: `EvollaForProteinText2Text` (Evolla model) - `Florence2Config` configuration class: `Florence2ForConditionalGeneration` (Florence2 model) - `FuyuConfig` configuration class: `FuyuForCausalLM` (Fuyu model) - `Gemma3Config` configuration class: `Gemma3ForConditionalGeneration` (Gemma3ForConditionalGeneration model) - `Gemma3nConfig` configuration class: `Gemma3nForConditionalGeneration` (Gemma3nForConditionalGeneration model) - `GitConfig` configuration class: `GitForCausalLM` (GIT model) - `Glm4vConfig` configuration class: `Glm4vForConditionalGeneration` (GLM4V model) - `Glm4vMoeConfig` configuration class: `Glm4vMoeForConditionalGeneration` (GLM4VMOE model) - `GotOcr2Config` configuration class: `GotOcr2ForConditionalGeneration` (GOT-OCR2 model) - `Idefics2Config` configuration class: `Idefics2ForConditionalGeneration` (Idefics2 model) - `Idefics3Config` configuration class: `Idefics3ForConditionalGeneration` (Idefics3 model) - `IdeficsConfig` configuration class: `IdeficsForVisionText2Text` (IDEFICS model) - `InstructBlipConfig` configuration class: `InstructBlipForConditionalGeneration` (InstructBLIP model) - `InternVLConfig` configuration class: `InternVLForConditionalGeneration` (InternVL model) - `JanusConfig` configuration class: `JanusForConditionalGeneration` (Janus model) - `Kosmos2Config` configuration class: `Kosmos2ForConditionalGeneration` (KOSMOS-2 model) - `Kosmos2_5Config` configuration class: `Kosmos2_5ForConditionalGeneration` (KOSMOS-2.5 model) - `Lfm2VlConfig` configuration class: `Lfm2VlForConditionalGeneration` (Lfm2Vl model) - `Llama4Config` configuration class: `Llama4ForConditionalGeneration` (Llama4 model) - `LlavaConfig` configuration class: `LlavaForConditionalGeneration` (LLaVa model) - `LlavaNextConfig` configuration class: `LlavaNextForConditionalGeneration` (LLaVA-NeXT model) - `LlavaNextVideoConfig` configuration class: `LlavaNextVideoForConditionalGeneration` (LLaVa-NeXT-Video model) - `LlavaOnevisionConfig` configuration class: `LlavaOnevisionForConditionalGeneration` (LLaVA-Onevision model) - `Mistral3Config` configuration class: `Mistral3ForConditionalGeneration` (Mistral3 model) - `MllamaConfig` configuration class: `MllamaForConditionalGeneration` (Mllama model) - `Ovis2Config` configuration class: `Ovis2ForConditionalGeneration` (Ovis2 model) - `PaliGemmaConfig` configuration class: `PaliGemmaForConditionalGeneration` (PaliGemma model) - `PerceptionLMConfig` configuration class: `PerceptionLMForConditionalGeneration` (PerceptionLM model) - `Pix2StructConfig` configuration class: `Pix2StructForConditionalGeneration` (Pix2Struct model) - `PixtralVisionConfig` configuration class: `LlavaForConditionalGeneration` (Pixtral model) - `Qwen2VLConfig` configuration class: `Qwen2VLForConditionalGeneration` (Qwen2VL model) - `Qwen2_5_VLConfig` configuration class: `Qwen2_5_VLForConditionalGeneration` (Qwen2_5_VL model) - `Qwen3VLConfig` configuration class: `Qwen3VLForConditionalGeneration` (Qwen3VL model) - `Qwen3VLMoeConfig` configuration class: `Qwen3VLMoeForConditionalGeneration` (Qwen3VLMoe model) - `ShieldGemma2Config` configuration class: `Gemma3ForConditionalGeneration` (Shieldgemma2 model) - `SmolVLMConfig` configuration class: `SmolVLMForConditionalGeneration` (SmolVLM model) - `UdopConfig` configuration class: `UdopForConditionalGeneration` (UDOP model) - `VipLlavaConfig` configuration class: `VipLlavaForConditionalGeneration` (VipLlava model) - `VisionEncoderDecoderConfig` configuration class: `VisionEncoderDecoderModel` (Vision Encoder decoder model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForImageTextToText.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a image-text-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aria** -- `AriaForConditionalGeneration` (Aria model)
- **aya_vision** -- `AyaVisionForConditionalGeneration` (AyaVision model)
- **blip** -- [BlipForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip#transformers.BlipForConditionalGeneration) (BLIP model)
- **blip-2** -- [Blip2ForConditionalGeneration](/docs/transformers/v4.57.1/ja/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (BLIP-2 model)
- **chameleon** -- `ChameleonForConditionalGeneration` (Chameleon model)
- **cohere2_vision** -- `Cohere2VisionForConditionalGeneration` (Cohere2Vision model)
- **deepseek_vl** -- `DeepseekVLForConditionalGeneration` (DeepseekVL model)
- **deepseek_vl_hybrid** -- `DeepseekVLHybridForConditionalGeneration` (DeepseekVLHybrid model)
- **emu3** -- `Emu3ForConditionalGeneration` (Emu3 model)
- **evolla** -- `EvollaForProteinText2Text` (Evolla model)
- **florence2** -- `Florence2ForConditionalGeneration` (Florence2 model)
- **fuyu** -- `FuyuForCausalLM` (Fuyu model)
- **gemma3** -- `Gemma3ForConditionalGeneration` (Gemma3ForConditionalGeneration model)
- **gemma3n** -- `Gemma3nForConditionalGeneration` (Gemma3nForConditionalGeneration model)
- **git** -- `GitForCausalLM` (GIT model)
- **glm4v** -- `Glm4vForConditionalGeneration` (GLM4V model)
- **glm4v_moe** -- `Glm4vMoeForConditionalGeneration` (GLM4VMOE model)
- **got_ocr2** -- `GotOcr2ForConditionalGeneration` (GOT-OCR2 model)
- **idefics** -- `IdeficsForVisionText2Text` (IDEFICS model)
- **idefics2** -- `Idefics2ForConditionalGeneration` (Idefics2 model)
- **idefics3** -- `Idefics3ForConditionalGeneration` (Idefics3 model)
- **instructblip** -- `InstructBlipForConditionalGeneration` (InstructBLIP model)
- **internvl** -- `InternVLForConditionalGeneration` (InternVL model)
- **janus** -- `JanusForConditionalGeneration` (Janus model)
- **kosmos-2** -- `Kosmos2ForConditionalGeneration` (KOSMOS-2 model)
- **kosmos-2.5** -- `Kosmos2_5ForConditionalGeneration` (KOSMOS-2.5 model)
- **lfm2_vl** -- `Lfm2VlForConditionalGeneration` (Lfm2Vl model)
- **llama4** -- `Llama4ForConditionalGeneration` (Llama4 model)
- **llava** -- `LlavaForConditionalGeneration` (LLaVa model)
- **llava_next** -- `LlavaNextForConditionalGeneration` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoForConditionalGeneration` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionForConditionalGeneration` (LLaVA-Onevision model)
- **mistral3** -- `Mistral3ForConditionalGeneration` (Mistral3 model)
- **mllama** -- `MllamaForConditionalGeneration` (Mllama model)
- **ovis2** -- `Ovis2ForConditionalGeneration` (Ovis2 model)
- **paligemma** -- `PaliGemmaForConditionalGeneration` (PaliGemma model)
- **perception_lm** -- `PerceptionLMForConditionalGeneration` (PerceptionLM model)
- **pix2struct** -- `Pix2StructForConditionalGeneration` (Pix2Struct model)
- **pixtral** -- `LlavaForConditionalGeneration` (Pixtral model)
- **qwen2_5_vl** -- `Qwen2_5_VLForConditionalGeneration` (Qwen2_5_VL model)
- **qwen2_vl** -- `Qwen2VLForConditionalGeneration` (Qwen2VL model)
- **qwen3_vl** -- `Qwen3VLForConditionalGeneration` (Qwen3VL model)
- **qwen3_vl_moe** -- `Qwen3VLMoeForConditionalGeneration` (Qwen3VLMoe model)
- **shieldgemma2** -- `Gemma3ForConditionalGeneration` (Shieldgemma2 model)
- **smolvlm** -- `SmolVLMForConditionalGeneration` (SmolVLM model)
- **udop** -- `UdopForConditionalGeneration` (UDOP model)
- **vipllava** -- `VipLlavaForConditionalGeneration` (VipLlava model)
- **vision-encoder-decoder** -- `VisionEncoderDecoderModel` (Vision Encoder decoder model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageTextToText

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageTextToText.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## Time Series

### AutoModelForTimeSeriesPrediction[[transformers.AutoModelForTimeSeriesPrediction]]

#### transformers.AutoModelForTimeSeriesPrediction[[transformers.AutoModelForTimeSeriesPrediction]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2101)

This is a generic model class that will be instantiated as one of the model classes of the library (with a time-series prediction head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTimeSeriesPrediction.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `TimesFmConfig` configuration class: `TimesFmModelForPrediction` (TimesFm model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a time-series prediction head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ja/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTimeSeriesPrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTimeSeriesPrediction.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `TimesFmConfig` configuration class: `TimesFmModelForPrediction` (TimesFm model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTimeSeriesPrediction.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a time-series prediction head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **timesfm** -- `TimesFmModelForPrediction` (TimesFm model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTimeSeriesPrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ja/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

