Auto ClassesΒΆ
In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you
are supplying to the from_pretrained() method. AutoClasses are here to do this job for you so that you
automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary.
Instantiating one of AutoConfig, AutoModel, and
AutoTokenizer will directly create a class of the relevant architecture. For instance
model = AutoModel.from_pretrained('bert-base-cased')
will create a model that is an instance of BertModel.
There is one class of AutoModel for each task, and for each backend (PyTorch or TensorFlow).
AutoConfigΒΆ
-
class
transformers.AutoConfig[source]ΒΆ This is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the
from_pretrained()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_pretrained(pretrained_model_name_or_path, **kwargs)[source]ΒΆ Instantiate one of the configuration classes of the library from a pretrained model configuration.
The configuration class to instantiate is selected based on the
model_typeproperty of the config object that is loaded, or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:speech_to_text β
Speech2TextConfig(Speech2Text model)wav2vec2 β
Wav2Vec2Config(Wav2Vec2 model)m2m_100 β
M2M100Config(M2M100 model)convbert β
ConvBertConfig(ConvBERT model)led β
LEDConfig(LED model)blenderbot-small β
BlenderbotSmallConfig(BlenderbotSmall model)retribert β
RetriBertConfig(RetriBERT model)ibert β
IBertConfig(I-BERT model)mt5 β
MT5Config(mT5 model)t5 β
T5Config(T5 model)mobilebert β
MobileBertConfig(MobileBERT model)distilbert β
DistilBertConfig(DistilBERT model)albert β
AlbertConfig(ALBERT model)bert-generation β
BertGenerationConfig(Bert Generation model)camembert β
CamembertConfig(CamemBERT model)xlm-roberta β
XLMRobertaConfig(XLM-RoBERTa model)pegasus β
PegasusConfig(Pegasus model)marian β
MarianConfig(Marian model)mbart β
MBartConfig(mBART model)mpnet β
MPNetConfig(MPNet model)bart β
BartConfig(BART model)blenderbot β
BlenderbotConfig(Blenderbot model)reformer β
ReformerConfig(Reformer model)longformer β
LongformerConfig(Longformer model)roberta β
RobertaConfig(RoBERTa model)deberta-v2 β
DebertaV2Config(DeBERTa-v2 model)deberta β
DebertaConfig(DeBERTa model)flaubert β
FlaubertConfig(FlauBERT model)fsmt β
FSMTConfig(FairSeq Machine-Translation model)squeezebert β
SqueezeBertConfig(SqueezeBERT model)bert β
BertConfig(BERT model)openai-gpt β
OpenAIGPTConfig(OpenAI GPT model)gpt2 β
GPT2Config(OpenAI GPT-2 model)transfo-xl β
TransfoXLConfig(Transformer-XL model)xlnet β
XLNetConfig(XLNet model)xlm-prophetnet β
XLMProphetNetConfig(XLMProphetNet model)prophetnet β
ProphetNetConfig(ProphetNet model)xlm β
XLMConfig(XLM model)ctrl β
CTRLConfig(CTRL model)electra β
ElectraConfig(ELECTRA model)encoder-decoder β
EncoderDecoderConfig(Encoder decoder model)funnel β
FunnelConfig(Funnel Transformer model)lxmert β
LxmertConfig(LXMERT model)dpr β
DPRConfig(DPR model)layoutlm β
LayoutLMConfig(LayoutLM model)rag β
RagConfig(RAG model)tapas β
TapasConfig(TAPAS model)
- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model configuration hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing a configuration file saved using the
save_pretrained()method, or thesave_pretrained()method, e.g.,./my_model_directory/.A path or url to a saved configuration JSON file, e.g.,
./my_model_directory/configuration.json.
cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.return_unused_kwargs (
bool, optional, defaults toFalse) βIf
False, then this function returns just the final configuration object.If
True, then this functions returns aTuple(config, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part ofkwargswhich has not been used to updateconfigand is otherwise ignored.kwargs (additional keyword arguments, optional) β The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not configuration attributes is controlled by the
return_unused_kwargskeyword parameter.
Examples:
>>> from transformers import AutoConfig >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> # Download configuration from huggingface.co (user-uploaded) and cache. >>> config = AutoConfig.from_pretrained('dbmdz/bert-base-german-cased') >>> # If configuration file is in a directory (e.g., was saved using `save_pretrained('./test/saved_model/')`). >>> config = AutoConfig.from_pretrained('./test/bert_saved_model/') >>> # Load a specific configuration file. >>> config = AutoConfig.from_pretrained('./test/bert_saved_model/my_configuration.json') >>> # Change some config attributes when loading a pretrained config. >>> config = AutoConfig.from_pretrained('bert-base-uncased', output_attentions=True, foo=False) >>> config.output_attentions True >>> config, unused_kwargs = AutoConfig.from_pretrained('bert-base-uncased', output_attentions=True, foo=False, return_unused_kwargs=True) >>> config.output_attentions True >>> config.unused_kwargs {'foo': False}
-
classmethod
AutoTokenizerΒΆ
-
class
transformers.AutoTokenizer[source]ΒΆ This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the
AutoTokenizer.from_pretrained()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)[source]ΒΆ Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.
The tokenizer class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:speech_to_text β
Speech2TextTokenizer(Speech2Text model)wav2vec2 β
Wav2Vec2CTCTokenizer(Wav2Vec2 model)m2m_100 β
M2M100Tokenizer(M2M100 model)convbert β
ConvBertTokenizer(ConvBERT model)led β
LEDTokenizer(LED model)blenderbot-small β
BlenderbotSmallTokenizer(BlenderbotSmall model)retribert β
RetriBertTokenizer(RetriBERT model)ibert β
RobertaTokenizer(I-BERT model)mt5 β
T5Tokenizer(mT5 model)t5 β
T5Tokenizer(T5 model)mobilebert β
MobileBertTokenizer(MobileBERT model)distilbert β
DistilBertTokenizer(DistilBERT model)albert β
AlbertTokenizer(ALBERT model)bert-generation β
BertGenerationTokenizer(Bert Generation model)camembert β
CamembertTokenizer(CamemBERT model)xlm-roberta β
XLMRobertaTokenizer(XLM-RoBERTa model)pegasus β
PegasusTokenizer(Pegasus model)marian β
MarianTokenizer(Marian model)mbart β
MBartTokenizer(mBART model)mpnet β
MPNetTokenizer(MPNet model)bart β
BartTokenizer(BART model)blenderbot β
BlenderbotTokenizer(Blenderbot model)reformer β
ReformerTokenizer(Reformer model)longformer β
LongformerTokenizer(Longformer model)roberta β
RobertaTokenizer(RoBERTa model)deberta-v2 β
DebertaV2Tokenizer(DeBERTa-v2 model)deberta β
DebertaTokenizer(DeBERTa model)flaubert β
FlaubertTokenizer(FlauBERT model)fsmt β
FSMTTokenizer(FairSeq Machine-Translation model)squeezebert β
SqueezeBertTokenizer(SqueezeBERT model)bert β
BertTokenizer(BERT model)openai-gpt β
OpenAIGPTTokenizer(OpenAI GPT model)gpt2 β
GPT2Tokenizer(OpenAI GPT-2 model)transfo-xl β
TransfoXLTokenizer(Transformer-XL model)xlnet β
XLNetTokenizer(XLNet model)xlm-prophetnet β
XLMProphetNetTokenizer(XLMProphetNet model)prophetnet β
ProphetNetTokenizer(ProphetNet model)xlm β
XLMTokenizer(XLM model)ctrl β
CTRLTokenizer(CTRL model)electra β
ElectraTokenizer(ELECTRA model)funnel β
FunnelTokenizer(Funnel Transformer model)lxmert β
LxmertTokenizer(LXMERT model)dpr β
DPRQuestionEncoderTokenizer(DPR model)layoutlm β
LayoutLMTokenizer(LayoutLM model)rag β
RagTokenizer(RAG model)tapas β
TapasTokenizer(TAPAS model)
- Params:
- pretrained_model_name_or_path (
stroros.PathLike): Can be either:
A string, the model id of a predefined tokenizer hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing vocabulary files required by the tokenizer, for instance saved using the
save_pretrained()method, e.g.,./my_model_directory/.A path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (like Bert or XLNet), e.g.:
./my_model_directory/vocab.txt. (Not applicable to all derived classes)
- inputs (additional positional arguments, optional):
Will be passed along to the Tokenizer
__init__()method.- config (
PreTrainedConfig, optional) The configuration object used to dertermine the tokenizer class to instantiate.
- cache_dir (
stroros.PathLike, optional): Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
- force_download (
bool, optional, defaults toFalse): Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.
- resume_download (
bool, optional, defaults toFalse): Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
- proxies (
Dict[str, str], optional): A dictionary of proxy servers to use by protocol or endpoint, e.g.,
{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.- revision(
str, optional, defaults to"main"): The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so
revisioncan be any identifier allowed by git.- subfolder (
str, optional): In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for facebook/rag-token-base), specify it here.
- use_fast (
bool, optional, defaults toTrue): Whether or not to try to load the fast version of the tokenizer.
- kwargs (additional keyword arguments, optional):
Will be passed to the Tokenizer
__init__()method. Can be used to set special tokens likebos_token,eos_token,unk_token,sep_token,pad_token,cls_token,mask_token,additional_special_tokens. See parameters in the__init__()for more details.
- pretrained_model_name_or_path (
Examples:
>>> from transformers import AutoTokenizer >>> # Download vocabulary from huggingface.co and cache. >>> tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased') >>> # Download vocabulary from huggingface.co (user-uploaded) and cache. >>> tokenizer = AutoTokenizer.from_pretrained('dbmdz/bert-base-german-cased') >>> # If vocabulary files are in a directory (e.g. tokenizer was saved using `save_pretrained('./test/saved_model/')`) >>> tokenizer = AutoTokenizer.from_pretrained('./test/bert_saved_model/')
-
classmethod
AutoModelΒΆ
-
class
transformers.AutoModel[source]ΒΆ This is a generic model class that will be instantiated as one of the base model classes of the library when created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the base model classes of the library from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
Speech2TextConfigconfiguration class:Speech2TextModel(Speech2Text model)Wav2Vec2Configconfiguration class:Wav2Vec2Model(Wav2Vec2 model)M2M100Configconfiguration class:M2M100Model(M2M100 model)ConvBertConfigconfiguration class:ConvBertModel(ConvBERT model)BlenderbotSmallConfigconfiguration class:BlenderbotSmallModel(BlenderbotSmall model)RetriBertConfigconfiguration class:RetriBertModel(RetriBERT model)PegasusConfigconfiguration class:PegasusModel(Pegasus model)MarianConfigconfiguration class:MarianModel(Marian model)MBartConfigconfiguration class:MBartModel(mBART model)BlenderbotConfigconfiguration class:BlenderbotModel(Blenderbot model)DistilBertConfigconfiguration class:DistilBertModel(DistilBERT model)AlbertConfigconfiguration class:AlbertModel(ALBERT model)CamembertConfigconfiguration class:CamembertModel(CamemBERT model)XLMRobertaConfigconfiguration class:XLMRobertaModel(XLM-RoBERTa model)BartConfigconfiguration class:BartModel(BART model)LongformerConfigconfiguration class:LongformerModel(Longformer model)RobertaConfigconfiguration class:RobertaModel(RoBERTa model)LayoutLMConfigconfiguration class:LayoutLMModel(LayoutLM model)SqueezeBertConfigconfiguration class:SqueezeBertModel(SqueezeBERT model)BertConfigconfiguration class:BertModel(BERT model)OpenAIGPTConfigconfiguration class:OpenAIGPTModel(OpenAI GPT model)GPT2Configconfiguration class:GPT2Model(OpenAI GPT-2 model)MobileBertConfigconfiguration class:MobileBertModel(MobileBERT model)TransfoXLConfigconfiguration class:TransfoXLModel(Transformer-XL model)XLNetConfigconfiguration class:XLNetModel(XLNet model)FlaubertConfigconfiguration class:FlaubertModel(FlauBERT model)FSMTConfigconfiguration class:FSMTModel(FairSeq Machine-Translation model)CTRLConfigconfiguration class:CTRLModel(CTRL model)ElectraConfigconfiguration class:ElectraModel(ELECTRA model)ReformerConfigconfiguration class:ReformerModel(Reformer model)FunnelConfigconfiguration class:FunnelModel(Funnel Transformer model)LxmertConfigconfiguration class:LxmertModel(LXMERT model)BertGenerationConfigconfiguration class:BertGenerationEncoder(Bert Generation model)DebertaConfigconfiguration class:DebertaModel(DeBERTa model)DebertaV2Configconfiguration class:DebertaV2Model(DeBERTa-v2 model)DPRConfigconfiguration class:DPRQuestionEncoder(DPR model)XLMProphetNetConfigconfiguration class:XLMProphetNetModel(XLMProphetNet model)ProphetNetConfigconfiguration class:ProphetNetModel(ProphetNet model)MPNetConfigconfiguration class:MPNetModel(MPNet model)TapasConfigconfiguration class:TapasModel(TAPAS model)IBertConfigconfiguration class:IBertModel(I-BERT model)
Examples:
>>> from transformers import AutoConfig, AutoModel >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModel.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the base model classes of the library from a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:speech_to_text β
Speech2TextModel(Speech2Text model)wav2vec2 β
Wav2Vec2Model(Wav2Vec2 model)m2m_100 β
M2M100Model(M2M100 model)convbert β
ConvBertModel(ConvBERT model)led β
LEDModel(LED model)blenderbot-small β
BlenderbotSmallModel(BlenderbotSmall model)retribert β
RetriBertModel(RetriBERT model)ibert β
IBertModel(I-BERT model)mt5 β
MT5Model(mT5 model)t5 β
T5Model(T5 model)mobilebert β
MobileBertModel(MobileBERT model)distilbert β
DistilBertModel(DistilBERT model)albert β
AlbertModel(ALBERT model)bert-generation β
BertGenerationEncoder(Bert Generation model)camembert β
CamembertModel(CamemBERT model)xlm-roberta β
XLMRobertaModel(XLM-RoBERTa model)pegasus β
PegasusModel(Pegasus model)marian β
MarianModel(Marian model)mbart β
MBartModel(mBART model)mpnet β
MPNetModel(MPNet model)bart β
BartModel(BART model)blenderbot β
BlenderbotModel(Blenderbot model)reformer β
ReformerModel(Reformer model)longformer β
LongformerModel(Longformer model)roberta β
RobertaModel(RoBERTa model)deberta-v2 β
DebertaV2Model(DeBERTa-v2 model)deberta β
DebertaModel(DeBERTa model)flaubert β
FlaubertModel(FlauBERT model)fsmt β
FSMTModel(FairSeq Machine-Translation model)squeezebert β
SqueezeBertModel(SqueezeBERT model)bert β
BertModel(BERT model)openai-gpt β
OpenAIGPTModel(OpenAI GPT model)gpt2 β
GPT2Model(OpenAI GPT-2 model)transfo-xl β
TransfoXLModel(Transformer-XL model)xlnet β
XLNetModel(XLNet model)xlm-prophetnet β
XLMProphetNetModel(XLMProphetNet model)prophetnet β
ProphetNetModel(ProphetNet model)xlm β
XLMModel(XLM model)ctrl β
CTRLModel(CTRL model)electra β
ElectraModel(ELECTRA model)funnel β
FunnelModel(Funnel Transformer model)lxmert β
LxmertModel(LXMERT model)dpr β
DPRQuestionEncoder(DPR model)layoutlm β
LayoutLMModel(LayoutLM model)tapas β
TapasModel(TAPAS model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModel >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModel.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModel.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModel.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForPreTrainingΒΆ
-
class
transformers.AutoModelForPreTraining[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith the architecture used for pretraining this modelβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith the architecture used for pretraining this modelβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
LayoutLMConfigconfiguration class:LayoutLMForMaskedLM(LayoutLM model)RetriBertConfigconfiguration class:RetriBertModel(RetriBERT model)T5Configconfiguration class:T5ForConditionalGeneration(T5 model)DistilBertConfigconfiguration class:DistilBertForMaskedLM(DistilBERT model)AlbertConfigconfiguration class:AlbertForPreTraining(ALBERT model)CamembertConfigconfiguration class:CamembertForMaskedLM(CamemBERT model)XLMRobertaConfigconfiguration class:XLMRobertaForMaskedLM(XLM-RoBERTa model)BartConfigconfiguration class:BartForConditionalGeneration(BART model)FSMTConfigconfiguration class:FSMTForConditionalGeneration(FairSeq Machine-Translation model)LongformerConfigconfiguration class:LongformerForMaskedLM(Longformer model)RobertaConfigconfiguration class:RobertaForMaskedLM(RoBERTa model)SqueezeBertConfigconfiguration class:SqueezeBertForMaskedLM(SqueezeBERT model)BertConfigconfiguration class:BertForPreTraining(BERT model)OpenAIGPTConfigconfiguration class:OpenAIGPTLMHeadModel(OpenAI GPT model)GPT2Configconfiguration class:GPT2LMHeadModel(OpenAI GPT-2 model)MobileBertConfigconfiguration class:MobileBertForPreTraining(MobileBERT model)TransfoXLConfigconfiguration class:TransfoXLLMHeadModel(Transformer-XL model)XLNetConfigconfiguration class:XLNetLMHeadModel(XLNet model)FlaubertConfigconfiguration class:FlaubertWithLMHeadModel(FlauBERT model)XLMConfigconfiguration class:XLMWithLMHeadModel(XLM model)CTRLConfigconfiguration class:CTRLLMHeadModel(CTRL model)ElectraConfigconfiguration class:ElectraForPreTraining(ELECTRA model)LxmertConfigconfiguration class:LxmertForPreTraining(LXMERT model)FunnelConfigconfiguration class:FunnelForPreTraining(Funnel Transformer model)MPNetConfigconfiguration class:MPNetForMaskedLM(MPNet model)TapasConfigconfiguration class:TapasForMaskedLM(TAPAS model)IBertConfigconfiguration class:IBertForMaskedLM(I-BERT model)DebertaConfigconfiguration class:DebertaForMaskedLM(DeBERTa model)DebertaV2Configconfiguration class:DebertaV2ForMaskedLM(DeBERTa-v2 model)
Examples:
>>> from transformers import AutoConfig, AutoModelForPreTraining >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForPreTraining.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith the architecture used for pretraining this modelβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:retribert β
RetriBertModel(RetriBERT model)ibert β
IBertForMaskedLM(I-BERT model)t5 β
T5ForConditionalGeneration(T5 model)mobilebert β
MobileBertForPreTraining(MobileBERT model)distilbert β
DistilBertForMaskedLM(DistilBERT model)albert β
AlbertForPreTraining(ALBERT model)camembert β
CamembertForMaskedLM(CamemBERT model)xlm-roberta β
XLMRobertaForMaskedLM(XLM-RoBERTa model)mpnet β
MPNetForMaskedLM(MPNet model)bart β
BartForConditionalGeneration(BART model)longformer β
LongformerForMaskedLM(Longformer model)roberta β
RobertaForMaskedLM(RoBERTa model)deberta-v2 β
DebertaV2ForMaskedLM(DeBERTa-v2 model)deberta β
DebertaForMaskedLM(DeBERTa model)flaubert β
FlaubertWithLMHeadModel(FlauBERT model)fsmt β
FSMTForConditionalGeneration(FairSeq Machine-Translation model)squeezebert β
SqueezeBertForMaskedLM(SqueezeBERT model)bert β
BertForPreTraining(BERT model)openai-gpt β
OpenAIGPTLMHeadModel(OpenAI GPT model)gpt2 β
GPT2LMHeadModel(OpenAI GPT-2 model)transfo-xl β
TransfoXLLMHeadModel(Transformer-XL model)xlnet β
XLNetLMHeadModel(XLNet model)xlm β
XLMWithLMHeadModel(XLM model)ctrl β
CTRLLMHeadModel(CTRL model)electra β
ElectraForPreTraining(ELECTRA model)funnel β
FunnelForPreTraining(Funnel Transformer model)lxmert β
LxmertForPreTraining(LXMERT model)layoutlm β
LayoutLMForMaskedLM(LayoutLM model)tapas β
TapasForMaskedLM(TAPAS model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModelForPreTraining >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForPreTraining.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForPreTraining.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForPreTraining.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForCausalLMΒΆ
-
class
transformers.AutoModelForCausalLM[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a causal language modeling headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a causal language modeling headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
CamembertConfigconfiguration class:CamembertForCausalLM(CamemBERT model)XLMRobertaConfigconfiguration class:XLMRobertaForCausalLM(XLM-RoBERTa model)RobertaConfigconfiguration class:RobertaForCausalLM(RoBERTa model)BertConfigconfiguration class:BertLMHeadModel(BERT model)OpenAIGPTConfigconfiguration class:OpenAIGPTLMHeadModel(OpenAI GPT model)GPT2Configconfiguration class:GPT2LMHeadModel(OpenAI GPT-2 model)TransfoXLConfigconfiguration class:TransfoXLLMHeadModel(Transformer-XL model)XLNetConfigconfiguration class:XLNetLMHeadModel(XLNet model)XLMConfigconfiguration class:XLMWithLMHeadModel(XLM model)CTRLConfigconfiguration class:CTRLLMHeadModel(CTRL model)ReformerConfigconfiguration class:ReformerModelWithLMHead(Reformer model)BertGenerationConfigconfiguration class:BertGenerationDecoder(Bert Generation model)XLMProphetNetConfigconfiguration class:XLMProphetNetForCausalLM(XLMProphetNet model)ProphetNetConfigconfiguration class:ProphetNetForCausalLM(ProphetNet model)BartConfigconfiguration class:BartForCausalLM(BART model)MBartConfigconfiguration class:MBartForCausalLM(mBART model)PegasusConfigconfiguration class:PegasusForCausalLM(Pegasus model)MarianConfigconfiguration class:MarianForCausalLM(Marian model)BlenderbotConfigconfiguration class:BlenderbotForCausalLM(Blenderbot model)BlenderbotSmallConfigconfiguration class:BlenderbotSmallForCausalLM(BlenderbotSmall model)
Examples:
>>> from transformers import AutoConfig, AutoModelForCausalLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('gpt2') >>> model = AutoModelForCausalLM.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a causal language modeling headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:blenderbot-small β
BlenderbotSmallForCausalLM(BlenderbotSmall model)bert-generation β
BertGenerationDecoder(Bert Generation model)camembert β
CamembertForCausalLM(CamemBERT model)xlm-roberta β
XLMRobertaForCausalLM(XLM-RoBERTa model)pegasus β
PegasusForCausalLM(Pegasus model)marian β
MarianForCausalLM(Marian model)mbart β
MBartForCausalLM(mBART model)bart β
BartForCausalLM(BART model)blenderbot β
BlenderbotForCausalLM(Blenderbot model)reformer β
ReformerModelWithLMHead(Reformer model)roberta β
RobertaForCausalLM(RoBERTa model)bert β
BertLMHeadModel(BERT model)openai-gpt β
OpenAIGPTLMHeadModel(OpenAI GPT model)gpt2 β
GPT2LMHeadModel(OpenAI GPT-2 model)transfo-xl β
TransfoXLLMHeadModel(Transformer-XL model)xlnet β
XLNetLMHeadModel(XLNet model)xlm-prophetnet β
XLMProphetNetForCausalLM(XLMProphetNet model)prophetnet β
ProphetNetForCausalLM(ProphetNet model)xlm β
XLMWithLMHeadModel(XLM model)ctrl β
CTRLLMHeadModel(CTRL model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModelForCausalLM >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForCausalLM.from_pretrained('gpt2') >>> # Update configuration during loading >>> model = AutoModelForCausalLM.from_pretrained('gpt2', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/gpt2_tf_model_config.json') >>> model = AutoModelForCausalLM.from_pretrained('./tf_model/gpt2_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForMaskedLMΒΆ
-
class
transformers.AutoModelForMaskedLM[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a masked language modeling headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a masked language modeling headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
Wav2Vec2Configconfiguration class:Wav2Vec2ForMaskedLM(Wav2Vec2 model)ConvBertConfigconfiguration class:ConvBertForMaskedLM(ConvBERT model)LayoutLMConfigconfiguration class:LayoutLMForMaskedLM(LayoutLM model)DistilBertConfigconfiguration class:DistilBertForMaskedLM(DistilBERT model)AlbertConfigconfiguration class:AlbertForMaskedLM(ALBERT model)BartConfigconfiguration class:BartForConditionalGeneration(BART model)MBartConfigconfiguration class:MBartForConditionalGeneration(mBART model)CamembertConfigconfiguration class:CamembertForMaskedLM(CamemBERT model)XLMRobertaConfigconfiguration class:XLMRobertaForMaskedLM(XLM-RoBERTa model)LongformerConfigconfiguration class:LongformerForMaskedLM(Longformer model)RobertaConfigconfiguration class:RobertaForMaskedLM(RoBERTa model)SqueezeBertConfigconfiguration class:SqueezeBertForMaskedLM(SqueezeBERT model)BertConfigconfiguration class:BertForMaskedLM(BERT model)MobileBertConfigconfiguration class:MobileBertForMaskedLM(MobileBERT model)FlaubertConfigconfiguration class:FlaubertWithLMHeadModel(FlauBERT model)XLMConfigconfiguration class:XLMWithLMHeadModel(XLM model)ElectraConfigconfiguration class:ElectraForMaskedLM(ELECTRA model)ReformerConfigconfiguration class:ReformerForMaskedLM(Reformer model)FunnelConfigconfiguration class:FunnelForMaskedLM(Funnel Transformer model)MPNetConfigconfiguration class:MPNetForMaskedLM(MPNet model)TapasConfigconfiguration class:TapasForMaskedLM(TAPAS model)DebertaConfigconfiguration class:DebertaForMaskedLM(DeBERTa model)DebertaV2Configconfiguration class:DebertaV2ForMaskedLM(DeBERTa-v2 model)IBertConfigconfiguration class:IBertForMaskedLM(I-BERT model)
Examples:
>>> from transformers import AutoConfig, AutoModelForMaskedLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForMaskedLM.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a masked language modeling headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:wav2vec2 β
Wav2Vec2ForMaskedLM(Wav2Vec2 model)convbert β
ConvBertForMaskedLM(ConvBERT model)ibert β
IBertForMaskedLM(I-BERT model)mobilebert β
MobileBertForMaskedLM(MobileBERT model)distilbert β
DistilBertForMaskedLM(DistilBERT model)albert β
AlbertForMaskedLM(ALBERT model)camembert β
CamembertForMaskedLM(CamemBERT model)xlm-roberta β
XLMRobertaForMaskedLM(XLM-RoBERTa model)mbart β
MBartForConditionalGeneration(mBART model)mpnet β
MPNetForMaskedLM(MPNet model)bart β
BartForConditionalGeneration(BART model)reformer β
ReformerForMaskedLM(Reformer model)longformer β
LongformerForMaskedLM(Longformer model)roberta β
RobertaForMaskedLM(RoBERTa model)deberta-v2 β
DebertaV2ForMaskedLM(DeBERTa-v2 model)deberta β
DebertaForMaskedLM(DeBERTa model)flaubert β
FlaubertWithLMHeadModel(FlauBERT model)squeezebert β
SqueezeBertForMaskedLM(SqueezeBERT model)bert β
BertForMaskedLM(BERT model)xlm β
XLMWithLMHeadModel(XLM model)electra β
ElectraForMaskedLM(ELECTRA model)funnel β
FunnelForMaskedLM(Funnel Transformer model)layoutlm β
LayoutLMForMaskedLM(LayoutLM model)tapas β
TapasForMaskedLM(TAPAS model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModelForMaskedLM >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForMaskedLM.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForMaskedLM.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForMaskedLM.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForSeq2SeqLMΒΆ
-
class
transformers.AutoModelForSeq2SeqLM[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a sequence-to-sequence language modeling headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a sequence-to-sequence language modeling headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
M2M100Configconfiguration class:M2M100ForConditionalGeneration(M2M100 model)LEDConfigconfiguration class:LEDForConditionalGeneration(LED model)BlenderbotSmallConfigconfiguration class:BlenderbotSmallForConditionalGeneration(BlenderbotSmall model)MT5Configconfiguration class:MT5ForConditionalGeneration(mT5 model)T5Configconfiguration class:T5ForConditionalGeneration(T5 model)PegasusConfigconfiguration class:PegasusForConditionalGeneration(Pegasus model)MarianConfigconfiguration class:MarianMTModel(Marian model)MBartConfigconfiguration class:MBartForConditionalGeneration(mBART model)BlenderbotConfigconfiguration class:BlenderbotForConditionalGeneration(Blenderbot model)BartConfigconfiguration class:BartForConditionalGeneration(BART model)FSMTConfigconfiguration class:FSMTForConditionalGeneration(FairSeq Machine-Translation model)EncoderDecoderConfigconfiguration class:EncoderDecoderModel(Encoder decoder model)XLMProphetNetConfigconfiguration class:XLMProphetNetForConditionalGeneration(XLMProphetNet model)ProphetNetConfigconfiguration class:ProphetNetForConditionalGeneration(ProphetNet model)
Examples:
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('t5') >>> model = AutoModelForSeq2SeqLM.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a sequence-to-sequence language modeling headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:m2m_100 β
M2M100ForConditionalGeneration(M2M100 model)led β
LEDForConditionalGeneration(LED model)blenderbot-small β
BlenderbotSmallForConditionalGeneration(BlenderbotSmall model)mt5 β
MT5ForConditionalGeneration(mT5 model)t5 β
T5ForConditionalGeneration(T5 model)pegasus β
PegasusForConditionalGeneration(Pegasus model)marian β
MarianMTModel(Marian model)mbart β
MBartForConditionalGeneration(mBART model)bart β
BartForConditionalGeneration(BART model)blenderbot β
BlenderbotForConditionalGeneration(Blenderbot model)fsmt β
FSMTForConditionalGeneration(FairSeq Machine-Translation model)xlm-prophetnet β
XLMProphetNetForConditionalGeneration(XLMProphetNet model)prophetnet β
ProphetNetForConditionalGeneration(ProphetNet model)encoder-decoder β
EncoderDecoderModel(Encoder decoder model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForSeq2SeqLM.from_pretrained('t5-base') >>> # Update configuration during loading >>> model = AutoModelForSeq2SeqLM.from_pretrained('t5-base', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/t5_tf_model_config.json') >>> model = AutoModelForSeq2SeqLM.from_pretrained('./tf_model/t5_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForSequenceClassificationΒΆ
-
class
transformers.AutoModelForSequenceClassification[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a sequence classification headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a sequence classification headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
ConvBertConfigconfiguration class:ConvBertForSequenceClassification(ConvBERT model)LEDConfigconfiguration class:LEDForSequenceClassification(LED model)DistilBertConfigconfiguration class:DistilBertForSequenceClassification(DistilBERT model)AlbertConfigconfiguration class:AlbertForSequenceClassification(ALBERT model)CamembertConfigconfiguration class:CamembertForSequenceClassification(CamemBERT model)XLMRobertaConfigconfiguration class:XLMRobertaForSequenceClassification(XLM-RoBERTa model)MBartConfigconfiguration class:MBartForSequenceClassification(mBART model)BartConfigconfiguration class:BartForSequenceClassification(BART model)LongformerConfigconfiguration class:LongformerForSequenceClassification(Longformer model)RobertaConfigconfiguration class:RobertaForSequenceClassification(RoBERTa model)SqueezeBertConfigconfiguration class:SqueezeBertForSequenceClassification(SqueezeBERT model)LayoutLMConfigconfiguration class:LayoutLMForSequenceClassification(LayoutLM model)BertConfigconfiguration class:BertForSequenceClassification(BERT model)XLNetConfigconfiguration class:XLNetForSequenceClassification(XLNet model)MobileBertConfigconfiguration class:MobileBertForSequenceClassification(MobileBERT model)FlaubertConfigconfiguration class:FlaubertForSequenceClassification(FlauBERT model)XLMConfigconfiguration class:XLMForSequenceClassification(XLM model)ElectraConfigconfiguration class:ElectraForSequenceClassification(ELECTRA model)FunnelConfigconfiguration class:FunnelForSequenceClassification(Funnel Transformer model)DebertaConfigconfiguration class:DebertaForSequenceClassification(DeBERTa model)DebertaV2Configconfiguration class:DebertaV2ForSequenceClassification(DeBERTa-v2 model)GPT2Configconfiguration class:GPT2ForSequenceClassification(OpenAI GPT-2 model)OpenAIGPTConfigconfiguration class:OpenAIGPTForSequenceClassification(OpenAI GPT model)ReformerConfigconfiguration class:ReformerForSequenceClassification(Reformer model)CTRLConfigconfiguration class:CTRLForSequenceClassification(CTRL model)TransfoXLConfigconfiguration class:TransfoXLForSequenceClassification(Transformer-XL model)MPNetConfigconfiguration class:MPNetForSequenceClassification(MPNet model)TapasConfigconfiguration class:TapasForSequenceClassification(TAPAS model)IBertConfigconfiguration class:IBertForSequenceClassification(I-BERT model)
Examples:
>>> from transformers import AutoConfig, AutoModelForSequenceClassification >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForSequenceClassification.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a sequence classification headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:convbert β
ConvBertForSequenceClassification(ConvBERT model)led β
LEDForSequenceClassification(LED model)ibert β
IBertForSequenceClassification(I-BERT model)mobilebert β
MobileBertForSequenceClassification(MobileBERT model)distilbert β
DistilBertForSequenceClassification(DistilBERT model)albert β
AlbertForSequenceClassification(ALBERT model)camembert β
CamembertForSequenceClassification(CamemBERT model)xlm-roberta β
XLMRobertaForSequenceClassification(XLM-RoBERTa model)mbart β
MBartForSequenceClassification(mBART model)mpnet β
MPNetForSequenceClassification(MPNet model)bart β
BartForSequenceClassification(BART model)reformer β
ReformerForSequenceClassification(Reformer model)longformer β
LongformerForSequenceClassification(Longformer model)roberta β
RobertaForSequenceClassification(RoBERTa model)deberta-v2 β
DebertaV2ForSequenceClassification(DeBERTa-v2 model)deberta β
DebertaForSequenceClassification(DeBERTa model)flaubert β
FlaubertForSequenceClassification(FlauBERT model)squeezebert β
SqueezeBertForSequenceClassification(SqueezeBERT model)bert β
BertForSequenceClassification(BERT model)openai-gpt β
OpenAIGPTForSequenceClassification(OpenAI GPT model)gpt2 β
GPT2ForSequenceClassification(OpenAI GPT-2 model)transfo-xl β
TransfoXLForSequenceClassification(Transformer-XL model)xlnet β
XLNetForSequenceClassification(XLNet model)xlm β
XLMForSequenceClassification(XLM model)ctrl β
CTRLForSequenceClassification(CTRL model)electra β
ElectraForSequenceClassification(ELECTRA model)funnel β
FunnelForSequenceClassification(Funnel Transformer model)layoutlm β
LayoutLMForSequenceClassification(LayoutLM model)tapas β
TapasForSequenceClassification(TAPAS model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModelForSequenceClassification >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForSequenceClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForMultipleChoiceΒΆ
-
class
transformers.AutoModelForMultipleChoice[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a multiple choice classification headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a multiple choice classification headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
ConvBertConfigconfiguration class:ConvBertForMultipleChoice(ConvBERT model)CamembertConfigconfiguration class:CamembertForMultipleChoice(CamemBERT model)ElectraConfigconfiguration class:ElectraForMultipleChoice(ELECTRA model)XLMRobertaConfigconfiguration class:XLMRobertaForMultipleChoice(XLM-RoBERTa model)LongformerConfigconfiguration class:LongformerForMultipleChoice(Longformer model)RobertaConfigconfiguration class:RobertaForMultipleChoice(RoBERTa model)SqueezeBertConfigconfiguration class:SqueezeBertForMultipleChoice(SqueezeBERT model)BertConfigconfiguration class:BertForMultipleChoice(BERT model)DistilBertConfigconfiguration class:DistilBertForMultipleChoice(DistilBERT model)MobileBertConfigconfiguration class:MobileBertForMultipleChoice(MobileBERT model)XLNetConfigconfiguration class:XLNetForMultipleChoice(XLNet model)AlbertConfigconfiguration class:AlbertForMultipleChoice(ALBERT model)XLMConfigconfiguration class:XLMForMultipleChoice(XLM model)FlaubertConfigconfiguration class:FlaubertForMultipleChoice(FlauBERT model)FunnelConfigconfiguration class:FunnelForMultipleChoice(Funnel Transformer model)MPNetConfigconfiguration class:MPNetForMultipleChoice(MPNet model)IBertConfigconfiguration class:IBertForMultipleChoice(I-BERT model)
Examples:
>>> from transformers import AutoConfig, AutoModelForMultipleChoice >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForMultipleChoice.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a multiple choice classification headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:convbert β
ConvBertForMultipleChoice(ConvBERT model)ibert β
IBertForMultipleChoice(I-BERT model)mobilebert β
MobileBertForMultipleChoice(MobileBERT model)distilbert β
DistilBertForMultipleChoice(DistilBERT model)albert β
AlbertForMultipleChoice(ALBERT model)camembert β
CamembertForMultipleChoice(CamemBERT model)xlm-roberta β
XLMRobertaForMultipleChoice(XLM-RoBERTa model)mpnet β
MPNetForMultipleChoice(MPNet model)longformer β
LongformerForMultipleChoice(Longformer model)roberta β
RobertaForMultipleChoice(RoBERTa model)flaubert β
FlaubertForMultipleChoice(FlauBERT model)squeezebert β
SqueezeBertForMultipleChoice(SqueezeBERT model)bert β
BertForMultipleChoice(BERT model)xlnet β
XLNetForMultipleChoice(XLNet model)xlm β
XLMForMultipleChoice(XLM model)electra β
ElectraForMultipleChoice(ELECTRA model)funnel β
FunnelForMultipleChoice(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModelForMultipleChoice >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForMultipleChoice.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForMultipleChoice.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForMultipleChoice.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForNextSentencePredictionΒΆ
-
class
transformers.AutoModelForNextSentencePrediction[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a next sentence prediction headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a multiple choice classification headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
BertConfigconfiguration class:BertForNextSentencePrediction(BERT model)MobileBertConfigconfiguration class:MobileBertForNextSentencePrediction(MobileBERT model)
Examples:
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForNextSentencePrediction.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a multiple choice classification headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:mobilebert β
MobileBertForNextSentencePrediction(MobileBERT model)bert β
BertForNextSentencePrediction(BERT model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForNextSentencePrediction.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForNextSentencePrediction.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForNextSentencePrediction.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForTokenClassificationΒΆ
-
class
transformers.AutoModelForTokenClassification[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a token classification headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a token classification headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
ConvBertConfigconfiguration class:ConvBertForTokenClassification(ConvBERT model)LayoutLMConfigconfiguration class:LayoutLMForTokenClassification(LayoutLM model)DistilBertConfigconfiguration class:DistilBertForTokenClassification(DistilBERT model)CamembertConfigconfiguration class:CamembertForTokenClassification(CamemBERT model)FlaubertConfigconfiguration class:FlaubertForTokenClassification(FlauBERT model)XLMConfigconfiguration class:XLMForTokenClassification(XLM model)XLMRobertaConfigconfiguration class:XLMRobertaForTokenClassification(XLM-RoBERTa model)LongformerConfigconfiguration class:LongformerForTokenClassification(Longformer model)RobertaConfigconfiguration class:RobertaForTokenClassification(RoBERTa model)SqueezeBertConfigconfiguration class:SqueezeBertForTokenClassification(SqueezeBERT model)BertConfigconfiguration class:BertForTokenClassification(BERT model)MobileBertConfigconfiguration class:MobileBertForTokenClassification(MobileBERT model)XLNetConfigconfiguration class:XLNetForTokenClassification(XLNet model)AlbertConfigconfiguration class:AlbertForTokenClassification(ALBERT model)ElectraConfigconfiguration class:ElectraForTokenClassification(ELECTRA model)FunnelConfigconfiguration class:FunnelForTokenClassification(Funnel Transformer model)MPNetConfigconfiguration class:MPNetForTokenClassification(MPNet model)DebertaConfigconfiguration class:DebertaForTokenClassification(DeBERTa model)DebertaV2Configconfiguration class:DebertaV2ForTokenClassification(DeBERTa-v2 model)IBertConfigconfiguration class:IBertForTokenClassification(I-BERT model)
Examples:
>>> from transformers import AutoConfig, AutoModelForTokenClassification >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForTokenClassification.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a token classification headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:convbert β
ConvBertForTokenClassification(ConvBERT model)ibert β
IBertForTokenClassification(I-BERT model)mobilebert β
MobileBertForTokenClassification(MobileBERT model)distilbert β
DistilBertForTokenClassification(DistilBERT model)albert β
AlbertForTokenClassification(ALBERT model)camembert β
CamembertForTokenClassification(CamemBERT model)xlm-roberta β
XLMRobertaForTokenClassification(XLM-RoBERTa model)mpnet β
MPNetForTokenClassification(MPNet model)longformer β
LongformerForTokenClassification(Longformer model)roberta β
RobertaForTokenClassification(RoBERTa model)deberta-v2 β
DebertaV2ForTokenClassification(DeBERTa-v2 model)deberta β
DebertaForTokenClassification(DeBERTa model)flaubert β
FlaubertForTokenClassification(FlauBERT model)squeezebert β
SqueezeBertForTokenClassification(SqueezeBERT model)bert β
BertForTokenClassification(BERT model)xlnet β
XLNetForTokenClassification(XLNet model)xlm β
XLMForTokenClassification(XLM model)electra β
ElectraForTokenClassification(ELECTRA model)funnel β
FunnelForTokenClassification(Funnel Transformer model)layoutlm β
LayoutLMForTokenClassification(LayoutLM model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModelForTokenClassification >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForTokenClassification.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForTokenClassification.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForTokenClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForQuestionAnsweringΒΆ
-
class
transformers.AutoModelForQuestionAnswering[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a question answering headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a question answering headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
ConvBertConfigconfiguration class:ConvBertForQuestionAnswering(ConvBERT model)LEDConfigconfiguration class:LEDForQuestionAnswering(LED model)DistilBertConfigconfiguration class:DistilBertForQuestionAnswering(DistilBERT model)AlbertConfigconfiguration class:AlbertForQuestionAnswering(ALBERT model)CamembertConfigconfiguration class:CamembertForQuestionAnswering(CamemBERT model)BartConfigconfiguration class:BartForQuestionAnswering(BART model)MBartConfigconfiguration class:MBartForQuestionAnswering(mBART model)LongformerConfigconfiguration class:LongformerForQuestionAnswering(Longformer model)XLMRobertaConfigconfiguration class:XLMRobertaForQuestionAnswering(XLM-RoBERTa model)RobertaConfigconfiguration class:RobertaForQuestionAnswering(RoBERTa model)SqueezeBertConfigconfiguration class:SqueezeBertForQuestionAnswering(SqueezeBERT model)BertConfigconfiguration class:BertForQuestionAnswering(BERT model)XLNetConfigconfiguration class:XLNetForQuestionAnsweringSimple(XLNet model)FlaubertConfigconfiguration class:FlaubertForQuestionAnsweringSimple(FlauBERT model)MobileBertConfigconfiguration class:MobileBertForQuestionAnswering(MobileBERT model)XLMConfigconfiguration class:XLMForQuestionAnsweringSimple(XLM model)ElectraConfigconfiguration class:ElectraForQuestionAnswering(ELECTRA model)ReformerConfigconfiguration class:ReformerForQuestionAnswering(Reformer model)FunnelConfigconfiguration class:FunnelForQuestionAnswering(Funnel Transformer model)LxmertConfigconfiguration class:LxmertForQuestionAnswering(LXMERT model)MPNetConfigconfiguration class:MPNetForQuestionAnswering(MPNet model)DebertaConfigconfiguration class:DebertaForQuestionAnswering(DeBERTa model)DebertaV2Configconfiguration class:DebertaV2ForQuestionAnswering(DeBERTa-v2 model)IBertConfigconfiguration class:IBertForQuestionAnswering(I-BERT model)
Examples:
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForQuestionAnswering.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a question answering headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:convbert β
ConvBertForQuestionAnswering(ConvBERT model)led β
LEDForQuestionAnswering(LED model)ibert β
IBertForQuestionAnswering(I-BERT model)mobilebert β
MobileBertForQuestionAnswering(MobileBERT model)distilbert β
DistilBertForQuestionAnswering(DistilBERT model)albert β
AlbertForQuestionAnswering(ALBERT model)camembert β
CamembertForQuestionAnswering(CamemBERT model)xlm-roberta β
XLMRobertaForQuestionAnswering(XLM-RoBERTa model)mbart β
MBartForQuestionAnswering(mBART model)mpnet β
MPNetForQuestionAnswering(MPNet model)bart β
BartForQuestionAnswering(BART model)reformer β
ReformerForQuestionAnswering(Reformer model)longformer β
LongformerForQuestionAnswering(Longformer model)roberta β
RobertaForQuestionAnswering(RoBERTa model)deberta-v2 β
DebertaV2ForQuestionAnswering(DeBERTa-v2 model)deberta β
DebertaForQuestionAnswering(DeBERTa model)flaubert β
FlaubertForQuestionAnsweringSimple(FlauBERT model)squeezebert β
SqueezeBertForQuestionAnswering(SqueezeBERT model)bert β
BertForQuestionAnswering(BERT model)xlnet β
XLNetForQuestionAnsweringSimple(XLNet model)xlm β
XLMForQuestionAnsweringSimple(XLM model)electra β
ElectraForQuestionAnswering(ELECTRA model)funnel β
FunnelForQuestionAnswering(Funnel Transformer model)lxmert β
LxmertForQuestionAnswering(LXMERT model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForQuestionAnswering.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForQuestionAnswering.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForQuestionAnswering.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForTableQuestionAnsweringΒΆ
-
class
transformers.AutoModelForTableQuestionAnswering[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a table question answering headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a table question answering headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
TapasConfigconfiguration class:TapasForQuestionAnswering(TAPAS model)
Examples:
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('google/tapas-base-finetuned-wtq') >>> model = AutoModelForTableQuestionAnswering.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a table question answering headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:tapas β
TapasForQuestionAnswering(TAPAS model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path (
stroros.PathLike) βCan be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index). In this case,from_tfshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
stroros.PathLike, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForTableQuestionAnswering.from_pretrained('google/tapas-base-finetuned-wtq') >>> # Update configuration during loading >>> model = AutoModelForTableQuestionAnswering.from_pretrained('google/tapas-base-finetuned-wtq', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/tapas_tf_checkpoint.json') >>> model = AutoModelForQuestionAnswering.from_pretrained('./tf_model/tapas_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
TFAutoModelΒΆ
-
class
transformers.TFAutoModel[source]ΒΆ This is a generic model class that will be instantiated as one of the base model classes of the library when created with the when created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config, **kwargs)[source]ΒΆ Instantiates one of the base model classes of the library from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
ConvBertConfigconfiguration class:TFConvBertModel(ConvBERT model)LEDConfigconfiguration class:TFLEDModel(LED model)LxmertConfigconfiguration class:TFLxmertModel(LXMERT model)MT5Configconfiguration class:TFMT5Model(mT5 model)DistilBertConfigconfiguration class:TFDistilBertModel(DistilBERT model)AlbertConfigconfiguration class:TFAlbertModel(ALBERT model)BartConfigconfiguration class:TFBartModel(BART model)CamembertConfigconfiguration class:TFCamembertModel(CamemBERT model)XLMRobertaConfigconfiguration class:TFXLMRobertaModel(XLM-RoBERTa model)LongformerConfigconfiguration class:TFLongformerModel(Longformer model)RobertaConfigconfiguration class:TFRobertaModel(RoBERTa model)BertConfigconfiguration class:TFBertModel(BERT model)OpenAIGPTConfigconfiguration class:TFOpenAIGPTModel(OpenAI GPT model)GPT2Configconfiguration class:TFGPT2Model(OpenAI GPT-2 model)MobileBertConfigconfiguration class:TFMobileBertModel(MobileBERT model)TransfoXLConfigconfiguration class:TFTransfoXLModel(Transformer-XL model)XLNetConfigconfiguration class:TFXLNetModel(XLNet model)FlaubertConfigconfiguration class:TFFlaubertModel(FlauBERT model)XLMConfigconfiguration class:TFXLMModel(XLM model)CTRLConfigconfiguration class:TFCTRLModel(CTRL model)ElectraConfigconfiguration class:TFElectraModel(ELECTRA model)FunnelConfigconfiguration class:TFFunnelModel(Funnel Transformer model)DPRConfigconfiguration class:TFDPRQuestionEncoder(DPR model)MPNetConfigconfiguration class:TFMPNetModel(MPNet model)MBartConfigconfiguration class:TFMBartModel(mBART model)MarianConfigconfiguration class:TFMarianModel(Marian model)PegasusConfigconfiguration class:TFPegasusModel(Pegasus model)BlenderbotConfigconfiguration class:TFBlenderbotModel(Blenderbot model)BlenderbotSmallConfigconfiguration class:TFBlenderbotSmallModel(BlenderbotSmall model)
Examples:
>>> from transformers import AutoConfig, TFAutoModel >>> # Download configuration from huggingface.co and cache. >>> config = TFAutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModel.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the base model classes of the library from a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:convbert β
TFConvBertModel(ConvBERT model)led β
TFLEDModel(LED model)blenderbot-small β
TFBlenderbotSmallModel(BlenderbotSmall model)mt5 β
TFMT5Model(mT5 model)t5 β
TFT5Model(T5 model)mobilebert β
TFMobileBertModel(MobileBERT model)distilbert β
TFDistilBertModel(DistilBERT model)albert β
TFAlbertModel(ALBERT model)camembert β
TFCamembertModel(CamemBERT model)xlm-roberta β
TFXLMRobertaModel(XLM-RoBERTa model)pegasus β
TFPegasusModel(Pegasus model)marian β
TFMarianModel(Marian model)mbart β
TFMBartModel(mBART model)mpnet β
TFMPNetModel(MPNet model)bart β
TFBartModel(BART model)blenderbot β
TFBlenderbotModel(Blenderbot model)longformer β
TFLongformerModel(Longformer model)roberta β
TFRobertaModel(RoBERTa model)flaubert β
TFFlaubertModel(FlauBERT model)bert β
TFBertModel(BERT model)openai-gpt β
TFOpenAIGPTModel(OpenAI GPT model)gpt2 β
TFGPT2Model(OpenAI GPT-2 model)transfo-xl β
TFTransfoXLModel(Transformer-XL model)xlnet β
TFXLNetModel(XLNet model)xlm β
TFXLMModel(XLM model)ctrl β
TFCTRLModel(CTRL model)electra β
TFElectraModel(ELECTRA model)funnel β
TFFunnelModel(Funnel Transformer model)lxmert β
TFLxmertModel(LXMERT model)dpr β
TFDPRQuestionEncoder(DPR model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path β
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin). In this case,from_ptshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
str, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, AutoModel >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModel.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModel.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModel.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForPreTrainingΒΆ
-
class
transformers.TFAutoModelForPreTraining[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith the architecture used for pretraining this modelβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith the architecture used for pretraining this modelβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
LxmertConfigconfiguration class:TFLxmertForPreTraining(LXMERT model)T5Configconfiguration class:TFT5ForConditionalGeneration(T5 model)DistilBertConfigconfiguration class:TFDistilBertForMaskedLM(DistilBERT model)AlbertConfigconfiguration class:TFAlbertForPreTraining(ALBERT model)BartConfigconfiguration class:TFBartForConditionalGeneration(BART model)CamembertConfigconfiguration class:TFCamembertForMaskedLM(CamemBERT model)XLMRobertaConfigconfiguration class:TFXLMRobertaForMaskedLM(XLM-RoBERTa model)RobertaConfigconfiguration class:TFRobertaForMaskedLM(RoBERTa model)BertConfigconfiguration class:TFBertForPreTraining(BERT model)OpenAIGPTConfigconfiguration class:TFOpenAIGPTLMHeadModel(OpenAI GPT model)GPT2Configconfiguration class:TFGPT2LMHeadModel(OpenAI GPT-2 model)MobileBertConfigconfiguration class:TFMobileBertForPreTraining(MobileBERT model)TransfoXLConfigconfiguration class:TFTransfoXLLMHeadModel(Transformer-XL model)XLNetConfigconfiguration class:TFXLNetLMHeadModel(XLNet model)FlaubertConfigconfiguration class:TFFlaubertWithLMHeadModel(FlauBERT model)XLMConfigconfiguration class:TFXLMWithLMHeadModel(XLM model)CTRLConfigconfiguration class:TFCTRLLMHeadModel(CTRL model)ElectraConfigconfiguration class:TFElectraForPreTraining(ELECTRA model)FunnelConfigconfiguration class:TFFunnelForPreTraining(Funnel Transformer model)MPNetConfigconfiguration class:TFMPNetForMaskedLM(MPNet model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForPreTraining >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForPreTraining.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith the architecture used for pretraining this modelβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:t5 β
TFT5ForConditionalGeneration(T5 model)mobilebert β
TFMobileBertForPreTraining(MobileBERT model)distilbert β
TFDistilBertForMaskedLM(DistilBERT model)albert β
TFAlbertForPreTraining(ALBERT model)camembert β
TFCamembertForMaskedLM(CamemBERT model)xlm-roberta β
TFXLMRobertaForMaskedLM(XLM-RoBERTa model)mpnet β
TFMPNetForMaskedLM(MPNet model)bart β
TFBartForConditionalGeneration(BART model)roberta β
TFRobertaForMaskedLM(RoBERTa model)flaubert β
TFFlaubertWithLMHeadModel(FlauBERT model)bert β
TFBertForPreTraining(BERT model)openai-gpt β
TFOpenAIGPTLMHeadModel(OpenAI GPT model)gpt2 β
TFGPT2LMHeadModel(OpenAI GPT-2 model)transfo-xl β
TFTransfoXLLMHeadModel(Transformer-XL model)xlnet β
TFXLNetLMHeadModel(XLNet model)xlm β
TFXLMWithLMHeadModel(XLM model)ctrl β
TFCTRLLMHeadModel(CTRL model)electra β
TFElectraForPreTraining(ELECTRA model)funnel β
TFFunnelForPreTraining(Funnel Transformer model)lxmert β
TFLxmertForPreTraining(LXMERT model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path β
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin). In this case,from_ptshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
str, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForPreTraining >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForPreTraining.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForPreTraining.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForPreTraining.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForCausalLMΒΆ
-
class
transformers.TFAutoModelForCausalLM[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a causal language modeling headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a causal language modeling headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
BertConfigconfiguration class:TFBertLMHeadModel(BERT model)OpenAIGPTConfigconfiguration class:TFOpenAIGPTLMHeadModel(OpenAI GPT model)GPT2Configconfiguration class:TFGPT2LMHeadModel(OpenAI GPT-2 model)TransfoXLConfigconfiguration class:TFTransfoXLLMHeadModel(Transformer-XL model)XLNetConfigconfiguration class:TFXLNetLMHeadModel(XLNet model)XLMConfigconfiguration class:TFXLMWithLMHeadModel(XLM model)CTRLConfigconfiguration class:TFCTRLLMHeadModel(CTRL model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForCausalLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('gpt2') >>> model = TFAutoModelForCausalLM.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a causal language modeling headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:bert β
TFBertLMHeadModel(BERT model)openai-gpt β
TFOpenAIGPTLMHeadModel(OpenAI GPT model)gpt2 β
TFGPT2LMHeadModel(OpenAI GPT-2 model)transfo-xl β
TFTransfoXLLMHeadModel(Transformer-XL model)xlnet β
TFXLNetLMHeadModel(XLNet model)xlm β
TFXLMWithLMHeadModel(XLM model)ctrl β
TFCTRLLMHeadModel(CTRL model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path β
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin). In this case,from_ptshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
str, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForCausalLM >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForCausalLM.from_pretrained('gpt2') >>> # Update configuration during loading >>> model = TFAutoModelForCausalLM.from_pretrained('gpt2', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/gpt2_pt_model_config.json') >>> model = TFAutoModelForCausalLM.from_pretrained('./pt_model/gpt2_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForMaskedLMΒΆ
-
class
transformers.TFAutoModelForMaskedLM[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a masked language modeling headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a masked language modeling headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
ConvBertConfigconfiguration class:TFConvBertForMaskedLM(ConvBERT model)DistilBertConfigconfiguration class:TFDistilBertForMaskedLM(DistilBERT model)AlbertConfigconfiguration class:TFAlbertForMaskedLM(ALBERT model)CamembertConfigconfiguration class:TFCamembertForMaskedLM(CamemBERT model)XLMRobertaConfigconfiguration class:TFXLMRobertaForMaskedLM(XLM-RoBERTa model)LongformerConfigconfiguration class:TFLongformerForMaskedLM(Longformer model)RobertaConfigconfiguration class:TFRobertaForMaskedLM(RoBERTa model)BertConfigconfiguration class:TFBertForMaskedLM(BERT model)MobileBertConfigconfiguration class:TFMobileBertForMaskedLM(MobileBERT model)FlaubertConfigconfiguration class:TFFlaubertWithLMHeadModel(FlauBERT model)XLMConfigconfiguration class:TFXLMWithLMHeadModel(XLM model)ElectraConfigconfiguration class:TFElectraForMaskedLM(ELECTRA model)FunnelConfigconfiguration class:TFFunnelForMaskedLM(Funnel Transformer model)MPNetConfigconfiguration class:TFMPNetForMaskedLM(MPNet model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForMaskedLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForMaskedLM.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a masked language modeling headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:convbert β
TFConvBertForMaskedLM(ConvBERT model)mobilebert β
TFMobileBertForMaskedLM(MobileBERT model)distilbert β
TFDistilBertForMaskedLM(DistilBERT model)albert β
TFAlbertForMaskedLM(ALBERT model)camembert β
TFCamembertForMaskedLM(CamemBERT model)xlm-roberta β
TFXLMRobertaForMaskedLM(XLM-RoBERTa model)mpnet β
TFMPNetForMaskedLM(MPNet model)longformer β
TFLongformerForMaskedLM(Longformer model)roberta β
TFRobertaForMaskedLM(RoBERTa model)flaubert β
TFFlaubertWithLMHeadModel(FlauBERT model)bert β
TFBertForMaskedLM(BERT model)xlm β
TFXLMWithLMHeadModel(XLM model)electra β
TFElectraForMaskedLM(ELECTRA model)funnel β
TFFunnelForMaskedLM(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path β
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin). In this case,from_ptshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
str, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForMaskedLM >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForMaskedLM.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForMaskedLM.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForMaskedLM.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForSeq2SeqLMΒΆ
-
class
transformers.TFAutoModelForSeq2SeqLM[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a sequence-to-sequence language modeling headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config, **kwargs)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a sequence-to-sequence language modeling headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
LEDConfigconfiguration class:TFLEDForConditionalGeneration(LED model)MT5Configconfiguration class:TFMT5ForConditionalGeneration(mT5 model)T5Configconfiguration class:TFT5ForConditionalGeneration(T5 model)MarianConfigconfiguration class:TFMarianMTModel(Marian model)MBartConfigconfiguration class:TFMBartForConditionalGeneration(mBART model)PegasusConfigconfiguration class:TFPegasusForConditionalGeneration(Pegasus model)BlenderbotConfigconfiguration class:TFBlenderbotForConditionalGeneration(Blenderbot model)BlenderbotSmallConfigconfiguration class:TFBlenderbotSmallForConditionalGeneration(BlenderbotSmall model)BartConfigconfiguration class:TFBartForConditionalGeneration(BART model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('t5') >>> model = TFAutoModelForSeq2SeqLM.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a sequence-to-sequence language modeling headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:LEDConfigconfiguration class:TFLEDForConditionalGeneration(LED model)MT5Configconfiguration class:TFMT5ForConditionalGeneration(mT5 model)T5Configconfiguration class:TFT5ForConditionalGeneration(T5 model)MarianConfigconfiguration class:TFMarianMTModel(Marian model)MBartConfigconfiguration class:TFMBartForConditionalGeneration(mBART model)PegasusConfigconfiguration class:TFPegasusForConditionalGeneration(Pegasus model)BlenderbotConfigconfiguration class:TFBlenderbotForConditionalGeneration(Blenderbot model)BlenderbotSmallConfigconfiguration class:TFBlenderbotSmallForConditionalGeneration(BlenderbotSmall model)BartConfigconfiguration class:TFBartForConditionalGeneration(BART model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path β
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin). In this case,from_ptshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
str, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForSeq2SeqLM.from_pretrained('t5-base') >>> # Update configuration during loading >>> model = TFAutoModelForSeq2SeqLM.from_pretrained('t5-base', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/t5_pt_model_config.json') >>> model = TFAutoModelForSeq2SeqLM.from_pretrained('./pt_model/t5_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForSequenceClassificationΒΆ
-
class
transformers.TFAutoModelForSequenceClassification[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a sequence classification headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a sequence classification headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
ConvBertConfigconfiguration class:TFConvBertForSequenceClassification(ConvBERT model)DistilBertConfigconfiguration class:TFDistilBertForSequenceClassification(DistilBERT model)AlbertConfigconfiguration class:TFAlbertForSequenceClassification(ALBERT model)CamembertConfigconfiguration class:TFCamembertForSequenceClassification(CamemBERT model)XLMRobertaConfigconfiguration class:TFXLMRobertaForSequenceClassification(XLM-RoBERTa model)LongformerConfigconfiguration class:TFLongformerForSequenceClassification(Longformer model)RobertaConfigconfiguration class:TFRobertaForSequenceClassification(RoBERTa model)BertConfigconfiguration class:TFBertForSequenceClassification(BERT model)XLNetConfigconfiguration class:TFXLNetForSequenceClassification(XLNet model)MobileBertConfigconfiguration class:TFMobileBertForSequenceClassification(MobileBERT model)FlaubertConfigconfiguration class:TFFlaubertForSequenceClassification(FlauBERT model)XLMConfigconfiguration class:TFXLMForSequenceClassification(XLM model)ElectraConfigconfiguration class:TFElectraForSequenceClassification(ELECTRA model)FunnelConfigconfiguration class:TFFunnelForSequenceClassification(Funnel Transformer model)GPT2Configconfiguration class:TFGPT2ForSequenceClassification(OpenAI GPT-2 model)MPNetConfigconfiguration class:TFMPNetForSequenceClassification(MPNet model)OpenAIGPTConfigconfiguration class:TFOpenAIGPTForSequenceClassification(OpenAI GPT model)TransfoXLConfigconfiguration class:TFTransfoXLForSequenceClassification(Transformer-XL model)CTRLConfigconfiguration class:TFCTRLForSequenceClassification(CTRL model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForSequenceClassification.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a sequence classification headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:convbert β
TFConvBertForSequenceClassification(ConvBERT model)mobilebert β
TFMobileBertForSequenceClassification(MobileBERT model)distilbert β
TFDistilBertForSequenceClassification(DistilBERT model)albert β
TFAlbertForSequenceClassification(ALBERT model)camembert β
TFCamembertForSequenceClassification(CamemBERT model)xlm-roberta β
TFXLMRobertaForSequenceClassification(XLM-RoBERTa model)mpnet β
TFMPNetForSequenceClassification(MPNet model)longformer β
TFLongformerForSequenceClassification(Longformer model)roberta β
TFRobertaForSequenceClassification(RoBERTa model)flaubert β
TFFlaubertForSequenceClassification(FlauBERT model)bert β
TFBertForSequenceClassification(BERT model)openai-gpt β
TFOpenAIGPTForSequenceClassification(OpenAI GPT model)gpt2 β
TFGPT2ForSequenceClassification(OpenAI GPT-2 model)transfo-xl β
TFTransfoXLForSequenceClassification(Transformer-XL model)xlnet β
TFXLNetForSequenceClassification(XLNet model)xlm β
TFXLMForSequenceClassification(XLM model)ctrl β
TFCTRLForSequenceClassification(CTRL model)electra β
TFElectraForSequenceClassification(ELECTRA model)funnel β
TFFunnelForSequenceClassification(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path β
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin). In this case,from_ptshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
str, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForSequenceClassification.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForSequenceClassification.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForSequenceClassification.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForMultipleChoiceΒΆ
-
class
transformers.TFAutoModelForMultipleChoice[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a multiple choice classification headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a multiple choice classification headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
ConvBertConfigconfiguration class:TFConvBertForMultipleChoice(ConvBERT model)CamembertConfigconfiguration class:TFCamembertForMultipleChoice(CamemBERT model)XLMConfigconfiguration class:TFXLMForMultipleChoice(XLM model)XLMRobertaConfigconfiguration class:TFXLMRobertaForMultipleChoice(XLM-RoBERTa model)LongformerConfigconfiguration class:TFLongformerForMultipleChoice(Longformer model)RobertaConfigconfiguration class:TFRobertaForMultipleChoice(RoBERTa model)BertConfigconfiguration class:TFBertForMultipleChoice(BERT model)DistilBertConfigconfiguration class:TFDistilBertForMultipleChoice(DistilBERT model)MobileBertConfigconfiguration class:TFMobileBertForMultipleChoice(MobileBERT model)XLNetConfigconfiguration class:TFXLNetForMultipleChoice(XLNet model)FlaubertConfigconfiguration class:TFFlaubertForMultipleChoice(FlauBERT model)AlbertConfigconfiguration class:TFAlbertForMultipleChoice(ALBERT model)ElectraConfigconfiguration class:TFElectraForMultipleChoice(ELECTRA model)FunnelConfigconfiguration class:TFFunnelForMultipleChoice(Funnel Transformer model)MPNetConfigconfiguration class:TFMPNetForMultipleChoice(MPNet model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForMultipleChoice.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a multiple choice classification headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:convbert β
TFConvBertForMultipleChoice(ConvBERT model)mobilebert β
TFMobileBertForMultipleChoice(MobileBERT model)distilbert β
TFDistilBertForMultipleChoice(DistilBERT model)albert β
TFAlbertForMultipleChoice(ALBERT model)camembert β
TFCamembertForMultipleChoice(CamemBERT model)xlm-roberta β
TFXLMRobertaForMultipleChoice(XLM-RoBERTa model)mpnet β
TFMPNetForMultipleChoice(MPNet model)longformer β
TFLongformerForMultipleChoice(Longformer model)roberta β
TFRobertaForMultipleChoice(RoBERTa model)flaubert β
TFFlaubertForMultipleChoice(FlauBERT model)bert β
TFBertForMultipleChoice(BERT model)xlnet β
TFXLNetForMultipleChoice(XLNet model)xlm β
TFXLMForMultipleChoice(XLM model)electra β
TFElectraForMultipleChoice(ELECTRA model)funnel β
TFFunnelForMultipleChoice(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path β
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin). In this case,from_ptshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
str, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForMultipleChoice.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForMultipleChoice.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForMultipleChoice.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForTokenClassificationΒΆ
-
class
transformers.TFAutoModelForTokenClassification[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a token classification headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a token classification headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
ConvBertConfigconfiguration class:TFConvBertForTokenClassification(ConvBERT model)DistilBertConfigconfiguration class:TFDistilBertForTokenClassification(DistilBERT model)AlbertConfigconfiguration class:TFAlbertForTokenClassification(ALBERT model)CamembertConfigconfiguration class:TFCamembertForTokenClassification(CamemBERT model)FlaubertConfigconfiguration class:TFFlaubertForTokenClassification(FlauBERT model)XLMConfigconfiguration class:TFXLMForTokenClassification(XLM model)XLMRobertaConfigconfiguration class:TFXLMRobertaForTokenClassification(XLM-RoBERTa model)LongformerConfigconfiguration class:TFLongformerForTokenClassification(Longformer model)RobertaConfigconfiguration class:TFRobertaForTokenClassification(RoBERTa model)BertConfigconfiguration class:TFBertForTokenClassification(BERT model)MobileBertConfigconfiguration class:TFMobileBertForTokenClassification(MobileBERT model)XLNetConfigconfiguration class:TFXLNetForTokenClassification(XLNet model)ElectraConfigconfiguration class:TFElectraForTokenClassification(ELECTRA model)FunnelConfigconfiguration class:TFFunnelForTokenClassification(Funnel Transformer model)MPNetConfigconfiguration class:TFMPNetForTokenClassification(MPNet model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForTokenClassification >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForTokenClassification.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a token classification headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:convbert β
TFConvBertForTokenClassification(ConvBERT model)mobilebert β
TFMobileBertForTokenClassification(MobileBERT model)distilbert β
TFDistilBertForTokenClassification(DistilBERT model)albert β
TFAlbertForTokenClassification(ALBERT model)camembert β
TFCamembertForTokenClassification(CamemBERT model)xlm-roberta β
TFXLMRobertaForTokenClassification(XLM-RoBERTa model)mpnet β
TFMPNetForTokenClassification(MPNet model)longformer β
TFLongformerForTokenClassification(Longformer model)roberta β
TFRobertaForTokenClassification(RoBERTa model)flaubert β
TFFlaubertForTokenClassification(FlauBERT model)bert β
TFBertForTokenClassification(BERT model)xlnet β
TFXLNetForTokenClassification(XLNet model)xlm β
TFXLMForTokenClassification(XLM model)electra β
TFElectraForTokenClassification(ELECTRA model)funnel β
TFFunnelForTokenClassification(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path β
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin). In this case,from_ptshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
str, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForTokenClassification >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForTokenClassification.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForTokenClassification.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForTokenClassification.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForQuestionAnsweringΒΆ
-
class
transformers.TFAutoModelForQuestionAnswering[source]ΒΆ This is a generic model class that will be instantiated as one of the model classes of the libraryβwith a question answering headβwhen created with the
from_pretrained()class method or thefrom_config()class method.This class cannot be instantiated directly using
__init__()(throws an error).-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the model classes of the libraryβwith a question answering headβfrom a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the modelβs configuration. Use
from_pretrained()to load the model weights.- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
ConvBertConfigconfiguration class:TFConvBertForQuestionAnswering(ConvBERT model)DistilBertConfigconfiguration class:TFDistilBertForQuestionAnswering(DistilBERT model)AlbertConfigconfiguration class:TFAlbertForQuestionAnswering(ALBERT model)CamembertConfigconfiguration class:TFCamembertForQuestionAnswering(CamemBERT model)XLMRobertaConfigconfiguration class:TFXLMRobertaForQuestionAnswering(XLM-RoBERTa model)LongformerConfigconfiguration class:TFLongformerForQuestionAnswering(Longformer model)RobertaConfigconfiguration class:TFRobertaForQuestionAnswering(RoBERTa model)BertConfigconfiguration class:TFBertForQuestionAnswering(BERT model)XLNetConfigconfiguration class:TFXLNetForQuestionAnsweringSimple(XLNet model)MobileBertConfigconfiguration class:TFMobileBertForQuestionAnswering(MobileBERT model)FlaubertConfigconfiguration class:TFFlaubertForQuestionAnsweringSimple(FlauBERT model)XLMConfigconfiguration class:TFXLMForQuestionAnsweringSimple(XLM model)ElectraConfigconfiguration class:TFElectraForQuestionAnswering(ELECTRA model)FunnelConfigconfiguration class:TFFunnelForQuestionAnswering(Funnel Transformer model)MPNetConfigconfiguration class:TFMPNetForQuestionAnswering(MPNet model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForQuestionAnswering.from_config(config)
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiate one of the model classes of the libraryβwith a question answering headβfrom a pretrained model.
The model class to instantiate is selected based on the
model_typeproperty of the config object (either passed as an argument or loaded frompretrained_model_name_or_pathif possible), or when itβs missing, by falling back to using pattern matching onpretrained_model_name_or_path:convbert β
TFConvBertForQuestionAnswering(ConvBERT model)mobilebert β
TFMobileBertForQuestionAnswering(MobileBERT model)distilbert β
TFDistilBertForQuestionAnswering(DistilBERT model)albert β
TFAlbertForQuestionAnswering(ALBERT model)camembert β
TFCamembertForQuestionAnswering(CamemBERT model)xlm-roberta β
TFXLMRobertaForQuestionAnswering(XLM-RoBERTa model)mpnet β
TFMPNetForQuestionAnswering(MPNet model)longformer β
TFLongformerForQuestionAnswering(Longformer model)roberta β
TFRobertaForQuestionAnswering(RoBERTa model)flaubert β
TFFlaubertForQuestionAnsweringSimple(FlauBERT model)bert β
TFBertForQuestionAnswering(BERT model)xlnet β
TFXLNetForQuestionAnsweringSimple(XLNet model)xlm β
TFXLMForQuestionAnsweringSimple(XLM model)electra β
TFElectraForQuestionAnswering(ELECTRA model)funnel β
TFFunnelForQuestionAnswering(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()- Parameters
pretrained_model_name_or_path β
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.A path to a directory containing model weights saved using
save_pretrained(), e.g.,./my_model_directory/.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin). In this case,from_ptshould be set toTrueand a configuration object should be provided asconfigargument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) β Will be passed along to the underlying model
__init__()method.config (
PretrainedConfig, optional) βConfiguration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) β
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()andfrom_pretrained()is not a simpler option.cache_dir (
str, optional) β Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool, optional, defaults toFalse) β Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_pathargument).force_download (
bool, optional, defaults toFalse) β Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool, optional, defaults toFalse) β Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional) β A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.output_loading_info (
bool, optional, defaults toFalse) β Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool, optional, defaults toFalse) β Whether or not to only look at local files (e.g., not try downloading the model).revision (
str, optional, defaults to"main") β The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.kwargs (additional keyword arguments, optional) β
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:If a configuration is provided with
config,**kwargswill be directly passed to the underlying modelβs__init__method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying modelβs__init__function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForQuestionAnswering.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForQuestionAnswering.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForQuestionAnswering.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
FlaxAutoModelΒΆ
-
class
transformers.FlaxAutoModel[source]ΒΆ FlaxAutoModelis a generic model class that will be instantiated as one of the base model classes of the library when created with the FlaxAutoModel.from_pretrained(pretrained_model_name_or_path) or the FlaxAutoModel.from_config(config) class methods.This class cannot be instantiated using __init__() (throws an error).
-
classmethod
from_config(config)[source]ΒΆ Instantiates one of the base model classes of the library from a configuration.
- Parameters
config (
PretrainedConfig) βThe model class to instantiate is selected based on the configuration class:
isInstance of roberta configuration class:
FlaxRobertaModel(RoBERTa model)isInstance of bert configuration class:
FlaxBertModel(Bert model
Examples:
config = BertConfig.from_pretrained('bert-base-uncased') # Download configuration from huggingface.co and cache. model = FlaxAutoModel.from_config(config) # E.g. model was saved using `save_pretrained('./test/saved_model/')`
-
classmethod
from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]ΒΆ Instantiates one of the base model classes of the library from a pre-trained model configuration.
The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when itβs missing, falling back to using pattern matching on the pretrained_model_name_or_path string.
The base model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):
contains roberta:
FlaxRobertaModel(RoBERTa model)contains bert:
FlaxBertModel(Bert model)
The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()
- Parameters
pretrained_model_name_or_path β
either:
a string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased.a path to a directory containing model weights saved using
save_pretrained(), e.g.:./my_model_directory/.a path or url to a pytorch index checkpoint file (e.g. ./pt_model/pytorch_model.bin). In this case,
from_ptshould be set to True and a configuration object should be provided asconfigargument.
model_args β (optional) Sequence of positional arguments: All remaining positional arguments will be passed to the underlying modelβs
__init__methodconfig β
(optional) instance of a class derived from
PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:the model is a model provided by the library (loaded with the
shortcut-namestring of a pretrained model), orthe model was saved using
save_pretrained()and is reloaded by supplying the save directory.the model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
cache_dir β (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.
force_download β (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.
resume_download β (optional) boolean, default False: Do not delete incompletely received file. Attempt to resume the download if such a file exists.
proxies β (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {βhttpβ: βfoo.bar:3128β, βhttp://hostnameβ: βfoo.bar:4012β}. The proxies are used on each request.
output_loading_info β (optional) boolean: Set to
Trueto also return a dictionary containing missing keys, unexpected keys and error messages.kwargs β (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.
Examples:
model = FlaxAutoModel.from_pretrained('bert-base-uncased') # Download model and configuration from huggingface.co and cache. model = FlaxAutoModel.from_pretrained('./test/bert_model/') # E.g. model was saved using `save_pretrained('./test/saved_model/')` assert model.config.output_attention == True
-
classmethod