transformer_heads.model package
Submodules
transformer_heads.model.head module
This module defines a class MLPHead that represents a multi-layer perceptron (MLP) head for a transformer model. It also includes utility functions for saving and loading the state of the MLP head.
- Classes:
MLPHead: A class that represents a multi-layer perceptron (MLP) head for a transformer model.
- class transformer_heads.model.head.MLPHead(name: str, in_size, hidden_size, num_layers, output_activation: str, num_outputs: int = 1, output_bias: bool = False, trainable: bool = True)
Bases:
ModuleA class that represents a multi-layer perceptron (MLP) head for a transformer model.
- Variables:
name (str) – The name of the MLP head.
trainable (bool) – Whether the MLP head is trainable.
lins (nn.ModuleList) – A list of linear layers in the MLP head.
hidden_activation (nn.ReLU) – The activation function for the hidden layers.
output_activation (nn.Module) – The activation function for the output layer.
requires_individual_saving (bool) – Whether the MLP head needs to be saved separately.
- forward(x) FloatTensor
Performs a forward pass through the MLP head.
- Parameters:
x (torch.Tensor) – The input tensor.
- Returns:
The output tensor.
- Return type:
torch.FloatTensor
- classmethod from_head_config(head_config: HeadConfig) MLPHead
Creates an MLP head from a head configuration.
- Parameters:
head_config (HeadConfig) – The head configuration.
- Returns:
The created MLP head.
- Return type:
- load_from_safetensors(folder)
Loads the state of the MLP head from a safetensors file.
- Parameters:
folder (str) – The folder where the file is located.
- save_to_safetensors(folder)
Saves the state of the MLP head to a safetensors file.
- Parameters:
folder (str) – The folder where the file will be saved.
- set_requires_grad(requires_grad)
Sets whether the parameters of the MLP head require gradients.
- Parameters:
requires_grad (bool) – Whether the parameters require gradients.
transformer_heads.model.model module
This module provides classes and functions for creating and manipulating transformer models with multiple heads.
- Classes:
HeadedModel: Abstract base class for models with multiple heads. TransformerWithHeads: Transformer model with multiple heads.
- Functions:
get_headed_pretrained_model_class: Get a new model base class for a pretrained model with multiple heads. get_multi_head_transformer: Patch a pretrained transformer model to add multiple heads.
- class transformer_heads.model.model.HeadedModel(config: PretrainedConfig, *inputs, **kwargs)
Bases:
ABC,PreTrainedModelAbstract base class for models with multiple heads.
- Variables:
head_configs (List[HeadConfig]) – The configurations for the new heads.
vocab_size (int) – The size of the vocabulary.
heads (nn.ModuleDict) – The new heads of the model.
lm_head_config (Optional[HeadConfig]) – The configuration for the pretrained language model head.
lm_head (Optional[MLPHead]) – The pretrained language model head.
- head_configs: List[HeadConfig]
- heads: ModuleDict
- lm_head_config: HeadConfig | None
- vocab_size: int
- transformer_heads.model.model.get_headed_pretrained_model_class(base_model_class: Type[PreTrainedModel])
Get a new model base class for a pretrained model with multiple heads.
- Parameters:
base_model_class (Type[PreTrainedModel]) – The base class of the pretrained model.
- Returns:
The new class supporting headed model configuration.
- Return type:
Type[PreTrainedModel]
- transformer_heads.model.model.get_multi_head_transformer(base_model_class: Type[PreTrainedModel])
Patch a pretrained transformer model to add multiple heads.
- Parameters:
base_model_class (Type[PreTrainedModel]) – The base pretrained model class.
- Returns:
The new, patched class.
- Return type:
Type[PreTrainedModel]