transformer_heads.model package

Submodules

transformer_heads.model.head module

This module defines a class MLPHead that represents a multi-layer perceptron (MLP) head for a transformer model. It also includes utility functions for saving and loading the state of the MLP head.

Classes:: MLPHead: A class that represents a multi-layer perceptron (MLP) head for a transformer model.

class transformer_heads.model.head.MLPHead(name: str, in_size, hidden_size, num_layers, output_activation: str, num_outputs: int = 1, output_bias: bool = False, trainable: bool = True)

Bases: Module

A class that represents a multi-layer perceptron (MLP) head for a transformer model.

Variables:

name (str) – The name of the MLP head.
trainable (bool) – Whether the MLP head is trainable.
lins (nn.ModuleList) – A list of linear layers in the MLP head.
hidden_activation (nn.ReLU) – The activation function for the hidden layers.
output_activation (nn.Module) – The activation function for the output layer.
requires_individual_saving (bool) – Whether the MLP head needs to be saved separately.

forward(x) → FloatTensor

Performs a forward pass through the MLP head.

Parameters:: x (torch.Tensor) – The input tensor.
Returns:: The output tensor.
Return type:: torch.FloatTensor

classmethod from_head_config(head_config: HeadConfig) → MLPHead

Creates an MLP head from a head configuration.

Parameters:: head_config (HeadConfig) – The head configuration.
Returns:: The created MLP head.
Return type:: MLPHead

load_from_safetensors(folder)

Loads the state of the MLP head from a safetensors file.

Parameters:: folder (str) – The folder where the file is located.

save_to_safetensors(folder)

Saves the state of the MLP head to a safetensors file.

Parameters:: folder (str) – The folder where the file will be saved.

set_requires_grad(requires_grad)

Sets whether the parameters of the MLP head require gradients.

Parameters:: requires_grad (bool) – Whether the parameters require gradients.

transformer_heads.model.model module

This module provides classes and functions for creating and manipulating transformer models with multiple heads.

Classes:: HeadedModel: Abstract base class for models with multiple heads. TransformerWithHeads: Transformer model with multiple heads.
Functions:: get_headed_pretrained_model_class: Get a new model base class for a pretrained model with multiple heads. get_multi_head_transformer: Patch a pretrained transformer model to add multiple heads.

class transformer_heads.model.model.HeadedModel(config: PretrainedConfig, *inputs, **kwargs)

Bases: ABC, PreTrainedModel

Abstract base class for models with multiple heads.

Variables:

head_configs (List[HeadConfig]) – The configurations for the new heads.
vocab_size (int) – The size of the vocabulary.
heads (nn.ModuleDict) – The new heads of the model.
lm_head_config (Optional[HeadConfig]) – The configuration for the pretrained language model head.
lm_head (Optional[MLPHead]) – The pretrained language model head.

head_configs: List[HeadConfig]

heads: ModuleDict

lm_head: MLPHead | None

lm_head_config: HeadConfig | None

vocab_size: int

transformer_heads.model.model.get_headed_pretrained_model_class(base_model_class: Type[PreTrainedModel])

Get a new model base class for a pretrained model with multiple heads.

Parameters:: base_model_class (Type[PreTrainedModel]) – The base class of the pretrained model.
Returns:: The new class supporting headed model configuration.
Return type:: Type[PreTrainedModel]

transformer_heads.model.model.get_multi_head_transformer(base_model_class: Type[PreTrainedModel])

Patch a pretrained transformer model to add multiple heads.

Parameters:: base_model_class (Type[PreTrainedModel]) – The base pretrained model class.
Returns:: The new, patched class.
Return type:: Type[PreTrainedModel]

transformer_heads.model package

Submodules

transformer_heads.model.head module

transformer_heads.model.model module

Module contents