transformer_heads.model package

Submodules

transformer_heads.model.head module

This module defines a class MLPHead that represents a multi-layer perceptron (MLP) head for a transformer model. It also includes utility functions for saving and loading the state of the MLP head.

Classes:

MLPHead: A class that represents a multi-layer perceptron (MLP) head for a transformer model.

class transformer_heads.model.head.MLPHead(name: str, in_size, hidden_size, num_layers, output_activation: str, num_outputs: int = 1, output_bias: bool = False, trainable: bool = True)

Bases: Module

A class that represents a multi-layer perceptron (MLP) head for a transformer model.

Variables:
  • name (str) – The name of the MLP head.

  • trainable (bool) – Whether the MLP head is trainable.

  • lins (nn.ModuleList) – A list of linear layers in the MLP head.

  • hidden_activation (nn.ReLU) – The activation function for the hidden layers.

  • output_activation (nn.Module) – The activation function for the output layer.

  • requires_individual_saving (bool) – Whether the MLP head needs to be saved separately.

forward(x) FloatTensor

Performs a forward pass through the MLP head.

Parameters:

x (torch.Tensor) – The input tensor.

Returns:

The output tensor.

Return type:

torch.FloatTensor

classmethod from_head_config(head_config: HeadConfig) MLPHead

Creates an MLP head from a head configuration.

Parameters:

head_config (HeadConfig) – The head configuration.

Returns:

The created MLP head.

Return type:

MLPHead

load_from_safetensors(folder)

Loads the state of the MLP head from a safetensors file.

Parameters:

folder (str) – The folder where the file is located.

save_to_safetensors(folder)

Saves the state of the MLP head to a safetensors file.

Parameters:

folder (str) – The folder where the file will be saved.

set_requires_grad(requires_grad)

Sets whether the parameters of the MLP head require gradients.

Parameters:

requires_grad (bool) – Whether the parameters require gradients.

transformer_heads.model.model module

This module provides classes and functions for creating and manipulating transformer models with multiple heads.

Classes:

HeadedModel: Abstract base class for models with multiple heads. TransformerWithHeads: Transformer model with multiple heads.

Functions:

get_headed_pretrained_model_class: Get a new model base class for a pretrained model with multiple heads. get_multi_head_transformer: Patch a pretrained transformer model to add multiple heads.

class transformer_heads.model.model.HeadedModel(config: PretrainedConfig, *inputs, **kwargs)

Bases: ABC, PreTrainedModel

Abstract base class for models with multiple heads.

Variables:
  • head_configs (List[HeadConfig]) – The configurations for the new heads.

  • vocab_size (int) – The size of the vocabulary.

  • heads (nn.ModuleDict) – The new heads of the model.

  • lm_head_config (Optional[HeadConfig]) – The configuration for the pretrained language model head.

  • lm_head (Optional[MLPHead]) – The pretrained language model head.

head_configs: List[HeadConfig]
heads: ModuleDict
lm_head: MLPHead | None
lm_head_config: HeadConfig | None
vocab_size: int
transformer_heads.model.model.get_headed_pretrained_model_class(base_model_class: Type[PreTrainedModel])

Get a new model base class for a pretrained model with multiple heads.

Parameters:

base_model_class (Type[PreTrainedModel]) – The base class of the pretrained model.

Returns:

The new class supporting headed model configuration.

Return type:

Type[PreTrainedModel]

transformer_heads.model.model.get_multi_head_transformer(base_model_class: Type[PreTrainedModel])

Patch a pretrained transformer model to add multiple heads.

Parameters:

base_model_class (Type[PreTrainedModel]) – The base pretrained model class.

Returns:

The new, patched class.

Return type:

Type[PreTrainedModel]

Module contents