Skip to content

Latest commit

 

History

History
116 lines (70 loc) · 7.41 KB

File metadata and controls

116 lines (70 loc) · 7.41 KB
graph LR
    labml_nn_transformers_gpt["labml_nn.transformers.gpt"]
    labml_nn_transformers_models["labml_nn.transformers.models"]
    labml_nn_transformers_mha["labml_nn.transformers.mha"]
    labml_nn_transformers_positional_encoding["labml_nn.transformers.positional_encoding"]
    labml_nn_transformers_rope["labml_nn.transformers.rope"]
    labml_nn_transformers_configs["labml_nn.transformers.configs"]
    labml_nn_transformers_label_smoothing_loss["labml_nn.transformers.label_smoothing_loss"]
    labml_nn_transformers_utils["labml_nn.transformers.utils"]
    labml_nn_transformers_gpt -- "uses" --> labml_nn_transformers_mha
    labml_nn_transformers_gpt -- "uses" --> labml_nn_transformers_positional_encoding
    labml_nn_transformers_gpt -- "uses" --> labml_nn_transformers_rope
    labml_nn_transformers_gpt -- "receives configurations from" --> labml_nn_transformers_configs
    labml_nn_transformers_gpt -- "utilizes" --> labml_nn_transformers_label_smoothing_loss
    labml_nn_transformers_gpt -- "relies on" --> labml_nn_transformers_utils
    labml_nn_transformers_models -- "uses" --> labml_nn_transformers_mha
    labml_nn_transformers_models -- "uses" --> labml_nn_transformers_positional_encoding
    labml_nn_transformers_models -- "uses" --> labml_nn_transformers_rope
    labml_nn_transformers_models -- "receives configurations from" --> labml_nn_transformers_configs
    labml_nn_transformers_models -- "relies on" --> labml_nn_transformers_utils
    labml_nn_transformers_mha -- "is a core building block for" --> labml_nn_transformers_gpt
    labml_nn_transformers_mha -- "is a core building block for" --> labml_nn_transformers_models
    labml_nn_transformers_mha -- "may use" --> labml_nn_transformers_utils
    labml_nn_transformers_positional_encoding -- "provides positional information to" --> labml_nn_transformers_gpt
    labml_nn_transformers_positional_encoding -- "provides positional information to" --> labml_nn_transformers_models
    labml_nn_transformers_rope -- "provides alternative positional embedding to" --> labml_nn_transformers_gpt
    labml_nn_transformers_rope -- "provides alternative positional embedding to" --> labml_nn_transformers_models
    labml_nn_transformers_configs -- "provides configuration settings to" --> labml_nn_transformers_gpt
    labml_nn_transformers_configs -- "provides configuration settings to" --> labml_nn_transformers_models
    labml_nn_transformers_utils -- "provides utility functions to" --> labml_nn_transformers_mha
    labml_nn_transformers_utils -- "provides utility functions to" --> labml_nn_transformers_gpt
    labml_nn_transformers_utils -- "provides utility functions to" --> labml_nn_transformers_models
Loading

CodeBoardingDemoContact

Details

The Transformer Model Implementations subsystem is primarily encapsulated within the labml_nn.transformers package. This subsystem focuses on providing core building blocks and complete implementations of various transformer architectures.

labml_nn.transformers.gpt

Implements the Generative Pre-trained Transformer (GPT) architecture, an autoregressive model designed for sequence generation tasks. It manages the overall GPT model structure, including layers, attention mechanisms, and forward pass logic.

Related Classes/Methods:

labml_nn.transformers.models

Provides a generic and reusable framework for constructing various transformer models, capable of encompassing both encoder and decoder functionalities. It offers a flexible base for building different transformer architectures by composing core components.

Related Classes/Methods:

labml_nn.transformers.mha

Implements the multi-head attention mechanism, a fundamental component for capturing dependencies across different representation subspaces within sequences. It computes attention scores and combines information from multiple "heads" to form a richer representation.

Related Classes/Methods:

labml_nn.transformers.positional_encoding

Generates and applies sinusoidal positional encodings to input sequences, providing crucial positional information to attention mechanisms, as transformers are permutation-invariant. It injects absolute positional information into token embeddings.

Related Classes/Methods:

labml_nn.transformers.rope

Implements Rotary Positional Embeddings (RoPE), an alternative and often more effective method for integrating relative positional information directly into attention computations. It modifies attention computations to incorporate relative positional data.

Related Classes/Methods:

labml_nn.transformers.configs

Defines and manages configuration settings for various transformer sub-components and models. It enables flexible and standardized parameterization of models and their building blocks.

Related Classes/Methods:

labml_nn.transformers.label_smoothing_loss

Provides a label smoothing regularization technique, commonly used to improve the generalization and calibration of deep learning models, particularly in sequence-to-sequence tasks. It computes a modified cross-entropy loss with label smoothing.

Related Classes/Methods:

labml_nn.transformers.utils

Offers general utility functions that support various transformer implementations. It provides common helper operations such as mask generation, tensor manipulations, etc.

Related Classes/Methods: