TORCHMIX 🧩
GitHubGitHub (opens in a new tab)
  • Introduction
  • Component Class
  • Global Configurations
  • Config-only Mode
  • Examples
    • ViT
    • BERT
    • GPT
  • Components
    • Attention
    • WindowAttention
    • Feedforward
    • PositionalEmbedding
    • SinusoidalEmbedding
    • VocabEmbedding
    • PatchEmbedding
    • ClassEmbedding
    • AvgPool
    • ClassPool
    • PatchMerge
    • Add
    • Mul
    • Attach
    • Dropout
    • DropPath
    • StochasticDepth
    • PreNorm
    • PostNorm
    • Repeat
  • Plugins
    • CausalMask
    • DropAttention
    • DropProjection
    • RelativePositionBias
    • RelativePositionBiasViT
    • RotaryEmbedding
    • SubNorm
    • DropActivation
    • DropProjectionIn
    • DropProjectionOut
    • Transpose
  • Introduction
  • Component Class
  • Global Configurations
  • Config-only Mode
  • Examples
    • ViT
    • BERT
    • GPT
  • Components
    • Attention
    • WindowAttention
    • Feedforward
    • PositionalEmbedding
    • SinusoidalEmbedding
    • VocabEmbedding
    • PatchEmbedding
    • ClassEmbedding
    • AvgPool
    • ClassPool
    • PatchMerge
    • Add
    • Mul
    • Attach
    • Dropout
    • DropPath
    • StochasticDepth
    • PreNorm
    • PostNorm
    • Repeat
  • Plugins
    • CausalMask
    • DropAttention
    • DropProjection
    • RelativePositionBias
    • RelativePositionBiasViT
    • RotaryEmbedding
    • SubNorm
    • DropActivation
    • DropProjectionIn
    • DropProjectionOut
    • Transpose
Question? Give us feedback → (opens in a new tab)Edit this page
Plugins
SubNorm

SubNorm (Attention)

Apply layer normalization before projection.

PreNorm(
    Attention(
        dim=768,
        plugins=[
            SubNorm(dim=768)
        ],
    ),
    dim=768,
)

This plugin implements Sub-LN for Foundation Transformers (opens in a new tab). Note that Sub-LN presumes Pre-LN rather than Post-LN

RotaryEmbeddingDropActivation