use biased estimate of std in layernorm as in the original paper #119

Arunprakash-A · 2024-01-25T17:23:08Z

The original paper computes a biased estimate of sample standard deviation. However, by default, torch.Tensor.std() uses an unbiased estimate Ref. Therefore, it is necessary to use torch.Tensor.std(-1,unbiased=False). Moreover, the class nn.LayerNorm() uses biased estimate as well. Though it does not make much difference for large dim, following the definition given in the cited paper is more appropriate.

For PyTorch>=2.0, use torch.Tensor.std(-1,correction=0).

Arunprakash-A added 2 commits January 25, 2024 22:16

compute std in layernorm as in original paper

439bae7

computed biased std in layernorm as in the original paper

515560c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use biased estimate of std in layernorm as in the original paper #119

use biased estimate of std in layernorm as in the original paper #119

Arunprakash-A commented Jan 25, 2024

use biased estimate of std in layernorm as in the original paper #119

Are you sure you want to change the base?

use biased estimate of std in layernorm as in the original paper #119

Conversation

Arunprakash-A commented Jan 25, 2024