Skip to content

same sentence different padding length result different embedding because softmax precision #1326

@JJplane

Description

@JJplane

I use nn.Softmax(dim=-1) to softmax. I find different outputs.


a = [-3.6180e-01,  6.6926e-01,  1.2248e+01, -9.5795e-01]
b = [-3.6180e-01,  6.6926e-01,  1.2248e+01, -9.5795e-01, -9.5795e-01]

softmax(a) = [3.3403e-06, 9.3662e-06, 9.9999e-01, 1.8402e-06]
softmax(b) =[3.3403e-06, 9.3661e-06, 9.9998e-01, 1.8402e-06, 1.8402e-06]
The different softmax results result in different sentence embedding, sometimes the embedding differ a lot.I test transeformers the question cant repoduce. This bug appears in transformers modified by our company. Any help is appreciate!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions