lre.baulab.info Open in urlscan Pro
35.232.255.106  Public Scan

URL: http://lre.baulab.info/
Submission: On December 27 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

LINEARITY OF RELATION DECODING IN TRANSFORMER LMS

Evan Hernandez1*, Arnab Sen Sharma2*, Tal Haklay3, Kevin Meng1, Martin
Wattenberg4, Jacob Andreas1, Yonatan Belinkov3, David Bau2
1MIT CSAIL, 2Northeastern University, 3Technion - IIT; *Equal contribution


ArXiv
Preprint
Source Code

Dataset



HOW DO TRANSFORMER LMS DECODE RELATIONS?

Much of the knowledge contained in neural language models may be expressed in
terms of relations. For example, the fact that Miles Davis is a trumpet player
can be written as a relation (plays the instrumentĀ ) connecting a subject (Miles
DavisĀ ) to an object (trumpetĀ ).

One might expect how a language model decodes a relation to be a sequence of
complex, non-linear computation spanning multiple layers. However, in this paper
we show that for a subset of relations this (highly non-linear) decoding
procedure can be well-approximated by a single linear transformation (LRE) on
the subject representation s after some intermediate layer.

In an LM relations such as plays the instrument can be well-approximated by a
linear function R that maps subject representation s to object representation o,
which is then directy decoded.


HOW TO GET THE LRE APPROXIMATING A RELATION DECODING?

A linear approximation in form of LRE(s) = Ws + b can be obtained by taking a
first order Taylor series approximation to the LM computation, where W is the
local derivative (Jacobian) of the LM computation at some subject representation
s0. For a range of relations we find that averaging the estimation of LRE
parameters on just 5 samples is enough to get a faithful approximation of LM
decoding.

Here F represents how LM obtains the object representation o from the subject
representation s introduced within a textual context c. Kindly refer to our
paper for further details.


HOW WELL IS THE LRE APPROXIMATION?

We evaluate the LRE approximations on a set of 47 relations spanning 4
categories: factual associations, commonsense knowledge, implicit biases, and
linguistic knowledge. We find that for almost half of the relations LRE
faithfully recovers subject-object mappings for a majority of the subjects in
the test set.

We also identify a set of relations where we couldn't find a good LRE
approximations. For most of these relations the range was names of people and
companies. We think the range for this relations are so large that LM cannot
encode them in a single state, and relies on a more complex non-linear decoding
procedure.


ATTRIBUTE LENS

Attribute Lens, which is motivated by the idea that a hidden state h may contain
pieces of information beyond the prediction of the immediate next token. And, an
LRE can be used to extract a certain attribute from h without relevant textual
context. LRE approximating the relation country capital applied on hidden state
h after different layers in different token positions.


HOW TO CITE

This work is not yet peer-reviewed. The preprint can be cited as follows.


BIBLIOGRAPHY

Evan Hernandez, Arnab Sen Sharma, Tal Haklay, Kevin Meng, Martin Wattenberg,
Yonatan Belinkov, and David Bau. "Linearity of Relation Decoding in Transformer
Language Models." arXiv preprint arXiv:2308.09124 (2023).


BIBTEX

@article{hernandez2023linearity,
    title={Linearity of Relation Decoding in Transformer Language Models}, 
    author={Evan Hernandez and Arnab Sen Sharma and Tal Haklay and Kevin Meng and Martin Wattenberg and Jacob Andreas and Yonatan Belinkov and David Bau},
    year={2023},
    eprint={2308.09124},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}




About the Bau Lab