News
Newest
Ask
Show
Jobs
Open on GitHub
Theoretical Analysis of Positional Encodings in Transformer Models
(arxiv.org)
36 points | by
PaulHoule
1 day ago
1 comments
semiinfinitely
1 day ago
Kinda disappointing that rope- the most common pe- is given about one sentence in this work and omitted from the analysis.
[-]
gsf_emergency_2
22 hours ago
Maybe it's because ropes by themselves do nothing for the model capacity?
[-]
semiinfinitely
6 hours ago
Neither does alibi
1 comments