2019 ifKakao 방문기
- 2019 if Kakao는 29일에서 30일까지 이틀에 걸쳐 진행되었습니다.
- 29일에는 ‘FE’, 30일에는 ‘데이터’에 대한 내용이 주를 이루었습니다.
- Previous attention needs source vector to conjuncture the relevance with task. In encoder-decoder architecture, an output of encoder is source vector while an output of decoder is target task vector.
- Self-Attention embeds contextual information - each tokens’ significance in given task - into a matrix rather than a vector, assuming that there can be more than 1 contextual weight vector for 1 sentence, While former Attention embeds contextual information into vector.
- Self-Attention helps model to pay attention to significant parts of sentence for target task relieving some long-term memorization burden from LSTM, and provides attention matrix for visualization.
- This paper provided clue to solve long term dependency problem and to develop self-attention, transformer, and BERT, the most popular model in 2019.
- It is undeniable that attention, which supports decoder to search where the relatively significant parts are, is novel approach itself compared to previous one which embed source sentence into one fixed-length vector according to distributional hypothesis.
- Eventually, attention contributed to broaden model variation of NLP, expanding the existing options that were limited to recurrent network family.