[Deep Learning] Attention을 사용한 신경망

Notice

Recent Posts

Recent Comments

Link

« 2025/12 »
일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

Data Analyst KIM

[Deep Learning] Attention을 사용한 신경망 본문

데이터 분석/ML | DL | NLP

[Deep Learning] Attention을 사용한 신경망

김두연 2023. 10. 26. 15:29

Attention은 모델이 각 단계에서 서로 다른 입력 요소의 중요성에 가중치를 다르게 부여하는 방법이다.

Attention의 구현 원리는 다음의 그림과 같다.

먼저 인코더와 디코더 사이에 층이 하나 생긴다. 새로 삽입된 층에는 각 셀로부터 계산된 스코어들이 모인다.

이 스코어를 이용해 softmax 함수를 사용해서 Attention 가중치를 만든다.

예를 들어 '당신께'라는 자리에 가장 적절한 단어는 'you'라는 것을 학습한다.

이러한 방식으로 매 출력마다 모든 입력 값을 활용하게 하는 것이 어텐션이다.

마지막 셀에 모든 입력이 집중되는 RNN의 단점을 극복해낸 알고리즘이다.

Attention을 활용하여 구현을 해보자.

!pip install attention

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Embedding, LSTM, Conv1D, MaxPooling1D
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.utils import plot_model
from attention import Attention

import numpy as np
import matplotlib.pyplot as plt

# 데이터를 불러와 학습셋, 테스트셋으로 분리
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=5000)

# 단어의 수를 맞춤
X_train = sequence.pad_sequences(X_train, maxlen=500)
X_test = sequence.pad_sequences(X_test, maxlen=500)

# 모델의 구조를 설정
model = Sequential()
model.add(Embedding(5000, 500))
model.add(Dropout(0.5))
model.add(LSTM(64, return_sequences=True))
model.add(Attention())
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

# 모델의 실행 옵션 지정
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# 학습의 조기 중단을 설정
early_stopping_callback = EarlyStopping(monitor='val_loss', patience=3)

# 모델을 실행
history = model.fit(X_train, y_train, batch_size=40, epochs=100,  validation_data=(X_test, y_test), callbacks=[early_stopping_callback])

# 테스트 정확도 출력
print("\n Test Accuracy: %.4f" % (model.evaluate(X_test, y_test)[1]))

# 학습셋과 테스트셋의 오차 저장
y_vloss = history.history['val_loss']
y_loss = history.history['loss']

# 그래프로 표현
x_len = np.arange(len(y_loss))
plt.plot(x_len, y_vloss, marker='.', c="red", label='Testset_loss')
plt.plot(x_len, y_loss, marker='.', c="blue", label='Trainset_loss')

# 그래프에 그리드를 주고 레이블을 표시
plt.legend(loc='upper right')
plt.grid()
plt.xlabel('epoch')
plt.ylabel('loss')
plt.show()

정확도 = 88.08%
앞서 어텐션 없이 실행했던 모델의 84.54%보다 상승된 것을 알 수 있음

저작자표시 (새창열림)

'데이터 분석 > ML | DL | NLP' 카테고리의 다른 글

[Deep Learning] 오토인코더(Auto-Encoder)란 무엇인가? (0)	2023.10.26
[Deep Learning] GAN이란 무엇인가? (0)	2023.10.26
[Deep Learning] RNN,LSTM의 개념 및 로이터 뉴스 카테고리 분류하기 (0)	2023.10.26
[ML] 2. Model Selection 모듈 소개(train_test_split,교차검증,그리드서치) (0)	2023.10.09
[ML] 1. 사이킷런 기반의 프레임워크 (1)	2023.10.04

'데이터 분석/ML | DL | NLP' Related Articles

Data Analyst KIM

[Deep Learning] Attention을 사용한 신경망 본문

[Deep Learning] Attention을 사용한 신경망

'데이터 분석 > ML | DL | NLP' 카테고리의 다른 글

티스토리툴바