Tensorflow 에서 데이터 Batch처리 손쉽게 구현하기

건조젤리 2020. 1. 6. 15:00

코드

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

x_input = tf.placeholder(tf.float32, [None, 28*28])
y_input = tf.placeholder(tf.float32, [None, 10])
 
 
input_dataset = tf.data.Dataset.from_tensor_slices((x_input, y_input)).repeat().batch(100)
train_iterator = input_dataset.make_initializable_iterator()
next_batch_train = train_iterator.get_next()
 
 
sess = tf.Session()
sess.run(train_iterator.initializer, feed_dict={ x_input: ...(입력 데이터 설정),
                                                y_input: ...(입력 데이터 설정)})
 
for iter in range(100):
    batch = sess.run(next_batch_train)
    ...
Colored by Color Scripter

cs

설명

입력데이터를 동적으로 할당하기 위해 x_input, y_input placeholer를 설정합니다. (1, 2)
입력데이터를 Dataset 형태로 바꾼뒤, 반복과 배치 크기를 설정합니다. (5)
입력데이터를 초기화 하기위한 연산자를 설정합니다. (6)
다음 배치를 가져오기 위한 연산자를 설정합니다. (7)
세션을 연 후 입력 데이터를 할당합니다. (10, 11)
next_batch_train 코드가 실행 될 때마다 batch에 x_input, y_input이 설정한 batch 크기만큼 들어옵니다. (14, 15)

더 자세한 설명과 Tensorflow Dataset의 다양한 기능 설명들은 아래 블로그에서 찾을 수 있습니다.

출처 : https://cyc1am3n.github.io/2018/09/13/how-to-use-dataset-in-tensorflow.html

TensorFlow에서 Dataset을 사용하는 방법

The built-in Input Pipeline. Never use ‘feed-dict’ anymore

cyc1am3n.github.io