TensorFlowのMNISTはどうなってるのか？

TensorFlowのチュートリアルのMNISTはどうなってるのか？ソースコードはこれです。

とりあえずデフォルトだと、[0,0,0,128,255…]とかではなく、[0.,0.,0.9..]みたいになっていて、これはdtypeをtf.float32にするとわざわざそうしてくれるらしい。もう一つtf.unit8というのにもできるので、それぞれの出力結果を確認してみようと思います。

import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
mnist32 = input_data.read_data_sets('.\mnist', one_hot=True, dtype=tf.float32)
mnist8 = input_data.read_data_sets('.\mnist', one_hot=True, dtype=tf.uint8)

x32, t32 = mnist32.train.next_batch(1)
x8, t8 = mnist8.train.next_batch(1)

print(x32)
print(t32)
print(x8)
print(t8)

やっぱり、x32は、0から1の数値が入っております。1次元784列です。 x8は、0から255の数値が入っており、同じく1次元784列です。

x32は、 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.のようになります。next_batchは普通1つだけ抽出することはないので、配列の中に配列が入ってます。0から始まるのでこれは7が正解だと言っています。では画像出力してみます。

from PIL import Image
Image.fromarray(x8.reshape(28, 28)).show()

これ７なの？みたいな画像が表示されました。7,3,4,6,1の順で入っていました。画像はそれぞれ下記になります。

一応ラベルも確認してみます。

[[ 0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
 [ 0.  0.  0.  1.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.]]

あってます。7,3,4,6,1です。

あとは、x32は、x8を255で割っているだけなのか？？255は1なはずだから、それだけな気がする。コードは、下記のようになってます。

if dtype == tf.float32:
    # Convert from [0, 255] -> [0.0, 1.0].
    images = images.astype(numpy.float32)
    images = numpy.multiply(images, 1.0 / 255.0)

そのようだな。float32型にしてから、255分の1をかけてるということじゃろう。一応数値を確かめてみよう。

tmp = []
for i in range(784):
    if x8[0][i] != 0:
        tmp.append([i, x8[0][i], '{0:.2f}'.format(x32[0][i]), '{0:.2f}'.format(x8[0][i] / 255)])
print(tmp)

結果、下記のようになり、やっぱり一緒になる。

[[207, 97, '0.38', '0.38'], [208, 96, '0.38', '0.38'], [209, 77, '0.30', '0.30'], [210, 118, '0.46', '0.46'], [211, 61, '0.24', '0
.24'], [227, 90, '0.35', '0.35'], [228, 138, '0.54', '0.54'], [229, 235, '0.92', '0.92'], [230, 235, '0.92', '0.92'], [231, 235, '
0.92', '0.92'], [232, 235, '0.92', '0.92'], [233, 235, '0.92', '0.92'], [234, 235, '0.92', '0.92'], [235, 251, '0.98', '0.98'], [2
36, 251, '0.98', '0.98'],

テストデータはどうなってるのか？

tx32 = mnist32.test.images
tx8 = mnist8.test.images
print(tx32.shape)
print(tx8.shape)

結果

(10000, 784)
(10000, 784)

tx8は0-255の数字だったし、tx32は0-1の数字で、きちんと設定が反映されております。デフォルトで10000個くれるんだな。

いやー上記は全部想像していた通りなのですが、だったらなぜこれが動かないのでしょうか？

import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('.\mnist', one_hot=True, dtype=tf.uint8)

x = tf.placeholder(tf.float32, [None, 784])
w = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
f = tf.matmul(x, w) + b
y = tf.nn.softmax(f)
t = tf.placeholder(tf.float32, [None, 10])
loss = -tf.reduce_sum(t * tf.log(y))
train_step = tf.train.AdamOptimizer().minimize(loss)
correct = tf.equal(tf.argmax(y, 1), tf.argmax(t, 1))
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())
    i = 0
    for _ in range(1500):
        i += 1
        batch_x, batch_t = mnist.train.next_batch(100)
        sess.run(train_step, feed_dict={x: batch_x, t: batch_t})
        if i % 100 == 0:
            loss_val, acc_val = sess.run([loss, accuracy], feed_dict={x: mnist.test.images, t: mnist.test.labels})
            print ('Step: %d, Loss: %f, Accuracy: %f' % (i, loss_val, acc_val))

結果は下記になります。

Step: 100, Loss: nan, Accuracy: 0.098000
Step: 200, Loss: nan, Accuracy: 0.098000
Step: 300, Loss: nan, Accuracy: 0.098000
Step: 400, Loss: nan, Accuracy: 0.098000
Step: 500, Loss: nan, Accuracy: 0.098000
Step: 600, Loss: nan, Accuracy: 0.098000
Step: 700, Loss: nan, Accuracy: 0.098000
Step: 800, Loss: nan, Accuracy: 0.098000
Step: 900, Loss: nan, Accuracy: 0.098000
Step: 1000, Loss: nan, Accuracy: 0.098000
Step: 1100, Loss: nan, Accuracy: 0.098000
Step: 1200, Loss: nan, Accuracy: 0.098000
Step: 1300, Loss: nan, Accuracy: 0.098000
Step: 1400, Loss: nan, Accuracy: 0.098000
Step: 1500, Loss: nan, Accuracy: 0.098000

mnist = input_data.read_data_sets(’.\mnist’, one_hot=True, dtype=tf.uint8)を、mnist = input_data.read_data_sets(’.\mnist’, one_hot=True, dtype=tf.float32)にすると、こうなります。

Step: 100, Loss: 7747.070312, Accuracy: 0.848400
Step: 200, Loss: 5439.358887, Accuracy: 0.879900
Step: 300, Loss: 4556.464355, Accuracy: 0.890900
Step: 400, Loss: 4132.033203, Accuracy: 0.896100
Step: 500, Loss: 3836.137207, Accuracy: 0.902600
Step: 600, Loss: 3662.451416, Accuracy: 0.903300
Step: 700, Loss: 3505.859619, Accuracy: 0.908400
Step: 800, Loss: 3415.246338, Accuracy: 0.909000
Step: 900, Loss: 3291.936035, Accuracy: 0.913200
Step: 1000, Loss: 3229.531738, Accuracy: 0.912900
Step: 1100, Loss: 3156.712646, Accuracy: 0.913200
Step: 1200, Loss: 3110.640625, Accuracy: 0.915800
Step: 1300, Loss: 3053.396973, Accuracy: 0.917000
Step: 1400, Loss: 3032.461914, Accuracy: 0.915400
Step: 1500, Loss: 2984.136719, Accuracy: 0.917200

uint8にすると、テスト画像のxはしっかり0-255になるのですが、wとbがnanになってます。結果、fもyもlossもnanになっています。なんでMNISTをuint8形式にするとwとかが取得できないのでしょうか？？

下記のような感じで5回だけ回してみました。

mnist = input_data.read_data_sets('.\mnist', one_hot=True, dtype=tf.uint8)

x = tf.placeholder(tf.float32, [None, 784])
w = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
f = tf.matmul(x, w) + b
y = tf.nn.softmax(f)
t = tf.placeholder(tf.float32, [None, 10])
loss = -tf.reduce_sum(t * tf.log(y))
train_step = tf.train.AdamOptimizer().minimize(loss)
correct = tf.equal(tf.argmax(y, 1), tf.argmax(t, 1))
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for _ in range(5):
        bx, bt = mnist.train.next_batch(100)
        fd = {x:bx, t:bt}
        now_w, now_b, _ = sess.run([w, b, train_step], feed_dict=fd)
        loss_val, acc_val = sess.run([loss, accuracy], feed_dict=fd)
        print(now_w)
        print(now_b)
        print(loss_val)
        print(acc_val)

結果がこれです。

[[ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 ...,
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
343.416
0.65
[[ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 ...,
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]]
[ 0.001       0.001      -0.001       0.001       0.001      -0.001       0.001
 -0.00036121  0.001      -0.001     ]
245.757
0.66
[[ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 ...,
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]]
[ 0.0003892   0.00182066 -0.0005059   0.0005982   0.00195108 -0.0005164
  0.00037463  0.00038293  0.00030813 -0.00045584]
476.496
0.54
[[ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 ...,
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]]
[  2.22086237e-04   2.57444754e-03   1.69790001e-04  -8.80775624e-05
   1.40353921e-03   1.97276066e-04   1.13845745e-04   8.85801390e-04
   1.59423871e-04   2.26144330e-05]
nan
0.13
[[ nan  nan  nan ...,  nan  nan  nan]
 [ nan  nan  nan ...,  nan  nan  nan]
 [ nan  nan  nan ...,  nan  nan  nan]
 ...,
 [ nan  nan  nan ...,  nan  nan  nan]
 [ nan  nan  nan ...,  nan  nan  nan]
 [ nan  nan  nan ...,  nan  nan  nan]]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan  nan]
nan
0.1

w等のVariableの初期化に失敗してるとかじゃなくて、計算過程でnanに変わってしまいました。学習が発散するとnanになるとかチュートリアルに書いてあったので、それかなあと思いました。にしてもどうして発散するのかな？

参考：http://stackoverflow.com/questions/33712178/tensorflow-nan-bug

交差エントロピーのnp.log(0)が-infを出すことに対する対策が必要だった。ちょうどこれになってるっぽい。clip_by_valueは、第二引数と、第三引数の間に値を調整してくれるようです。

修正版コード

import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('.\mnist', one_hot=True, dtype=tf.uint8)

x = tf.placeholder(tf.float32, [None, 784])
w = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
f = tf.matmul(x, w) + b
y = tf.nn.softmax(f)
t = tf.placeholder(tf.float32, [None, 10])
loss = -tf.reduce_sum(t * tf.log(tf.clip_by_value(y, 1e-10, 1.0)))
train_step = tf.train.AdamOptimizer().minimize(loss)
correct = tf.equal(tf.argmax(y, 1), tf.argmax(t, 1))
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    i = 0
    for _ in range(10000):
        i += 1
        bx, bt = mnist.train.next_batch(300)
        now_w, now_b, _ = sess.run([w, b, train_step], feed_dict={x:bx, t:bt})
        if not i % 500:
            loss_val, acc_val = sess.run([loss, accuracy], feed_dict={x:mnist.test.images, t:mnist.test.labels})
            print ('Step: %d, Loss: %f, Accuracy: %f' % (i, loss_val, acc_val))

結果

Step: 500, Loss: 12224.713867, Accuracy: 0.909800
Step: 1000, Loss: 13273.717773, Accuracy: 0.917400
Step: 1500, Loss: 14892.584961, Accuracy: 0.909100
Step: 2000, Loss: 14022.375000, Accuracy: 0.918800
Step: 2500, Loss: 14104.679688, Accuracy: 0.921500
Step: 3000, Loss: 13933.588867, Accuracy: 0.926500
Step: 3500, Loss: 14054.534180, Accuracy: 0.926300
Step: 4000, Loss: 14930.049805, Accuracy: 0.924100
Step: 4500, Loss: 14557.189453, Accuracy: 0.924800
Step: 5000, Loss: 15464.050781, Accuracy: 0.921000
Step: 5500, Loss: 14617.333008, Accuracy: 0.927400
Step: 6000, Loss: 14869.592773, Accuracy: 0.924900
Step: 6500, Loss: 15666.247070, Accuracy: 0.923500
Step: 7000, Loss: 14788.012695, Accuracy: 0.927800
Step: 7500, Loss: 15928.329102, Accuracy: 0.921200
Step: 8000, Loss: 15767.657227, Accuracy: 0.923200
Step: 8500, Loss: 14828.413086, Accuracy: 0.928400
Step: 9000, Loss: 15601.589844, Accuracy: 0.924700
Step: 9500, Loss: 16263.544922, Accuracy: 0.921700
Step: 10000, Loss: 15657.148438, Accuracy: 0.926300