-
Notifications
You must be signed in to change notification settings - Fork 821
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
922703f
commit 6544e31
Showing
1 changed file
with
262 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,262 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"3.9 多层感知机的从零开始实现\n", | ||
"我们已经从上一节里了解了多层感知机的原理。下面,我们一起来动手实现一个多层感知机。首先导入实现所需的包或模块。" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 41, | ||
"metadata": { | ||
"pycharm": { | ||
"is_executing": false | ||
} | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"2.0.0\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"import tensorflow as tf\n", | ||
"import numpy as np\n", | ||
"import sys\n", | ||
"print(tf.__version__)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"3.9.1 获取和读取数据\n", | ||
"这里继续使用Fashion-MNIST数据集。我们将使用多层感知机对图像进行分类" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 42, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from tensorflow.keras.datasets import fashion_mnist\n", | ||
"(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()\n", | ||
"batch_size = 256\n", | ||
"x_train = tf.cast(x_train, tf.float32)\n", | ||
"x_test = tf.cast(x_test, tf.float32)\n", | ||
"x_train = x_train/255.0\n", | ||
"x_test = x_test/255.0\n", | ||
"train_iter = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(batch_size)\n", | ||
"test_iter = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(batch_size)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"3.9.2 定义模型参数\n", | ||
"我们在3.6节(softmax回归的从零开始实现)里已经介绍了,Fashion-MNIST数据集中图像形状为 28×28,类别数为10。本节中我们依然使用长度为 28×28=784 的向量表示每一张图像。因此,输入个数为784,输出个数为10。实验中,我们设超参数隐藏单元个数为256。" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 43, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"num_inputs, num_outputs, num_hiddens = 784, 10, 256\n", | ||
"\n", | ||
"w1 = tf.Variable(tf.random.truncated_normal([num_inputs, num_hiddens], stddev=0.1))\n", | ||
"b1 = tf.Variable(tf.random.truncated_normal([num_hiddens], stddev=0.1))\n", | ||
"w2 = tf.Variable(tf.random.truncated_normal([num_hiddens, num_outputs], stddev=0.1))\n", | ||
"b2=tf.Variable(tf.random.truncated_normal([num_outputs], stddev=0.1))\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"3.9.3 定义激活函数\n", | ||
"这里我们使用基础的max函数来实现ReLU,而非直接调用relu函数。" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 44, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"def relu(x):\n", | ||
" return tf.math.maximum(x,0)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 45, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"def net(x,w1,b1,w2,b2):\n", | ||
" x = tf.reshape(x,shape=[-1,num_inputs])\n", | ||
" h = relu(tf.matmul(x,w1) + b1 )\n", | ||
" y = tf.math.softmax( tf.matmul(h,w2) + b2 )\n", | ||
" return y" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"3.9.5. 定义损失函数¶\n", | ||
"为了得到更好的数值稳定性,我们直接使用Tensorflow提供的包括softmax运算和交叉熵损失计算的函数。" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 46, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"def loss(y_hat,y_true):\n", | ||
" return tf.losses.sparse_categorical_crossentropy(y_true,y_hat)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"3.9.6. 训练模型" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 47, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"def acc(y_hat,y):\n", | ||
" return np.mean((tf.argmax(y_hat,axis=1) == y))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 48, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"num_epochs, lr = 5, 0.5" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 49, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"0 loss: 0.7799275\n", | ||
"0 test_acc: 0.875\n", | ||
"1 loss: 0.72887945\n", | ||
"1 test_acc: 0.9375\n", | ||
"2 loss: 0.72454\n", | ||
"2 test_acc: 0.8125\n", | ||
"3 loss: 0.5607478\n", | ||
"3 test_acc: 0.875\n", | ||
"4 loss: 0.5008962\n", | ||
"4 test_acc: 0.9375\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"for epoch in range(num_epochs):\n", | ||
" loss_all = 0\n", | ||
" for x,y in train_iter:\n", | ||
" with tf.GradientTape() as tape:\n", | ||
" y_hat = net(x,w1,b1,w2,b2)\n", | ||
" l = tf.reduce_mean(loss(y_hat,y))\n", | ||
" loss_all += l.numpy()\n", | ||
" grads = tape.gradient(l, [w1, b1, w2, b2])\n", | ||
" w1.assign_sub(grads[0])\n", | ||
" b1.assign_sub(grads[1])\n", | ||
" w2.assign_sub(grads[2])\n", | ||
" b2.assign_sub(grads[3])\n", | ||
" print(epoch, 'loss:', l.numpy())\n", | ||
" total_correct, total_number = 0, 0\n", | ||
"\n", | ||
" for x,y in test_iter:\n", | ||
" with tf.GradientTape() as tape:\n", | ||
" y_hat = net(x,w1,b1,w2,b2)\n", | ||
" y=tf.cast(y,'int64')\n", | ||
" correct=acc(y_hat,y)\n", | ||
" print(epoch,\"test_acc:\", correct)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.6.1" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |