Skip to content

Commit

Permalink
updated_10.30
Browse files Browse the repository at this point in the history
  • Loading branch information
archersama authored Oct 30, 2019
1 parent 922703f commit 6544e31
Showing 1 changed file with 262 additions and 0 deletions.
262 changes: 262 additions & 0 deletions ch3_DL_basics/3.9_mlp-scratch.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,262 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3.9 多层感知机的从零开始实现\n",
"我们已经从上一节里了解了多层感知机的原理。下面,我们一起来动手实现一个多层感知机。首先导入实现所需的包或模块。"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"pycharm": {
"is_executing": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.0\n"
]
}
],
"source": [
"import tensorflow as tf\n",
"import numpy as np\n",
"import sys\n",
"print(tf.__version__)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3.9.1 获取和读取数据\n",
"这里继续使用Fashion-MNIST数据集。我们将使用多层感知机对图像进行分类"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from tensorflow.keras.datasets import fashion_mnist\n",
"(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()\n",
"batch_size = 256\n",
"x_train = tf.cast(x_train, tf.float32)\n",
"x_test = tf.cast(x_test, tf.float32)\n",
"x_train = x_train/255.0\n",
"x_test = x_test/255.0\n",
"train_iter = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(batch_size)\n",
"test_iter = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(batch_size)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3.9.2 定义模型参数\n",
"我们在3.6节(softmax回归的从零开始实现)里已经介绍了,Fashion-MNIST数据集中图像形状为 28×28,类别数为10。本节中我们依然使用长度为 28×28=784 的向量表示每一张图像。因此,输入个数为784,输出个数为10。实验中,我们设超参数隐藏单元个数为256。"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [],
"source": [
"num_inputs, num_outputs, num_hiddens = 784, 10, 256\n",
"\n",
"w1 = tf.Variable(tf.random.truncated_normal([num_inputs, num_hiddens], stddev=0.1))\n",
"b1 = tf.Variable(tf.random.truncated_normal([num_hiddens], stddev=0.1))\n",
"w2 = tf.Variable(tf.random.truncated_normal([num_hiddens, num_outputs], stddev=0.1))\n",
"b2=tf.Variable(tf.random.truncated_normal([num_outputs], stddev=0.1))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3.9.3 定义激活函数\n",
"这里我们使用基础的max函数来实现ReLU,而非直接调用relu函数。"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def relu(x):\n",
" return tf.math.maximum(x,0)"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def net(x,w1,b1,w2,b2):\n",
" x = tf.reshape(x,shape=[-1,num_inputs])\n",
" h = relu(tf.matmul(x,w1) + b1 )\n",
" y = tf.math.softmax( tf.matmul(h,w2) + b2 )\n",
" return y"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3.9.5. 定义损失函数¶\n",
"为了得到更好的数值稳定性,我们直接使用Tensorflow提供的包括softmax运算和交叉熵损失计算的函数。"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def loss(y_hat,y_true):\n",
" return tf.losses.sparse_categorical_crossentropy(y_true,y_hat)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3.9.6. 训练模型"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def acc(y_hat,y):\n",
" return np.mean((tf.argmax(y_hat,axis=1) == y))"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"num_epochs, lr = 5, 0.5"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 loss: 0.7799275\n",
"0 test_acc: 0.875\n",
"1 loss: 0.72887945\n",
"1 test_acc: 0.9375\n",
"2 loss: 0.72454\n",
"2 test_acc: 0.8125\n",
"3 loss: 0.5607478\n",
"3 test_acc: 0.875\n",
"4 loss: 0.5008962\n",
"4 test_acc: 0.9375\n"
]
}
],
"source": [
"for epoch in range(num_epochs):\n",
" loss_all = 0\n",
" for x,y in train_iter:\n",
" with tf.GradientTape() as tape:\n",
" y_hat = net(x,w1,b1,w2,b2)\n",
" l = tf.reduce_mean(loss(y_hat,y))\n",
" loss_all += l.numpy()\n",
" grads = tape.gradient(l, [w1, b1, w2, b2])\n",
" w1.assign_sub(grads[0])\n",
" b1.assign_sub(grads[1])\n",
" w2.assign_sub(grads[2])\n",
" b2.assign_sub(grads[3])\n",
" print(epoch, 'loss:', l.numpy())\n",
" total_correct, total_number = 0, 0\n",
"\n",
" for x,y in test_iter:\n",
" with tf.GradientTape() as tape:\n",
" y_hat = net(x,w1,b1,w2,b2)\n",
" y=tf.cast(y,'int64')\n",
" correct=acc(y_hat,y)\n",
" print(epoch,\"test_acc:\", correct)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

0 comments on commit 6544e31

Please sign in to comment.