文章传送
机器学习之监督学习(一)线性回归、多项式回归、算法优化[巨具体笔记]
机器学习之监督学习(二)二元逻辑回归
机器学习之监督学习(三)神经网络基础
机器学习之实战篇——猜测二手房房价(线性回归)
机器学习之实战篇——肿瘤良性/恶性分类器(二元逻辑回归)
MNIST数据集介绍:
MNIST数据集是机器学习和计算机视觉领域中最知名和广泛使用的数据集之一。它是一个大型手写数字数据库,包罗 70,000 张手写数字图像,60,000 张练习图像,10,000 张测试图像。每张图像是 28x28 像素的灰度图像素值范围从 0(白色)到 255(黑色),每张图像对应一个 0 到 9 的数字标签。
在实验开始前,为了熟悉这个伟大的数据集,读者可以先做一下下面的小实验,考试你的手写数字识别本领。只管识别手写数字对于人类来说小菜一碟,但由于图像分辨率比较低同时有些数字写的比较抽象,因此想要到达100%准确率还是很难的,实验表明类的平均准确率约为97.5%到98.5%,实验代码如下:
- import numpy as np
- from tensorflow.keras.datasets import mnist
- import matplotlib.pyplot as plt
- from random import sample
- # 加载MNIST数据集
- (_, _), (x_test, y_test) = mnist.load_data()
- # 随机选择100个样本
- indices = sample(range(len(x_test)), 100)
- correct = 0
- total = 100
- for i, idx in enumerate(indices, 1):
- # 显示图像
- plt.imshow(x_test[idx], cmap='gray')
- plt.axis('off')
- plt.show()
- # 获取用户输入
- user_answer = input(f"问题 {i}/100: 这个数字是什么? ")
- # 检查答案
- if int(user_answer) == y_test[idx]:
- correct += 1
- print("正确!")
- else:
- print(f"错误. 正确答案是 {y_test[idx]}")
- print(f"当前正确率: {correct}/{i} ({correct/i*100:.2f}%)")
- print(f"\n最终正确率: {correct}/{total} ({correct/total*100:.2f}%)")
复制代码 实验过程
实验环境
pycharm+jupyter notebook
导入模块
- import numpy as np
- import matplotlib.pyplot as plt
- import tensorflow as tf
- from sklearn.metrics import accuracy_score
- from tensorflow.keras.layers import Input,Dense,Dropout
- from tensorflow.keras.regularizers import l2
- from tensorflow.keras.optimizers import Adam
- from tensorflow.keras.models import Sequential
- from tensorflow.keras.losses import SparseCategoricalCrossentropy
- from tensorflow.keras.callbacks import EarlyStopping
- import matplotlib
- matplotlib.rcParams['font.family'] = 'SimHei' # 或者 'Microsoft YaHei'
- matplotlib.rcParams['axes.unicode_minus'] = False # 解决负号 '-'
复制代码 导入MNIST数据集
导入mnist手写数字集(包括练习集和测试集)
- from tensorflow.keras.datasets import mnist
- (x_train, y_train), (x_test, y_test) = mnist.load_data()
复制代码 查看练习、测试数据集的规模
- print(f'x_train.shape:{x_train.shape}')
- print(f'y_train.shape:{y_train.shape}')
- print(f'x_test.shape:{x_test.shape}')
- print(f'y_test.shape:{y_test.shape}')
复制代码- x_train.shape:(60000, 28, 28)
- y_train.shape:(60000,)
- x_test.shape:(10000, 28, 28)
- y_test.shape:(10000,)
复制代码 查看64张手写图片
- #查看64张训练手写图片内容
- #获取训练集规模
- m=x_train.shape[0]
- #创建4*4子图布局
- fig,axes=plt.subplots(8,8,figsize=(8,8))
- #每张子图随机呈现一张手写图片
- for i,ax in enumerate(axes.flat):
- idx=np.random.randint(m)
- #imshow():传入图片的像素矩阵,cmap='gray',显示黑白图片
- ax.imshow(x_train[idx],cmap='gray')
- #设置子图标题,将图片标签显示在图片上方
- ax.set_title(y_train[idx])
- # 移除坐标轴
- ax.axis('off')
- #调整子图之间的间距
- plt.tight_layout()
复制代码 (由于空间限制,没有展现全64张图片)
将图片灰度像素矩阵转为灰度像素向量[展平],同时举行归一化[/255](0-255->0-1)
- x_train_flat=x_train.reshape(60000,28*28).astype('float32')/255
- x_test_flat=x_test.reshape(10000,28*28).astype('float32')/255
复制代码 查看展平后数据集规模
- print(f'x_train.shape:{x_train_flat.shape}')
- print(f'x_test.shape:{x_test_flat.shape}')
复制代码- x_train.shape:(60000, 784)
- x_test.shape:(10000, 784)
复制代码 创建神经网络模子举行练习,测试,评估
开端创建第一个三层全毗连神经网络,隐层中采用‘relu’激活函数,使用分类交织熵损失函数(设置from_logits=True,淘汰练习过程计算误差),Adam学习率自顺应器(设置初始学习率0.001)
- #创建神经网络
- model1=Sequential(
- [
- Input(shape=(784,)),
- Dense(128,activation='relu',name='L1'),
- Dense(32,activation='relu',name='L2'),
- Dense(10,activation='linear',name='L3'),
- ],name='model1',
- )
- #编译模型
- model1.compile(loss=SparseCategoricalCrossentropy(from_logits=True),optimizer=Adam(learning_rate=0.001))
- #查看模型总结
- model1.summary()
复制代码- Model: "model1"
- _________________________________________________________________
- Layer (type) Output Shape Param #
- =================================================================
- L1 (Dense) (None, 128) 100480
-
- L2 (Dense) (None, 32) 4128
-
- L3 (Dense) (None, 10) 330
-
- =================================================================
- Total params: 104,938
- Trainable params: 104,938
- Non-trainable params: 0
复制代码 model1拟合练习集开始练习,迭代次数开端设置为20
- model1.fit(x_train_flat,y_train,epochs=20)
复制代码- Epoch 1/20
- 1875/1875 [==============================] - 12s 5ms/step - loss: 0.2502
- Epoch 2/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.1057
- Epoch 3/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0748
- Epoch 4/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0547
- Epoch 5/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0438
- Epoch 6/20
- 1875/1875 [==============================] - 8s 5ms/step - loss: 0.0360
- Epoch 7/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0300
- Epoch 8/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0237
- Epoch 9/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0223
- Epoch 10/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0201
- Epoch 11/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0166
- Epoch 12/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0172
- Epoch 13/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0131
- Epoch 14/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0124
- Epoch 15/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0133
- Epoch 16/20
- 1875/1875 [==============================] - 10s 5ms/step - loss: 0.0108
- Epoch 17/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0095
- Epoch 18/20
- 1875/1875 [==============================] - 10s 5ms/step - loss: 0.0116
- Epoch 19/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0090
- Epoch 20/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0084
复制代码 查看model1练习结果,由于模子直接输出Logits,必要通过softmax函数激活输出概率向量,然后通过最大概率索引得出模子识别的手写数字
- #查看训练结果
- z_train_hat=model1.predict(x_train_flat)
- #经过softmax激活后得到概率向量构成的矩阵
- p_train_hat=tf.nn.softmax(z_train_hat).numpy()
- #找出每个概率向量最大概率对应的索引,即识别的数字
- y_train_hat=np.argmax(p_train_hat,axis=1)
- print(y_train_hat)
复制代码 可以将上述代码编写为函数:
- #神经网络输出->最终识别结果
- def get_result(z):
- p=tf.nn.softmax(z)
- y=np.argmax(p,axis=1)
- return y
复制代码 为了明白上面的输出处理过程,查看第一个样本的逻辑输出、概率向量和识别数字
- print(f'Logits:{z_train_hat[0]}')
- print(f'Probabilities:{p_train_hat[0]}')
- print(f'targe:{y_train_hat[0]}')
复制代码- Logits:[-21.427883 -11.558845 -15.150495 15.6205845 -58.351833 29.704205
- -23.925339 -30.009314 -11.389831 -14.521982 ]
- Probabilities:[6.2175050e-23 1.2013921e-18 3.3101813e-20 7.6482343e-07 0.0000000e+00
- 9.9999928e-01 5.1166414e-24 1.1661356e-26 1.4226123e-18 6.2059749e-20]
- targe:5
复制代码 输出model1练习准确率,准确率到达99.8%
- print(f'model1训练集准确率:{accuracy_score(y_train,y_train_hat)}')
复制代码 测试model1,准确率到达97.9%,相当不戳
- z_test_hat=model1.predict(x_test_flat)
- y_test_hat=get_result(z_test_hat)
- print(f'model1测试集准确率:{accuracy_score(y_test,y_test_hat)}')
复制代码- 313/313 [==============================] - 1s 3ms/step
- model1测试集准确率:0.9789
复制代码 为了方便后续神经网络模子的实验,编写run_model函数包罗练习、测试模子的整个过程,引入早停机制,即当10个epoch内练习损失没有改善,则克制练习
- early_stopping = EarlyStopping(
- monitor='loss',
- patience=10, # 如果10个epoch内训练损失没有改善,则停止训练
- restore_best_weights=True # 恢复最佳权重
- )
- def run_model(model,epochs):
- model.fit(x_train_flat,y_train,epochs=epochs,callbacks=[early_stopping])
- z_train_hat=model.predict(x_train_flat)
- y_train_hat=get_result(z_train_hat)
- print(f'{model.name}训练准确率:{accuracy_score(y_train,y_train_hat)}')
- z_test_hat=model.predict(x_test_flat)
- y_test_hat=get_result(z_test_hat)
- print(f'{model.name}测试准确率:{accuracy_score(y_test,y_test_hat)}')
复制代码 查看模子在哪些图片上栽了跟头:
- #显示n张错误识别图片的函数
- def show_error_pic(x, y, y_pred, n=64):
- wrong_idx = (y != y_pred)
-
- # 获取错误识别的图片和标签
- x_wrong = x[wrong_idx]
- y_wrong = y[wrong_idx]
- y_pred_wrong = y_pred[wrong_idx]
-
- # 选择前n张错误图片
- n = min(n, len(x_wrong))
- x_wrong = x_wrong[:n]
- y_wrong = y_wrong[:n]
- y_pred_wrong = y_pred_wrong[:n]
-
- # 设置图片网格
- rows = int(np.ceil(n / 8))
- fig, axes = plt.subplots(rows, 8, figsize=(20, 2.5*rows))
- axes = axes.flatten()
-
- for i in range(n):
- ax = axes[i]
- ax.imshow(x_wrong[i].reshape(28, 28), cmap='gray')
- ax.set_title(f'True: {y_wrong[i]}, Pred: {y_pred_wrong[i]}')
- ax.axis('off')
-
- # 隐藏多余的子图
- for i in range(n, len(axes)):
- axes[i].axis('off')
-
- plt.tight_layout()
- plt.show()
- show_error_pic(x_test,y_test,y_test_hat)
复制代码 (出于空间限制,只展示部分图片)
模子优化
目前来看我们第一个较简单的神经网络表现得非常不错,练习准确率到达99.8%,测试准确率到达97.9%,而人类的平均准确率约为97.5%到98.5%,因此我们诊断模子存在肯定高方差的问题,可以思量引入正则化技术或增长数据量来优化模子,或者从另一方面,思量采用更加大型的神经网络看看可否到达更优的准确率。
model2:model1基础上,增长迭代次数至40次
- #创建神经网络
- model2=Sequential(
- [
- Input(shape=(784,)),
- Dense(128,activation='relu',name='L1'),
- Dense(32,activation='relu',name='L2'),
- Dense(10,activation='linear',name='L3'),
- ],name='model2',
- )
- #编译模型
- model2.compile(loss=SparseCategoricalCrossentropy(from_logits=True),optimizer=Adam(learning_rate=0.001))
- #查看模型总结
- model2.summary()
- run_model(model2,40)
复制代码 可以看到测试准确率到达98%,略有提拔,但思量到运行时间翻倍,收益并不明显
model3:采用宽度和厚度更大的神经网络,迭代次数20
- #增加模型宽度和厚度
- model3 = Sequential([
- Input(shape=(784,)),
- Dense(256, activation='relu', name='L1'),
- Dense(128, activation='relu', name='L2'),
- Dense(64, activation='relu', name='L3'),
- Dense(10, activation='linear', name='L4'),
- ], name='model3')
- #编译模型
- model3.compile(loss=SparseCategoricalCrossentropy(from_logits=True),optimizer=Adam(learning_rate=0.001))
- #查看模型总结
- model3.summary()
- run_model(model3,20)
复制代码- Model: "model3"
- _________________________________________________________________
- Layer (type) Output Shape Param #
- =================================================================
- L1 (Dense) (None, 256) 200960
-
- L2 (Dense) (None, 128) 32896
-
- L3 (Dense) (None, 64) 8256
-
- L4 (Dense) (None, 10) 650
-
- =================================================================
- Total params: 242,762
- Trainable params: 242,762
- Non-trainable params: 0
- _________________________________________________________________
- Epoch 1/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.2152
- Epoch 2/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.0908
- Epoch 3/20
- 1875/1875 [==============================] - 12s 7ms/step - loss: 0.0623
- Epoch 4/20
- 1875/1875 [==============================] - 12s 7ms/step - loss: 0.0496
- Epoch 5/20
- 1875/1875 [==============================] - 12s 7ms/step - loss: 0.0390
- Epoch 6/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.0341
- Epoch 7/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.0291
- Epoch 8/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.0244
- Epoch 9/20
- 1875/1875 [==============================] - 12s 7ms/step - loss: 0.0223
- Epoch 10/20
- 1875/1875 [==============================] - 12s 7ms/step - loss: 0.0187
- Epoch 11/20
- 1875/1875 [==============================] - 12s 7ms/step - loss: 0.0206
- Epoch 12/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.0145
- Epoch 13/20
- 1875/1875 [==============================] - 12s 7ms/step - loss: 0.0176
- Epoch 14/20
- 1875/1875 [==============================] - 12s 7ms/step - loss: 0.0153
- Epoch 15/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.0120
- Epoch 16/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.0148
- Epoch 17/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.0125
- Epoch 18/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.0123
- Epoch 19/20
- 1875/1875 [==============================] - 13s 7ms/step - loss: 0.0120
- Epoch 20/20
- 1875/1875 [==============================] - 13s 7ms/step - loss: 0.0094
- 1875/1875 [==============================] - 6s 3ms/step
- model3训练准确率:0.9989333333333333
- 313/313 [==============================] - 1s 4ms/step
- model3测试准确率:0.9816
复制代码 model3练习准确率到达99.9%,测试准确率也取得了目前为止的新高98.2%
model4:model1基础上,加入Dropout层引入正则化
- #Dropout正则化
- model4 = Sequential([
- Input(shape=(784,)),
- Dense(128, activation='relu', name='L1'),
- Dropout(0.3),
- Dense(64, activation='relu', name='L2'),
- Dropout(0.2),
- Dense(10, activation='linear', name='L3'),
- ], name='model4')
- #编译模型
- model4.compile(loss=SparseCategoricalCrossentropy(from_logits=True),optimizer=Adam(learning_rate=0.001))
- #查看模型总结
- model4.summary()
- run_model(model4,20)
复制代码- Model: "model5"
- _________________________________________________________________
- Layer (type) Output Shape Param #
- =================================================================
- L1 (Dense) (None, 128) 100480
-
- dropout_2 (Dropout) (None, 128) 0
-
- L2 (Dense) (None, 64) 8256
-
- dropout_3 (Dropout) (None, 64) 0
-
- L3 (Dense) (None, 10) 650
-
- =================================================================
- Total params: 109,386
- Trainable params: 109,386
- Non-trainable params: 0
- _________________________________________________________________
- Epoch 1/20
- 1875/1875 [==============================] - 15s 7ms/step - loss: 0.3686
- Epoch 2/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.1855
- Epoch 3/20
- 1875/1875 [==============================] - 17s 9ms/step - loss: 0.1475
- Epoch 4/20
- 1875/1875 [==============================] - 17s 9ms/step - loss: 0.1289
- Epoch 5/20
- 1875/1875 [==============================] - 20s 11ms/step - loss: 0.1124
- Epoch 6/20
- 1875/1875 [==============================] - 19s 10ms/step - loss: 0.1053
- Epoch 7/20
- 1875/1875 [==============================] - 22s 12ms/step - loss: 0.0976
- Epoch 8/20
- 1875/1875 [==============================] - 15s 8ms/step - loss: 0.0907
- Epoch 9/20
- 1875/1875 [==============================] - 12s 6ms/step - loss: 0.0861
- Epoch 10/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0807
- Epoch 11/20
- 1875/1875 [==============================] - 10s 5ms/step - loss: 0.0794
- Epoch 12/20
- 1875/1875 [==============================] - 11s 6ms/step - loss: 0.0744
- Epoch 13/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0733
- Epoch 14/20
- 1875/1875 [==============================] - 8s 4ms/step - loss: 0.0734
- Epoch 15/20
- 1875/1875 [==============================] - 8s 4ms/step - loss: 0.0691
- Epoch 16/20
- 1875/1875 [==============================] - 10s 5ms/step - loss: 0.0656
- Epoch 17/20
- 1875/1875 [==============================] - 11s 6ms/step - loss: 0.0674
- Epoch 18/20
- 1875/1875 [==============================] - 12s 7ms/step - loss: 0.0614
- Epoch 19/20
- 1875/1875 [==============================] - 11s 6ms/step - loss: 0.0601
- Epoch 20/20
- 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0614
- 1875/1875 [==============================] - 5s 3ms/step
- model5训练准确率:0.9951833333333333
- 313/313 [==============================] - 1s 2ms/step
- model5测试准确率:0.98
复制代码 model5练习准确率下降到了99.5%,但是相比model1测试准确率98%略有提拔,Dropout正则化简直有用低落了模子方差,增强了模子的泛化本领
综上思量,使用model3的框架同时引入Dropout正则化,迭代练习40次,构建model7
- #最终全连接神经网络
- model7 = Sequential([
- Input(shape=(784,)),
- Dense(256, activation='relu', name='L1'),
- Dropout(0.3),
- Dense(128, activation='relu', name='L2'),
- Dropout(0.2),
- Dense(64, activation='relu', name='L3'),
- Dropout(0.1),
- Dense(10, activation='linear', name='L4'),
- ], name='model7')
- #编译模型
- model7.compile(loss=SparseCategoricalCrossentropy(from_logits=True),optimizer=Adam(learning_rate=0.001))
- #查看模型总结
- model7.summary()
- run_model(model7,40)
复制代码 model7练习准确率99.8%,测试准确率到达了98.3%,相比model1的97.9%,取得了接近0.4%的提拔。
本实验是学习了神经网络基础后的一个实验练习,因此只采用全毗连神经网络模子。我们知道CNN模子在图像识别上本领更强,因此在实验最后创建一个CNN网络举行测试(gpt生成网络框架)。
- from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten
- model8 = Sequential([
- Input(shape=(28, 28, 1)),
- Conv2D(32, kernel_size=(3, 3), activation='relu'),
- MaxPooling2D(pool_size=(2, 2)),
- Conv2D(64, kernel_size=(3, 3), activation='relu'),
- MaxPooling2D(pool_size=(2, 2)),
- Flatten(),
- Dense(128, activation='relu'),
- Dense(10, activation='linear')
- ], name='cnn_model')
- #编译模型
- model8.compile(loss=SparseCategoricalCrossentropy(from_logits=True),optimizer=Adam(learning_rate=0.001))
- #查看模型总结
- model8.summary()
- model8.fit(x_train,y_train,epochs=20,callbacks=[early_stopping])
- z_train_hat=model8.predict(x_train)
- y_train_hat=get_result(z_train_hat)
- print(f'{model8.name}训练准确率:{accuracy_score(y_train,y_train_hat)}')
- z_test_hat=model8.predict(x_test)
- y_test_hat=get_result(z_test_hat)
- print(f'{model8.name}测试准确率:{accuracy_score(y_test,y_test_hat)}')
复制代码 cnn网络:
cnn_model练习准确率:0.9982333333333333
cnn_model测试准确率:0.9878
可以看到测试准确率到达了98.8%,比我们上面的全毗连神经网络要优异。
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |