宝可梦识别模型训练:使用 TensorFlow 区分妙蛙种子、小火龙、杰尼龟、皮卡丘和超梦
你是真新镇宝可梦研究所的一名研究员,大木博士给了一份宝可梦数据集给你,让你帮他训练一个模型来区分妙蛙种子、小火龙、杰尼龟、皮卡丘和超梦。
首先根据 pokmon.py 批量读取图像路径并生成每张图像的路径和标签,使用 numpy 随机函数将数据打乱顺序。
构建一个简单的卷积神经网络,包含卷积层、池化层和全连接层。对于卷积操作,可以采用具有不同过滤器数量和大小的多个卷积层,以进一步增强模型的性能。
使用 TensorBoard 对准确率(accuracy)和损失(loss)进行可视化,以帮助您更好地理解模型训练中的变化。您可以在代码中使用 tf.summary.scalar() 函数来记录度量指标,并在 TensorBoard 中查看结果。
对于数据集的划分,您可以采用 train_test_split 函数进行训练集和测试集的划分,然后再从训练集中再分出 20% 的验证集作为评估模型性能的标准。需要注意的是在切分时要控制类别的平衡,每个类别的数量应当相等。
采用交叉熵作为损失函数,并结合正则化技术(L1 或 L2 正则化)优化模型防止过拟合,并控制学习率来优化模型,如 Adam 或者 SGD 优化器。您可以尝试不同的超参数并选择性能最好的模型, 将模型做保存。
以下是一个可能的代码实现:
import numpy as np
import tensorflow as tf
from sklearn.model_selection import train_test_split
from tensorflow.keras import layers, models
# 读取图像路径并生成每张图像的路径和标签
image_paths = []
labels = []
for pokemon in ['Bulbasaur', 'Charmander', 'Squirtle', 'Pikachu', 'Mewtwo']:
for i in range(1, 101):
image_paths.append(f'./{pokemon}/{pokemon} ({i}).jpg')
labels.append(pokemon)
labels = np.array(labels)
# 使用 numpy 随机函数将数据打乱顺序
np.random.seed(42)
shuffle_indices = np.random.permutation(len(image_paths))
image_paths = np.array(image_paths)[shuffle_indices]
labels = labels[shuffle_indices]
# 划分训练集、验证集和测试集
train_paths, test_paths, train_labels, test_labels = train_test_split(image_paths, labels, test_size=0.2, stratify=labels)
train_paths, val_paths, train_labels, val_labels = train_test_split(train_paths, train_labels, test_size=0.2, stratify=train_labels)
# 定义图像预处理函数
def preprocess_image(image_path):
image = tf.io.read_file(image_path)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [224, 224])
image = tf.keras.applications.resnet50.preprocess_input(image)
return image
# 定义数据集生成器
BATCH_SIZE = 32
train_dataset = tf.data.Dataset.from_tensor_slices((train_paths, train_labels))
train_dataset = train_dataset.shuffle(buffer_size=len(train_paths))
train_dataset = train_dataset.map(lambda x, y: (preprocess_image(x), y))
train_dataset = train_dataset.batch(BATCH_SIZE)
val_dataset = tf.data.Dataset.from_tensor_slices((val_paths, val_labels))
val_dataset = val_dataset.map(lambda x, y: (preprocess_image(x), y))
val_dataset = val_dataset.batch(BATCH_SIZE)
test_dataset = tf.data.Dataset.from_tensor_slices((test_paths, test_labels))
test_dataset = test_dataset.map(lambda x, y: (preprocess_image(x), y))
test_dataset = test_dataset.batch(BATCH_SIZE)
# 定义模型
model = models.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 3)),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(128, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(5, activation='softmax')
])
# 定义损失函数和优化器
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
# 定义度量指标和 TensorBoard 可视化
train_loss = tf.keras.metrics.Mean(name='train_loss')
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')
val_loss = tf.keras.metrics.Mean(name='val_loss')
val_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='val_accuracy')
summary_writer = tf.summary.create_file_writer('./logs')
# 定义训练函数
@tf.function
def train_step(images, labels):
with tf.GradientTape() as tape:
predictions = model(images)
loss = loss_fn(labels, predictions)
reg_losses = []
for layer in model.layers:
if hasattr(layer, 'kernel_regularizer'):
reg_losses.append(layer.kernel_regularizer(layer.kernel))
total_reg_loss = tf.math.add_n(reg_losses) if reg_losses else 0
loss += 0.001 * total_reg_loss
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
train_loss(loss)
train_accuracy(labels, predictions)
# 定义验证函数
@tf.function
def val_step(images, labels):
predictions = model(images)
loss = loss_fn(labels, predictions)
val_loss(loss)
val_accuracy(labels, predictions)
# 训练模型
EPOCHS = 10
for epoch in range(EPOCHS):
for images, labels in train_dataset:
train_step(images, labels)
with summary_writer.as_default():
tf.summary.scalar('train_loss', train_loss.result(), step=epoch)
tf.summary.scalar('train_accuracy', train_accuracy.result() * 100, step=epoch)
for images, labels in val_dataset:
val_step(images, labels)
with summary_writer.as_default():
tf.summary.scalar('val_loss', val_loss.result(), step=epoch)
tf.summary.scalar('val_accuracy', val_accuracy.result() * 100, step=epoch)
template = 'Epoch {}, Loss: {:.4f}, Accuracy: {:.2f}%, Val Loss: {:.4f}, Val Accuracy: {:.2f}%'
print(template.format(epoch + 1,
train_loss.result(),
train_accuracy.result() * 100,
val_loss.result(),
val_accuracy.result() * 100))
train_loss.reset_states()
train_accuracy.reset_states()
val_loss.reset_states()
val_accuracy.reset_states()
# 评估模型
test_loss = tf.keras.metrics.Mean(name='test_loss')
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='test_accuracy')
for images, labels in test_dataset:
predictions = model(images)
loss = loss_fn(labels, predictions)
test_loss(loss)
test_accuracy(labels, predictions)
print('Test Loss: {:.4f}, Test Accuracy: {:.2f}%'.format(test_loss.result(), test_accuracy.result() * 100))
# 保存模型
model.save('./pokemon_model')
在代码中,我们使用了 ResNet50 的图像预处理函数,将图像大小调整为 224x224,并进行了归一化。模型采用了多个卷积层和池化层,然后将特征展平并通过两个全连接层得到最终的分类结果。我们使用了交叉熵作为损失函数,并结合 L2 正则化来控制过拟合。优化器采用了 Adam,并控制学习率为 0.001。我们将训练过程中的度量指标记录到了 TensorBoard 中。最后,我们评估了模型在测试集上的表现,并将模型保存到了本地。
原文地址: https://www.cveoy.top/t/topic/nTIU 著作权归作者所有。请勿转载和采集!