鸢尾花数据集分类：前向传播、反向传播及 Loss 曲线可视化

本实验使用 TensorFlow 实现了一个简单的神经网络，对鸢尾花数据集进行分类，并实现了前向传播、反向传播以及 Loss 曲线可视化。

首先，我们导入鸢尾花数据集，并将其分为训练集和测试集。接着，使用 tf.data.Dataset 将数据集分批次，并使用 tf.Variable 标记参数可训练。

在训练部分，使用循环结构对每个 epoch 和每个 step 进行迭代，使用 tf.GradientTape 记录梯度信息，并使用均方误差损失函数计算 Loss。计算完一个 batch 的 Loss 后，使用 tape.gradient 计算 Loss 对各个参数的梯度，并使用梯度下降法更新参数。同时，将每个 step 计算出的 Loss 累加，为后续求 Loss 平均值提供数据，这样计算的 Loss 更准确。最后，将每个 epoch 的 Loss 记录下来，并打印出来。

在测试部分，使用更新后的参数进行预测，并计算准确率。同样地，将每个 epoch 的准确率记录下来。

最后，使用 matplotlib 绘制 Loss 曲线。通过 Loss 曲线可以看出，在训练初期，Loss 下降得比较快，但到后期，下降速度变得缓慢，甚至出现了上升的趋势。这说明，模型在初期可以很好地拟合数据，但到后期，可能出现过拟合的情况。因此，在训练神经网络时，需要注意调整学习率、正则化等参数，以避免出现过拟合的情况。

总之，通过这个实验，我们掌握了使用 TensorFlow 实现神经网络的基本方法，包括前向传播、反向传播和参数更新等步骤，同时也学会了如何使用 matplotlib 可视化 Loss 曲线。

以下是实验代码：

import tensorflow as tf
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt
import numpy as np

# 导入数据，分别为输入特征和标签
iris = load_iris()
X = iris.data
y = iris.target
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2)


# 转换x的数据类型
x_train = tf.cast(x_train, tf.float32)
x_test = tf.cast(x_test, tf.float32)

# 把数据集分批次，每个批次batch组数据
train_db = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(32)
test_db = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

# 用tf.Variable()标记参数可训练
# 使用seed使每次生成的随机数相同
w1 = tf.Variable(tf.random.truncated_normal([4, 3], stddev=0.1, seed=1))
b1 = tf.Variable(tf.random.truncated_normal([3], stddev=0.1, seed=1))
# 学习率为0.1
lr = 0.1
train_loss = []
test_acc = []
epoch = 500
# 每轮分4个step，loss_all记录四个step生成的4个loss的和
loss_all = 0

# 训练部分
# 每个epoch循环一次数据集
for epoch in range(epoch):
    # 每个step循环一个batch
    for step, (x_train, y_train) in enumerate(train_db):
        # with结构记录梯度信息
        with tf.GradientTape() as tape:
            # 神经网络乘加运算
            y = tf.matmul(x_train, w1) + b1
            # 使输出y符合概率分布
            y = tf.nn.softmax(y)
            # 将标签值转换为独热码格式，方便计算loss和accuracy
            y_pred = tf.one_hot(y_train, depth=3)
            # 采用均方误差损失函数
            loss = tf.reduce_mean(tf.square(y_pred - y))
            # 将每个step计算出的loss累加，为后续求loss平均值提供数据，这样计算的loss更准确
            loss_all += loss.numpy()

        # 计算loss对各个参数的梯度
        grads = tape.gradient(loss, [w1, b1])
        # 实现梯度更新 w1 = w1 - lr * w1_grad    b = b - lr * b_grad
        w1.assign_sub(lr * grads[0])  # 参数w1自更新
        b1.assign_sub(lr * grads[1])  # 参数b自更新

    # 每个epoch，打印loss信息
    if epoch % 10 == 0:
        print("Epoch {}, loss: {}".format(epoch, loss_all / 4))
    # 将4个step的loss求平均记录在此变量中
    train_loss.append(loss_all / 4)
    loss_all = 0

    # 测试部分
    total_correct, total_number = 0, 0
    for x_test, y_test in test_db:
        # 使用更新后的参数进行预测
        y = tf.matmul(x_test, w1) + b1
        y = tf.nn.softmax(y)
        # 返回y中最大值的索引，即预测的分类
        pred = tf.argmax(y, axis=1)
        # 将pred转换为y_test的数据类型
        pred = tf.cast(pred, dtype=y_test.dtype)
        # 若分类正确，则correct=1，否则为0，将bool型的结果转换为int型
        correct = tf.cast(tf.equal(pred, y_test), dtype=tf.int32)
        # 计算correct个数
        correct = tf.reduce_sum(correct)
        total_correct += int(correct)
        # total_number为测试的总样本数，也就是x_test的行数，shape[0]返回变量的行数
        total_number += x_test.shape[0]

    # 总的准确率
    acc = total_correct / total_number
    test_acc.append(acc)
    if epoch % 10 == 0:
        print("Test_acc:", acc)
        print("--------------------------")

# 绘制 loss 曲线
plt.title('Loss Function Curve')
plt.xlabel('Epoch')
plt.ylabel('Loss')
# 逐点画出train_loss值并连线，连线图标是Loss
plt.plot(train_loss, label="$Loss$")
plt.legend()
plt.show()