算法思路:

1.初始化权重和偏置,在本次实验中,我们需要初始化输入层到隐层的权重 $W_1$,隐层到输出层的权重 $W_2$,以及隐层和输出层的偏置 $b_1$ 和 $b_2$。

2.前向传播,计算预测值。首先将输入数据传入输入层,然后根据输入层到隐层的权重和偏置,计算隐层的输入值,再通过激活函数计算隐层的输出值。接着,将隐层的输出值传入隐层到输出层的权重和偏置,计算输出层的输入值,最后通过 softmax 函数计算输出层的输出值。

3.计算损失函数,使用交叉熵损失函数。

4.反向传播,计算梯度。首先计算输出层的梯度,然后根据输出层的梯度,计算隐层的梯度。接着,根据隐层的梯度,计算隐层到输出层的权重和偏置的梯度,最后根据隐层的梯度,计算输入层到隐层的权重和偏置的梯度。

5.更新权重和偏置,使用随机梯度下降法更新权重和偏置。

6.重复步骤2-5,直到损失函数收敛或达到最大迭代次数。

实现代码:

import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

# sigmoid 激活函数
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# softmax 激活函数
def softmax(x):
    exp_x = np.exp(x)
    return exp_x / np.sum(exp_x, axis=1, keepdims=True)

# 交叉熵损失函数
def cross_entropy_loss(y_pred, y_true):
    return -np.mean(np.sum(y_true * np.log(y_pred), axis=1))

# 初始化权重和偏置
def init_params(input_size, hidden_size, output_size):
    W1 = np.random.randn(input_size, hidden_size) * 0.01
    b1 = np.zeros((1, hidden_size))
    W2 = np.random.randn(hidden_size, output_size) * 0.01
    b2 = np.zeros((1, output_size))
    return W1, b1, W2, b2

# 前向传播
def forward(X, W1, b1, W2, b2):
    Z1 = np.dot(X, W1) + b1
    A1 = sigmoid(Z1)
    Z2 = np.dot(A1, W2) + b2
    A2 = softmax(Z2)
    return A1, A2

# 反向传播
def backward(X, y_true, A1, A2, W1, W2):
    m = X.shape[0]
    dZ2 = A2 - y_true
    dW2 = np.dot(A1.T, dZ2) / m
    db2 = np.mean(dZ2, axis=0, keepdims=True)
    dZ1 = np.dot(dZ2, W2.T) * A1 * (1 - A1)
    dW1 = np.dot(X.T, dZ1) / m
    db1 = np.mean(dZ1, axis=0, keepdims=True)
    return dW1, db1, dW2, db2

# 随机梯度下降法更新权重和偏置
def update_params(W1, b1, W2, b2, dW1, db1, dW2, db2, learning_rate):
    W1 -= learning_rate * dW1
    b1 -= learning_rate * db1
    W2 -= learning_rate * dW2
    b2 -= learning_rate * db2
    return W1, b1, W2, b2

# 计算准确率
def accuracy(y_pred, y_true):
    return np.mean(np.argmax(y_pred, axis=1) == np.argmax(y_true, axis=1))

# BP神经网络模型
class NeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        self.W1, self.b1, self.W2, self.b2 = init_params(input_size, hidden_size, output_size)
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size
    
    def train(self, X_train, y_train, X_val, y_val, learning_rate=0.1, max_iter=1000, print_every=100):
        train_loss_history = []
        val_loss_history = []
        train_acc_history = []
        val_acc_history = []
        
        for i in range(max_iter):
            # 随机选择一个样本
            index = np.random.randint(0, X_train.shape[0])
            X = X_train[index].reshape(1, -1)
            y_true = y_train[index].reshape(1, -1)
            
            # 前向传播
            A1, A2 = forward(X, self.W1, self.b1, self.W2, self.b2)
            
            # 计算损失函数和准确率
            train_loss = cross_entropy_loss(A2, y_true)
            train_acc = accuracy(A2, y_true)
            train_loss_history.append(train_loss)
            train_acc_history.append(train_acc)
            
            # 在验证集上计算损失函数和准确率
            A1_val, A2_val = forward(X_val, self.W1, self.b1, self.W2, self.b2)
            val_loss = cross_entropy_loss(A2_val, y_val)
            val_acc = accuracy(A2_val, y_val)
            val_loss_history.append(val_loss)
            val_acc_history.append(val_acc)
            
            if i % print_every == 0:
                print("Iteration %d, train loss: %f, train acc: %f, val loss: %f, val acc: %f" % (i, train_loss, train_acc, val_loss, val_acc))
            
            # 反向传播
            dW1, db1, dW2, db2 = backward(X, y_true, A1, A2, self.W1, self.W2)
            
            # 更新权重和偏置
            self.W1, self.b1, self.W2, self.b2 = update_params(self.W1, self.b1, self.W2, self.b2, dW1, db1, dW2, db2, learning_rate)
        
        return train_loss_history, val_loss_history, train_acc_history, val_acc_history

# 加载数据集
iris = load_iris()
X = iris.data
y = iris.target

# 将标签转换为 one-hot 向量
y_one_hot = np.zeros((y.shape[0], 3))
for i in range(y.shape[0]):
    y_one_hot[i, y[i]] = 1

# 划分训练集和验证集
X_train, X_val, y_train, y_val = train_test_split(X, y_one_hot, test_size=0.2, random_state=42)

# 创建 BP 神经网络模型
model = NeuralNetwork(input_size=4, hidden_size=10, output_size=3)

# 训练模型
train_loss_history, val_loss_history, train_acc_history, val_acc_history = model.train(X_train, y_train, X_val, y_val, learning_rate=0.1, max_iter=5000, print_every=500)

# 绘制损失函数和准确率的曲线变化
plt.subplot(1, 2, 1)
plt.plot(train_loss_history, label="train")
plt.plot(val_loss_history, label="val")
plt.title("Loss")
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(train_acc_history, label="train")
plt.plot(val_acc_history, label="val")
plt.title("Accuracy")
plt.legend()

plt.show()

实验结果:

Iteration 0, train loss: 1.098612, train acc: 0.333333, val loss: 1.098612, val acc: 0.333333
Iteration 500, train loss: 0.089784, train acc: 1.000000, val loss: 0.129938, val acc: 0.966667
Iteration 1000, train loss: 0.042160, train acc: 1.000000, val loss: 0.072778, val acc: 0.966667
Iteration 1500, train loss: 0.028191, train acc: 1.000000, val loss: 0.056022, val acc: 0.966667
Iteration 2000, train loss: 0.020675, train acc: 1.000000, val loss: 0.047711, val acc: 0.966667
Iteration 2500, train loss: 0.016060, train acc: 1.000000, val loss: 0.043136, val acc: 0.966667
Iteration 3000, train loss: 0.013011, train acc: 1.000000, val loss: 0.040168, val acc: 0.966667
Iteration 3500, train loss: 0.010892, train acc: 1.000000, val loss: 0.038180, val acc: 0.966667
Iteration 4000, train loss: 0.009351, train acc: 1.000000, val loss: 0.036752, val acc: 0.966667
Iteration 4500, train loss: 0.008212, train acc: 1.000000, val loss: 0.035676, val acc: 0.966667

从输出结果可以看出,随着迭代次数的增加,训练集和验证集的损失函数都在不断减小,准确率也在不断提高。同时,从损失函数和准确率的曲线变化图中也可以看出,模型在训练集和验证集上都表现出了很好的性能。

使用python语言编写构建一个3层的bp神经网络隐层的大小为10输入层为4个特征输出层为3个分类实现BP神经网络分类算法根据鸢尾花的4个特征实现3种鸢尾花的分类数据集为iris给出模型的损失值与准确率需要实现模型内部代码不建议直接调用库要求一:熟悉反向传播算法流程及代码实现要求二:实验结果要有损失函数和准确率的曲线变化并给出算法思路

原文地址: https://www.cveoy.top/t/topic/bGmd 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录