Keras 生成式文本模型训练代码示例
下面是一个使用 Keras 框架训练生成式文本模型的示例代码:
import numpy as np
from keras.models import Sequential
from keras.layers import LSTM, Dense, Embedding
# 加载文本数据
with open('text_data.txt', 'r') as file:
text_data = file.read()
# 构建字符索引映射表
chars = sorted(list(set(text_data)))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))
# 准备训练数据
max_len = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text_data) - max_len, step):
sentences.append(text_data[i:i+max_len])
next_chars.append(text_data[i+max_len])
# 将训练数据转换为向量表示
X = np.zeros((len(sentences), max_len, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
for j, char in enumerate(sentence):
X[i, j, char_indices[char]] = 1
y[i, char_indices[next_chars[i]]] = 1
# 构建模型
model = Sequential()
model.add(LSTM(128, input_shape=(max_len, len(chars))))
model.add(Dense(len(chars), activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
# 训练模型
model.fit(X, y, batch_size=128, epochs=20)
# 生成文本
start_index = np.random.randint(0, len(text_data) - max_len - 1)
seed_text = text_data[start_index:start_index+max_len]
generated_text = seed_text
for _ in range(400):
x_pred = np.zeros((1, max_len, len(chars)))
for t, char in enumerate(seed_text):
x_pred[0, t, char_indices[char]] = 1.
preds = model.predict(x_pred, verbose=0)[0]
next_index = np.argmax(preds)
next_char = indices_char[next_index]
generated_text += next_char
seed_text = seed_text[1:] + next_char
print(generated_text)
这个例子使用一个 LSTM 模型来训练生成文本。首先,加载文本数据并构建字符索引映射表。然后,将文本数据切分为训练样本,并将其转换为向量表示。接下来,构建模型并编译。最后,使用训练好的模型生成文本。
原文地址: https://www.cveoy.top/t/topic/nCEk 著作权归作者所有。请勿转载和采集!