深度学习实战：tensorflow训练循环神经网络让AI创做出模仿莎士比亚风格的做品

时间 2020-03-22

标签深度学习实战 tensorflow 训练循环神经网络做出模仿莎士比亚风格繁體版

原文原文链接

AI创做莎士比亚风格的做品训练一个循环神经网络模仿莎士比亚

FLORIZEL:
Should she kneel be?
In shall not weep received; unleased me
And unrespective greeting than dwell in, thee,
look’d on me, son in heavenly properly.git

这是谁写的，莎士比亚仍是机器学习模型？spring

答案是后者！上面这篇文章是一个通过TensorFlow训练的循环神经网络的产物，通过30个epoch的训练，并给出了一颗“FLORIZEL:”的种子。在本文中，我将解释并给出如何训练神经网络来编写莎士比亚戏剧或任何您但愿它编写的东西的代码!api

导入和数据

首先导入一些基本库数组

import tensorflow as tf
import numpy as np
import os
import time

TensorFlow内置了莎士比亚做品。若是您在像Kaggle这样的在线环境中工做，请确保链接了互联网。网络

path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')

数据须要用utf-8进行解码。架构

text = open(path_to_file, 'rb').read().decode(encoding='utf-8')
# length of text is the number of characters in it
print ('Length of text: {} characters'.format(len(text)))

[输出]:app

Length of text: 1115394 charactersless

它里面有不少的数据能够用！dom

咱们看看前250个字符是什么机器学习

print(text[:250])

向量化

首先看看文件里面有多少不一样的字符：

vocab = sorted(set(text))
print ('{} unique characters'.format(len(vocab)))

[输出]:

65 unique characters

在训练以前，字符串须要映射到数字表示。
下面建立两个表—一个表将字符映射到数字，另外一个表将数字映射到字符。

char2idx = {u:i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)
text_as_int = np.array([char2idx[c] for c in text])

查看向量字典：

print('{')
for char,_ in zip(char2idx, range(20)):
    print('  {:4s}: {:3d},'.format(repr(char), char2idx[char]))
print('  ...\n}')

[输出]：

{
'\n': 0,
' ' : 1,
'!' : 2,
'$' : 3,
'&' : 4,
"'" : 5,
',' : 6,
'-' : 7,
'.' : 8,
'3' : 9,
':' : 10,
...
}

每个不同的字符都有了编号。

咱们看看向量生成器如何处理做品的前两个单词 'First Citizen'

print ('{} ---- characters mapped to int ---- > {}'.format(repr(text[:13]), text_as_int[:13]))

这些单词被转换成一个数字向量，这个向量能够很容易地经过整数到字符字典转换回文本。

制造训练数据

给定一个字符序列，该模型将理想地找到最有可能的下一个字符。
文本将被分红几个句子，每一个输入句子将包含文本中的一个可变的seq_length字符。
任何输入语句的输出都将是输入语句，向右移动一个字符。

例如，给定一个输入“Hell”，输出将是“ello”，从而造成单词“Hello”。

首先，咱们可使用tensorflow的.from_tensor_slices函数将文本向量转换为字符索引。

# The maximum length sentence we want for a single input in characters
seq_length = 100
examples_per_epoch = len(text)//(seq_length+1)

# Create training examples / targets
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

for i in char_dataset.take(5):
  print(idx2char[i.numpy()])

[输出]:

F
i
r
s
t

批处理方法容许这些单个字符成为肯定大小的序列，造成段落片断。

sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

for item in sequences.take(5):
  print(repr(''.join(idx2char[item.numpy()])))

[输出]:

'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou ' 'are all resolved rather to die than to famish?\n\nAll:\nResolved. resolved.\n\nFirst Citizen:\nFirst, you k' "now Caius Marcius is chief enemy to the people.\n\nAll:\nWe know't, we know't.\n\nFirst Citizen:\nLet us ki" "ll him, and we'll have corn at our own price.\nIs't a verdict?\n\nAll:\nNo more talking on't; let it be d" 'one: away, away!\n\nSecond Citizen:\nOne word, good citizens.\n\nFirst Citizen:\nWe are accounted poor citi'

对于每一个序列，咱们将复制它并使用map方法移动它以造成一个输入和一个目标。

def split_input_target(chunk):
    input_text = chunk[:-1]
    target_text = chunk[1:]
    return input_text, target_text

dataset = sequences.map(split_input_target)

如今，数据集已经变成了咱们想要的输入和输出。

Input data:  'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou' 
Target data: 'irst Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '

对向量的每一个索引进行一次性处理;对于第0步的输入，模型接收“F”的数值索引，并尝试预测“i”做为下一个字符。在下一个时序步骤中，它作一样的事情，可是RNN不只考虑前面的步骤，并且还考虑它刚才预测的字符。

for i, (input_idx, target_idx) in enumerate(zip(input_example[:5], target_example[:5])):
    print("Step {:4d}".format(i))
    print("  input: {} ({:s})".format(input_idx, repr(idx2char[input_idx])))
    print("  expected output: {} ({:s})".format(target_idx, repr(idx2char[target_idx])))

[输出]:

Step 0
input: 18 ('F')
expected output: 47 ('i')
Step 1
input: 47 ('i')
expected output: 56 ('r')
Step 2
input: 56 ('r')
expected output: 57 ('s')
Step 3
input: 57 ('s')
expected output: 58 ('t')
Step 4
input: 58 ('t')
expected output: 1 (' ')

Tensorflow的 tf.data 能够用来将文本分割成更易于管理的序列——但首先，须要将数据打乱并打包成批。

# Batch size
BATCH_SIZE = 64

# Buffer size to shuffle the dataset
BUFFER_SIZE = 10000

dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

dataset

[输出]:

构建模型

最后，咱们能够构建模型。让咱们先设定一些重要的变量:

# Length of the vocabulary in chars
vocab_size = len(vocab)

# The embedding dimension
embedding_dim = 256

# Number of RNN units
rnn_units = 1024

模型将有一个嵌入层或输入层，该层将每一个字符的数量映射到一个具备变量embedding_dim维数的向量。它将有一个GRU层(能够用LSTM层代替)，大小为units = rnn_units。最后，输出层将是一个标准的全链接层，带有vocab_size输出。

下面的函数帮助咱们快速而清晰地建立一个模型。

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.GRU(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

经过调用函数组合模型架构。

model = build_model(
  vocab_size = len(vocab),
  embedding_dim=embedding_dim,
  rnn_units=rnn_units,
  batch_size=BATCH_SIZE)

让咱们总结一下咱们的模型，看看有多少参数。

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (64, None, 256)           16640     
_________________________________________________________________
gru (GRU)                    (64, None, 1024)          3938304   
_________________________________________________________________
dense (Dense)                (64, None, 65)            66625     
=================================================================
Total params: 4,021,569
Trainable params: 4,021,569
Non-trainable params: 0
_________________________________________________________________

400万的参数!咱们但愿把它训练的久一点。

聚集

这个问题如今能够做为一个分类问题来处理。
给定先前的RNN状态和时间步长的输入，预测表示下一个字符的类。
所以，咱们将附加一个稀疏分类熵损失函数和Adam优化器。

def loss(labels, logits):
  return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

example_batch_loss  = loss(target_example_batch, example_batch_predictions)
print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")
print("scalar_loss:      ", example_batch_loss.numpy().mean())
model.compile(optimizer='adam', loss=loss)

[输出]:

Prediction shape: (64, 100, 65) # (batch_size, sequence_length, vocab_size)
scalar_loss: 4.1746616

配置检查点

模型训练，尤为是像莎士比亚戏剧这样的大型数据集，须要很长时间。理想状况下，咱们不会为了作出预测而反复训练它。tf.keras.callbacks.ModelCheckpoint函数能够在训练期间将某些检查点的权重保存到一个文件中，该文件能够在一个空白模型被后续检索。这在训练因任何缘由中断时也很方便。

# Directory where the checkpoints will be saved
checkpoint_dir = './training_checkpoints'
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

最后，执行训练

EPOCHS=30
history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])

这应该须要大约6个小时的时间来得到不那么使人印象深入但更快的结果，epochs能够调整到10(任何小于5的都会彻底变成垃圾)。

生成文本

冲检查点中恢复权重参数

tf.train.latest_checkpoint(checkpoint_dir)

用这些权重参数咱们能够从新构建模型：

model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)

model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))

model.build(tf.TensorShape([1, None]))

生成文本的步骤：

首先选择一个种子字符串，初始化RNN状态，并设置要生成的字符数。
使用开始字符串和RNN状态得到下一个字符的预测分布。
使用分类分布计算预测字符的索引，并将其做为模型的下一个输入。
模型返回的RNN状态被反馈回自身。
重复步骤2和步骤4，直到生成文本。

def generate_text(model, start_string):
  # Evaluation step (generating text using the learned model)

  # Number of characters to generate
  num_generate = 1000

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Empty string to store our results
  text_generated = []

  # Low temperatures results in more predictable text.
  # Higher temperatures results in more surprising text.
  # Experiment to find the best setting.
  temperature = 1.0

  # Here batch size == 1
  model.reset_states()
  for i in range(num_generate):
      predictions = model(input_eval)
      # remove the batch dimension
      predictions = tf.squeeze(predictions, 0)

      # using a categorical distribution to predict the character returned by the model
      predictions = predictions / temperature
      predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

      # We pass the predicted character as the next input to the model
      # along with the previous hidden state
      input_eval = tf.expand_dims([predicted_id], 0)

      text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

最后，给定一个开始字符串，咱们能够生成一些有趣的文本。

如今，欣赏一下两个RNN的剧本吧，一个是训练了10个epochs，另外一个是30个epochs。

这是训练了10个epochs的

print(generate_text(model, start_string=u"ROMEO: "))

ROMEO: how I, away too put That you shall have thieffort, are but love.

JULIET: Go, fight, sir: we say ‘Ay,’ and alack to stand and not to go to; And washt us him to-domm. Ay, my ows young; a man hear from his monsher to thee.

KING RICHARD III: Come, cease. O broteld the costime’s deforment! Thou wilt was quite.

PAULINA: I would you say the hour! Ah, hole for your company: But, good my lord; we have a king, of peace?

BALTHASAR: Cadul and washee could he ha! To curit her I may wench.

GLOUCESTER: Had you here shall such a pierce to temper; Or might his noble offery owe and speed Which seemest thy trims in a weaky amidude By this to the dother, dods citizens.

Third Citizen:
Madam sweet give reward, rebeire them With news gone! Pluck yielding: ’tis sign out things Within risess in strifes all ten times, To dish his finmers for briefily.

JULIET:
Gentlemen, God eveI come approbouting his wife as it, — triumphrous night change you gods, thou goest:
To which will dispersed and France.

哇!仅仅在10个epochs以后，就有了使人印象深入的理解。这些词的拼写准确性使人怀疑，但其中有明显的情节冲突。写做确定能够改进。但愿30-epoch模型能有更好的表现。

这是训练了30个epochs的

欣赏一下彻底由RNN一个字一个字地创做出来的做品吧！

BRUTUS:
Could you be atherveshed him, our two,
But much a tale lendly fear;
For which we in thy shade of Naples.
Here’s no increase False to’t, offorit is the war of white give again.
This is the queen, whose vanoar’s head is worthly.
But cere it be a witch, some comfort.
What, nurse, I say!
Go Hamell.

FLORIZEL:
Should she kneel be?
In shall not weep received; unleased me
And unrespective greeting than dwell in, thee,
look’d on me, son in heavenly properly,
That ever you are my father is but straing;
Unless you would repossess him, hath always louded up,
You provokest. Good faith, o’erlar I can repart the heavens like deeds dills
For temper as soon as another maiden here, and he is bann’d upon which springs;
O’er most upon your voysus, I have no thunder; and my good villain!
Alest each other’s sleepings.
A fool; if this business prating duty
Does these traitors other sorrow.

LUCENTIO:
Tell me, they’s honourably.

Shepherd:
I know, my lord, to London, and you my moved join under him,
Great Apollo’s stan to make a book,
Both yet my father away towards Covent. Tut, And thou still’d by the earthmen lord r sensible your mother?

Servant:
Go, vill! We muster yet, for you’ll not: you are took good mad within your company in rage, I would you fight it so, his eye for every days,
To swear the beam of such a detects,
To Clarence dead to call upon you all I thank your grace, my father and my father, and yourself prevails
My father, hath a sword for hither;
Nor when thy heart is grown grave done.

QUEEN MARGARET: *
*Thou art a lodging very good and give thanks
With him.
But There is now in hand:
Therefore it be possish’d with Romeo dead.

MENENIUS:
Ha! little very welcome to my daughter’s sword,
Which haply my prayer’s legs, such as he does.
I am banks, sir, I’ll make you say ‘nough; for hither so better now to be so, sent it: it is stranger.

哇!有趣的是，这个模型甚至学会了在某些状况下押韵(特别是Florizel的台词)。想象一下，在50甚至100个epochs以后，RNN能写些什么!

嗯，我猜测AI会让做家失业

不彻底是这样——但我能够想象将来人工智能会发表大量设计成病毒式传播的文章。这是一个挑战——收集与主题相关的顶级文章，好比Human Parts或其余相似出版物的文章，而后训练人工智能撰写热门文章。发布RNN的输出，逐字地，看看效果如何！注意——我不建议在更专业的出版物上训练RNN，好比Towards Data Science 或 Better Programming，由于它须要RNN在合理的时间内没法学习的技术知识。然而，在RNN目前的能力范围内，更多的哲学和非技术的写做还行。

随着文本生成变得愈来愈先进，它将有潜力比人类写得更好，由于它有一个眼睛，什么内容将像病毒同样，什么措推辞读者感受良好，等等。使人震惊的是，有一天，机器能够在人类最擅长的事情——写做上击败人类。诚然，它没法真正理解本身在写什么，但它会掌握人类的交流方式。

我想若是你不能战胜他们，那就加入他们吧！

原文地址：https://imba.deephub.ai/p/051053806a5211ea90cd05de3860c663

深度学习实战：tensorflow训练循环神经网络让AI创做出模仿莎士比亚风格的做品

AI创做莎士比亚风格的做品 训练一个循环神经网络模仿莎士比亚

导入和数据

向量化

制造训练数据

构建模型

聚集

配置检查点

最后，执行训练

生成文本

这是训练了10个epochs的

这是训练了30个epochs的

嗯，我猜测AI会让做家失业

AI创做莎士比亚风格的做品训练一个循环神经网络模仿莎士比亚