摘要: 如何用TFLearn,Keras等高层框架来学习自动生成莎翁的戏剧或者尼采的哲学文章git
上一节咱们学习了Tensorflow的高层API封装,能够经过简单的几步就生成一个DNN分类器来解决MNIST手写识别问题。github
尽管Tensorflow也在不断推动Estimator API。可是,这并非工具的所有。在Tensorflow官方的API方外,咱们还有强大的工具,好比TFLearn和Keras。后端
这节咱们就作一个武器库的展现,看看专门为Tensorflow作的高层框架TFLearn和跨Tensorflow和CNTK几种后端的Keras为咱们作了哪些强大的功能封装。网络
以前咱们简单介绍了强大的用于处理序列数据的RNN。RNN比起其它网络的重要优势是能够学习了序列数据以后进行自生成。
好比,学习《唐诗三百首》能够写诗,学习了Linux Kernel源代码就能写C代码(虽然基本上编译不过)。app
咱们首先来一个自动写莎士比亚戏剧的例子吧。
在看代码以前我先唠叨几句。深度学习对于数据量的要求仍是比较高的,像训练自动生成的这种,通常得几百万到几千万量级的训练数据下才能有好的效果。好比只用几篇小说来训练确定生成不知所云的小说。就算是人类也作不到只学几首诗就会写诗么。
另一点就是,训练数据量上来了,对于时间和算力的要求也是指数级提升的。
好比咱们用莎翁的戏剧来训练,虽然数据量也不是特别的大,也就16万多行,可是在CPU上训练的话也不是一两个小时能搞定的,大约是天为单位。
后面咱们举图像或视频的例子,在CPU上训,论月也是并不意外的。框架
那么,这个须要训一天左右的例子,代码会多复杂呢?答案是核心代码不过10几行,总共加上数据处理和测试代码也不过50行左右。dom
from __future__ import absolute_import, division, print_function import os import pickle from six.moves import urllib import tflearn from tflearn.data_utils import * path = "shakespeare_input.txt" char_idx_file = 'char_idx.pickle' if not os.path.isfile(path): urllib.request.urlretrieve("https://raw.githubusercontent.com/tflearn/tflearn.github.io/master/resources/shakespeare_input.txt", path) maxlen = 25 char_idx = None if os.path.isfile(char_idx_file): print('Loading previous char_idx') char_idx = pickle.load(open(char_idx_file, 'rb')) X, Y, char_idx = \ textfile_to_semi_redundant_sequences(path, seq_maxlen=maxlen, redun_step=3, pre_defined_char_idx=char_idx) pickle.dump(char_idx, open(char_idx_file,'wb')) g = tflearn.input_data([None, maxlen, len(char_idx)]) g = tflearn.lstm(g, 512, return_seq=True) g = tflearn.dropout(g, 0.5) g = tflearn.lstm(g, 512, return_seq=True) g = tflearn.dropout(g, 0.5) g = tflearn.lstm(g, 512) g = tflearn.dropout(g, 0.5) g = tflearn.fully_connected(g, len(char_idx), activation='softmax') g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001) m = tflearn.SequenceGenerator(g, dictionary=char_idx, seq_maxlen=maxlen, clip_gradients=5.0, checkpoint_path='model_shakespeare') for i in range(50): seed = random_sequence_from_textfile(path, maxlen) m.fit(X, Y, validation_set=0.1, batch_size=128, n_epoch=1, run_id='shakespeare') print("-- TESTING...") print("-- Test with temperature of 1.0 --") print(m.generate(600, temperature=1.0, seq_seed=seed)) print("-- Test with temperature of 0.5 --") print(m.generate(600, temperature=0.5, seq_seed=seed))
上面的例子须要使用TFLearn框架,能够经过工具
pip install tflearn
来安装。学习
TFLearn是专门为Tensorflow开发的高层次API框架。
用TFLearn API的主要好处是可读性更好,好比刚才的核心代码:测试
g = tflearn.input_data([None, maxlen, len(char_idx)]) g = tflearn.lstm(g, 512, return_seq=True) g = tflearn.dropout(g, 0.5) g = tflearn.lstm(g, 512, return_seq=True) g = tflearn.dropout(g, 0.5) g = tflearn.lstm(g, 512) g = tflearn.dropout(g, 0.5) g = tflearn.fully_connected(g, len(char_idx), activation='softmax') g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001) m = tflearn.SequenceGenerator(g, dictionary=char_idx, seq_maxlen=maxlen, clip_gradients=5.0, checkpoint_path='model_shakespeare')
从输入数据,三层LSTM,三层Dropout,最后是一个softmax的全链接层。
咱们再来看一个预测泰坦尼克号幸存几率的网络的结构:
# Build neural network net = tflearn.input_data(shape=[None, 6]) net = tflearn.fully_connected(net, 32) net = tflearn.fully_connected(net, 32) net = tflearn.fully_connected(net, 2, activation='softmax') net = tflearn.regression(net) # Define model model = tflearn.DNN(net) # Start training (apply gradient descent algorithm) model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)
你们的莎士比亚模型应该正在训练过程当中吧,我们闲着也是闲着,不如从一个更简单的例子来看看这个生成过程。
咱们仍是取TFLearn的官方例子,经过读取美国主要城市名字列表来生成一些新的城市名字。
咱们以Z开头的城市为例:
Zachary Zafra Zag Zahl Zaleski Zalma Zama Zanesfield Zanesville Zap Zapata Zarah Zavalla Zearing Zebina Zebulon Zeeland Zeigler Zela Zelienople Zell Zellwood Zemple Zena Zenda Zenith Zephyr Zephyr Cove Zephyrhills Zia Pueblo Zillah Zilwaukee Zim Zimmerman Zinc Zion Zionsville Zita Zoar Zolfo Springs Zona Zumbro Falls Zumbrota Zuni Zurich Zwingle Zwolle
一共20580个城市。这个训练就快多了,在纯CPU上训练,大约5到6分钟能够训练一轮。
代码以下,跟上面写莎翁的戏剧的一模一样:
from __future__ import absolute_import, division, print_function import os from six import moves import ssl import tflearn from tflearn.data_utils import * path = "US_Cities.txt" if not os.path.isfile(path): context = ssl._create_unverified_context() moves.urllib.request.urlretrieve("https://raw.githubusercontent.com/tflearn/tflearn.github.io/master/resources/US_Cities.txt", path, context=context) maxlen = 20 string_utf8 = open(path, "r").read().decode('utf-8') X, Y, char_idx = \ string_to_semi_redundant_sequences(string_utf8, seq_maxlen=maxlen, redun_step=3) g = tflearn.input_data(shape=[None, maxlen, len(char_idx)]) g = tflearn.lstm(g, 512, return_seq=True) g = tflearn.dropout(g, 0.5) g = tflearn.lstm(g, 512) g = tflearn.dropout(g, 0.5) g = tflearn.fully_connected(g, len(char_idx), activation='softmax') g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001) m = tflearn.SequenceGenerator(g, dictionary=char_idx, seq_maxlen=maxlen, clip_gradients=5.0, checkpoint_path='model_us_cities') for i in range(40): seed = random_sequence_from_string(string_utf8, maxlen) m.fit(X, Y, validation_set=0.1, batch_size=128, n_epoch=1, run_id='us_cities') print("-- TESTING...") print("-- Test with temperature of 1.2 --") print(m.generate(30, temperature=1.2, seq_seed=seed).encode('utf-8')) print("-- Test with temperature of 1.0 --") print(m.generate(30, temperature=1.0, seq_seed=seed).encode('utf-8')) print("-- Test with temperature of 0.5 --") print(m.generate(30, temperature=0.5, seq_seed=seed).encode('utf-8'))
咱们看下第一轮训练完生成的城市名:
t and Shoot Cuthbertd Lettfrecv El Ceoneel Sutd Sa
第二轮:
stle Finchford Finch Dasthond madloogd Wlaycoyarfw
第三轮:
averal Cape Carteret Acbiropa Heowar Sor Dittoy Do
第十轮:
hoenchen Schofield Stcojos Schabell StcaKnerum Cri
Keras是能够跨Tensorflow,微软的CNTK等多种后端的API。
能够经过
pip install keras
来安装keras。咱们安装了Tensorflow以后,Keras会选用Tensorflow来作它的后端。
咱们也看下Keras上文本生成的例子。官方例子是用来生成尼采的句子。
核心语句就6句话:
model = Sequential() model.add(LSTM(128, input_shape=(maxlen, len(chars)))) model.add(Dense(len(chars))) model.add(Activation('softmax')) optimizer = RMSprop(lr=0.01) model.compile(loss='categorical_crossentropy', optimizer=optimizer)
下面是完整的代码,你们跑来玩玩吧。若是对尼采不感兴趣,也能够换成别的文章。不过请注意,正如注释中所说的,文本随便换,可是要保持在10万字符以上。最好是100万字符以上。
'''Example script to generate text from Nietzsche's writings. At least 20 epochs are required before the generated text starts sounding coherent. It is recommended to run this script on GPU, as recurrent networks are quite computationally intensive. If you try this script on new data, make sure your corpus has at least ~100k characters. ~1M is better. ''' from __future__ import print_function from keras.callbacks import LambdaCallback from keras.models import Sequential from keras.layers import Dense, Activation from keras.layers import LSTM from keras.optimizers import RMSprop from keras.utils.data_utils import get_file import numpy as np import random import sys import io path = get_file('nietzsche.txt', origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt') with io.open(path, encoding='utf-8') as f: text = f.read().lower() print('corpus length:', len(text)) chars = sorted(list(set(text))) print('total chars:', len(chars)) char_indices = dict((c, i) for i, c in enumerate(chars)) indices_char = dict((i, c) for i, c in enumerate(chars)) # cut the text in semi-redundant sequences of maxlen characters maxlen = 40 step = 3 sentences = [] next_chars = [] for i in range(0, len(text) - maxlen, step): sentences.append(text[i: i + maxlen]) next_chars.append(text[i + maxlen]) print('nb sequences:', len(sentences)) print('Vectorization...') x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool) y = np.zeros((len(sentences), len(chars)), dtype=np.bool) for i, sentence in enumerate(sentences): for t, char in enumerate(sentence): x[i, t, char_indices[char]] = 1 y[i, char_indices[next_chars[i]]] = 1 # build the model: a single LSTM print('Build model...') model = Sequential() model.add(LSTM(128, input_shape=(maxlen, len(chars)))) model.add(Dense(len(chars))) model.add(Activation('softmax')) optimizer = RMSprop(lr=0.01) model.compile(loss='categorical_crossentropy', optimizer=optimizer) def sample(preds, temperature=1.0): # helper function to sample an index from a probability array preds = np.asarray(preds).astype('float64') preds = np.log(preds) / temperature exp_preds = np.exp(preds) preds = exp_preds / np.sum(exp_preds) probas = np.random.multinomial(1, preds, 1) return np.argmax(probas) def on_epoch_end(epoch, logs): # Function invoked at end of each epoch. Prints generated text. print() print('----- Generating text after Epoch: %d' % epoch) start_index = random.randint(0, len(text) - maxlen - 1) for diversity in [0.2, 0.5, 1.0, 1.2]: print('----- diversity:', diversity) generated = '' sentence = text[start_index: start_index + maxlen] generated += sentence print('----- Generating with seed: "' + sentence + '"') sys.stdout.write(generated) for i in range(400): x_pred = np.zeros((1, maxlen, len(chars))) for t, char in enumerate(sentence): x_pred[0, t, char_indices[char]] = 1. preds = model.predict(x_pred, verbose=0)[0] next_index = sample(preds, diversity) next_char = indices_char[next_index] generated += next_char sentence = sentence[1:] + next_char sys.stdout.write(next_char) sys.stdout.flush() print() print_callback = LambdaCallback(on_epoch_end=on_epoch_end) model.fit(x, y, batch_size=128, epochs=60, callbacks=[print_callback])
本文做者:lusing
本文为云栖社区原创内容,未经容许不得转载。