prepartion
选用RNN的网络,主要是针对词来算
1 | Xdata = [] |
上面部分是将词转化成为vector,注意几点:
- 限制max_length_word部分
- 这里面没有像之前lstm那样,通过记录词频的方式,来进行one_hot编码
- feature_list记录所有的词,同时,feature_dict记录词进入feature_list的位置
data preprocessing
As mentioned above I would like to use a sequence to sequence approach. Important for this approach is having a certain length of words. Words that are longer than that length have been discarded in de data-reading step above. Now we will add paddings to the words that are not long enough. — 对于不够的词增加相应的padding
Another important step is creating a train and a test set. We only show the network examples from the train set. At the end I will manually evaluate some of the examples in the testset and discuss what the network learned. During training we train in batches with a small amount of data. With a random data splitter we get a different trainset every run. — 挑选合适的训练集和测试集
1 | before_padding = Xdata[0] |
这里的几点:
- 使用sequence.pad_sequences警醒padding补充
The network
- embeds our characters
- has an encoder which returns a sequence of outputs
- has an attention model which uses this sequence to generate output characters
1 | batch_size = 64 |
代码解析:
1. enc_input和dec_output分别是10dim的数组
2. init weights部分
3. memory_dim的意义?
4. 采用的是GRUCell的小处理阀门
5. feed_previous的意义?
Training
1 | for index_now in range(1002): |
每次选取不同的random的数据集来进行相关的训练,增加随机性部分?
Train analysis
1 | def get_reversed_max_string_logits(logits): |
代码解析:
- 首先选取相应的test_case,训练处最终的结果,按照loss最小的参数
- 然后是通过run,得到single_word的translate的结果
- 相关输出的一些反转,通过vector还原成为相应的单词