TensorFlow - labels layer 的 reshape 问题,请教各位大佬!

2018-04-08 13:05:27 +08:00
 scoronepion

最近在做毕设,有一个环节是训练一个 RNN 来判断当前系统是否存在攻击行为。我用的是基于 TensorFlow 开发的 TensorLayer 库,数据集用的是 ADFA-LD,搭建的网络如下代码所示:

# 网络结构
network = tl.layers.EmbeddingInputlayer(x, vocabulary_size=max_sys_call+1, embedding_size=128, name='embedding')
network = tl.layers.RNNLayer(network, cell_fn=tf.contrib.rnn.BasicLSTMCell, cell_init_args={'forget_bias':0.0, 'state_is_tuple':True}, n_hidden=128, name='lstm')
network = tl.layers.DropoutLayer(network, keep=0.8, name='dropout')
# 重塑为向量
network = tl.layers.FlattenLayer(network, name='flatten')
network = tl.layers.DenseLayer(network, n_units=2, act=tf.identity, name='output')

data 和 labels 的 placeholder 定义如下:

x = tf.placeholder(tf.int64, shape=[None, max_sequences_len], name='x')
y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')

损失函数定义如下:

# 定义损失函数
y = network.outputs
cost = tl.cost.cross_entropy(y, y_, name='entropy')
correct_prediction = tf.equal(tf.argmax(y, 1), y_)
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

但在实际运行过程中,报了如下错误:

[TL] Finished! use $tensorboard --logdir=logs/ to start server
[TL] Start training the network ...

Traceback (most recent call last):
File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 123, in <module>
	rnn_lstm(x, y)
File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 110, in rnn_lstm
	print_freq=5, X_val=np.array(x_val), y_val=np.array(y_val), eval_train=True, tensorboard=True)
File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorlayer\utils.py", line 147, in fit
	loss, _ = sess.run([cost, train_op], feed_dict=feed_dict)
File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 905, in run
	run_metadata_ptr)
File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 1113, in _run
	str(subfeed_t.get_shape())))
    
ValueError: Cannot feed value of shape (100, 2) for Tensor 'y_:0', which has shape '(?,)'

我尝试将 labels 的 placeholder 改成下面这样:

y_ = tf.placeholder(tf.int64, shape=[100, 2], name='y_')

但这样改完后,会报一个新错误,而且出现在构造网络阶段:

[TL] FlattenLayer flatten: 640
[TL] DenseLayer  output: 2 identity

Traceback (most recent call last):
File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 123, in <module>
	rnn_lstm(x, y)
File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 96, in rnn_lstm
	cost = tl.cost.cross_entropy(y, y_, name='entropy')
File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorlayer\cost.py", line 36, in cross_entropy
	return tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=target, logits=output, name=name))
File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2038, in sparse_softmax_cross_entropy_with_logits
(labels_static_shape.ndims, logits.get_shape().ndims))

ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).

我觉得应该是我 labels 有问题,可能需要 reshape ?但是因为刚刚接触 TensorFlow,对于这块不是很懂,希望能有大佬指点一下迷津!万分感谢!!

附上完整代码:

import os
import numpy as np
import tensorflow as tf
import tensorlayer as tl
from sklearn.cross_validation import train_test_split


max_sys_call = 0
max_sequences_len = 300
learning_rate = 0.0001
ADFA_NormalData_Path = r"./data/ADFA-LD/Normal_Data_Master"
ADFA_WebshellData_Path = r"./data/ADFA-LD/Attack_Data_Master"


def load_one_file(filename):
	# 读取单个文件中的系统调用,并记录最大系统调用序号
	global max_sys_call
	x = []
	with open(filename) as f:
    	for line in f:
        	line = line.strip('\n')
        	line = line.split(' ')
        	for num in line:
            	if len(num) > 0:
                	x.append(int(num))
                	if int(num) > max_sys_call:
                    	max_sys_call = int(num)
	return x

def load_ADFA_Data(dir):
	# 加载 ADFA 数据集
	data = []
	label = []
	g = os.walk(dir)
	i = 0
	for path, d, filelist in g:
    	for filename in filelist:
        	if filename.endswith('.txt'):
            	filepath = os.path.join(path, filename)
            	i += 1
            	print("[%d] Load %s" % (i, filepath))
            	nums = load_one_file(filepath)
            	data.append(nums)
            	if dir == ADFA_NormalData_Path:
                	label.append(0)
            	else:
                	label.append(1)
	return data, label

def rnn_lstm(x, y):
	# 构造 rnn,使用 lstm
	x_train_and_val, x_test, y_train_and_val, y_test = train_test_split(x, y, test_size=0.4, random_state=0)
	x_train, x_val, y_train, y_val = train_test_split(x_train_and_val, y_train_and_val, test_size=0.3, random_state=0)

	x_train = tl.prepro.pad_sequences(x_train, maxlen=max_sequences_len, value=0.)
	x_val = tl.prepro.pad_sequences(x_val, maxlen=max_sequences_len, value=0.)
	x_test = tl.prepro.pad_sequences(x_test, maxlen=max_sequences_len, value=0.)

	y_train = tf.keras.utils.to_categorical(y_train, num_classes=2)
	y_val = tf.keras.utils.to_categorical(y_val, num_classes=2)
	y_test = tf.keras.utils.to_categorical(y_test, num_classes=2)

	sess = tf.InteractiveSession()

	x = tf.placeholder(tf.int64, shape=[None, max_sequences_len], name='x')
	y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')
	# y_ = tf.placeholder(tf.int64, shape=[100, 2], name='y_')

	# 网络结构
	network = tl.layers.EmbeddingInputlayer(x, vocabulary_size=max_sys_call+1, embedding_size=128, name='embedding')
	network = tl.layers.RNNLayer(network, cell_fn=tf.contrib.rnn.BasicLSTMCell, cell_init_args={'forget_bias':0.0, 'state_is_tuple':True}, n_hidden=128, name='lstm')
	network = tl.layers.DropoutLayer(network, keep=0.8, name='dropout')
	# 重塑为向量
	network = tl.layers.FlattenLayer(network, name='flatten')
	network = tl.layers.DenseLayer(network, n_units=2, act=tf.identity, name='output')

	# 定义损失函数
	y = network.outputs
	cost = tl.cost.cross_entropy(y, y_, name='entropy')
	correct_prediction = tf.equal(tf.argmax(y, 1), y_)
	acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

	# 定义优化器
	train_params = network.all_params
	train_op = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost, var_list=train_params)

	# 初始化模型参数
	tl.layers.initialize_global_variables(sess)

	# 训练网络模型
	tl.utils.fit(sess, network, train_op, cost, np.array(x_train), np.array(y_train), x, y_, acc=acc, n_epoch=1500,
             print_freq=5, X_val=np.array(x_val), y_val=np.array(y_val), eval_train=True, tensorboard=True)

	# 评估模型
	tl.utils.test(sess, network, acc, x_test, y_test, x, y_, batch_size=None, cost=cost)

	sess.close()


x1, y1 = load_ADFA_Data(ADFA_NormalData_Path)
x2, y2 = load_ADFA_Data(ADFA_WebshellData_Path)
x = x1 + x2
y = y1 + y2

rnn_lstm(x, y)
3024 次点击
所在节点    程序员
6 条回复
epleone
2018-04-08 13:32:07 +08:00
改成这样试试
y_ = tf.placeholder(tf.int64, shape=[None, 2], name='y_')
scoronepion
2018-04-08 13:34:02 +08:00
@epleone 试过了,依然会报第二个错...
epleone
2018-04-08 13:38:11 +08:00
@scoronepion
cost = tl.cost.cross_entropy(y, y_, name='entropy')
也要改成
cost = tl.cost.cross_entropy(y_, y, name='entropy')
scoronepion
2018-04-08 13:43:36 +08:00
@epleone 刚刚试了一下,还是不行,依然在报:ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).
epleone
2018-04-08 13:49:11 +08:00
@scoronepion
没道理啊,占位符 y_ = tf.placeholder(tf.int64, shape=[None, 2]) 这样是没有问题的,最终的类型是 int 吗,还是 tf.float32?
调用损失函数的时候,保证前向生成的 y 在前 gt 在后。好好检查一下代码。
还有你用的是 tl,我不太清楚,直接用 tf 不可以么。
scoronepion
2018-04-08 13:52:50 +08:00
@epleone 好的好的谢谢大佬,我再看看。用 tl 是因为前段时间刚好在学这个,所以想用用。

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/445036

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX