V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
scoronepion
V2EX  ›  程序员

TensorFlow - labels layer 的 reshape 问题,请教各位大佬!

  •  
  •   scoronepion · 2018-04-08 13:05:27 +08:00 · 2917 次点击
    这是一个创建于 2216 天前的主题,其中的信息可能已经有所发展或是发生改变。

    最近在做毕设,有一个环节是训练一个 RNN 来判断当前系统是否存在攻击行为。我用的是基于 TensorFlow 开发的 TensorLayer 库,数据集用的是 ADFA-LD,搭建的网络如下代码所示:

    # 网络结构
    network = tl.layers.EmbeddingInputlayer(x, vocabulary_size=max_sys_call+1, embedding_size=128, name='embedding')
    network = tl.layers.RNNLayer(network, cell_fn=tf.contrib.rnn.BasicLSTMCell, cell_init_args={'forget_bias':0.0, 'state_is_tuple':True}, n_hidden=128, name='lstm')
    network = tl.layers.DropoutLayer(network, keep=0.8, name='dropout')
    # 重塑为向量
    network = tl.layers.FlattenLayer(network, name='flatten')
    network = tl.layers.DenseLayer(network, n_units=2, act=tf.identity, name='output')
    

    data 和 labels 的 placeholder 定义如下:

    x = tf.placeholder(tf.int64, shape=[None, max_sequences_len], name='x')
    y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')
    

    损失函数定义如下:

    # 定义损失函数
    y = network.outputs
    cost = tl.cost.cross_entropy(y, y_, name='entropy')
    correct_prediction = tf.equal(tf.argmax(y, 1), y_)
    acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    

    但在实际运行过程中,报了如下错误:

    [TL] Finished! use $tensorboard --logdir=logs/ to start server
    [TL] Start training the network ...
    
    Traceback (most recent call last):
    File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 123, in <module>
    	rnn_lstm(x, y)
    File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 110, in rnn_lstm
    	print_freq=5, X_val=np.array(x_val), y_val=np.array(y_val), eval_train=True, tensorboard=True)
    File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorlayer\utils.py", line 147, in fit
    	loss, _ = sess.run([cost, train_op], feed_dict=feed_dict)
    File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 905, in run
    	run_metadata_ptr)
    File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 1113, in _run
    	str(subfeed_t.get_shape())))
        
    ValueError: Cannot feed value of shape (100, 2) for Tensor 'y_:0', which has shape '(?,)'
    

    我尝试将 labels 的 placeholder 改成下面这样:

    y_ = tf.placeholder(tf.int64, shape=[100, 2], name='y_')
    

    但这样改完后,会报一个新错误,而且出现在构造网络阶段:

    [TL] FlattenLayer flatten: 640
    [TL] DenseLayer  output: 2 identity
    
    Traceback (most recent call last):
    File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 123, in <module>
    	rnn_lstm(x, y)
    File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 96, in rnn_lstm
    	cost = tl.cost.cross_entropy(y, y_, name='entropy')
    File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorlayer\cost.py", line 36, in cross_entropy
    	return tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=target, logits=output, name=name))
    File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2038, in sparse_softmax_cross_entropy_with_logits
    (labels_static_shape.ndims, logits.get_shape().ndims))
    
    ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).
    

    我觉得应该是我 labels 有问题,可能需要 reshape ?但是因为刚刚接触 TensorFlow,对于这块不是很懂,希望能有大佬指点一下迷津!万分感谢!!

    附上完整代码:

    import os
    import numpy as np
    import tensorflow as tf
    import tensorlayer as tl
    from sklearn.cross_validation import train_test_split
    
    
    max_sys_call = 0
    max_sequences_len = 300
    learning_rate = 0.0001
    ADFA_NormalData_Path = r"./data/ADFA-LD/Normal_Data_Master"
    ADFA_WebshellData_Path = r"./data/ADFA-LD/Attack_Data_Master"
    
    
    def load_one_file(filename):
    	# 读取单个文件中的系统调用,并记录最大系统调用序号
    	global max_sys_call
    	x = []
    	with open(filename) as f:
        	for line in f:
            	line = line.strip('\n')
            	line = line.split(' ')
            	for num in line:
                	if len(num) > 0:
                    	x.append(int(num))
                    	if int(num) > max_sys_call:
                        	max_sys_call = int(num)
    	return x
    
    def load_ADFA_Data(dir):
    	# 加载 ADFA 数据集
    	data = []
    	label = []
    	g = os.walk(dir)
    	i = 0
    	for path, d, filelist in g:
        	for filename in filelist:
            	if filename.endswith('.txt'):
                	filepath = os.path.join(path, filename)
                	i += 1
                	print("[%d] Load %s" % (i, filepath))
                	nums = load_one_file(filepath)
                	data.append(nums)
                	if dir == ADFA_NormalData_Path:
                    	label.append(0)
                	else:
                    	label.append(1)
    	return data, label
    
    def rnn_lstm(x, y):
    	# 构造 rnn,使用 lstm
    	x_train_and_val, x_test, y_train_and_val, y_test = train_test_split(x, y, test_size=0.4, random_state=0)
    	x_train, x_val, y_train, y_val = train_test_split(x_train_and_val, y_train_and_val, test_size=0.3, random_state=0)
    
    	x_train = tl.prepro.pad_sequences(x_train, maxlen=max_sequences_len, value=0.)
    	x_val = tl.prepro.pad_sequences(x_val, maxlen=max_sequences_len, value=0.)
    	x_test = tl.prepro.pad_sequences(x_test, maxlen=max_sequences_len, value=0.)
    
    	y_train = tf.keras.utils.to_categorical(y_train, num_classes=2)
    	y_val = tf.keras.utils.to_categorical(y_val, num_classes=2)
    	y_test = tf.keras.utils.to_categorical(y_test, num_classes=2)
    
    	sess = tf.InteractiveSession()
    
    	x = tf.placeholder(tf.int64, shape=[None, max_sequences_len], name='x')
    	y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')
    	# y_ = tf.placeholder(tf.int64, shape=[100, 2], name='y_')
    
    	# 网络结构
    	network = tl.layers.EmbeddingInputlayer(x, vocabulary_size=max_sys_call+1, embedding_size=128, name='embedding')
    	network = tl.layers.RNNLayer(network, cell_fn=tf.contrib.rnn.BasicLSTMCell, cell_init_args={'forget_bias':0.0, 'state_is_tuple':True}, n_hidden=128, name='lstm')
    	network = tl.layers.DropoutLayer(network, keep=0.8, name='dropout')
    	# 重塑为向量
    	network = tl.layers.FlattenLayer(network, name='flatten')
    	network = tl.layers.DenseLayer(network, n_units=2, act=tf.identity, name='output')
    
    	# 定义损失函数
    	y = network.outputs
    	cost = tl.cost.cross_entropy(y, y_, name='entropy')
    	correct_prediction = tf.equal(tf.argmax(y, 1), y_)
    	acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
    	# 定义优化器
    	train_params = network.all_params
    	train_op = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost, var_list=train_params)
    
    	# 初始化模型参数
    	tl.layers.initialize_global_variables(sess)
    
    	# 训练网络模型
    	tl.utils.fit(sess, network, train_op, cost, np.array(x_train), np.array(y_train), x, y_, acc=acc, n_epoch=1500,
                 print_freq=5, X_val=np.array(x_val), y_val=np.array(y_val), eval_train=True, tensorboard=True)
    
    	# 评估模型
    	tl.utils.test(sess, network, acc, x_test, y_test, x, y_, batch_size=None, cost=cost)
    
    	sess.close()
    
    
    x1, y1 = load_ADFA_Data(ADFA_NormalData_Path)
    x2, y2 = load_ADFA_Data(ADFA_WebshellData_Path)
    x = x1 + x2
    y = y1 + y2
    
    rnn_lstm(x, y)
    
    6 条回复    2018-04-08 13:52:50 +08:00
    epleone
        1
    epleone  
       2018-04-08 13:32:07 +08:00
    改成这样试试
    y_ = tf.placeholder(tf.int64, shape=[None, 2], name='y_')
    scoronepion
        2
    scoronepion  
    OP
       2018-04-08 13:34:02 +08:00
    @epleone 试过了,依然会报第二个错...
    epleone
        3
    epleone  
       2018-04-08 13:38:11 +08:00
    @scoronepion
    cost = tl.cost.cross_entropy(y, y_, name='entropy')
    也要改成
    cost = tl.cost.cross_entropy(y_, y, name='entropy')
    scoronepion
        4
    scoronepion  
    OP
       2018-04-08 13:43:36 +08:00
    @epleone 刚刚试了一下,还是不行,依然在报:ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).
    epleone
        5
    epleone  
       2018-04-08 13:49:11 +08:00
    @scoronepion
    没道理啊,占位符 y_ = tf.placeholder(tf.int64, shape=[None, 2]) 这样是没有问题的,最终的类型是 int 吗,还是 tf.float32?
    调用损失函数的时候,保证前向生成的 y 在前 gt 在后。好好检查一下代码。
    还有你用的是 tl,我不太清楚,直接用 tf 不可以么。
    scoronepion
        6
    scoronepion  
    OP
       2018-04-08 13:52:50 +08:00
    @epleone 好的好的谢谢大佬,我再看看。用 tl 是因为前段时间刚好在学这个,所以想用用。
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   我们的愿景   ·   实用小工具   ·   2254 人在线   最高记录 6543   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 128ms · UTC 02:43 · PVG 10:43 · LAX 19:43 · JFK 22:43
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.