新手求教:怎样把两个列向量合并成一个 n*2 的矩阵?

2018-06-08 06:12:33 +08:00
 acone2003
最近学习随即森林分类算法,碰到一个问题,试了各种互联网上的方法,都不能得到正确结果,只好在这里求助大家了.
是这样:test_lables 是测试样本二分类的真实标签,有 692 个样本,test_hat 是预测值,现在我想把这两个合并在一块,组成一个 692*2 的矩阵,每个预测值对应一个真实值。源代码如下:

import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.dummy import DummyClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.svm import SVC
#from sklearn import datasets

dataframe = pd.read_csv( "D:/Research/TuPo_sel0.Train.csv", header = None )
train_features = dataframe.iloc[ :, 0:24]
train_lables = dataframe.iloc[:, 24]

test_data = pd.read_csv( "D:/Research/TuPo_sel0.Valid.csv", header = None )
test_features = test_data.iloc[ :, 0:24 ]
test_lables = test_data.iloc[ :, 24 ]

dummy = DummyClassifier( strategy = 'uniform', random_state = 1 )
dummy.fit( train_features, train_lables )
print( "dummy_score =", dummy.score( test_features, test_lables ) )

style = 1

if style == 1:
max_features = 19
n_estimators = 400
randomforest = RandomForestClassifier( max_features = max_features, n_estimators = n_estimators, random_state=1, n_jobs=-1 )
model = randomforest.fit( train_features, train_lables )
test_hat = model.predict( test_features )
test_hat1 = np.hstack( ( test_hat, test_lables ) )
test_hat1.reshape( -1, 2 )
print( test_hat1.shape )
print( test_hat1 )
print( "max_features =", max_features, "; n_estimators =", n_estimators,
"; randomforest_score =", randomforest.score( test_features, test_lables ) )

运算结果如下:
runfile('D:/Python Programs/TryLoadData.py', wdir='D:/Python Programs')
dummy_score = 0.5447976878612717
(1384,)
[0 0 1 ... 0 0 0]
max_features = 19 ; n_estimators = 400 ; randomforest_score = 0.6416184971098265

求教各位怎么修改才能得到正确结果?
5120 次点击
所在节点    Python
4 条回复
acone2003
2018-06-08 06:35:55 +08:00
另外再顺便问一下:怎样计算测试集中的预测精度,即所有预测为 1 的样本的预测正确率。
enenaaa
2018-06-08 09:02:17 +08:00
test_hat1 = np.hstack((test_hat.reshape(-1, 1), test_lables.reshape(-1, 1)))

查看训练结果可以看简报,metrics.classification_report
acone2003
2018-06-08 09:18:39 +08:00
谢谢 enenaaa,搞定!
necomancer
2018-06-08 09:51:42 +08:00
np.vstack([a, b]).T

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/461370

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX