将一台机器作为 hadoop 的 master ，然后在 slaves 文件里只写一个 localhost , 可以算是将 namenode 节点和 datanode 节点都部署在一台机器上了么。。

wander2008

2016-06-30 21:02:14 +08:00

不可以。

AntonChen

2016-06-30 21:07:53 +08:00

cdh 的可以 apache 的理论上可行，暂未测试正好最近想验证下。简单的说就是去没个节点上启动对应服务不用 apache hadoop 自己的批量远程启动先主后从

AntonChen

2016-06-30 21:09:49 +08:00

补充一台机可以启动主从，估计是哪儿配置错误具体需要看日志跟踪

wdg8106

2016-07-01 00:01:22 +08:00

非常感谢，今天刚开始答，我再仔细看看日志

Jackliu91

2016-07-01 00:33:01 +08:00

搜伪分布式安装

wdg8106

2016-07-01 09:41:51 +08:00

@Jackliu91 伪分布式安装已经成功了，想试一下集群，但是只有一台机器。。

wdg8106

2016-07-01 09:44:25 +08:00

@AntonChen
@Jackliu91
说起伪分布式，我其实只是想熟悉下 hive 的操作，在伪分布式上可以操作 hive 么。。集群跑不起来的话，执行一个 hive 操作通常就卡主了。。

Jackliu91

2016-07-01 10:07:09 +08:00

@wdg8106 首先伪分布式也可以用 Hive ，至于你说的卡住感觉是你没安对。然后如果你想用一台机器做全分布式，可以用虚拟机虚拟三台，但这种也只是玩具，跟你用伪分布式来学习效果是一样的。

wdg8106

2016-07-01 12:06:14 +08:00

@Jackliu91 嗯，就是当了解一下集群的配置

wdg8106

2016-07-01 12:18:21 +08:00

hive 读取文件写入数据库写入的是空值是怎么回事呢。。
文件 stu.txt ：
1 xiaopi
2 xiaoxue
3 qingqing

执行语句：
load data local inpath '/usr/local/hadoop/examples/stu.txt' overwrite into table stu;

查询表 stu ：
hive> select id ,name from stu;
OK
NULL NULL
NULL NULL
NULL NULL
Time taken: 0.087 seconds, Fetched: 3 row(s)

hadoop 是伪分布式的，请问这个怎么解决呢，
@AntonChen @Jackliu91 @wander2008

Jackliu91

2016-07-01 13:10:40 +08:00

@wdg8106 建表语句呢？

wdg8106

2016-07-01 13:55:41 +08:00

@Jackliu91
create table if not exists hive.stu(id int,name string) row format delimited fields terminated by '\t';

Jackliu91

2016-07-01 15:01:39 +08:00

@wdg8106 应该是这样的，你的文件中
1 xiaopi 是用空格分隔的，不是制表符，所以“ 1 xiaopi ”会成为第一个字段，第二个字段为 NULL ；但你第一个字段是 int ，从“ 1 xiaopi ”向 int 转换失败，所以第一列也为 NULL 。

wdg8106

2016-07-01 15:45:54 +08:00

@Jackliu91
太感谢了，就是这个问题， O(∩_∩)O 哈哈~

wdg8106

2016-07-05 21:12:19 +08:00

@Jackliu91
又碰到了个 thrift 连接 hive 的问题请教下啊，

先执行 hive --service hiveserver2 & 命令启动 hiveserver ，

然后执行脚本连接 hive 进行操作：
# test.py
#!/usr/bin/env python

import sys
from hive_service import ThriftHive
#!/usr/bin/env python

import sys
from hive_service import ThriftHive
from hive_service.ttypes import HiveServerException
from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol

def hiveExe(sql):
try:
transport = TSocket.TSocket('127.0.0.1',10000)
transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = ThriftHive.Client(protocol)
transport.open()
client.execute(sql)

print "The return value is:"
print client.fetchAll()
print "................."
transport.close()
except Thrift.TException,tx:
print "%s" % (tx.message)

if __name__ == '__main__':
print "hello"
hiveExe("show tables")

就是执行一行简单的查看所有表的语句，但是程序卡在 client.execute(sql) 这行就不动了。

请问这个问题怎么解决呢，，，

用 netstat 命令查看 10000 端口 hiveServer 是启动成功的。
$ netstat -nl |grep 10000
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN

在 CLI 里面也是可以正常进行 hive 查询操作的