求助, sqoop 导出 hive 表到 mysql 出错!

2018-03-16 15:48:54 +08:00
 gregocean
为了调试,mysql 表结构已经简化了,没有任何约束:
mysql> desc tb_stu_answer;
+-----------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+----------+------+-----+---------+-------+
| sa_id | tinytext | YES | | NULL | |
| sa_stu_id | tinytext | YES | | NULL | |
+-----------+----------+------+-----+---------+-------+
sqoop 脚本如下:
sqoop export --connect jdbc:mysql://xxx:3306/db_name --username xxx --P --table tb_stu_answer --export-dir 'hdfs://aly-bigdata-hadoop-xxx/user/hive/warehouse/xx.db/tb_stu_answer_test' --input-fields-terminated-by '\001' --columns "sa_id, sa_stu_id" --input-null-string 'null' --input-null-non-string 'null'

连接到 mysql 权限也没问题,账号可写,如果--table 参数 表名打错会提示没有这个表。

这个 tb_stu_answer_test 是我用 hql create table tb_stu_answer_test as select xx from another_table limit 2; 生成的子集。

错误信息如下(没啥信息量的 log ):

18/03/16 15:23:23 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/03/16 15:23:23 INFO tool.CodeGenTool: Beginning code generation
Fri Mar 16 15:23:24 CST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
18/03/16 15:23:24 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_stu_answer` AS t LIMIT 1
18/03/16 15:23:24 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_stu_answer` AS t LIMIT 1
18/03/16 15:23:24 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-/compile/8e6c405c3c24bd055881ab1a4ce27300/tb_stu_answer.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/03/16 15:23:25 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-/compile/8e6c405c3c24bd055881ab1a4ce27300/tb_stu_answer.jar
18/03/16 15:23:25 INFO mapreduce.ExportJobBase: Beginning export of tb_stu_answer
18/03/16 15:23:26 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/03/16 15:23:26 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
18/03/16 15:23:26 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
18/03/16 15:23:26 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/03/16 15:23:27 INFO client.RMProxy: Connecting to ResourceManager at aly-bigdata-hadoop-yarnmanager...com/10.255.25.38:8032
18/03/16 15:23:27 INFO hdfs.DFSClient: Created token for hive: HDFS_DELEGATION_TOKEN owner=hive@.COM, renewer=yarn, realUser=, issueDate=1521185007407, maxDate=1521789807407, sequenceNumber=64697, masterKeyId=75 on 10.255.25.36:8020
18/03/16 15:23:27 INFO security.TokenCache: Got dt for hdfs://aly-bigdata-hadoop-namenodemaster...com:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 10.255.25.36:8020, Ident: (token for hive: HDFS_DELEGATION_TOKEN owner=hive@.COM, renewer=yarn, realUser=, issueDate=1521185007407, maxDate=1521789807407, sequenceNumber=64697, masterKeyId=75)
18/03/16 15:23:28 INFO input.FileInputFormat: Total input paths to process : 1
18/03/16 15:23:28 INFO input.FileInputFormat: Total input paths to process : 1
18/03/16 15:23:28 INFO mapreduce.JobSubmitter: number of splits:4
18/03/16 15:23:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1520934182986_5322
18/03/16 15:23:28 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: 10.255.25.36:8020, Ident: (token for hive: HDFS_DELEGATION_TOKEN owner=hive@.COM, renewer=yarn, realUser=, issueDate=1521185007407, maxDate=1521789807407, sequenceNumber=64697, masterKeyId=75)
18/03/16 15:23:28 INFO impl.YarnClientImpl: Submitted application application_1520934182986_5322
18/03/16 15:23:28 INFO mapreduce.Job: The url to track the job: http://aly-bigdata-hadoop-yarnmanager...com:8088/proxy/application_1520934182986_5322/
18/03/16 15:23:28 INFO mapreduce.Job: Running job: job_1520934182986_5322
18/03/16 15:23:35 INFO mapreduce.Job: Job job_1520934182986_5322 running in uber mode : false
18/03/16 15:23:35 INFO mapreduce.Job: map 0% reduce 0%
18/03/16 15:23:40 INFO mapreduce.Job: map 100% reduce 0%
18/03/16 15:23:40 INFO mapreduce.Job: Job job_1520934182986_5322 failed with state FAILED due to: Task failed task_1520934182986_5322_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

18/03/16 15:23:40 INFO mapreduce.Job: Counters: 8
Job Counters
Failed map tasks=4
Launched map tasks=4
Rack-local map tasks=4
Total time spent by all maps in occupied slots (ms)=9805
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=9805
Total vcore-milliseconds taken by all map tasks=9805
Total megabyte-milliseconds taken by all map tasks=10040320
18/03/16 15:23:40 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
18/03/16 15:23:40 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 14.0035 seconds (0 bytes/sec)
18/03/16 15:23:40 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
18/03/16 15:23:40 INFO mapreduce.ExportJobBase: Exported 0 records.
18/03/16 15:23:40 ERROR tool.ExportTool: Error during export:
Export job failed!
at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:439)
at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.java:931)
at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:80)
at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)

有人遇到过类似问题有经验的么,文档看了没啥帮助,google,Stack Overflow 似乎遇到类似问题的人不多。
9244 次点击
所在节点    Hadoop
3 条回复
anthow
2018-03-16 15:58:51 +08:00
语句没毛病。。看下 tb_stu_answer_test 的表信息,是不是分隔符的问题
liprais
2018-03-16 16:31:37 +08:00
你知道 sqoop 到底干了啥了么
去 yarn resourcemanager 上看 Failed map tasks 的日志啊
Task failed task_1520934182986_5322_m_000001 这不是写着这个 task 挂掉了么
gregocean
2018-03-16 17:02:28 +08:00
@liprais 嗯,看了详细的日志,还是数据库访问出问题了,No route to host。谢谢~

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/438675

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX