Невозможно сделать distcp от s3 до hdfs, используя shell-action в oozie
Я пытаюсь скопировать данные из s3 в hdfs, используя distcp. Ниже приведен мой сценарий оболочки, где я делаю distcp.
mkdir.sh
hadoop distcp s3n://bucket-name/foldername hdfs://localhost:8020/user/hdfs/data/
The above shell script works fine when i am running the script manually.
But when i try to run the same script using oozie workflow distcp fails.
I am trying to run the workflow using shell-action.
Вот мой файл job.properties:
nameNode=hdfs://ip-172-31-34-170.us-west-2.compute.internal:8020
jobTracker=ip-172-31-34-195.us-west-2.compute.internal:8032
queueName=default
oozie.libpath=${nameNode}/user/oozie/share/lib
user.name=hdfs
oozie.wf.application.path=${nameNode}/user/${user.name}/oozie/
mkdirshellscript=${oozie.wf.application.path}/mkdir.sh
И мой workflow.xml выглядит следующим образом:
<workflow-app name="WorkFlowForShellAction" xmlns="uri:oozie:workflow:0.1">
<start to="shellAction"/>
<action name="shellAction">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="/user/hdfs/hari123"/>
<mkdir path="/user/hdfs/hari123"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>${mkdirshellscript}</exec>
<file>${mkdirshellscript}</file>
</shell>
<ok to="end"/>
<error to="killAction"/>
</action>
<kill name="killAction">
<message>"Killed job due to error"</message>
</kill>
<end name="end"/>
</workflow-app>
Журнал регистрации выглядит следующим образом:
2014-09-30 10:31:51,102 INFO org.apache.oozie.servlet.CallbackServlet: SERVER[ec2-54-69-26-119.us-west-2.compute.amazonaws.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000018-140930055823135-oozie-oozi-W] ACTION[0000018-140930055823135-oozie-oozi-W@shellAction] callback for action [0000018-140930055823135-oozie-oozi-W@shellAction]
2014-09-30 10:31:51,337 INFO org.apache.oozie.command.wf.ActionEndXCommand: SERVER[ec2-54-69-26-119.us-west-2.compute.amazonaws.com] USER[hdfs] GROUP[-] TOKEN[] APP[WorkFlowForShellActionWithCaptureOutput] JOB[0000018-140930055823135-oozie-oozi-W] ACTION[0000018-140930055823135-oozie-oozi-W@shellAction] ERROR is considered as FAILED for SLA
Я хочу сделать distcp используя shell-action, но не distcp-action в oozie.
1 ответ
Попробуйте с:
<workflow-app name="WorkFlowForShellAction" xmlns="uri:oozie:workflow:0.1">
...
<start to="shellAction"/>
<action name="shellAction">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="/user/hdfs/hari123"/>
<mkdir path="/user/hdfs/hari123"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>./${mkdirshellscript}</exec>
<file>${mkdirshellscript}#${mkdirshellscript}</file>
</shell>
...
</workflow-app>