Сбой примера Hadoop Wordcount из-за контейнера AM
Я уже некоторое время пытаюсь запустить пример wordcount hadoop, однако я сталкиваюсь с некоторыми проблемами. У меня есть hadoop 2.7.1 и работает на Windows. Ниже приведены подробности ошибки:
команда:
yarn jar C:\hadoop-2.7.1\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.7.1.jar wordcount input output
Выход:
INFO input.FileInputFormat: Total input paths to process : 1
INFO mapreduce.JobSubmitter: number of splits:1
INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14
90853163147_0009
INFO impl.YarnClientImpl: Submitted application application_14
90853163147_0009
INFO mapreduce.Job: The url to track the job: http://*****
*****/proxy/application_1490853163147_0009/
INFO mapreduce.Job: Running job: job_1490853163147_0009
INFO mapreduce.Job: Job job_1490853163147_0009 running in uber
mode : false
INFO mapreduce.Job: map 0% reduce 0%
INFO mapreduce.Job: Job job_1490853163147_0009 failed with sta
te FAILED due to: Application application_1490853163147_0009 failed 2 times due
to AM Container for appattempt_1490853163147_0009_000002 exited with exitCode:
1639
For more detailed output, check application tracking page:http://********
:****/cluster/app/application_1490853163147_0009Then, click on links to logs of
each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1490853163147_0009_02_000001
Exit code: 1639
Exception message: Incorrect command line arguments.
Stack trace: ExitCodeException exitCode=1639: Incorrect command line arguments.
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
unchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:617)
at java.lang.Thread.run(Thread.java:745)
Shell output: Usage: task create [TASKNAME] [COMMAND_LINE] |
task isAlive [TASKNAME] |
task kill [TASKNAME]
task processList [TASKNAME]
Creates a new task jobobject with taskname
Checks if task jobobject is alive
Kills task jobobject
Prints to stdout a list of processes in the task
along with their resource usage. One process per line
and comma separated info per process
ProcessId,VirtualMemoryCommitted(bytes),
WorkingSetSize(bytes),CpuTime(Millisec,Kernel+User)
Container exited with a non-zero exit code 1639
Failing this attempt. Failing the application.
INFO mapreduce.Job: Counters: 0
Пряжа-site.xml:
<configuration>
<property>
<name>yarn.application.classpath</name>
<value>
C:\hadoop-2.7.1\etc\hadoop,
C:\hadoop-2.7.1\share\hadoop\common\*,
C:\hadoop-2.7.1\share\hadoop\common\lib\*,
C:\hadoop-2.7.1\share\hadoop\hdfs\*,
C:\hadoop-2.7.1\share\hadoop\hdfs\lib\*,
C:\hadoop-2.7.1\share\hadoop\mapreduce\*,
C:\hadoop-2.7.1\share\hadoop\mapreduce\lib\*,
C:\hadoop-2.7.1\share\hadoop\yarn\*,
C:\hadoop-2.7.1\share\hadoop\yarn\lib\*
</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
<value>98.5</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2200</value>
<description>Amount of physical memory, in MB, that can be allocated for containers.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>500</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<description>Where to aggregate logs to.</description>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/tmp/logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>259200</value>
</property>
<property>
<name>yarn.log-aggregation.retain-check-interval-seconds</name>
<value>3600</value>
</property>
</configuration>
mapred.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Любая идея о том, что идет не так?
2 ответа
exitCode: 1639 Похоже, у вас запущен hadoop в Windows.
Я столкнулся с точно такой же проблемой. Я следовал руководству по установке Hadoop 2.6.0 ( http://www.ics.uci.edu/~shantas/Install_Hadoop-2.6.0_on_Windows10.pdf) при фактической установке Hadoop 2.8.0. Как только я закончил, я побежал
hadoop jar D:\hadoop-2.8.0\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.8.0.jar wordcount /foo/bar/LICENSE.txt /out1
И получил (из журналов пряжи nodemanager):
17/06/19 13:15:30 INFO monitor.ContainersMonitorImpl: запуск мониторинга ресурсов для контейнера_1497902417767_0004_01_000001
17/06/19 13:15:30 INFO nodemanager.DefaultContainerExecutor: launchContainer: [D: \ hadoop-2.8.0 \ bin \ winutils.exe, задача, создать, -m, -1, -c, -1, container_1497902417767_0004_01_000001, cmd / c D: / hadoop / temp / nm-localdir / usercache / ****** / appcache / application_1497902417767_0004 / container_1497902417767_0004_01_000001 / default_container_executor.cmd]
17/06/19 13:15:30 WARN nodemanager.DefaultContainerExecutor: Код выхода из контейнера container_1497902417767_0004_01_000001: 1639
17/06/19 13:15:30 WARN nodemanager.DefaultContainerExecutor: Исключение из запуска контейнера с идентификатором контейнера: container_1497902417767_0004_01_000001 и кодом выхода: 1639
ExitCodeException exitCode = 1639: неверные аргументы командной строки.
TaskExit: ошибка (1639): неверный аргумент командной строки. Обратитесь к SDK установщика Windows за подробной справкой командной строки.
Еще один симптом (из журналов менеджера пряжи):
17/06/19 13:25:49 WARN util.SysInfoWindows: ожидаемая длина разбиения sysInfo будет 11. Получил 7
Решением было получить совместимые (с Hadoop 2.8.0) двоичные файлы: https://github.com/steveloughran/winutils/tree/master/hadoop-2.8.0-RC3/bin
Как только я получил правильный файл winutils.exe, моя проблема исчезла.