Контейнер заданий Hadoop MapReduce выдает исключение java.io.FileNotFoundException для пути /tmp/crunch-1324412807/p2/MAP

Когда я отправляю задание Oozie в кластер EMR Hadoop, я вижу ошибку ниже. Я мог видеть, что определенный контейнер не находит временный выходной файл, созданный другим заданием (?).

Я проверил адрес узла имени, а также убедился, что на нем достаточно памяти (прикреплен дополнительный том EBS). Я использую тип экземпляра Master - и тип основного экземпляра - .

Это ошибка, которую я вижу в журналах Application Master.

      Application application_1632399471753_0051 failed 2 times due to AM Container for appattempt_1632399471753_0051_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: java.io.FileNotFoundException: File does not exist: hdfs://ip-10-23-31-119.us-west-2.compute.internal:8020/tmp/crunch-1324412807/p2/MAP
For more detailed output, check the application tracking page: http://ip-10-23-31-119.us-west-2.compute.internal:8088/cluster/app/application_1632399471753_0051 Then click on links to logs of each attempt.
. Failing the application.

Это ошибка, которую я вижу, когда захожу в один из журналов задач.

      2021-09-23 13:19:43,280 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1632399471753_0051Job Transitioned from INITED to SETUP
2021-09-23 13:19:43,282 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2021-09-23 13:19:43,294 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1632399471753_0051Job Transitioned from SETUP to RUNNING
2021-09-23 13:19:43,309 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1632399471753_0051_m_000000 Task Transitioned from NEW to SCHEDULED
2021-09-23 13:19:43,311 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1632399471753_0051_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2021-09-23 13:19:43,311 INFO [Thread-85] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:7168, vCores:1>
2021-09-23 13:19:43,380 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1632399471753_0051, File: hdfs://ip-10-23-31-119.us-west-2.compute.internal:8020/tmp/hadoop-yarn/staging/hadoop/.staging/job_1632399471753_0051/job_1632399471753_0051_1.jhist
2021-09-23 13:19:44,270 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2021-09-23 13:19:44,298 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1632399471753_0051: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:14336, vCores:10> knownNMs=2
2021-09-23 13:19:45,306 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2021-09-23 13:19:45,308 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1632399471753_0051_01_000002 to attempt_1632399471753_0051_m_000000_0
2021-09-23 13:19:45,309 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2021-09-23 13:19:45,362 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar file on the remote FS is hdfs://ip-10-23-31-119.us-west-2.compute.internal:8020/tmp/hadoop-yarn/staging/hadoop/.staging/job_1632399471753_0051/job.jar
2021-09-23 13:19:45,364 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf file on the remote FS is /tmp/hadoop-yarn/staging/hadoop/.staging/job_1632399471753_0051/job.xml
2021-09-23 13:19:45,367 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://ip-10-23-31-119.us-west-2.compute.internal:8020/tmp/crunch-1324412807/p2/MAP
        at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:902)
        at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:947)
        at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1714)
        at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1691)
        at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
        at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1210)
        at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:147)
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1459)
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1451)
        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
        at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://ip-10-23-31-119.us-west-2.compute.internal:8020/tmp/crunch-1324412807/p2/MAP
        at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1444)
        at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1437)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1452)
        at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:774)
        at org.apache.hadoop.mapreduce.v2.util.MRApps.parseDistributedCacheArtifacts(MRApps.java:601)
        at org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:491)
        at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:821)
        ... 14 more
2021-09-23 13:19:45,369 INFO [AsyncDispatcher ShutDown handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
End of LogType:syslog

Я не знаю, что еще мне нужно для проверки и устранения этой проблемы. Я не вижу журналов, которые печатаются из моего кода приложения.

0 ответов

Другие вопросы по тегам