Ошибка OutOfMemory при чтении байтов с ребер в пряже
Я делаю алгоритм BFS в пряже, и я делаю пользовательское значение для данных в моей вершине (Vertex Data). Но после того, как я это сделал, что-то пошло не так для процесса чтения краев.
Я прослеживаю ошибку до следующих строк кода:
В ByteArrayEdges переменная
serializedEdgesBytesUsed
получить значение1987015248
и выдает ошибку OutOfMemory при выделении нового массива (насколько я знаю, ограничение Java составляет 64 КБ)@Override public void readFields(DataInput in) throws IOException { serializedEdgesBytesUsed = in.readInt(); if (serializedEdgesBytesUsed > 0) { // Only create a new buffer if the old one isn't big enough if (serializedEdges == null || serializedEdgesBytesUsed > serializedEdges.length) { serializedEdges = new byte[serializedEdgesBytesUsed]; } in.readFully(serializedEdges, 0, serializedEdgesBytesUsed); } edgeCount = in.readInt();
}
Я не уверен, почему это начало происходить, но до использования пользовательских данных вершин, эта проблема не существует.
Полный журнал находится здесь (я тестирую прямо из Eclipse, потому что в псевдораспределенном кластере было гораздо сложнее):
2015-08-20 01:52:21,103 INFO [LocalJobRunner Map Task Executor #0] utils.ProgressableUtils (ProgressableUtils.java:waitFor(315)) - waitFor: Future result not ready yet java.util.concurrent.FutureTask@b2dd686
2015-08-20 01:52:21,103 INFO [LocalJobRunner Map Task Executor #0] utils.ProgressableUtils (ProgressableUtils.java:waitFor(197)) - waitFor: Waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
2015-08-20 01:53:12,527 ERROR [LocalJobRunner Map Task Executor #0] graph.GraphMapper (GraphMapper.java:run(101)) - Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316)
at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409)
at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629)
at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:202)
at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312)
at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185)
... 17 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.giraph.edge.ByteArrayEdges.readFields(ByteArrayEdges.java:193)
at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:541)
at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
... 4 more
2015-08-20 01:53:12,532 ERROR [LocalJobRunner Map Task Executor #0] worker.BspServiceWorker (BspServiceWorker.java:unregisterHealth(777)) - unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_local1113753160_0001/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/localhost_0 on superstep -1
2015-08-20 01:53:12,558 INFO [Thread-13] mapred.LocalJobRunner (LocalJobRunner.java:runTasks(456)) - map task executor complete.
2015-08-20 01:53:12,562 WARN [Thread-13] mapred.LocalJobRunner (LocalJobRunner.java:run(560)) - job_local1113753160_0001
java.lang.Exception: java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:104)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316)
at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409)
at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629)
at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
... 8 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:202)
at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312)
at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185)
... 17 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.giraph.edge.ByteArrayEdges.readFields(ByteArrayEdges.java:193)
at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:541)
at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
... 4 more
Строка от терминала, используемая для выполнения этого:
$HADOOP_HOME/bin/yarn jar $GIRAPH_HOME/gaph-examples/target/giraph-examples-1.1.0-for-hadoop-2.4.0-jar-with-dependencies.jar algoritmos.masivos.BusquedaDeCaminosNavegacionalesWikiquotesMasivo lectura_de_grafo.BusquedaDeCaminosNavegacionalesWikiquote -vif pruebas.IdTextWithValueDoubleInputFormat -vip /user/hduser/input/wiki-graph-chiquito.txt -vof pruebas.IdTextWithValueTextOutputFormat -op /user/hduser/output/caminosNavegacionales -w 2 -yh 250
Может быть, я должен использовать EdgeInputFormat
?
Спасибо за прочтение.
1 ответ
Я вижу реальную проблему как недостаток памяти, выделенной для контейнера Maptask, что вызывает ошибку пространства кучи Java.
Чтобы быстро это исправить, вы можете предпочесть расширить контейнер памяти карты пряжи / уменьшить количество узлов, выделив больше памяти в конфигурациях.
Пожалуйста, выделите больше памяти для следующего набора свойств в файле yarn-site.xml.
mapreduce.map.memory.mb
mapreduce.reduce.memory.mb
mapreduce.map.java.opts
mapreduce.reduce.java.opts
[Примечание: свойства *.memory.mb должны быть выше, чем свойства *.java.opts]