Быстрая вставка в друида

Столкнувшись с проблемой быстрого приема друида. Вещи начинают ломаться после org.apache.hadoop.mapred.LocalJobRunner - исполнитель задачи на карте завершен. Его можно получить входной файл.

Мои спецификации JSON файл -

{
    "hadoopCoordinates": "org.apache.hadoop:hadoop-client:2.6.0", 
    "spec": {
        "dataSchema": {
            "dataSource": "apps_searchprivacy", 
            "granularitySpec": {
                "intervals": [
                    "2017-01-23T00:00:00.000Z/2017-01-23T01:00:00.000Z"
                ], 
                "queryGranularity": "HOUR", 
                "segmentGranularity": "HOUR", 
                "type": "uniform"
            }, 
            "metricsSpec": [
                {
                    "name": "count", 
                    "type": "count"
                }, 
                {
                    "fieldName": "event_value", 
                    "name": "event_value", 
                    "type": "longSum"
                }, 
                {
                    "fieldName": "landing_impression", 
                    "name": "landing_impression", 
                    "type": "longSum"
                }, 
                 {
                    "fieldName": "user", 
                    "name": "DistinctUsers", 
                    "type": "hyperUnique"
                },
                {
                    "fieldName": "cost", 
                    "name": "cost", 
                    "type": "doubleSum"
                } 
            ], 
            "parser": {
                "parseSpec": {
                    "dimensionsSpec": {
                        "dimensionExclusions": [
                            "landing_page",
                            "skip_url",
                            "ua",
                            "user_id"
                            ], 
                        "dimensions": [
                            "t3",
                            "t2",
                            "t1",
                            "aff_id",
                            "customer",
                            "evt_id",
                            "install_date",
                            "install_week",
                            "install_month",
                            "install_year",
                            "days_since_install",
                            "months_since_install",
                            "weeks_since_install",
                            "success_url",
                            "event",
                            "chrome_version",
                            "value",
                            "event_label",
                            "rand",
                            "type_tag_id",
                            "channel_name",
                            "cid",
                            "log_id",
                            "extension",
                            "os",
                            "device",
                            "browser",
                            "cli_ip",
                            "t4",
                            "t5",
                            "referal_url",
                            "week",
                            "month",
                            "year",
                            "browser_version",
                            "browser_name",
                            "landing_template",
                            "strvalue",
                            "customer_group",
                            "extname",
                            "countrycode",
                            "issp",
                            "spdes",
                            "spsc"                         

                            ],                
                        "spatialDimensions": []
                    }, 
                    "format": "json", 
                    "timestampSpec": {
                        "column": "time_stamp", 
                        "format": "yyyy-MM-dd HH:mm:ss"
                    }
                }, 
                "type": "hadoopyString"
            }
        }, 
        "ioConfig": {
            "inputSpec": {
                "dataGranularity": "hour", 
                "filePattern": ".*\\..*",
                "inputPath": "hdfs://c8-auto-hadoop-service-1.srv.media.net:8020/data/apps_test_output", 
                "pathFormat": "'ts'=yyyyMMddHH", 
                "type": "granularity"
            }, 
            "type": "hadoop"
        }, 
        "tuningConfig": {
            "ignoreInvalidRows": "true",  
            "type": "hadoop", 
            "useCombiner": "false"
        }
    }, 
    "type": "index_hadoop"
}

Ошибка при получении

2017-02-03T14:39:50,738 INFO [LocalJobRunner Map Task Executor #0] org.apache.hadoop.mapred.MapTask - (EQUATOR) 0 kvi 26214396(104857584)
2017-02-03T14:39:50,738 INFO [LocalJobRunner Map Task Executor #0] org.apache.hadoop.mapred.MapTask - mapreduce.task.io.sort.mb: 100
2017-02-03T14:39:50,738 INFO [LocalJobRunner Map Task Executor #0] org.apache.hadoop.mapred.MapTask - soft limit at 83886080
2017-02-03T14:39:50,738 INFO [LocalJobRunner Map Task Executor #0] org.apache.hadoop.mapred.MapTask - bufstart = 0; bufvoid = 104857600
2017-02-03T14:39:50,738 INFO [LocalJobRunner Map Task Executor #0] org.apache.hadoop.mapred.MapTask - kvstart = 26214396; length = 6553600
2017-02-03T14:39:50,738 INFO [LocalJobRunner Map Task Executor #0] org.apache.hadoop.mapred.MapTask - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2017-02-03T14:39:50,847 INFO [LocalJobRunner Map Task Executor #0] org.apache.hadoop.mapred.MapTask - Starting flush of map output
2017-02-03T14:39:50,849 INFO [Thread-22] org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2017-02-03T14:39:50,850 WARN [Thread-22] org.apache.hadoop.mapred.LocalJobRunner - job_local233667772_0001
java.lang.Exception: java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) [hadoop-mapreduce-client-common-2.6.0.jar:?]
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
    at org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method) ~[hadoop-common-2.6.0.jar:?]
    at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63) ~[hadoop-common-2.6.0.jar:?]
    at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:192) ~[hadoop-common-2.6.0.jar:?]
    at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176) ~[hadoop-common-2.6.0.jar:?]
    at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:90) ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
    at org.apache.hadoop.mapreduce.lib.input.DelegatingRecordReader.initialize(DelegatingRecordReader.java:84) ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:545) ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:783) ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_121]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_121]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_121]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_121]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_121]
2017-02-03T14:39:51,130 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_local233667772_0001 failed with state FAILED due to: NA
2017-02-03T14:39:51,139 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 0
2017-02-03T14:39:51,143 INFO [task-runner-0-priority-0] io.druid.indexer.JobHelper - Deleting path[var/druid/hadoop-tmp/apps_searchprivacy/2017-02-03T143903.262Z_bb7a812bc0754d4aabcd4bc103ed648a]
2017-02-03T14:39:51,158 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_apps_searchprivacy_2017-02-03T14:39:03.257Z, type=index_hadoop, dataSource=apps_searchprivacy}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
    at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:204) ~[druid-indexing-service-0.9.2.jar:0.9.2]
    at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:208) ~[druid-indexing-service-0.9.2.jar:0.9.2]
    at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.2.jar:0.9.2]
    at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.2.jar:0.9.2]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_121]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_121]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_121]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_121]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
    at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.2.jar:0.9.2]
    ... 7 more
Caused by: com.metamx.common.ISE: Job[class io.druid.indexer.IndexGeneratorJob] failed!
    at io.druid.indexer.JobHelper.runJobs(JobHelper.java:369) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
    at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
    at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.2.jar:0.9.2]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_121]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_121]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_121]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
    at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.2.jar:0.9.2]
    ... 7 more
2017-02-03T14:39:51,165 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_apps_searchprivacy_2017-02-03T14:39:03.257Z] status changed to [FAILED].
2017-02-03T14:39:51,168 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_apps_searchprivacy_2017-02-03T14:39:03.257Z",
  "status" : "FAILED",
  "duration" : 43693
}

1 ответ

Кажется, что jvm не может загрузить собственную разделяемую библиотеку (например,.dll или.so), проверьте, доступна ли она на компьютерах, на которых выполняется задача, и, если это так, проверьте ее каталог на пути к классам jvm.

Другие вопросы по тегам