Ошибка примера подсчета слов в учебном пособии

Question

Ошибка примера подсчета слов в учебном пособии

Сейчас я учусь каскадированию. Сейчас я смотрю второй учебник на его официальном сайте, который посвящен примеру Work Count. Я копирую код из него и пытаюсь запустить, он всегда дает мне следующие ошибки:

Exception in thread "main" cascading.flow.planner.PlannerException: could not build flow from assembly: [[token][com.starscriber.cascadingtest.Main.main(Main.java:44)] 
unable to resolve argument selector: [{1}:'text'], with incoming: [{1}:'doc01        A rain shadow is a dry area on the lee back side of a mountainous area.']] at cascading.flow.planner.FlowPlanner.handleExceptionDuringPlanning(FlowPlanner.java:576)
at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:263)
at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:80)
at cascading.flow.FlowConnector.connect(FlowConnector.java:459)
at com.starscriber.cascadingtest.Main.main(Main.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Caused by: cascading.pipe.OperatorException: [token][com.starscriber.cascadingtest.Main.main(Main.java:44)] 
unable to resolve argument selector: [{1}:'text'], with incoming: [{1}:'doc01        A rain shadow is a dry area on the lee back side of a mountainous area.']
at cascading.pipe.Operator.resolveArgumentSelector(Operator.java:345)
at cascading.pipe.Each.outgoingScopeFor(Each.java:368)
at cascading.flow.planner.ElementGraph.resolveFields(ElementGraph.java:628)
at cascading.flow.planner.ElementGraph.resolveFields(ElementGraph.java:610)
at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:248)
... 8 more

Caused by: cascading.tuple.FieldsResolverException: 
could not select fields: [{1}:'text'], from: [{1}:'doc01        A rain shadow is a dry area on the lee back side of a mountainous area.']
at cascading.tuple.Fields.indexOf(Fields.java:1008)
at cascading.tuple.Fields.select(Fields.java:1064)
at cascading.pipe.Operator.resolveArgumentSelector(Operator.java:341)
... 12 more

Как так?? Я копирую точно такой же код из официального Github и ничего не меняю...

String docPath = args[0];
String wcPath = args[1];

Properties properties = new Properties();          
AppProps.setApplicationJarClass(properties, Main.class);
HadoopFlowConnector flowConnector = new HadoopFlowConnector(properties);

// create source and sink taps
Tap docTap = new Hfs(new TextDelimited(true, "\t"), docPath);
Tap wcTap = new Hfs(new TextDelimited(true, "\t"), wcPath);

// specify a regex operation to split the "document" text lines into a token stream
Fields token = new Fields("token");
Fields text = new Fields("text");
RegexSplitGenerator splitter = new RegexSplitGenerator(token, "[ \\[\\]\\(\\),.]");
// only returns "token"
Pipe docPipe = new Each("token", text, splitter, Fields.RESULTS);

// determine the word counts
Pipe wcPipe = new Pipe("wc", docPipe);
wcPipe = new GroupBy(wcPipe, token);
wcPipe = new Every(wcPipe, Fields.ALL, new Count(), Fields.ALL);

// connect the taps, pipes, etc., into a flow
FlowDef flowDef = FlowDef.flowDef()
            .setName("wc")
            .addSource(docPipe, docTap)
            .addTailSink(wcPipe, wcTap);

// write a DOT file and run the flow
Flow wcFlow = flowConnector.connect(flowDef);
wcFlow.writeDOT("dot/wc.dot");
wcFlow.complete();

В чем проблема??

И это входной файл:

doc01        A rain shadow is a dry area on the lee back side of a mountainous area.
doc02        This sinking, dry air produces a rain shadow, or area in the lee of a mountain with less rain and cloudcover.
doc03        A rain shadow is an area of dry land that lies on the leeward (or downwind) side of a mountain.
doc04        This is known as the rain shadow effect and is the primary cause of leeward deserts of mountain ranges, such as California's Death Valley.
doc05        Two Women. Secrets. A Broken Land. [DVD Australia]

0

java hadoop cascading

Источник

user2597504 19 ноя '13 в 18:45

2 ответа

Другие вопросы по тегам java hadoop cascading

user2772176 29 ноя '13 в 09:49 2013-11-29 09:49 · Answer 1 · 2013-11-29 09:49

Один раз проверьте, есть ли вкладка между двумя полями docId и text во входном файле. Программа ожидает два поля с разделенными табуляцией, но в вашем случае она читает всю строку в одно поле.

1

Источник

user2772176 29 ноя '13 в 09:49

user636696 02 фев '14 в 12:31 2014-02-02 12:31 · Answer 2 · 2014-02-02 12:31

Как уже упоминали другие люди, вы должны иметь те же заголовки, что и в примере. Вместо того, чтобы копировать код, попробуйте клонировать репозиторий, чтобы не было ошибок, связанных с форматированием файла.

0

Источник

user636696 02 фев '14 в 12:31