ОШИБКА 1128: Не удается найти поле dryTemp
Мой поросенок запустил код температуры и мне выдавали ошибку, поместили код ниже и произошла ошибка, чтобы облегчить понимание моей проблемы.
ошибка в строке 38 столбца 15, попытался удалить dryTemp, но также выдал другую ошибку.
Код:
--Load files into relations
month1 = LOAD 'hdfs:/data/big/data/weather/weather/201201hourly.txt' USING PigStorage(',');
month2 = LOAD 'hdfs:/data/big/data/weather/weather/201202hourly.txt' USING PigStorage(',');
month3 = LOAD 'hdfs:/data/big/data/weather/weather/201203hourly.txt' USING PigStorage(',');
month4 = LOAD 'hdfs:/data/big/data/weather/weather/201204hourly.txt' USING PigStorage(',');
month5 = LOAD 'hdfs:/data/big/data/weather/weather/201205hourly.txt' USING PigStorage(',');
month6 = LOAD 'hdfs:/data/big/data/weather/weather/201206hourly.txt' USING PigStorage(',');
--Combine relations
months = UNION month1, month2, month3, month4, month5, month6;
/* Splitting relations
SPLIT months INTO
splitMonth1 IF SUBSTRING(date, 4, 6) == '01',
splitMonth2 IF SUBSTRING(date, 4, 6) == '02',
splitMonth3 IF SUBSTRING(date, 4, 6) == '03',
splitRest IF (SUBSTRING(date, 4, 6) == '04' OR SUBSTRING(date, 4, 6) == '04');
*/
/* Joining relations
stations = LOAD 'hdfs:/data/big/data/QCLCD201211/stations.txt' USING PigStorage() AS (id:int, name:chararray)
JOIN months BY wban, stations by id;
*/
--filter out unwanted data
clearWeather = FILTER months BY skyCondition == 'CLR';
--Transform and shape relation
shapedWeather = FOREACH clearWeather GENERATE date, SUBSTRING(date, 0, 4) as year, SUBSTRING(date, 4, 6) as month, SUBSTRING(date, 6, 8) as day, skyCondition, dryTemp;
--Group relation specifying number of reducers
groupedByMonthDay = GROUP shapedWeather BY (month, day) PARALLEL 10;
--Aggregate relation
aggedResults = FOREACH groupedByMonthDay GENERATE group as MonthDay, AVG(shapedWeather.dryTemp), MIN(shapedWeather.dryTemp), MAX(shapedWeather.dryTemp), COUNT(shapedWeather.dryTemp) PARALLEL 10;
--Sort relation
sortedResults = ORDER aggedResults BY $1 DESC;
--Store results in HDFS
STORE sortedResults INTO 'hdfs:/data/big/data/weather/pigresults' USING PigStorage(':');
Запишите ошибку, он был довольно большим, все еще не знаю много о свинье, я все еще учусь, я считаю, что ошибка связана с типом переменной, которая не распознается, но не знаю, исправить это, надеюсь, поможет мне,
Ошибка:
ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1691)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:607)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: Failed to parse: Pig script failed to parse:
<file Documentos/pig/weather.pig, line 38, column 15> pig script failed to validate: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
... 15 more
Caused by:
<file Documentos/pig/weather.pig, line 38, column 15> pig script failed to validate: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
at org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:1017)
at org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:15870)
at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1933)
at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 16 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
at org.apache.pig.newplan.logical.expression.DereferenceExpression.translateAliasToPos(DereferenceExpression.java:215)
at org.apache.pig.newplan.logical.expression.DereferenceExpression.getFieldSchema(DereferenceExpression.java:149)
at org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:264)
at org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:148)
at org.apache.pig.newplan.logical.expression.DereferenceExpression.accept(DereferenceExpression.java:84)
at org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visitAll(SchemaResetter.java:67)
at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:122)
at org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:245)
at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:114)
at org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:1015)
... 22 more
Вот несколько строк файла 201211 hourly.txt:
WBAN, дата, время,StationType,SkyCondition,SkyConditionFlag, видимость,VisibilityFlag,WeatherType,WeatherTypeFlag,DryBulbFarenheit,DryBulbFarenheitFlag,DryBulbCelsius,DryBulbCelsiusFlag,WetBulbFarenheit,WetBulbFarenheitFlag,WetBulbCelsius,WetBulbCelsiusFlag,DewPointFarenheit,DewPointFarenheitFlag,DewPointCelsius,DewPointCelsiusFlag, относительная влажность,RelativeHumidityFlag,WindSpeed,WindSpeedFlag, направлением ветра,WindDirectionFlag,ValueForWindCharacter,ValueForWindCharacterFlag,StationPressure,StationPressureFlag,PressureTendency,PressureTendencyFlag,PressureChange,PressureChangeFlag,SeaLevelPressure,SeaLevelPressureFlag,RecordType,RecordTypeFlag,HourlyPrecip,HourlyPrecipFlag, высотомер,AltimeterFlag 03011,20120101,0015,0,CLR,,10,00,,,, 23, -5,0, 15,-9,5,-9, -23,0, 24, 5, 120,,, 21,70,,,,, M,,AA,,, 30,43, 03011,20120101,0035,0,CLR, 10,00,,, 21,-6,0, 14, -10,2,-9, -23,0, 26, 6, 130,,, 21,70,,,,,,,M,,AA,,,,30,43, 03011,20120101,0055,0,CLR,,10,0 0,,,, 21,-6,0, 13, -10,5, -13, -25,0, 21,, 0, 000,,,,,,,,,,,M,,AA,,, 30,44, 03011,20120101,0115,0,CLR,,10.00,,,,21,-6.0, 14, -10.1,-8, -22.0, 27,, 0, 000,,,21.71,,,,, M,,AA,,,30.44, 03011,20120101,0135,0,CLR,,10.00,,, 21,-6.0, 13, -10.4,, -11,, -24.0,, 23,, 0,, 000,,,,21.72,,,,,,M,,AA,,,30.45, 03011,20120101,0155,0,CLR,,10.00,,,,21,,-6.0,,13,,-10.5,, -13, -25.0,, 21,, 6,,130,,,,21.72,,,,, M,,AA,,, 30,45, 03011,20120101,0215,0,CLR, 10,00,,, 21,-6,0, 14, -10,2,-9, -23,0, 26, 5,090,,, 21.73,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 9, -23,0, 26, 6, 120,,, 21,74,,,,, M,,AA,,, 30,47, 03011,20120101,0255,0, CLR,, 10,00,,, 21,-6,0, 13, -10,4, -11, -24,0, 23, 7,, 130,,, 21,74,,,,, M,,AA,,, 30,48, 03011,20120101,0315,0,CLR,,10.00,,,,23,, -5.0,,15,,-9.4,,-8,, -22.0,, 2 5,, 9,, 120,,,,21.74,,,,,,M,,AA,,,,30.47, 03011,20120101,0335,0,CLR,,10.00,,,,23, -5.0,,15,,-9.4,-8, -22.0, 25,, 8, 120,,, 21.74,,,,, M,,AA,,, 30.47, 03011,20120101,0355, 0, CLR,, 10.00,,,, 21,-6.0, 14, -10.2,-9, -23.0, 26, 7, 120,,, 21.73,,,,,,M,,AA,,,,30,46, 03011,20120101,0415,0,CLR,,10.00,,,,23, -5,0, 14,-9,7, -13, -25,0, 19,, 7,, 130,,,, 21.73,,,,,, М,, AA,,,30.46,
2 ответа
Я сделал несколько изменений в вашем сценарии,
1. Загрузите данные с правильной схемой (вы можете изменить тип данных каждого поля в соответствии с вашими потребностями)
2. Оптимизированы все 6 нагрузок на 1 груз.
3. Удален закомментированный код
Я проверил приведенный ниже сценарий pig с вашим вводом и он работает нормально, также вставил вывод.
PigScript:
--Load all the files into relations
months = LOAD 'hdfs:/data/big/data/weather/weather/20120[1-6]hourly.txt' USING PigStorage(',') AS (WBAN:int,Date:chararray,Time:chararray,StationType:int,SkyCondition:chararray,SkyConditionFlag,Visibility,VisibilityFlag,WeatherType,WeatherTypeFlag,DryBulbFarenheit:int,DryBulbFarenheitFlag,DryBulbCelsius:double,DryBulbCelsiusFlag,WetBulbFarenheit:int,WetBulbFarenheitFlag,WetBulbCelsius:double,WetBulbCelsiusFlag,DewPointFarenheit,DewPointFarenheitFlag,DewPointCelsius,DewPointCelsiusFlag,RelativeHumidity,RelativeHumidityFlag,WindSpeed,WindSpeedFlag,WindDirection,WindDirectionFlag,ValueForWindCharacter,ValueForWindCharacterFlag,StationPressure,StationPressureFlag,PressureTendency,PressureTendencyFlag,PressureChange,PressureChangeFlag,SeaLevelPressure,SeaLevelPressureFlag,RecordType,RecordTypeFlag,HourlyPrecip,HourlyPrecipFlag,Altimeter,AltimeterFlag);
--filter out unwanted data
clearWeather = FILTER months BY SkyCondition == 'CLR';
--Transform and shape relation
shapedWeather = FOREACH clearWeather GENERATE Date,
SUBSTRING(Date,0,4) AS year,
SUBSTRING(Date,4,6) AS month,
SUBSTRING(Date,6,8) AS day,
SkyCondition,
DryBulbFarenheit AS dryTemp;
--Group relation specifying number of reducers
groupedByMonthDay = GROUP shapedWeather BY (month, day) PARALLEL 10;
--Aggregate relation
aggedResults = FOREACH groupedByMonthDay GENERATE group as MonthDay, AVG(shapedWeather.dryTemp), MIN(shapedWeather.dryTemp), MAX(shapedWeather.dryTemp), COUNT(shapedWeather.dryTemp) PARALLEL 10;
--Sort relation
sortedResults = ORDER aggedResults BY $1 DESC;
--Store results in HDFS
STORE sortedResults INTO 'hdfs:/data/big/data/weather/pigresults' USING PigStorage(':');
Вывод: (на основе ваших входных образцов выше)
(01,01):21.615384615384617:21:23:13
MonthDay:(01,01)
Avg:21.615384615384617
Min:21
Max:23
Count:13
Похоже, вы загружаете "month1", "month2" и т. Д. Без указания схемы (где вы должны указать "dryTemp"). Вы можете попробовать что-то вроде:
month1 = LOAD 'hdfs:/data/big/data/weather/201201hourly.txt' USING PigStorage(',')
AS (wban,year_month_day,time,station_type,maint_indic,
sky_cond,visibility,weather_type,dryTemp);
Аналогично для всех остальных месяцев.
Спасибо