Невозможно прочитать данные из источника http, используя Flume
Я установил HDP на Ubuntu14, и Flume запущен и работает, поскольку я могу проверить через Ambari. Теперь я настроил Flume для чтения из локального файла и записи в HDFS, что сработало. Поэтому я пытался прочитать данные из http-источника, но, похоже, он не работает.
И странная вещь в том, что я также не получаю никаких ошибок в файле журнала.
Подробности в моем файле конфигурации выглядит следующим образом:
agent3.sources=reader3
agent3.sources.reader3.type=org.apache.flume.source.http.HTTPSource
agent3.sources.reader3.port=4200
agent3.sources.reader3.handler=org.apache.flume.source.http.JSONHandler
agent3.sources.reader3.channels=memoryChannel3
agent3.sources.reader3.bind=localhost
agent3.sources.reader3.url=https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol=MSFT&apikey=ST4Y9ND1ZEL092VV
agent3.sources.reader3.enableSSL=false
agent3.sources.reader3.logStdErr=true
agent3.sources.reader3.restart=true agent3.channels=memoryChannel3
agent3.channels.memoryChannel3.capacity=10000
agent3.channels.memoryChannel3.transactionCapacity=100
agent3.channels.memoryChannel3.type=memory agent3.sinks=hdfs-sink2
agent3.sinks.hdfs-sink2.channel=memoryChannel3
agent3.sinks.hdfs-sink2.hdfs.filePrefix=json-
agent3.sinks.hdfs-sink2.hdfs.path=/hadoop/hdfs/data/current/flume/events/%y-%m-%d/%H%M/%s
agent3.sinks.hdfs-sink2.hdfs.round=true
agent3.sinks.hdfs-sink2.hdfs.roundUnit=minute
agent3.sinks.hdfs-sink2.hdfs.roundValue=10
agent3.sinks.hdfs-sink2.hdfs.useLocalTimeStamp=true
agent3.sinks.hdfs-sink2.type=hdfs
Ниже приведен фрагмент файла flume-agent3.log
(несмотря на отсутствие ошибок в файле flume-agent3.log, потоковая передача данных не происходит)
14 Nov 2017 14:50:46,700 INFO [lifecycleSupervisor-1-0]
(org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start:61)
- Configuration provider starting 14 Nov 2017 14:50:46,724 INFO [conf-file-poller-0]
(org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:133)
- Reloading configuration file:/usr/hdp/current/flume-server/conf/agent3/flume.conf 14 Nov 2017
14:50:46,757 INFO [conf-file-poller-0]
Удаление повторяющихся строк...
(org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1019)
- Processing:hdfs-sink2 14 Nov 2017 14:50:46,763 INFO [conf-file-poller-0]
(org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:933)
- Added sinks: hdfs-sink2 Agent: agen t3 14 Nov 2017 14:50:46,802 INFO [conf-file-poller-0]
(org.apache.flume.conf.FlumeConfiguration.validateConfiguration:140)
- Post-validation flume configuration contains configuration for agents: [agent3] 14 Nov 2017 14:50:46,802 INFO [conf-file-poller-0]
(org.apache.flume.node.AbstractConfigurationProvider.loadChannels:150)
- Creating channels 14 Nov 2017 14:50:46,823 INFO [conf-file-poller-0]
(org.apache.flume.channel.DefaultChannelFactory.create:40) - Creating
instance of channel memoryChannel3 type memory 14 Nov 2017
14:50:46,831 INFO [conf-file-poller-0]
(org.apache.flume.node.AbstractConfigurationProvider.loadChannels:205)
- Created channel memoryChannel3 14 Nov 2017 14:50:46,831 INFO [conf-file-poller-0]
(org.apache.flume.source.DefaultSourceFactory.create:39) - Creating
instance of source reader3, type
org.apache.flume.source.http.HTTPSource 14 Nov 2017 14:50:46,957 INFO
[conf-file-poller-0]
(org.apache.flume.sink.DefaultSinkFactory.create:40) - Creating
instance of sink: hdfs-sink2, type: hdfs 14 Nov 2017 14:50:47,959 INFO
[conf-file-poller-0]
(org.apache.flume.sink.hdfs.HDFSEventSink.authenticate:560) - Hadoop
Security enabled: false 14 Nov 2017 14:50:47,965 INFO
[conf-file-poller-0]
(org.apache.flume.node.AbstractConfigurationProvider.getConfiguration:119)
- Channel memoryChannel3 connected to [reader3, hdfs-sink2] 14 Nov 2017 14:50:47,978 INFO [conf-file-poller-0]
(org.apache.flume.node.Application.startAllComponents:138) - Starting
new configuration:{ sourceRunners:{reader3=EventDrivenSourceRunner: {
source:org.apache.flume.source.http.HTTPSource{name:reader3,state:IDLE}}}
sinkRunners:{hdfs-sink2=SinkRunner:
{policy:org.apache.flume.sink.DefaultSinkProcessor@4cfccf09
counterGroup:{ name:null counters:{} } }}
channels:{memoryChannel3=org.apache.flume.channel.MemoryChannel{name:memoryChannel3}}
} 14 Nov 2017 14:50:47,989 INFO [conf-file-poller-0]
(org.apache.flume.node.Application.startAllComponents:145) - Starting
Channel memoryChannel3 14 Nov 2017 14:50:48,176 INFO
[lifecycleSupervisor-1-0]
(org.apache.flume.instrumentation.MonitoredCounterGroup.register:119)
- Monitored counter group for type: CHANNEL, name: memoryChannel3: Successfully registered new MBean. 14 Nov 2017 14:50:48,177 INFO
[lifecycleSupervisor-1-0]
(org.apache.flume.instrumentation.MonitoredCounterGroup.start:95) -
Component type: CHANNEL, name: memoryChannel3 started 14 Nov 2017
14:50:48,178 INFO [conf-file-poller-0]
(org.apache.flume.node.Application.startAllComponents:173) - Starting
Sink hdfs-sink2 14 Nov 2017 14:50:48,180 INFO [conf-file-poller-0]
(org.apache.flume.node.Application.startAllComponents:184) - Starting
Source reader3 14 Nov 2017 14:50:48,184 INFO
[lifecycleSupervisor-1-1]
(org.apache.flume.instrumentation.MonitoredCounterGroup.register:119)
- Monitored counter group for type: SINK, name: hdfs-sink2: Successfully registered new MBean. 14 Nov 2017 14:50:48,184 INFO
[lifecycleSupervisor-1-1]
(org.apache.flume.instrumentation.MonitoredCounterGroup.start:95)
-Component type: SINK, name: hdfs-sink2 started 14 Nov 2017 14:50:48,321 INFO [lifecycleSupervisor-1-0]
(org.mortbay.log.Slf4jLog.info:67) - Logging to
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
org.mortbay.log.Slf4jLog 14 Nov 2017 14:50:48,494 INFO
[lifecycleSupervisor-1-0] (org.mortbay.log.Slf4jLog.info:67) -
jetty-6.1.26.hwx 14 Nov 2017 14:50:48,657 INFO
[lifecycleSupervisor-1-0] (org.mortbay.log.Slf4jLog.info:67) -
Started SelectChannelConnector@localhost:4200 14 Nov 2017 14:50:48,658
INFO [lifecycleSupervisor-1-0]
(org.apache.flume.instrumentation.MonitoredCounterGroup.register:119)
- Monitored counter group for type: SOURCE, name: reader3: Successfully registered new MBean. 14 Nov 2017 14:50:48,658 INFO
[lifecycleSupervisor-1-0]
(org.apache.flume.instrumentation.MonitoredCounterGroup.start:95)
-Component type: SOURCE, name: reader3 started 14 Nov 2017 14:50:48,801 INFO [conf-file-poller-0]
(org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink.configure:86)
- Context parameters { parameters:{node=node.hadoop.com:6188, type=org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink} }
14 Nov 2017 14:50:48,913 INFO [conf-file-poller-0]
(org.apache.hadoop.metrics2.sink.timeline.availability.MetricSinkWriteShardHostnameHashingStrategy.findCollectorShard:42)
- Calculated collector shard node.hadoop.com based on hostname: node.hadoop.com 14 Nov 2017 14:50:48,913 INFO [conf-file-poller-0]
(org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink.start:69)
- Starting Flume Metrics Sink 14 Nov 2017 14:50:48,943 INFO [pool-5-thread-1]
(org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:207)
- ConnectionCreatedCount = 0 14 Nov 2017 14:50:48,944 INFO [pool-5-thread-1]
Удаление повторяющихся строк..
(org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:207)
- StopTime = 0 14 Nov 2017 14:50:48,948 INFO [pool-5-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:207)
- ChannelCapacity = 10000 14 Nov 2017 14:50:48,948 INFO [pool-5-thread-1]
(org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:207)
- ChannelFillPercentage = 0.0 14 Nov 2017 14:50:48,948 INFO [pool-5-thread-1]
(org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:207)
- ChannelSize = 0 14 Nov 2017 14:50:48,949 INFO [pool-5-thread-1]
(org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:207)
- EventTakeSuccessCount = 0 14 Nov 2017 14:50:48,949 INFO [pool-5-thread-1]
(org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:207)
- EventTakeAttemptCount = 1
Может кто-нибудь, пожалуйста, помогите мне определить, что здесь не так?