Чтение / запись с помощью Nifi в Kafka в публичном облаке CDP Cloudera Data Platform
Nifi и Kafka теперь доступны в Cloudera Data Platform, публичном облаке CDP. Нифи отлично разговаривает со всем, а Kafka - это основная шина сообщений, я просто подумал:
Каковы минимальные шаги, необходимые для производства / потребления данных в Kafka из Apache Nifi в публичном облаке CDP
В идеале я бы поискал шаги, которые работают в любом облаке, например Amazon AWS и Microsoft Azure.
Я доволен ответами, которые соответствуют лучшим практикам и работают с конфигурацией платформы по умолчанию, но если есть общие альтернативы, они также приветствуются.
1 ответ
There will be multiple form factors available in the future, for now I will assume you have an environment that contains 1 datahub with NiFi, and 1 Data Hub with Kafka. (The answer still works if both are on the same datahub).
- Data Hub(s) with NiFi and Kafka
- Permission to access these (e.g. add processor, create Kafka topic)
- Know your Workload User Name (Cdp management console>Click your name (bottom left) > Click profile)
- You should have set your Workload Password in the same location
These steps allow you to Produce data from NiFi to Kafka in CDP Public Cloud
Unless mentioned otherwise, I have kept everything to its default settings.
In Kafka Data Hub Cluster:
- Gather the FQDN links of the brokers, and the used ports.
- If you have Streams Messaging Manager: Go to the brokers tab to see the FQDN and port already together
- If you cannot use Streams Messaging Manager: Go to the hardware tab of your Data Hub with Kafka and get the FQDN of the relevant nodes. (Currently these are called broker). Then add:portnumber behind each one. The default port is 9093.
- Combine the links together in this format: FQDN:port,FQDN:port,FQDN:port it should now look something like this:
In NiFi GUI:
- Make sure you have some data in NiFi to produce, for example by using the
processor - Select the relevant processor for writing to kafka, for example
, configure it as follows:
- Settings
- Automatically terminate relationships: Tick both success and faillure
- Properties
- Kafka Brokers: The combined list we created earlier
- Security Protocol: SASL_SSL
- SASL Mechanism: PLAIN
- SSL Context Service: Default NiFi SSL Context Service
- Username: your Workload User Name (see prerequisites above)
- Password: your Workload Password
- Topic Name: dennis
- Use Transactions: false
- Max Metadata Wait Time: 30 sec
- Connect your
processor to yourPublishKafka_2_0
processor and start the flow
These are the minimal steps, a more extensive explanation can be found on in the Cloudera Documentation. Note that it best practice to create topics explicitly (this example leverages the feature of Kafka that automatically lets it create topics when produced to).
These steps allow you to Consume data with NiFi from Kafka in CDP Public Cloud
A good check to see if data was written to Kafka, is consuming it again.
In NiFi GUI:
- Create a Kafka consumption processor, for instance
, configure its Properties as follows:
- Kafka Brokers, Security Protocol, SASL Mechanism, SSL Context Service, Username, Password, Topic Name: All the same as in our producer example above
- Consumer Group: 1
- Offset Reset: earliest
- Create another processor, or a funnel to send the messages to, and start the consumption processor.
И все, через 30 секунд вы должны увидеть, что данные, которые вы опубликовали в Kafka, теперь снова поступают в NiFi.
Полное раскрытие информации: я сотрудник Cloudera, движущей силы Nifi.