Application configuration
The application is configured via the properties.xml
file, in the resources/META-INF
directory, and includes the following elements:
- The Kafka input topic
- The Kafka broker address and port
- A schema of input records and its name
- A schema of output records and its name
- The SQL query used to filter and project
- The output filename
- The output directory
The first two are straightforward:
<property> <name>apex.operator.KafkaInput.prop.topics</name> <value>ETLTopic</value> </property><property> <name>apex.operator.KafkaInput.prop.clusters</name> <value>localhost:9092</value> <!-- broker (NOT zookeeper) address --> </property>
The topics
property of the KafkaInput
operator defines the topic for input records, and the clusters
property defines the address and port of the Kafka broker (it is important to ensure that this is the address and port of the actual broker and not of the ZooKeeper...