Apache Kafka – Java Producer & Consumer


Apache Kakfa is an opensource distributed event streaming platform which works based on publish/subscribe messaging system. That means, there would be a producer who publishes messages to Kafka and a consumer who reads messages from Kafka. In between, Kafka acts like a filesystem or database commit log.

In this article we will discuss about writing a Kafka producer and consumer using Java with customized serialization and deserializations.

Kakfa Producer Application:

Producer is the one who publish the messages to Kafka topic. Topic is a partitioner in Kafka environment, it is very similar to a folder in a file system. In the below example program, messages are getting published to Kafka topic ‘kafka-message-count-topic‘.

package com.malliktalksjava.kafka.producer;

import java.util.Properties;
import java.util.concurrent.ExecutionException;

import com.malliktalksjava.kafka.constants.KafkaConstants;
import com.malliktalksjava.kafka.util.CustomPartitioner;
import org.apache.kafka.clients.producer.*;
import org.apache.kafka.common.serialization.LongSerializer;
import org.apache.kafka.common.serialization.StringSerializer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class KafkaSampleProducer {

    static Logger log = LoggerFactory.getLogger(KafkaSampleProducer.class);

    public static void main(String[] args) {
        runProducer();
    }

    static void runProducer() {
        Producer<Long, String> producer = createProducer();

        for (int index = 0; index < KafkaConstants.MESSAGE_COUNT; index++) {
            ProducerRecord<Long, String> record = new ProducerRecord<Long, String>(KafkaConstants.TOPIC_NAME,
                    "This is record " + index);
            try {
                RecordMetadata metadata = producer.send(record).get();
                //log.info("Record sent with key " + index + " to partition " + metadata.partition() +
                 //       " with offset " + metadata.offset());
                System.out.println("Record sent with key " + index + " to partition " + metadata.partition() +
                        " with offset " + metadata.offset());
            } catch (ExecutionException e) {
                log.error("Error in sending record", e);
                e.printStackTrace();
            } catch (InterruptedException e) {
                log.error("Error in sending record", e);
                e.printStackTrace();
            }
        }
    }

    public static Producer<Long, String> createProducer() {
        Properties props = new Properties();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, KafkaConstants.KAFKA_BROKERS);
        props.put(ProducerConfig.CLIENT_ID_CONFIG, KafkaConstants.CLIENT_ID);
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, LongSerializer.class.getName());
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        props.put(ProducerConfig.PARTITIONER_CLASS_CONFIG, CustomPartitioner.class.getName());
        return new KafkaProducer<>(props);
    }
}

Kakfa Consumer Program:

Consumer is the one who subscribe to Kafka topic to read the messages. There are different ways to read the messages from Kafka, below example polls the topic for every thousend milli seconds to fetch the messages from Kafka.

package com.malliktalksjava.kafka.consumer;

import java.util.Collections;
import java.util.Properties;

import com.malliktalksjava.kafka.constants.KafkaConstants;
import com.malliktalksjava.kafka.producer.KafkaSampleProducer;
import org.apache.kafka.clients.consumer.Consumer;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.serialization.LongDeserializer;
import org.apache.kafka.common.serialization.StringDeserializer;

import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class KafkaSampleConsumer {
    static Logger log = LoggerFactory.getLogger(KafkaSampleProducer.class);

    public static void main(String[] args) {
        runConsumer();
    }

    static void runConsumer() {
        Consumer<Long, String> consumer = createConsumer();
        int noMessageFound = 0;
        while (true) {
            ConsumerRecords<Long, String> consumerRecords = consumer.poll(1000);
            // 1000 is the time in milliseconds consumer will wait if no record is found at broker.
            if (consumerRecords.count() == 0) {
                noMessageFound++;
                if (noMessageFound > KafkaConstants.MAX_NO_MESSAGE_FOUND_COUNT)
                    // If no message found count is reached to threshold exit loop.
                    break;
                else
                    continue;
            }

            //print each record.
            consumerRecords.forEach(record -> {
                System.out.println("Record Key " + record.key());
                System.out.println("Record value " + record.value());
                System.out.println("Record partition " + record.partition());
                System.out.println("Record offset " + record.offset());
            });

            // commits the offset of record to broker.
            consumer.commitAsync();
        }
        consumer.close();
    }

        public static Consumer<Long, String> createConsumer() {
            Properties props = new Properties();
            props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, KafkaConstants.KAFKA_BROKERS);
            props.put(ConsumerConfig.GROUP_ID_CONFIG, KafkaConstants.GROUP_ID_CONFIG);
            props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, LongDeserializer.class.getName());
            props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
            props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, KafkaConstants.MAX_POLL_RECORDS);
            props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
            props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, KafkaConstants.OFFSET_RESET_EARLIER);

            Consumer<Long, String> consumer = new KafkaConsumer<>(props);
            consumer.subscribe(Collections.singletonList(KafkaConstants.TOPIC_NAME));
            return consumer;
        }
}

Messages will be published to a Kafka partition called Topic. A Kafka topic is sub-divided into units called partitions for fault tolerance and scalability.

Every Record in Kafka has key value pairs, while publishing messages key is optional. If you don’t pass the key, Kafka will assign its own key for each message. In Above example, ProducerRecord<Integer, String> is the message that published to Kafka has Integer type as key and String as value.

Message Model Class: Below model class is used to publish the object. Refer to below descriptions on how this class being used in the application.

package com.malliktalksjava.kafka.model;

import java.io.Serializable;

public class Message implements Serializable{

    private static final long serialVersionUID = 1L;

    private String id;
    private String name;

    public String getId() {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }
}

Constants class: All the constants related to this application have been placed into below class.

package com.malliktalksjava.kafka.constants;

public class KafkaConstants {

    public static String KAFKA_BROKERS = "localhost:9092";

    public static Integer MESSAGE_COUNT=100;

    public static String CLIENT_ID="client1";

    public static String TOPIC_NAME="kafka-message-count-topic";

    public static String GROUP_ID_CONFIG="consumerGroup1";

    public static String GROUP_ID_CONFIG_2 ="consumerGroup2";

    public static Integer MAX_NO_MESSAGE_FOUND_COUNT=100;

    public static String OFFSET_RESET_LATEST="latest";

    public static String OFFSET_RESET_EARLIER="earliest";

    public static Integer MAX_POLL_RECORDS=1;
}

Custom Serializer: Serializer is the class which converts java objects to write into disk. Below custom serializer is converting the Message object to JSON String. Serialized message will be placed into Kafka Topic, this message can’t be read until it is deserialized by the consumer.

package com.malliktalksjava.kafka.util;

import java.util.Map;
import com.malliktalksjava.kafka.model.Message;
import org.apache.kafka.common.serialization.Serializer;
import com.fasterxml.jackson.databind.ObjectMapper;

public class CustomSerializer implements Serializer<Message> {

    @Override
    public void configure(Map<String, ?> configs, boolean isKey) {

    }

    @Override
    public byte[] serialize(String topic, Message data) {
        byte[] retVal = null;
        ObjectMapper objectMapper = new ObjectMapper();
        try {
            retVal = objectMapper.writeValueAsString(data).getBytes();
        } catch (Exception exception) {
            System.out.println("Error in serializing object"+ data);
        }
        return retVal;
    }
    @Override
    public void close() {

    }
}

Custom Deserializer: Below custom deserializer, converts the serealized object coming from Kafka into Java object.

package com.malliktalksjava.kafka.util;

import java.util.Map;

import com.malliktalksjava.kafka.model.Message;
import org.apache.kafka.common.serialization.Deserializer;

import com.fasterxml.jackson.databind.ObjectMapper;

public class CustomObjectDeserializer implements Deserializer<Message> {

    @Override
    public void configure(Map<String, ?> configs, boolean isKey) {
    }

    @Override
    public Message deserialize(String topic, byte[] data) {
        ObjectMapper mapper = new ObjectMapper();
        Message object = null;
        try {
            object = mapper.readValue(data, Message.class);
        } catch (Exception exception) {
            System.out.println("Error in deserializing bytes "+ exception);
        }
        return object;
    }

    @Override
    public void close() {
    }
}

Custom Partitioner: If you would like to do any custom settings for Kafka, you can do that using the java code. Below is the sample custom partitioner created as part of this applicaiton.

package com.malliktalksjava.kafka.util;

import org.apache.kafka.clients.producer.Partitioner;
import org.apache.kafka.common.Cluster;
import java.util.Map;

public class CustomPartitioner implements Partitioner{

    private static final int PARTITION_COUNT=50;

    @Override
    public void configure(Map<String, ?> configs) {

    }

    @Override
    public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        Integer keyInt=Integer.parseInt(key.toString());
        return keyInt % PARTITION_COUNT;
    }

    @Override
    public void close() {
    }
}

Here is the GIT Hub link for the program: https://github.com/mallikarjungunda/kafka-producer-consumer

Hope you liked the details, please share your feedback in comments.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.