⚠️ 本文实验的Kafka版本为2.11版本.java
kafka中的消息指的就是一条ProducerRecord,里面除了携带发送的数据以外,还包含:apache
在生产者发送消息的时候,并非上面全部的信息都算在发送的消息大小.详情见下面代码.数组
上面的代码会将value序列化成字节数组,参与序列化的有topic,headers,key. 用来验证value是否超出长度的是ensureValidRecordSize(serializedSize);
方法.并发
ensureValidRecordSize从两个方面验证,一个是maxRequestSize(max.request.size),另外一个是totalMemorySize(buffer.memory), 只有当value的长度同时小于时,消息才能够正常发送.less
private void ensureValidRecordSize(int size) { if (size > this.maxRequestSize) throw new RecordTooLargeException("The message is " + size + " bytes when serialized which is larger than the maximum request size you have configured with the " + ProducerConfig.MAX_REQUEST_SIZE_CONFIG + " configuration."); if (size > this.totalMemorySize) throw new RecordTooLargeException("The message is " + size + " bytes when serialized which is larger than the total memory buffer you have configured with the " + ProducerConfig.BUFFER_MEMORY_CONFIG + " configuration."); }
单条消息过长或产生以下错误.异步
这里有个注意的点,若是只是单纯的发送消息,没有用Callback进行监控或者用Future进行得到结果,在消息过长的状况下,不会主动发出提示,ide
Future<RecordMetadata> send = kafkaProducer.send(new ProducerRecord<>("topic", "key", "value")); RecordMetadata recordMetadata = send.get(); System.out.println(recordMetadata);
Future类中get()方法, @throws ExecutionException 若是计算抛出异常,该方法将会抛出该异常.fetch
/** * Waits if necessary for the computation to complete, and then * retrieves its result. * * @return the computed result * @throws CancellationException if the computation was cancelled * @throws ExecutionException if the computation threw an * exception * @throws InterruptedException if the current thread was interrupted * while waiting */ V get() throws InterruptedException, ExecutionException;
先看Kafka专门为回调写的接口.this
// 英文注释省略,总的来讲: 用于异步回调,当消息发送server已经被确认以后,就会调用该方法 // 该方法中的确定有一个参数不为null,若是没有异常产生,则metadata有数据,若是有异常则相反 public void onCompletion(RecordMetadata metadata, Exception exception);
kafkaProducer.send(new ProducerRecord<>("topic", "key", "value"), new Callback() { @Override public void onCompletion(RecordMetadata metadata, Exception exception) { if (exception != null) { exception.printStackTrace(); } } });
将日志的消息级别设置为DEBUG,也会给标准输出输出该警告信息.spa
经过上面两种比较,不难发现Future是Java并发标准库中,并非专门为kafka而设计,须要显示捕获异常,而Callback接口是kafka提供标准回调措施,因此应尽量采用后者.
在生产者有一个限制消息的参数,而在服务端也有限制消息的参数,该参数就是message.max.bytes
,默认为1000012B (大约1MB),服务端能够接收不到1MB的数据.(在新客户端producer,消息老是通过分批group into batch的数据,详情见RecordBatch接口).
/** * A record batch is a container for records. In old versions of the record format (versions 0 and 1), * a batch consisted always of a single record if no compression was enabled, but could contain * many records otherwise. Newer versions (magic versions 2 and above) will generally contain many records * regardless of compression. * 在旧版本不开启消息压缩的状况下,一个batch只包含一条数据 * 在新版本中老是会包含多条消息,不会去考虑消息是否压缩 */ public interface RecordBatch extends Iterable<Record>{ ... }
修改broker端的能够接收的消息大小,须要在broker端server.properties文件中添加message.max.bytes=100000
. 数值能够修改为本身想要的,单位是byte.
若是生产者设置的消息发送大小为1MB,而broker端设置的消息大小为512KB会发生什么?
答案就是broker会拒绝该消息,生产者会返回一个RecordTooLargeException
. 该消息是不会被消费者消费.提示的信息为: org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.
消费者也会进行消息限制,这里介绍有关三个限制消费的参数
若是fetch.max.wait.ms
设置的时间到达,即便能够返回的消息总大小没有知足fetch.min.bytes
设置的值,也会进行返回.
若是fetch.max.bytes设置太小会发生什么? 会是不知足条件的数据一条都不返回吗? 咱们能够根据文档来查看一下.
The maximum amount of data the server should return for a fetch request. Records are fetched in batches by the consumer, and if the first record batch in the first non-empty partition of the fetch is larger than this value, the record batch will still be returned to ensure that the consumer can make progress.
英文的大意就是: fetch.max.bytes 表示服务端能返回消息的总大小. 消息是经过分批次返回给消费者. 若是在分区中的第一个消息批次大于这个值,那么该消息批次依然会返回给消费者,保证流程运行.
能够得出结论: 消费端的参数只会影响消息读取的大小.
properties.put(ConsumerConfig.FETCH_MAX_BYTES_CONFIG, 1024); properties.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 1024); properties.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 1); ... while (true) { ConsumerRecords<String, String> records = kafkaConsumer.poll(Duration.ofSeconds(Integer.MAX_VALUE)); System.out.println(records.count()); }
启动消费者,添加上面三个参数. 指定消息批次最小最大返回的大小以及容许抓取最长的等待时间. 最后将返回的消息总数输出到标准输出.
实验结果: 由于每次发送的消息都要大于1024B,因此消费者每一个批次只能返回一条数据. 最终会输出1...