You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It appears that in case repartition will be performed on the Kafka topic, from 3 partitions to 4 partitions for instance, the streaming job will ignore the new partition.
This is because providing fromOffsets parameter to createDirectStream indicates the job to read only from the 3 first partitions, and ignore the 4th one (for new records that might have been produced to it already, and will be produced during the job execution).
I would expect createDirectStream to handle non-indicated partitions (in fromOffsets) in a standard way (read new records), but unfortunately it does not behave like that.
Our application is written in Python, and we are using streaming-kafka-0-8-integration (as version 0.10 does not support python).
The text was updated successfully, but these errors were encountered:
It appears that in case repartition will be performed on the Kafka topic, from 3 partitions to 4 partitions for instance, the streaming job will ignore the new partition.
This is because providing fromOffsets parameter to createDirectStream indicates the job to read only from the 3 first partitions, and ignore the 4th one (for new records that might have been produced to it already, and will be produced during the job execution).
I would expect createDirectStream to handle non-indicated partitions (in fromOffsets) in a standard way (read new records), but unfortunately it does not behave like that.
Our application is written in Python, and we are using streaming-kafka-0-8-integration (as version 0.10 does not support python).
The text was updated successfully, but these errors were encountered: