Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgraded_kafka_client_version_to_3.9.0 #279

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

adkharat
Copy link

@adkharat adkharat commented Nov 27, 2024

Existing kafka version = 0.11.0.2
Upgraded Kafka version = 3.9.0

With reference to prestodb/presto PR Fix CVE-2022-34917

Kafka 3.9.0 depricates direct Zookeeper interactions:

  • ZkClient,
  • ZkUtils

Upgrading depricates below AdminUtils methods:

  • AdminUtils.createTopic()
  • AdminUtils.topicExists()
  • AdminUtils.deleteTopic()
  1. Deprecated AdminUtils functions are replaced with AdminClient APIs
  2. Deprecated Zookeeper dependency is replaced with Kafka's AdminClient API.
  3. Encapsulated AdminClient logic in a method KafkaAdminClient to manage lifecycle and reduce boilerplate.
  4. Operations such as topic creation (createTopic) and deletion (deleteTopic) now use AdminClient's createTopics and deleteTopics.

FYI:

kafka.utils.ZkUtils was deprecated since 2.0.0.

2.0.0- > ZKUtils present
2.3.1 -> ZKUtils present
2.4.0 -> ZKUtils removed
2.8.2 -> ZKUtils not present (Version 2.8.2 has some vulnerability)
3.7.1 -> ZKUtils not present (No vulnerability)
3.9.0 -> ZKUtils not present (Latest version + No vulnerability)

@adkharat adkharat force-pushed the upgrade_kafka_client_version_cve_2022_34917 branch from 2576948 to a15a170 Compare November 27, 2024 05:58
@adkharat
Copy link
Author

adkharat commented Nov 27, 2024

Build on

gradle -v                                                     
------------------------------------------------------------
Gradle 6.6
------------------------------------------------------------

Build time:   2020-08-10 22:06:19 UTC
Revision:     d119144684a0c301aea027b79857815659e431b9

Kotlin:       1.3.72
Groovy:       2.5.12
Ant:          Apache Ant(TM) version 1.10.8 compiled on May 10 2020
JVM:          1.8.0_422 (Azul Systems, Inc. 25.422-b05)
OS:           Mac OS X 14.4 x86_64

@adkharat adkharat marked this pull request as ready for review November 27, 2024 06:00
@adkharat
Copy link
Author

@imjalpreet can you please review the changes.

@adkharat adkharat changed the title upgraded_kafka_client_version_to_2.8.2 upgraded_kafka_client_version_to_3.7.1 Dec 2, 2024
@imjalpreet
Copy link
Member

@ZacBlanco FYI

@@ -41,6 +41,7 @@ jar {

shadowJar {
version = ''
zip64 = true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this required?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZacBlanco on gradle build, I am getting error

Execution failed for task ':tempto-examples:shadowJar'.
> shadow.org.apache.tools.zip.Zip64RequiredException: archive contains more than 65535 entries.
image

Here, ShadowJar plugin is creating a JAR file that exceeds the limit of 65,535 entries. This might due to the ZIP format limitation that ShadowJar uses by default. Link

Copy link

@ZacBlanco ZacBlanco Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for the explanation. This is fine then I guess. We were probably close to the limit and the newer version pushed us over? It would be good to know the number of entries before/after this change, but I don't think it's worth trying to fix

Copy link

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only other concern is the same with Jalpreet's on the presto PR. I think we should try to move to the latest available Kafka version

@adkharat
Copy link
Author

adkharat commented Dec 6, 2024

@ZacBlanco Latest Kafka version is 3.9.0.
Pushed changes for 3.9.0

@adkharat adkharat changed the title upgraded_kafka_client_version_to_3.7.1 upgraded_kafka_client_version_to_3.9.0 Dec 6, 2024
@adkharat adkharat force-pushed the upgrade_kafka_client_version_cve_2022_34917 branch from d2d3660 to 6bf9291 Compare December 6, 2024 20:00
Copy link

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment, otherwise LGTM.

@@ -113,7 +111,7 @@ ext.tempto_runner = project(':tempto-runner')
ext.tempto_ldap = project(':tempto-ldap')
ext.tempto_kafka = project(':tempto-kafka')
ext.expected_result_generator = project(':expected-result-generator')
ext.tempto_version = '1.53'
ext.tempto_version = '1.54'
Copy link

@ZacBlanco ZacBlanco Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be upgraded and set to a non-SNAPSHOT version during the release process. It looks like after the last release it wasn't properly updated. Since we already release 1.53, this should be 1.54-SNAPSHOT. During release we will update to 1.54, and then move to 1.55-SNAPSHOT afterwards.

So could you set this to 1.54-SNAPSHOT for now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZacBlanco Done. Updated to 1.54-SNAPSHOT

@adkharat adkharat force-pushed the upgrade_kafka_client_version_cve_2022_34917 branch 2 times, most recently from c6a547a to 411394e Compare December 11, 2024 10:37
@adkharat
Copy link
Author

adkharat commented Dec 12, 2024

@ZacBlanco I don't see an option to merge the PR. If no additional approvals are required, could you please merge these PR?
I believe the Tempto jar should get auto-published to https://oss.sonatype.org/#nexus-search;quick~io.prestodb.tempto based on the configuration in build.gradle

@ZacBlanco
Copy link

ZacBlanco commented Dec 12, 2024

I can't merge it. I don't have write access to this repository. @imjalpreet if you have time could you also review?

The jar is not automatically published. Someone with write access has to publish it to maven central. We also need to verify that the latest jar update doesn't break the presto-product-tests CI on the presto repository.

@tdcmeehan can merge after we get Jalpreet's review and verify that the CI does not break.

build.gradle Outdated
@@ -113,7 +111,7 @@ ext.tempto_runner = project(':tempto-runner')
ext.tempto_ldap = project(':tempto-ldap')
ext.tempto_kafka = project(':tempto-kafka')
ext.expected_result_generator = project(':expected-result-generator')
ext.tempto_version = '1.53'
ext.tempto_version = '1.54-SNAPSHOT'
Copy link
Member

@agrawalreetika agrawalreetika Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, before there should be a release prep commit to update ext.tempto_version from 1.53 to 1.54-SNAPSHOT? something like -
c1eec18

Then in your PR, you should change version from 1.54-SNAPSHOT to 1.54 for next release.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, before there should be a release prep commit to update ext.tempto_version from 1.53 to 1.54-SNAPSHOT? something like - c1eec18

Then in your PR, you should change version from 1.54-SNAPSHOT to 1.54 for next release.

@agrawalreetika

I checked the previous commits and can confirmed that no commits were pushed earlier for 1.54-SNAPSHOT.
From an example I found, the process involves first merging 1.54-SNAPSHOT and then creating another PR on top of it for 1.54.

Here are the previous PRs for reference:

Let me know, Shall I create a new PR for 1.54-SNAPSHOT ? and once it's merged, 1.54 will be updated with this current PR.

build.gradle Outdated
@@ -102,8 +101,7 @@ ext.libraries = [
commons_cli : "commons-cli:commons-cli:${versions.commons_cli}",
thrift : "org.apache.thrift:libthrift:${versions.thrift}",
kafka_clients : "org.apache.kafka:kafka-clients:${versions.kafka}",
kafka : "org.apache.kafka:kafka_2.12:${versions.kafka}",
zkclient : "com.101tec:zkclient:${versions.zkclient}"
kafka : "org.apache.kafka:kafka_2.12:${versions.kafka}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, which class is getting used from this library?

Copy link
Author

@adkharat adkharat Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Classes with package names starting with org.apache.kafka.clients will originate from the kafka_clients library.

For example:-

import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.admin.AdminClient;

Classes with package names starting with org.apache.kafka.common will originate from the kafka library.

For example:-

import org.apache.kafka.common.errors.TopicExistsException;
import org.apache.kafka.common.serialization.ByteArraySerializer

Both are being used in KafkaTableManager.java

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @adkharat

I just checked I see below classes that you mentioned are coming from kafka-clients-3.9.0.jar nothing from these are from org.apache.kafka:kafka_2.12:${versions.kafka}

org/apache/kafka/clients/producer/KafkaProducer.class
org/apache/kafka/clients/admin/AdminClient.class
org/apache/kafka/common/errors/TopicExistsException.class
org/apache/kafka/common/serialization/ByteArraySerializer.class

So I am curious do we even need "org.apache.kafka:kafka_2.12:${versions.kafka}" in this upgrade?

Copy link
Author

@adkharat adkharat Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agrawalreetika
That is the case then not required.
But kafka: "org.apache.kafka:kafka_2.12:${versions.kafka} is referenced in the tempto-kafka/build.gradle file. Build will fails if it's removed.
I think, in tempto-kafka/build.gradle file, we need to replace the reference to libraries.kafka with libraries.kafka_clients. Then we are good to remove.
I have pushed changes.

Properties properties = new Properties();
properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, brokerHost + ":" + brokerPort);
AdminClient adminClient = AdminClient.create(properties);
try {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use try-with-resources instead for handling close of adminClient

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, brokerHost + ":" + brokerPort);
AdminClient adminClient = AdminClient.create(properties);
try {
routine.accept(adminClient);
Copy link
Member

@agrawalreetika agrawalreetika Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if there is an issue/exception while doing above? Should we add a catch and logger for it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agrawalreetika added catch to handle exception and logged error.

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase to pick up the GitHub actions job.

@adkharat adkharat force-pushed the upgrade_kafka_client_version_cve_2022_34917 branch 3 times, most recently from 10ff64e to ba6b9dc Compare December 17, 2024 07:37
@adkharat
Copy link
Author

Please rebase to pick up the GitHub actions job.

@tdcmeehan Done. Rebased

@adkharat adkharat force-pushed the upgrade_kafka_client_version_cve_2022_34917 branch 3 times, most recently from 0603fa5 to f364578 Compare December 18, 2024 04:36
upgraded_kafka_client_version_to_3.9.0

upgraded_kafka_client_version_to_2.8.2

undo commented code

removed zkclient

Upgraded kafka version to 3.7.1

upgraded to 3.9.0

updated version to 1.54-SNAPSHOT

Prepare 1.54 development release iteration

upgraded_kafka_client_version_to_3.9.0

added topic check before deletion

added retry in topic deletion
@adkharat adkharat force-pushed the upgrade_kafka_client_version_cve_2022_34917 branch from 203cbe8 to 132bb5a Compare December 18, 2024 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants