Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMBARI-26131:Fix ambari-metrics-collector service check failed and resolve jar conflict problem #131

Closed
wants to merge 3 commits into from

Conversation

xijunmu
Copy link
Contributor

@xijunmu xijunmu commented Sep 9, 2024

What changes were proposed in this pull request?

After ambari-metrics-collector installed, i find the service check is failed. So i check the service port 6188 on host. The port is listened but access the url is failed return 500 (http://x.x.x.x:6188/ws/v1/timeline/metrics/livenodes)
Finally i find warn log about ambari-metrics-collector.It seems to be caused by jar conflict
微信图片_20240909143358

At the same time, I upgraded some jar versions:
to solve the CVE problem:
    1.upgrade commons-io to 2.8.0
    2.upgrade guava to 32.1.1-jre
adapt to bigtop3 and modify tar.gz download url:
    1.upgrade kafka to 2.8.2
    2.upgrade zookeeper to 3.7.2
    3.upgrade hbase to 2.4.17
    4.upgrade hadoop to 3.3.6
    5.upgrade phoenix to 2.4-5.1.3
add missing jar:
    1.sqlline

How was this patch tested?

1.build ambari-metrics
mvn clean package -Dbuild-rpm -Drat.skip=true -DskipTests -Dmaven.test.skip=true
微信图片_20240909194248
2.install ambari-metrics
微信图片_20240909194351

2.upgrade commons-io to 2.8.0
3.upgrade guava to 32.1.1-jre
4.upgrade kafka to 2.8.2
5.upgrade zookeeper to 3.7.2
6.upgrade hbase to 2.4.17
7.upgrade hadoop to 3.3.6
8.add missing jar sqlline
2.modify download url
@xijunmu
Copy link
Contributor Author

xijunmu commented Sep 9, 2024

@brahmareddybattula @kevinw66 @virajjasani Hi guys,could somebody help review this pr?

@virajjasani
Copy link

Nice one, this looks good. For those exclusions that you commented, we can remove them.

@virajjasani
Copy link

+1, @JiaLiangC any review from your side?

@JiaLiangC
Copy link
Contributor

@virajjasani Sorry for the late reply. It's strange that I didn't receive any email notifications for Ambari Metrics PRs or mentions. This PR corrects the download URLs for Hadoop binaries, etc. Excluding the Hadoop and HBase dependencies from Phoenix and explicitly declaring them is indeed more elegant. LGTM +1

@virajjasani
Copy link

I am not sure what release version we need to use while closing Jira related to metrics repo.

@JiaLiangC
Copy link
Contributor

I am not sure what release version we need to use while closing Jira related to metrics repo.

Indeed, it's difficult to determine the version in which the issue will be resolved. Ideally, Ambari Metrics should maintain version consistency with Ambari for better management, but Metrics has already released version 3.0... What do you think about 3.1?

@virajjasani
Copy link

Yes I was thinking about the same. It's time to release 3.1. These are some of the good fixes that we should release.

@@ -103,7 +103,6 @@
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>28.0-jre</version>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to provide a version here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the parent pom's guava version should be used, so I deleted the separate version in here

@@ -141,7 +141,6 @@ limitations under the License.
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>18.0</version>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a version here as well ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the previous comment

<hbase.folder>hbase-2.4.17</hbase.folder>
<hadoop.tar>https://archive.apache.org/dist/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz</hadoop.tar>
<hadoop.folder>hadoop-3.3.6</hadoop.folder>
<hadoop.version>3.3.6</hadoop.version>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have the hadoop version in sync with ambari where it is 3.3.4 currently ?
or bump up the version in ambari as well ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one, I think we should bump in ambari also.

@sandeep318kumar
Copy link
Contributor

@xijunmu CI has failed for this. Can you check and update?

@xijunmu
Copy link
Contributor Author

xijunmu commented Sep 26, 2024

@sandeep318kumar I checked the code and found that the CI failure was not caused by this PR It is caused by some Python libraries

......

[INFO] --- exec:1.2.1:exec (python-test) @ ambari-metrics-host-monitoring ---
Running tests
testApplicationMetricMap (TestApplicationMetricMap.TestApplicationMetricMap) ... ok
testEmptyMapReturnNone (TestApplicationMetricMap.TestApplicationMetricMap) ... ok
testFlattenAndClear (TestApplicationMetricMap.TestApplicationMetricMap) ... ok
test_flatten_and_align_values_by_minute_mark (TestApplicationMetricMap.TestApplicationMetricMap) ... ok
TestEmitter (unittest.loader._FailedTest) ... ERROR
TestMetricCollector (unittest.loader._FailedTest) ... ERROR
testCombinedDiskUsage (TestHostInfo.TestHostInfo) ... ok
testCpuTimes (TestHostInfo.TestHostInfo) ... ok
testDiskIOCounters (TestHostInfo.TestHostInfo) ... ok
testMemInfo (TestHostInfo.TestHostInfo) ... ERROR
testProcessInfo (TestHostInfo.TestHostInfo) ... ok
test_get_disk_io_counters_per_disk (TestHostInfo.TestHostInfo) ... ok
test_get_network_info_skip_by_pattern (TestHostInfo.TestHostInfo) ... ok
test_get_network_info_skip_by_pattern_and_virtual (TestHostInfo.TestHostInfo) ... ok
test_get_network_info_virtual_devices (TestHostInfo.TestHostInfo) ... ok
test_get_virtual_network_interfaces (TestHostInfo.TestHostInfo) ... ok

======================================================================
ERROR: TestEmitter (unittest.loader._FailedTest)

ImportError: Failed to import test module: TestEmitter
Traceback (most recent call last):
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/main/python/core/spnego_kerberos_auth.py", line 27, in
import kerberos
ModuleNotFoundError: No module named 'kerberos'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib64/python3.6/unittest/loader.py", line 153, in loadTestsFromName
module = import(module_name)
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/test/python/core/TestEmitter.py", line 29, in
from spnego_kerberos_auth import SPNEGOKerberosAuth
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/main/python/core/spnego_kerberos_auth.py", line 29, in
from resource_monitoring.core import krberr as kerberos
ModuleNotFoundError: No module named 'resource_monitoring'

======================================================================
ERROR: TestMetricCollector (unittest.loader._FailedTest)

ImportError: Failed to import test module: TestMetricCollector
Traceback (most recent call last):
File "/usr/lib64/python3.6/unittest/loader.py", line 153, in loadTestsFromName
module = import(module_name)
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/test/python/core/TestMetricCollector.py", line 25, in
from core.metric_collector import MetricsCollector
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/main/python/core/metric_collector.py", line 23, in
from resource_monitoring.core.event_definition import HostMetricCollectEvent, ProcessMetricCollectEvent
ModuleNotFoundError: No module named 'resource_monitoring'

======================================================================
ERROR: testMemInfo (TestHostInfo.TestHostInfo)

Traceback (most recent call last):
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/test/python/mock/mock.py", line 1199, in patched
return func(*args, **keywargs)
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/test/python/core/TestHostInfo.py", line 82, in testMemInfo
mem = hostinfo.get_mem_info()
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/main/python/core/host_info.py", line 125, in get_mem_info
'mem_total': bytes2kilobytes(mem_total) if mem_total else 0,
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/main/python/core/host_info.py", line 122, in
bytes2kilobytes = lambda x: x / 1024
TypeError: unsupported operand type(s) for /: 'MagicMock' and 'int'


Ran 16 tests in 0.063s

FAILED (errors=3)

Failed tests:
ERROR: TestEmitter (unittest.loader._FailedTest)

ImportError: Failed to import test module: TestEmitter
Traceback (most recent call last):
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/main/python/core/spnego_kerberos_auth.py", line 27, in
import kerberos
ModuleNotFoundError: No module named 'kerberos'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib64/python3.6/unittest/loader.py", line 153, in loadTestsFromName
module = import(module_name)
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/test/python/core/TestEmitter.py", line 29, in
from spnego_kerberos_auth import SPNEGOKerberosAuth
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/sdiskio(read_count=0, write_count=1, read_bytes=2, write_bytes=3, read_time=4, write_time=5)
src/main/python/core/spnego_kerberos_auth.py", line 29, in
from resource_monitoring.core import krberr as kerberos
ModuleNotFoundError: No module named 'resource_monitoring'

ERROR: TestMetricCollector (unittest.loader._FailedTest)

ImportError: Failed to import test module: TestMetricCollector
Traceback (most recent call last):
File "/usr/lib64/python3.6/unittest/loader.py", line 153, in loadTestsFromName
module = import(module_name)
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/test/python/core/TestMetricCollector.py", line 25, in
from core.metric_collector import MetricsCollector
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/main/python/core/metric_collector.py", line 23, in
from resource_monitoring.core.event_definition import HostMetricCollectEvent, ProcessMetricCollectEvent
ModuleNotFoundError: No module named 'resource_monitoring'

ERROR: testMemInfo (TestHostInfo.TestHostInfo)

Traceback (most recent call last):
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/test/python/mock/mock.py", line 1199, in patched
return func(*args, **keywargs)
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/test/python/core/TestHostInfo.py", line 82, in testMemInfo
mem = hostinfo.get_mem_info()
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/main/python/core/host_info.py", line 125, in get_mem_info
'mem_total': bytes2kilobytes(mem_total) if mem_total else 0,
File "/opt/ambari/ambari-metrics/ambari-metrics-host-monitoring/src/main/python/core/host_info.py", line 122, in
bytes2kilobytes = lambda x: x / 1024
TypeError: unsupported operand type(s) for /: 'MagicMock' and 'int'


Total run:16
Total errors:3
Total failures:0
ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for ambari-metrics 3.1.0-SNAPSHOT:
[INFO]
[INFO] ambari-metrics ..................................... SUCCESS [ 5.623 s]
[INFO] Ambari Metrics Common .............................. SUCCESS [01:09 min]
[INFO] Ambari Metrics Hadoop Sink ......................... SUCCESS [ 22.543 s]
[INFO] Ambari Metrics Flume Sink .......................... SUCCESS [ 14.651 s]
[INFO] Ambari Metrics Kafka Sink .......................... SUCCESS [ 15.066 s]
[INFO] Ambari Metrics Storm Sink .......................... SUCCESS [ 11.955 s]
[INFO] Ambari Metrics Collector ........................... SUCCESS [06:26 min]
[INFO] Ambari Metrics Monitor ............................. FAILURE [ 4.433 s]
[INFO] Ambari Metrics Grafana ............................. SKIPPED
[INFO] Ambari Metrics Host Aggregator ..................... SKIPPED
[INFO] Ambari Metrics Assembly ............................ SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------

@sandeep318kumar
Copy link
Contributor

@xijunmu I have resolved these build failures in this PR: #133

@xijunmu
Copy link
Contributor Author

xijunmu commented Sep 27, 2024

@sandeep318kumar nice work!

@sandeep318kumar
Copy link
Contributor

@xijunmu My PR has been merged. Can you rebase your PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants