Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ArtemisJmsRABuildItem #584

Merged
merged 4 commits into from
Aug 2, 2024
Merged

Add ArtemisJmsRABuildItem #584

merged 4 commits into from
Aug 2, 2024

Conversation

zhfeng
Copy link
Contributor

@zhfeng zhfeng commented Aug 1, 2024

It could fix #580 partly, and we still need to consider how to do the health check in quarkus-artemis-ra.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@turing85 turing85 added bug Something isn't working backport-3.2.x labels Aug 1, 2024
Copy link

github-actions bot commented Aug 2, 2024

🚦Reports for run #1418🚦

Reports will be posted here as they get available.

🥳 JUnit JVM Test passed

Passed Failed Skipped
✅ 181 ❌ 0 ⚠️ 0

You can see the report here.

🥳 JUnit Native Test passed

Passed Failed Skipped
✅ 113 ❌ 0 ⚠️ 0

You can see the report here.

@zhfeng
Copy link
Contributor Author

zhfeng commented Aug 2, 2024

@vsevel can you check this fix with your project?

@vsevel
Copy link

vsevel commented Aug 2, 2024

I now get:

$ curl localhost:9000/q/health/ready
        {
            "name": "Artemis JMS Resource Adaptor health check",
            "status": "UP",
            "data": {
                "<default>": "UP"
            }
        }

deps:

$ ls -al target/quarkus-app/lib/main | grep -E "artemis|iron"
-rw-r--r-- 1 Sevel 1049089   22029 Aug  2 11:51 io.quarkiverse.artemis.quarkus-artemis-core-999-SNAPSHOT.jar
-rw-r--r-- 1 Sevel 1049089   11146 Aug  2 11:51 io.quarkiverse.artemis.quarkus-artemis-jms-ra-999-SNAPSHOT.jar
-rw-r--r-- 1 Sevel 1049089   53855 Aug  2 11:51 io.quarkiverse.ironjacamar.quarkus-ironjacamar-1.3.2.jar
-rw-r--r-- 1 Sevel 1049089  676595 Aug  2 08:44 org.apache.activemq.artemis-commons-2.36.0.jar
-rw-r--r-- 1 Sevel 1049089  829150 Aug  2 08:44 org.apache.activemq.artemis-core-client-2.36.0.jar
-rw-r--r-- 1 Sevel 1049089   21552 Aug  2 08:44 org.apache.activemq.artemis-hqclient-protocol-2.36.0.jar
-rw-r--r-- 1 Sevel 1049089  187781 Aug  2 08:44 org.apache.activemq.artemis-jakarta-client-2.36.0.jar
-rw-r--r-- 1 Sevel 1049089  160114 Aug  2 08:44 org.apache.activemq.artemis-jakarta-ra-2.36.0.jar
-rw-r--r-- 1 Sevel 1049089   33311 Aug  2 08:44 org.apache.activemq.artemis-jakarta-service-extensions-2.36.0.jar
-rw-r--r-- 1 Sevel 1049089  112139 Aug  2 08:44 org.apache.activemq.artemis-selector-2.36.0.jar
-rw-r--r-- 1 Sevel 1049089  103566 Apr 30 10:56 org.jboss.ironjacamar.ironjacamar-common-api-3.0.9.Final.jar
-rw-r--r-- 1 Sevel 1049089   57849 Apr 30 10:56 org.jboss.ironjacamar.ironjacamar-core-api-3.0.9.Final.jar
-rw-r--r-- 1 Sevel 1049089  502175 Apr 30 10:56 org.jboss.ironjacamar.ironjacamar-core-impl-3.0.9.Final.jar

seems everything is working nicely. what would you say that it only partially fixes the issue @zhfeng ? what would be still missing? some additional checks we could do in the artemis RA?
I am surprised with the impl of:

    public HealthCheckResponse call() {
        HealthCheckResponseBuilder builder = HealthCheckResponse.named("Artemis JMS Resource Adaptor health check").up();
        for (String name : connectionFactoryNames) {
            Annotation identifier = ArtemisUtil.toIdentifier(name);
            try (Connection ignored = connectionFactories.select(identifier).get().createConnection()) {
                builder.withData(name, "UP");
            } catch (Exception e) {
                builder.withData(name, "DOWN").down();
            }
        }
        return builder.build();
    }

do you want the status to be always UP?
or should the status be DOWN as soon as at least one connection factory is down?

@zhfeng
Copy link
Contributor Author

zhfeng commented Aug 2, 2024

Thanks @vsevel - that's great to hear it is working!

I mean partially fix when only introducing the ArtemisJmsRABuildItem. Now I think we almost get to fix the issue :)

@turing85 what do you think about the status? I just bowrrow your codes from the quarkus-artemis-jms.

If all of the connection factories are DOWN, the status should be DOWN?

@vsevel
Copy link

vsevel commented Aug 2, 2024

If all of the connection factories are DOWN, the status should be DOWN?

it is a vast subject. in the company, we decided to disable most of the health checks, and extended them to map them onto @Wellness, then tie those to our alerting system.
and by default we even disabled the readiness probes for all of our apps. we have never seen a good reason to pull a single pod out of the pool.
so I actually understand that you would keep a status UP, even if a connection factory was DOWN, specifically if tied to the readiness probe.

to me the proper way would be to have a check per connection factory, all tied to wellness.

this should not be a startup check because it does not prevent the app from starting, and work partially. it should not be readiness because it does not prevent the app from working partially, and it should not be liveness because a restart is not going to magically solve the loss of connectivity.
this is particularly true with the ra that has some resiliency built it (the app can start without connectivity, and it will later connect if available. and if loss of connectivity happens while the app is running, it will be able to reconnect as soon as the broker is available again).
my 2 cents.

@turing85
Copy link
Contributor

turing85 commented Aug 2, 2024

...

    public HealthCheckResponse call() {
        HealthCheckResponseBuilder builder = HealthCheckResponse.named("Artemis JMS Resource Adaptor health check").up();
        for (String name : connectionFactoryNames) {
            Annotation identifier = ArtemisUtil.toIdentifier(name);
            try (Connection ignored = connectionFactories.select(identifier).get().createConnection()) {
                builder.withData(name, "UP");
            } catch (Exception e) {
                builder.withData(name, "DOWN").down();
            }
        }
        return builder.build();
    }

do you want the status to be always UP? or should the status be DOWN as soon as at least one connection factory is down?

The check is not always up. The call to down() "downs" the complete check, not the data-item.
Calling

builder.withData(name, "DOWN").down();

is the same as calling

builder.withData(name, "DOWN");
builder.down();

Basically, the check starts as being UP. While we iterate over the connection factories, we DOWN the check in every exception. In other words: if one connection is down, the whole check goes down.

@vsevel
Copy link

vsevel commented Aug 2, 2024

yes you are right @turing85 I missed it.
so if one connection is down, the whole check is down.
that is one possible behavior, consistent with others I have seen, even if I don't think this is the proper way of doing this. but this is a more general discussion. so I am fine with it.

@turing85
Copy link
Contributor

turing85 commented Aug 2, 2024

@vsevel Believe me, I am on your side on this one. I triggered a discussion about this a while back. Don't remember where though. Either zulip or the mailing list.

@turing85
Copy link
Contributor

turing85 commented Aug 2, 2024

So... is this MR ready4merge?

@vsevel
Copy link

vsevel commented Aug 2, 2024

I triggered a discussion about this a while back

me too ;-)
https://groups.google.com/g/quarkus-dev/c/pj6L7kzbIno/m/B4RXJcupBwAJ

So... is this MR ready4merge?

I think so. still not clear whether or not the initial issue is completely fixed.
@zhfeng what do you mean with Now I think we almost get to fix the issue :)?

@zhfeng
Copy link
Contributor Author

zhfeng commented Aug 2, 2024

Yeah, I think it is good to merge. @turing85 feel free to go ahead!

Thanks all for reviwing and @gastaldi for the quick release of quarkus-ironjacamar.

@turing85 turing85 merged commit 0b9b29f into main Aug 2, 2024
43 checks passed
@turing85 turing85 deleted the ra_healthcheck branch August 2, 2024 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-3.2.x bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

health check fails with java.lang.IllegalArgumentException: port out of range:-1
3 participants