Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify 2GB max heap for tests #1474

Closed

Conversation

iloveeclipse
Copy link
Member

-Xmx2g is needed for Jenkins where RAM is low.

SDK doesn't specify -Xmx anymore, see
#1463

@iloveeclipse
Copy link
Member Author

@laeubi : could you please review? I'm not a maven/tycho expert.

@laeubi
Copy link
Contributor

laeubi commented Oct 25, 2023

Build verification works, so it can't be that bad ;-)

@iloveeclipse
Copy link
Member Author

However it looks like maven-war-plugin is also affected (or is the one that causes troubles)???

See https://ci.eclipse.org/platform/job/eclipse.platform/job/PR-767/1/console

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-war-plugin:3.3.2:war (default-war) on project infocenter-app: Error assembling WAR: Problem creating war: Execution exception: Java heap space -> [Help 2]

But that one shouldn't run with SDK / shouldn't be affected by SDK eclipse.ini changes?

@iloveeclipse
Copy link
Member Author

iloveeclipse commented Oct 25, 2023

Build verification works, so it can't be that bad ;-)

Not sure if any tests are running that use the option? At least they don't say that, searching -Xmx2g doesn't show up in the console... https://ci.eclipse.org/platform/job/eclipse.platform.releng.aggregator/job/PR-1474/1/consoleText

@laeubi
Copy link
Contributor

laeubi commented Oct 25, 2023

However it looks like maven-war-plugin is also affected (or is the one that causes troubles)???

See https://ci.eclipse.org/platform/job/eclipse.platform/job/PR-767/1/console

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-war-plugin:3.3.2:war (default-war) on project infocenter-app: Error assembling WAR: Problem creating war: Execution exception: Java heap space -> [Help 2]

But that one shouldn't run with SDK / shouldn't be affected by SDK eclipse.ini changes?

That's what makes me really wonder, especially as it seems the Github Actions are not affected in the same way, so for me it looks like maybe an infra change that maven itself has insufficient resources now.

Not sure if any tests are running that use the option?

Yes tests are disabled here (because they run very long) ... my bad, so to verify one needs to run it locally or we need to temporarily remove -DskipTests to see what happening.

@laeubi
Copy link
Contributor

laeubi commented Oct 25, 2023

But it could also be some kind of meommoryleak... I wonder if we can get a heapdum or if that will blow the Jenkins @fredg02 ?

iloveeclipse added a commit to iloveeclipse/eclipse.platform.releng.aggregator that referenced this pull request Oct 25, 2023
Enabled tests to see effect of
eclipse-platform#1474
@iloveeclipse
Copy link
Member Author

or we need to temporarily remove -DskipTests to see what happening.

I've pushed bad commit on top (actually wanted to have different PR). Let see.

@iloveeclipse iloveeclipse marked this pull request as draft October 25, 2023 14:29
@iloveeclipse
Copy link
Member Author

so for me it looks like maybe an infra change that maven itself has insufficient resources now.

The only way to know that would be revert #1471 and if OOM's are still there, we have a new infra/maven/tycho issue.

I will do that if the experiment above (with disabling skipTests) would not show anything useful.

@iloveeclipse
Copy link
Member Author

And yes, we seem to have yet another infra issues: [ERROR] Caused by: repo.eclipse.org:443 failed to respond... OMFG

@akurtakov
Copy link
Member

This seems like another pros towards redoing releng as mentioned here #1472 . If projects are really split the resources needed for each will be sufficient more often and a lot less extra configuration will be needed.

@iloveeclipse
Copy link
Member Author

This seems like another pros towards redoing releng as mentioned here #1472 . If projects are really split the resources needed for each will be sufficient more often and a lot less extra configuration will be needed.

Not sure it's related. We have some build tasks & some tests running (sometimes in parallel) on a very limited Jenkins VM with only 4GB RAM. Some tasks just don't get enough memory or have some memory leaks - how that can be related to releng/project structure? Even if we would have a single project to build and test it can get OOM if maven/tycho/whatever has a memory leak.

@akurtakov
Copy link
Member

This seems like another pros towards redoing releng as mentioned here #1472 . If projects are really split the resources needed for each will be sufficient more often and a lot less extra configuration will be needed.

Not sure it's related. We have some build tasks & some tests running (sometimes in parallel) on a very limited Jenkins VM with only 4GB RAM. Some tasks just don't get enough memory or have some memory leaks - how that can be related to releng/project structure? Even if we would have a single project to build and test it can get OOM if maven/tycho/whatever has a memory leak.

If JDT and PDE build and generate their own repos and run their own tests in their own builds that would free releng.aggregator one from 100+ bundles (some of them heavy ones) thus reduce the hardware requirements to execute it.

@iloveeclipse
Copy link
Member Author

If JDT and PDE build and generate their own repos and run their own tests in their own builds that would free releng.aggregator one from 100+ bundles (some of them heavy ones) thus reduce the hardware requirements to execute it.

This is not the point here, we are probably talking about different issues.
The OOM's were not reported by PDE or JDT tests.

@akurtakov
Copy link
Member

If JDT and PDE build and generate their own repos and run their own tests in their own builds that would free releng.aggregator one from 100+ bundles (some of them heavy ones) thus reduce the hardware requirements to execute it.

This is not the point here, we are probably talking about different issues. The OOM's were not reported by PDE or JDT tests.

Maven doesn't always build modules in strict order but rather takes the first dependency order it figures which can differ from build to build, thus if you have OOM it can happen in one bundle in one run and in another on the second one and etc. Thus the solving is usually to reduce the number of modules in order to get things more predictable and manageable.

@iloveeclipse
Copy link
Member Author

Looks like JDT model tests are unhappy due ${tycho.surefire.argLine} argument from their pom file. Interestingly, that doesn't seem to be a problem for them running in Jenkins.

[ERROR] Failed to execute goal org.eclipse.tycho:tycho-surefire-plugin:4.0.4-SNAPSHOT:test (default-test) on project org.eclipse.jdt.core.tests.model: An unexpected error occurred while launching the test runtime (process returned error code 1). Command-line used to launch the sub-process was /opt/tools/java/openjdk/jdk-17/latest/bin/java -Dosgi.noShutdown=false -Dosgi.os=linux -Dosgi.ws=gtk -Dosgi.arch=x86_64 -Xmx1G -Djdt.default.test.compliance=1.8 -DDetectVMInstallationsJob.disabled=true ${tycho.surefire.argLine} -Dosgi.clean=true -ea -jar /home/jenkins/agent/workspace/atform.releng.aggregator_PR-1474/equinox/bundles/org.eclipse.equinox.launcher/target/org.eclipse.equinox.launcher-1.6.600-SNAPSHOT.jar -data /home/jenkins/agent/workspace/atform.releng.aggregator_PR-1474/eclipse.jdt.core/org.eclipse.jdt.core.tests.model/target/work/data -install /home/jenkins/agent/workspace/atform.releng.aggregator_PR-1474/eclipse.jdt.core/org.eclipse.jdt.core.tests.model/target/work -configuration /home/jenkins/agent/workspace/atform.releng.aggregator_PR-1474/eclipse.jdt.core/org.eclipse.jdt.core.tests.model/target/work/configuration -application org.eclipse.tycho.surefire.osgibooter.headlesstest -testproperties /home/jenkins/agent/workspace/atform.releng.aggregator_PR-1474/eclipse.jdt.core/org.eclipse.jdt.core.tests.model/target/surefire.properties in working directory /home/jenkins/agent/workspace/atform.releng.aggregator_PR-1474/eclipse.jdt.core/org.eclipse.jdt.core.tests.model```

iloveeclipse added a commit to iloveeclipse/eclipse.platform.releng.aggregator that referenced this pull request Oct 25, 2023
This reverts commit b0b7e79.

It is possible that Jenkins test execution is affected by that change
(we see some stranhe OOM errors), see comments on
eclipse-platform#1474.

To make sure the change doesn't cause any side effects, let revert it.
@iloveeclipse
Copy link
Member Author

I will do that if the experiment above (with disabling skipTests) would not show anything useful.

See #1475

@Bananeweizen
Copy link
Contributor

Maven doesn't always build modules in strict order but rather takes the first dependency order it figures which can differ from build to build

That's worded wrongly, even though the end result will be the same: Maven always enforces a partial sorting that reflects all explicit dependencies, no matter which builder is used later. For other modules that don't have an explicit dependency relation, Maven will not change the order (therefore those remain in the order of declaration in the POM). Therefore the build graph (and dependency order) that Maven core dictates is the same each time. However, in multithreaded execution the builder can choose to iterate the build graph in different ways, which is what then leads to different build order when modules finish later or earlier than in other runs.

My experience with OOMs in non eclipse Maven builds is that this happens especially when you have forked JVM processes and your modules are of similar size, causing parallel builds to trigger identical goals roughly at the same time. E.g. I have another Maven build with 8 threads in parallel, and the worst memory usage happens when all modules run Spotbugs or Asciidoc generation (both with forked JVM) at the same time. I'm not aware of anything that can limit the number of parallel executions of a single maven plugin (e.g. use 8 threads, but at most 4 asciidoc calls in parallel). Could this be similar? Any good ideas how to have a "semaphore" on a specific maven goal?

iloveeclipse added a commit that referenced this pull request Oct 25, 2023
This reverts commit b0b7e79.

It is possible that Jenkins test execution is affected by that change
(we see some stranhe OOM errors), see comments on
#1474.

To make sure the change doesn't cause any side effects, let revert it.
@iloveeclipse
Copy link
Member Author

Re-opening this PR, as we see OOM's also without this change.

I plan to push it once https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/issues/3896 is fixed, to not mix possible regressions coming from this PR with "other" OOM issues we see right now.

@iloveeclipse
Copy link
Member Author

Re-opening this PR, as we see OOM's also without this change.

I plan to push it once https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/issues/3896 is fixed, to not mix possible regressions coming from this PR with "other" OOM issues we see right now

Sorry, github mixed everything, wrong PR, wrong diff, wrong comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants