-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extremely slow for large vintage test suites #3050
Comments
Profiling shows that there are some significant performance issues that might be easy to address. This would significantly reduce the cpu time and memory usage.
Attached is the JFR recording which can be opened by Java Mission Control, junit.jfr.zip. This provides some helpful suggestions and gives a good view of the execution. I did the following which you can repeat for future investigations,
|
5.10.0-SNAPSHOT increases the execution time by a full two minutes in the sample project, from 7m 24s to 9m 27s. I had hoped to offer some patches to discuss possible optimizations and make this usable, but I'm a bit dejected by this large degradation and the lack of concern towards performance. If this is the trend then I am afraid a few spot fixes won't be enough for me to feel comfortable using this framework except in small test scenarios. |
Please don't interpret the lack of response on this issue so far as a lack of interest. I think it's fair to say that your use case is a bit outside the norm. But I'd be happy to work with you on this. We should also put some regression tests in place to avoid such degradations in the future. |
Thanks @marcphilipp. I did a basic optimization pass of UniqueIdFormat and that reduced a local snapshot to 5m 31s. I don't mean to imply this is better code, just ideas for consideration. This removes regex, decoding, allocations, wasted space, and map lookups. diff --git a/junit-platform-engine/src/main/java/org/junit/platform/engine/UniqueIdFormat.java b/junit-platform-engine/src/main/java/org/junit/platform/engine/UniqueIdFormat.java
index cf931f79af..9f3b2dd509 100644
--- a/junit-platform-engine/src/main/java/org/junit/platform/engine/UniqueIdFormat.java
+++ b/junit-platform-engine/src/main/java/org/junit/platform/engine/UniqueIdFormat.java
@@ -10,22 +10,16 @@
package org.junit.platform.engine;
-import static java.util.stream.Collectors.joining;
-import static java.util.stream.Collectors.toList;
-
import java.io.Serializable;
import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
import java.net.URLEncoder;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
-import java.util.HashMap;
-import java.util.List;
-import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.junit.platform.commons.JUnitException;
-import org.junit.platform.commons.util.Preconditions;
+import org.junit.platform.commons.PreconditionViolationException;
import org.junit.platform.engine.UniqueId.Segment;
/**
@@ -61,27 +55,21 @@ class UniqueIdFormat implements Serializable {
private final char closeSegment;
private final char segmentDelimiter;
private final char typeValueSeparator;
- private final Pattern segmentPattern;
- private final HashMap<Character, String> encodedCharacterMap = new HashMap<>();
+ private final char[] characters;
+ private final String[] encodedCharacters;
UniqueIdFormat(char openSegment, char typeValueSeparator, char closeSegment, char segmentDelimiter) {
this.openSegment = openSegment;
this.typeValueSeparator = typeValueSeparator;
this.closeSegment = closeSegment;
this.segmentDelimiter = segmentDelimiter;
- this.segmentPattern = Pattern.compile(
- String.format("%s(.+)%s(.+)%s", quote(openSegment), quote(typeValueSeparator), quote(closeSegment)),
- Pattern.DOTALL);
// Compute "forbidden" character encoding map.
- // Note that the map is always empty at this point. Thus the use of
- // computeIfAbsent() is purely syntactic sugar.
- encodedCharacterMap.computeIfAbsent('%', UniqueIdFormat::encode);
- encodedCharacterMap.computeIfAbsent('+', UniqueIdFormat::encode);
- encodedCharacterMap.computeIfAbsent(openSegment, UniqueIdFormat::encode);
- encodedCharacterMap.computeIfAbsent(typeValueSeparator, UniqueIdFormat::encode);
- encodedCharacterMap.computeIfAbsent(closeSegment, UniqueIdFormat::encode);
- encodedCharacterMap.computeIfAbsent(segmentDelimiter, UniqueIdFormat::encode);
+ characters = new char[] { '%', '+', openSegment, typeValueSeparator, closeSegment, segmentDelimiter };
+ encodedCharacters = new String[characters.length];
+ for (int i = 0; i < characters.length; i++) {
+ encodedCharacters[i] = UniqueIdFormat.encode(characters[i]);
+ }
}
/**
@@ -91,71 +79,117 @@ class UniqueIdFormat implements Serializable {
* @throws JUnitException if the string cannot be parsed
*/
UniqueId parse(String source) throws JUnitException {
- String[] parts = source.split(String.valueOf(this.segmentDelimiter));
- List<Segment> segments = Arrays.stream(parts).map(this::createSegment).collect(toList());
- return new UniqueId(this, segments);
+ int length = 1;
+ for (int i = 0; i < source.length(); i++) {
+ if (source.charAt(i) == segmentDelimiter) {
+ length++;
+ }
+ }
+
+ int index = 0;
+ int start = 0;
+ Segment[] segments = new Segment[length];
+ do {
+ int end = source.indexOf(segmentDelimiter, start);
+ if (end == -1) {
+ end = source.length();
+ }
+ if (start != end) {
+ segments[index] = createSegment(source.substring(start, end));
+ index++;
+ }
+ start = end + 1;
+ } while (start < source.length());
+ return new UniqueId(this, Arrays.asList(segments));
}
private Segment createSegment(String segmentString) throws JUnitException {
- Matcher segmentMatcher = this.segmentPattern.matcher(segmentString);
- if (!segmentMatcher.matches()) {
+ if ((segmentString.charAt(0) != openSegment)
+ && (segmentString.charAt(segmentString.length() - 1) != closeSegment)) {
throw new JUnitException(String.format("'%s' is not a well-formed UniqueId segment", segmentString));
}
- String type = decode(checkAllowed(segmentMatcher.group(1)));
- String value = decode(checkAllowed(segmentMatcher.group(2)));
+ int splitAt = segmentString.indexOf(typeValueSeparator);
+ String type = decode(checkAllowed(segmentString.substring(1, splitAt)));
+ String value = decode(checkAllowed(segmentString.substring(splitAt + 1, segmentString.length() - 1)));
return new Segment(type, value);
}
private String checkAllowed(String typeOrValue) {
- checkDoesNotContain(typeOrValue, this.segmentDelimiter);
- checkDoesNotContain(typeOrValue, this.typeValueSeparator);
- checkDoesNotContain(typeOrValue, this.openSegment);
- checkDoesNotContain(typeOrValue, this.closeSegment);
+ for (int i = 0; i < typeOrValue.length(); i++) {
+ char c = typeOrValue.charAt(i);
+ checkDoesNotContain(typeOrValue, c, this.segmentDelimiter);
+ checkDoesNotContain(typeOrValue, c, this.typeValueSeparator);
+ checkDoesNotContain(typeOrValue, c, this.openSegment);
+ checkDoesNotContain(typeOrValue, c, this.closeSegment);
+ }
return typeOrValue;
}
- private void checkDoesNotContain(String typeOrValue, char forbiddenCharacter) {
- Preconditions.condition(typeOrValue.indexOf(forbiddenCharacter) < 0,
- () -> String.format("type or value '%s' must not contain '%s'", typeOrValue, forbiddenCharacter));
+ private void checkDoesNotContain(String typeOrValue, char c, char forbiddenCharacter) {
+ if (c == forbiddenCharacter) {
+ throw new PreconditionViolationException(
+ String.format("type or value '%s' must not contain '%s'", typeOrValue, forbiddenCharacter));
+ }
}
/**
* Format and return the string representation of the supplied {@code UniqueId}.
*/
String format(UniqueId uniqueId) {
- // @formatter:off
- return uniqueId.getSegments().stream()
- .map(this::describe)
- .collect(joining(String.valueOf(this.segmentDelimiter)));
- // @formatter:on
+ int estimatedLength = 0;
+ for (Segment segment : uniqueId.getSegments()) {
+ estimatedLength += segment.getType().length() + 4 + segment.getValue().length();
+ }
+ StringBuilder builder = new StringBuilder(estimatedLength + /* slack */ 16);
+ for (int i = 0; i < uniqueId.getSegments().size(); i++) {
+ describe(builder, uniqueId.getSegments().get(i));
+ if (i != (uniqueId.getSegments().size() - 1)) {
+ builder.append(this.segmentDelimiter);
+ }
+ }
+ return builder.toString();
}
- private String describe(Segment segment) {
- String body = encode(segment.getType()) + typeValueSeparator + encode(segment.getValue());
- return openSegment + body + closeSegment;
+ private void describe(StringBuilder builder, Segment segment) {
+ builder.append(openSegment);
+ encode(builder, segment.getType());
+ builder.append(typeValueSeparator);
+ encode(builder, segment.getValue());
+ builder.append(closeSegment);
}
- private String encode(String s) {
- StringBuilder builder = new StringBuilder(s.length());
+ private void encode(StringBuilder builder, String s) {
for (int i = 0; i < s.length(); i++) {
+ int index = -1;
char c = s.charAt(i);
- String value = encodedCharacterMap.get(c);
- if (value == null) {
+ for (int j = 0; j < characters.length; j++) {
+ if (c == characters[j]) {
+ index = j;
+ break;
+ }
+ }
+
+ if (index == -1) {
builder.append(c);
- continue;
}
- builder.append(value);
+ else {
+ builder.append(encodedCharacters[index]);
+ }
}
- return builder.toString();
}
private static String decode(String s) {
- try {
- return URLDecoder.decode(s, StandardCharsets.UTF_8.name());
- }
- catch (UnsupportedEncodingException e) {
- throw new JUnitException("UTF-8 should be supported", e);
+ for (int i = 0; i < s.length(); i++) {
+ char c = s.charAt(i);
+ if ((c == '+') || (c == '%')) {
+ try {
+ return URLDecoder.decode(s, StandardCharsets.UTF_8.name());
+ }
+ catch (UnsupportedEncodingException e) {
+ throw new JUnitException("UTF-8 should be supported", e);
+ }
+ }
}
+ return s;
}
-
} I next tried using an lru cache when creating the segments as immutable, common prefixes (engine, class), and interning could reduce memory pressure. In a cold run it was a bit faster, though a profiled run somehow was a minute faster so I'm surprised (4m 28s). It may just be the profile slowed things down to let the GC not pause so much, as I don't think the cache was helpful enough to be used. The attached jfr snapshot shows that 1m7s is spent on old gc which is excessive, and the heap shows a steady trend towards exhaustion (2gb). It's easier to dig into the allocation hotspots by opening this up in JProfiler, which shows that the |
Thanks for the patch! While we can still consider doing these optimizations, I found it curious that I've prepared a small PR (#3073) that avoids that from JUnit's own listeners (e.g. I ran an experiment with using JUnit's Before
After
Please note that the "Test run finished" timings do not include the discovery phase. |
Thanks for the fix. I'll wait until the changes are in Gradle to retest. While I think this will help, I suspect that we'll have to make further changes here to get my build to run on JUnitPlatform. I haven't looked into the TestNG behavior but that appeared to not be correctly making test selections. |
Steps to reproduce
This Gradle project (junit-example.zip) demonstrates this performance regression using Guava's testlib to generate many test cases against
HashMap
. In total this runs 982,500 tests and that can be adjusted in theMapTests
iterator size.Context
I am trying to migrate Caffeine onto JUnitPlatform for running the suites. The external suites are JUnit 3 or 4, and the internal is TestNG. In total this amounts to ~10M tests due to brute force testing of different cache configurations.
The
caffeine:junitTest
suite runs 354,295 tests, which typically takes ~3 minutes. This fails after 17 minutes with an out-of-memory error. The TestNG scanner is also broken, but in a more severe manner. It tries to instantiate non-test classes unless those are explicitly excluded. It then appears to try to to run tests not in the specified group, perhaps just to ignore them, and exhausts memory without making any progress.For the smaller modules the integration works wonderfully and simplifies the build files. It is only for these larger test suites does it seem to struggle. That might be related to #2460 but it seems worse than simply being inefficient with memory.
The text was updated successfully, but these errors were encountered: