You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some Background
We are searching in the IMAP-Server for specific Emails based on their Subject. We noticed in one of our tests that the search with umlauts (ü or ä) in the subject is not performant as it should be. It takes 30 Minuten to find 1 Email in the Inbox with > 6000 Emails.
We investigated the problem and found, that in this case IMAP-Server throws an Exception and the Library falls down to the default implementation and loads all Emails.
Details
We debugged the code and found the root-cause of the error.
// Check if the search "text" terms contain only ASCII chars,
// or if utf8 support has been enabled (in which case CHARSET
// is not allowed; see RFC 6855, section 3, last paragraph)
if (supportsUtf8() || SearchSequence.isAscii(term)) {
try {
return issueSearch(msgSequence, term, null);
} catch (IOException ioex) { /* will not happen */ }
}
Here all SearchTerms will be converted to the Argument
// Generate a search-sequence with the given charset
Argument args = getSearchSequence().generateSequence(term,
charset == null ? null :
MimeUtility.javaCharset(charset)
In our case the charset is NULL and then the subject from the SearchTerm will be converted as follows:
at the end ASCIIUtility.getBytes(s) will be called and it uses a default OS-Charset (on Windows it is not UTF-8) and at this point of time all umlaut have a wrong representation in the byte-array, which will be sent to the IMAP-Server.
We strongly believe that there should be a possibility to specify the Encoding for converting SearchTerms independently. Or maybe you can find more elegant solution.
Thanks in advance.
The text was updated successfully, but these errors were encountered:
// Generate a search-sequence with the given charset
Argument args = getSearchSequence().generateSequence(term,
charset == null ? null :
MimeUtility.javaCharset(charset)
When the charset is null instead of unconditionally passing null it should pass "UTF-8" when supportsUtf8() is true otherwise null
Some Background
We are searching in the IMAP-Server for specific Emails based on their Subject. We noticed in one of our tests that the search with umlauts (ü or ä) in the subject is not performant as it should be. It takes 30 Minuten to find 1 Email in the Inbox with > 6000 Emails.
We investigated the problem and found, that in this case IMAP-Server throws an Exception and the Library falls down to the default implementation and loads all Emails.
Details
We debugged the code and found the root-cause of the error.
Method
search
in theIMAPProtocol
class https://github.com/eclipse-ee4j/angus-mail/blob/master/providers/imap/src/main/java/org/eclipse/angus/mail/imap/protocol/IMAPProtocol.java#L2494 has the following code:Out IMAP-Server Supports UTF-8 and the code correctly calls
issueSearch
with no Charset. So far so goodThe problem occurs in the
issueSearch
itself on line 2552 https://github.com/eclipse-ee4j/angus-mail/blob/master/providers/imap/src/main/java/org/eclipse/angus/mail/imap/protocol/IMAPProtocol.java#L2552Here all
SearchTerm
s will be converted to theArgument
In our case the
charset
is NULL and then the subject from the SearchTerm will be converted as follows:at the end
ASCIIUtility.getBytes(s)
will be called and it uses a default OS-Charset (on Windows it is not UTF-8) and at this point of time all umlaut have a wrong representation in the byte-array, which will be sent to the IMAP-Server.We strongly believe that there should be a possibility to specify the Encoding for converting SearchTerms independently. Or maybe you can find more elegant solution.
Thanks in advance.
The text was updated successfully, but these errors were encountered: