-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text format (e.g. IridasCube) parsing optimizations #2074
Changes from all commits
769e887
74b465a
8f5be6a
8428381
3353ac4
3f65bb3
53fbca9
27f8ab7
3a3e7e5
8d90274
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -44,6 +44,13 @@ inline unsigned char Upper(unsigned char c) | |
} | ||
} | ||
|
||
// Checks if character is whitespace (space, tab, newline or other | ||
// non-printable char). Does not take locale into account. | ||
inline bool IsSpace(unsigned char c) | ||
{ | ||
return c <= ' '; | ||
} | ||
|
||
// Return the lower case string. | ||
inline std::string Lower(std::string str) | ||
{ | ||
|
@@ -95,6 +102,13 @@ inline bool StartsWith(const std::string & str, const std::string & prefix) | |
return str.size() >= prefix.size() && 0 == str.compare(0, prefix.size(), prefix); | ||
} | ||
|
||
// Return true if the string starts with the prefix character. | ||
// Note: The comparison is case sensitive. | ||
inline bool StartsWith(const std::string& str, char prefix) | ||
{ | ||
return !str.empty() && str.front() == prefix; | ||
} | ||
|
||
// Starting from the left, trim the character. | ||
inline std::string LeftTrim(std::string str, char c) | ||
{ | ||
|
@@ -106,7 +120,7 @@ inline std::string LeftTrim(std::string str, char c) | |
// Starting from the left, trim all the space characters i.e. space, tabulation, etc. | ||
inline std::string LeftTrim(std::string str) | ||
{ | ||
const auto it = std::find_if(str.begin(), str.end(), [](char ch) { return !std::isspace(static_cast<unsigned char>(ch)); }); | ||
const auto it = std::find_if(str.begin(), str.end(), [](char ch) { return !IsSpace(ch); }); | ||
str.erase(str.begin(), it); | ||
return str; | ||
} | ||
|
@@ -123,7 +137,7 @@ inline std::string RightTrim(std::string str, char c) | |
inline std::string RightTrim(std::string str) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry for delayed response. Are we sure that this "unsigned char" based IsSpace() function can operate reliably with UTF-8 encoded, potentially multi-byte characters? Especially the RightTrim() feels dangerous which is used in places like GetAbsoluteSearchPaths() etc. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm with a closer look, that doesn't seem to be using this overload. So the current use-case wise we're safe I guess. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes this would work correctly with UTF-8 too, as in it would detect ASCII spaces just fine. UTF-8 is a full superset of 7-bit ASCII; any "longer than one byte" character bytes in there all have the highest bit set. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yup, you are right. MSB can be 0 only for the first byte of the multi-byte character. |
||
{ | ||
const auto it = | ||
std::find_if(str.rbegin(), str.rend(), [](char ch) { return !std::isspace(static_cast<unsigned char>(ch)); }); | ||
std::find_if(str.rbegin(), str.rend(), [](char ch) { return !IsSpace(ch); }); | ||
str.erase(it.base(), str.end()); | ||
return str; | ||
} | ||
|
@@ -148,6 +162,12 @@ inline void Trim(StringVec & list) | |
} | ||
} | ||
|
||
// Checks if string is empty or white-space only. | ||
inline bool IsEmptyOrWhiteSpace(const std::string& str) | ||
{ | ||
return std::find_if(str.begin(), str.end(), [](char ch) { return !IsSpace(ch); }) == str.end(); | ||
} | ||
|
||
// Split a string content using an arbitrary separator. | ||
inline StringVec Split(const std::string & str, char separator) | ||
{ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor but to make sure:
Seems like __cpp_lib_to_chars "feature-test macros" are standardized in C++20, If I understand correctly -in theory- a C++17 compiler may not have this macro defined even if from_chars are implemented, thus we may have a false-negative here. Is that correct?
The worst case it'll fall back to slow path, so that's totally fine but I'm just curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, yes. In practice, this was the "least bad" way I found that would work on all of MSVC/clang/gcc, while also working on Apple Clang (in a sense that Apple Clang does not pass the test; from_chars on floats is still not there). In theory there could be some exotic compiler that might have from_chars for floats but not expose
__cpp_lib_to_chars
.