Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sped up parsing by ~0..10%-ish #149

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

toughengineer
Copy link

Disclaimer:

this is a quick and dirty experiment, you will not find rigorous measurements, statistics and whatever, but the figures suggest that there is something that one may want to investigate further.

tl;dr:

first few pushbacks into a vector without calling reserve() are relatively expensive, these changes offset this cost by using arrays of temporarily stored first few parsed JSON array elements.

The gist

The type ArrayElements crudely models what some people call a "static vector", i.e. a vector-like container with capacity fixed at compile time and storage allocated inside the instance itself.

A proper "static vector" would allow to skip construction and destruction of "unused" elements so I believe it will allow to gain a little bit more speed by accomodating more elements.

With current code 4 elements seems to be the sweet spot where allocation, and construction/destruction of the elements in the array and moving them balances out.

Crude measurements

I crudely measured performance difference with the current main branch.
I used the latest MSVC and clang-cl (that comes with MSVS) because that's what I had immediately available.
I built tao-json-perf-parse_file.exe executable (in RelWithDebInfo mode, but that shouldn't make a difference) and started it with JSON files from tests directory.

The figures are the timings "per iteration" from the benchmark output in milliseconds.

MSVC clang-cl
base changed diff base changed diff
canada.json 43 39 -9% 35 31 -11%
citm_catalog.json 23 21 -9% 21 18 -14%
twitter.json 8,5 8 -6% 8 7 -13%
blns.json 0,2 0,2 0% 0,18 0,16 -11%

The measurements were quite consistent across few runs.
The parsed JSON files were on a relatively fast SSD so file reading speed should not have had a large influence.
I couldn't use clang (without "-cl") driver because of warnings that were treated as errors.

Click/tap to show
>------ Build started: Project: CMakeLists, Configuration: RelWithDebInfo ------
  [1/2] Building CXX object src/perf/json/CMakeFiles/tao-json-perf-parse_file.dir/parse_file.cpp.obj
  FAILED: src/perf/json/CMakeFiles/tao-json-perf-parse_file.dir/parse_file.cpp.obj 
  C:\PROGRA~1\MIB055~1\2022\COMMUN~1\VC\Tools\Llvm\x64\bin\CLANG_~1.EXE  -ID:/dev/taojson/include -ID:/dev/taojson/external/PEGTL/include --target=amd64-pc-windows-msvc -fdiagnostics-absolute-paths -O2 -DNDEBUG -g -Xclang -gcodeview -std=c++17 -D_DLL -D_MT -Xclang --dependent-lib=msvcrt -pedantic -Wall -Wextra -Wshadow -Werror -MD -MT src/perf/json/CMakeFiles/tao-json-perf-parse_file.dir/parse_file.cpp.obj -MF src\perf\json\CMakeFiles\tao-json-perf-parse_file.dir\parse_file.cpp.obj.d -o src/perf/json/CMakeFiles/tao-json-perf-parse_file.dir/parse_file.cpp.obj -c D:/dev/taojson/src/perf/json/parse_file.cpp
  In file included from D:/dev/taojson/src/perf/json/parse_file.cpp:4:
  In file included from D:/dev/taojson/include\tao/json.hpp:11:
  In file included from D:/dev/taojson/include\tao/json/from_file.hpp:10:
  In file included from D:/dev/taojson/include\tao/json/events/from_file.hpp:9:
  In file included from D:/dev/taojson/include\tao/json/events/../internal/action.hpp:16:
  In file included from D:/dev/taojson/include\tao/json/internal/number_state.hpp:13:
D:\dev\taojson\include\tao\json\external\double.hpp(89,9): error G748BFC68: extension used [-Werror,-Wlanguage-extension-token]
  typedef __int64 int64_t;
          ^
D:\dev\taojson\include\tao\json\external\double.hpp(90,18): error G748BFC68: extension used [-Werror,-Wlanguage-extension-token]
  typedef unsigned __int64 uint64_t;
  
                   ^
  
  2 errors generated.
  
  ninja: build stopped: subcommand failed.

Build failed.

To be at least a little bit conservative I rounded measurements of base implementation down-ish (overestimated) and that of changed implementation up-ish (underestimated).

Conclusion

All in all I would say that you would want to investigate this idea of speedup. It seems to not hurt where it is not used, e.g when JSON does not contain (a lot of) arrays, and on the other hand even this crude implementation gives noticeable speedup.

I do not plan to develop fully fledged implementation of this idea in this library.
Please feel free to use this code as is or as a starting point.

@ColinH ColinH self-assigned this Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants