Sped up parsing by ~0..10%-ish #149

toughengineer · 2023-11-30T16:53:48Z

Disclaimer:

this is a quick and dirty experiment, you will not find rigorous measurements, statistics and whatever, but the figures suggest that there is something that one may want to investigate further.

tl;dr:

first few pushbacks into a vector without calling reserve() are relatively expensive, these changes offset this cost by using arrays of temporarily stored first few parsed JSON array elements.

The gist

The type ArrayElements crudely models what some people call a "static vector", i.e. a vector-like container with capacity fixed at compile time and storage allocated inside the instance itself.

A proper "static vector" would allow to skip construction and destruction of "unused" elements so I believe it will allow to gain a little bit more speed by accomodating more elements.

With current code 4 elements seems to be the sweet spot where allocation, and construction/destruction of the elements in the array and moving them balances out.

Crude measurements

I crudely measured performance difference with the current main branch.
I used the latest MSVC and clang-cl (that comes with MSVS) because that's what I had immediately available.
I built tao-json-perf-parse_file.exe executable (in RelWithDebInfo mode, but that shouldn't make a difference) and started it with JSON files from tests directory.

The figures are the timings "per iteration" from the benchmark output in milliseconds.

	MSVC			clang-cl
	base	changed	diff	base	changed	diff
canada.json	43	39	-9%	35	31	-11%
citm_catalog.json	23	21	-9%	21	18	-14%
twitter.json	8,5	8	-6%	8	7	-13%
blns.json	0,2	0,2	0%	0,18	0,16	-11%

The measurements were quite consistent across few runs.
The parsed JSON files were on a relatively fast SSD so file reading speed should not have had a large influence.
I couldn't use clang (without "-cl") driver because of warnings that were treated as errors.

Click/tap to show

>------ Build started: Project: CMakeLists, Configuration: RelWithDebInfo ------
  [1/2] Building CXX object src/perf/json/CMakeFiles/tao-json-perf-parse_file.dir/parse_file.cpp.obj
  FAILED: src/perf/json/CMakeFiles/tao-json-perf-parse_file.dir/parse_file.cpp.obj 
  C:\PROGRA~1\MIB055~1\2022\COMMUN~1\VC\Tools\Llvm\x64\bin\CLANG_~1.EXE  -ID:/dev/taojson/include -ID:/dev/taojson/external/PEGTL/include --target=amd64-pc-windows-msvc -fdiagnostics-absolute-paths -O2 -DNDEBUG -g -Xclang -gcodeview -std=c++17 -D_DLL -D_MT -Xclang --dependent-lib=msvcrt -pedantic -Wall -Wextra -Wshadow -Werror -MD -MT src/perf/json/CMakeFiles/tao-json-perf-parse_file.dir/parse_file.cpp.obj -MF src\perf\json\CMakeFiles\tao-json-perf-parse_file.dir\parse_file.cpp.obj.d -o src/perf/json/CMakeFiles/tao-json-perf-parse_file.dir/parse_file.cpp.obj -c D:/dev/taojson/src/perf/json/parse_file.cpp
  In file included from D:/dev/taojson/src/perf/json/parse_file.cpp:4:
  In file included from D:/dev/taojson/include\tao/json.hpp:11:
  In file included from D:/dev/taojson/include\tao/json/from_file.hpp:10:
  In file included from D:/dev/taojson/include\tao/json/events/from_file.hpp:9:
  In file included from D:/dev/taojson/include\tao/json/events/../internal/action.hpp:16:
  In file included from D:/dev/taojson/include\tao/json/internal/number_state.hpp:13:
D:\dev\taojson\include\tao\json\external\double.hpp(89,9): error G748BFC68: extension used [-Werror,-Wlanguage-extension-token]
  typedef __int64 int64_t;
          ^
D:\dev\taojson\include\tao\json\external\double.hpp(90,18): error G748BFC68: extension used [-Werror,-Wlanguage-extension-token]
  typedef unsigned __int64 uint64_t;
  
                   ^
  
  2 errors generated.
  
  ninja: build stopped: subcommand failed.

Build failed.

To be at least a little bit conservative I rounded measurements of base implementation down-ish (overestimated) and that of changed implementation up-ish (underestimated).

Conclusion

All in all I would say that you would want to investigate this idea of speedup. It seems to not hurt where it is not used, e.g when JSON does not contain (a lot of) arrays, and on the other hand even this crude implementation gives noticeable speedup.

I do not plan to develop fully fledged implementation of this idea in this library.
Please feel free to use this code as is or as a starting point.

sped up parsing by ~0..10%

faaa07e

ColinH self-assigned this Dec 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sped up parsing by ~0..10%-ish #149

Sped up parsing by ~0..10%-ish #149

toughengineer commented Nov 30, 2023

Sped up parsing by ~0..10%-ish #149

Are you sure you want to change the base?

Sped up parsing by ~0..10%-ish #149

Conversation

toughengineer commented Nov 30, 2023

Disclaimer:

tl;dr:

The gist

Crude measurements

Conclusion