This corpus contains different datasets of behaviorally equivalent C/C++ programs to evaluate their semantic similitude.
The datasets:
- 6 Type-4 scenarios extracted from the BigCloneBench
- 10 programs for sorting, aggregation, and search algorithms
- 566 programs extracted from CodeForces solving 5 different problems