CLRS 1.0.0
Main changes
- Extended the benchmark from 21 to 30 tasks by adding the following:
- Activity selection (Gavril, 1972)
- Longest common subsequence
- Articulation points
- Bridges
- Kosaraju's strongly connected components algorithm (Aho et al., 1974)
- Kruskal's minimum spanning tree algorithm (Kruskal, 1956)
- Segment intersection
- Graham scan convex hull algorithm (Graham, 1972)
- Jarvis' march convex hull algorithm (Jarvis, 1973)
- Added new baseline processors:
- Deep Sets (Zaheer et al., NIPS 2017) and Pointer Graph Networks (Veličković et al., NeurIPS 2020) as particularisations of the existing Message-Passing Neural Network processor.
- End-to-End Memory Networks (Sukhbaatar et al., NIPS 2015)
- Graph Attention Networks v2 (Brody et al., ICLR 2022)
Detailed changes
- Add PyPI installation instructions. by @copybara-service in #6
- Fix README typo. by @copybara-service in #7
- Expose
Sampler
base class in public API. by @copybara-service in #8 - Add dataset reader. by @copybara-service in #12
- Patch imbalanced samplers for DFS-based algorithms. by @copybara-service in #15
- Disk-based samplers for convex hull algorithms. by @copybara-service in #16
- Avoid dividing by zero in F_1 score computaton. by @copybara-service in #18
- Sparsify the graphs generated for Kruskal. by @copybara-service in #20
- Option to add an lstm after the processor. by @copybara-service in #19
- Include dataset class and creation using tensorflow_datasets format. by @copybara-service in #23
- Change types of DataPoint and DataPoint members. by @copybara-service in #22
- Remove unnecessary data loading procedures. by @copybara-service in #24
- Modify example to run with the tf.data.Datasets dataset. by @copybara-service in #25
- Expose processors in CLRS by @copybara-service in #21
- Update CLRS-21 to CLRS-30. by @copybara-service in #26
- Update README with new algorithms. by @copybara-service in #27
- Add dropout to example. by @copybara-service in #28
- Make example download dataset. by @copybara-service in #30
- Force full dataset pipeline to be on the CPU. by @copybara-service in #31
- Set default dropout to 0.0 for now. by @copybara-service in #32
- Added support for GATv2 and masked GATs. by @copybara-service in #33
- Pad memory in MemNets and disable embeddings. by @copybara-service in #34
baselines.py
refactoring (2/N) by @copybara-service in #36baselines.py
refactoring (3/N). by @copybara-service in #38- Update readme. by @copybara-service in #37
- Generate more samples in tasks where the number of signals is small. by @copybara-service in #40
- Fix MemNet embeddings by @copybara-service in #41
- Supporting multiple attention heads in GAT and GATv2. by @copybara-service in #42
- Use GATv2 + add option to use different number of heads. by @copybara-service in #43
- Fix GAT processors. by @copybara-service in #44
- Fix samplers_test by @copybara-service in #47
- Update requirements.txt by @copybara-service in #45
- Bug in hint loss for CATEGORICAL type. The number of unmasked datapoints (jnp.sum(unmasked_data)) was computed over the whole time sequence instead of the pertinent time slice. by @copybara-service in #53
- Use internal rng for batch selection. Makes batch sampling deterministic given seed. by @copybara-service in #49
baselines.py
refactoring (6/N) by @copybara-service in #52- Time-chunked datasets. by @copybara-service in #48
- Potential bug in edge diff decoding. by @copybara-service in #54
- Losses for chunked data. by @copybara-service in #55
- Changes to hint losses, mostly for decode_diffs=True. Before, only one of the terms of the MASK type loss was masked by gt_diff. Also, the loss was averaged over all time steps, including steps without diffs and therefore contributing 0 to the loss. Now we average only over the non-zero-diff steps. by @copybara-service in #57
- Adapt baseline model to process multiple algorithms with a single processor. by @copybara-service in #59
- Explicitly denote a hint learning mode, to delimit the tasks of interest to CLRS. by @copybara-service in #60
- Give names to encoder and decoder params. This facilitates analysis, especially in multi-algorithm training. by @copybara-service in #63
- Symmetrise the weights of sampled weighted undirected Erdos-Renyi graphs. by @copybara-service in #62
- Fix dataset size for augmented validation + test sets. by @copybara-service in #65
- Bug when hint mode is 'none': the multi-algorithm version needs something in the list diff decoders. by @copybara-service in #66
- Change requirements to a fixed tensorflow datasets nightly build. by @copybara-service in #68
- Patch KMP algorithm to incorporate the "reset" node. by @copybara-service in #69
- Allow for multiple-batch evaluation in example run script. by @copybara-service in #70
- Bug in SearchSampler: arrays should be sorted. by @copybara-service in #71
- Record separate hint eval scores for analysis. by @copybara-service in #72
- Symmetrised edges for PGN. by @copybara-service in #73
- Option for noise in teacher forcing by @copybara-service in #74
- Regularize PGN_MASK losses by predicting min_value-1 at missing edges instead of -10^5 by @copybara-service in #75
- Make encoded_decoded_nodiff default mode, and add flag to control teacher forcing noise. by @copybara-service in #76
- Detailed evaluation of hints in verbose mode. by @copybara-service in #79
- Pass processor factory instead of processor string when creating model. This makes it easier to add new processors as processor parameters don't need to be passed down to model and net. by @copybara-service in #81
- Update README. by @copybara-service in #82
- Use large negative number instead of 0 to discard non-connected edges for max aggregation in PGN processor. by @copybara-service in #83
- Add tensorflow requirement. by @copybara-service in #84
- Change deprecated tree_multimap to tree_map by @copybara-service in #85
- Increase version number for PyPI release. by @copybara-service in #87
Full Changelog: v0.0.2...v1.0.0