diff --git a/README.md b/README.md index 8c8fab7..7ee27e7 100644 --- a/README.md +++ b/README.md @@ -26,6 +26,7 @@ For all the tasks, the user can specify the Clifford gate set and qubit connecti overview +The implementation of reinforcement learning with a non-cumulative reward based on [2] is also possible by setting `use_max_reward = True` in the environments. ## Installation @@ -135,4 +136,5 @@ The code in this repository is released under the MIT License. ## References [1] Chamberland, Christopher, and Michael E. Beverland. "Flag fault-tolerant error correction with arbitrary distance codes." Quantum 2 (2018): 53. +[2] Nägele, M., Olle, J., Fösel, T., Zen, R., & Marquardt, F. (2024). Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning. arXiv:2405.13609.