diff --git a/docs/index.html b/docs/index.html index 40e386b..da2de58 100644 --- a/docs/index.html +++ b/docs/index.html @@ -88,7 +88,7 @@

Goodhart’s law

This principle highlights a fundamental risk in setting objectives, particularly in AI-driven endeavors. By making a specific measure the objective function of an AI agent’s actions, we inadvertently shift its goals. The agent, in striving to maximize that objective function, may exploit loopholes or take shortcuts that align with the letter of the goal but deviate from its intended spirit. This will usually lead to unintended consequences and side-effects, where the pursuit of a narrowly defined objective overshadows broader, not explicitly specified considerations, such as ethical implications, societal impact, or long-term sustainability.

- In the context of our research, Goodhart’s law underscores the importance of designing AI agents whose goals are the full or partial maximization of some objective function. Instead, by embracing aspiration-based designs, we aim to create systems that are inherently more aligned with holistic and adaptable criteria.
+ In the context of our research, Goodhart’s law underscores the importance of designing AI agents whose goals are not the full or partial maximization of some objective function. Instead, by embracing aspiration-based designs, we aim to create systems that are inherently more aligned with holistic and adaptable criteria.
This approach seeks to mitigate the risks associated with Goodhart’s law by ensuring that the metrics used to guide AI behavior are not fixed targets but rather flexible aspirations that encourage the agent to consider a wider range of outcomes and behaviors. Thus, our project recognizes and addresses the challenge posed by Goodhart’s law, advocating for a more nuanced and safety-conscious strategy in the development of AI systems.

Challenges