-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
responsible ai chapter draft #91
Conversation
- updated references.bib - added keys to responsible AI chapter
Maintaining individuals' privacy is an ethical obligation and legal requirement for organizations deploying AI systems. Regulations like the EU's GDPR mandate data privacy protections and rights like the ability to access and delete one's data. | ||
|
||
However, maximizing the utility and accuracy of data for training models can conflict with preserving privacy - modeling disease progression could benefit from access to patients' full genomes but sharing such data widely violates privacy. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the way the definitions are written, they're clear and easy to follow! It might be helpful to also include some context on why these particular pillars are considered important (which seems a little obvious) through specific examples of failures, or specific successes?
|
||
When AI systems eventually fail or produce harmful outcomes, there must be mechanisms to address resultant issues, compensate affected parties, and assign responsibility. Both corporate accountability policies and government regulations are indispensable for responsible AI governance. For instance, [Illinois' Artificial Intelligence Video Interview Act](https://www.ilga.gov/legislation/ilcs/ilcs3.asp?ActID@15&ChapterIDh) requires companies to disclose and obtain consent for AI video analysis, promoting accountability. | ||
|
||
Without clear accountability, even harms caused unintentionally could go unresolved, furthering public outrage and distrust. Oversight boards, impact assessments, grievance redress processes, and independent audits promote responsible development and deployment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there examples of this having been done successfully/ finding a way to assign responsibility to either the model or model creator? If yes it would be helpful to list here or to be more specific since responsibility/ ownership is one of the biggest discussions I've heard around particularly GenAI. If not, it would be interesting to touch on why this has been a difficult problem to solve.
|
||
Understanding the social, ethical and historical background of a system is critical to prevent harm and should inform decisions throughout the model development lifecycle. After understanding the context, there are a wide array of technical decisions one can make to remove bias. First, one must decide what fairness metric is the most appropriate criterion to optimize for. Next, there are generally three main areas where one can intervene to debias an ML system. | ||
|
||
First, preprocessing is when one balances a dataset to ensure fair representation, or even increases the weight on certain underrepresented groups to ensure the model performs well on them. Second, in processing attempts to modify the training process of an ML system to ensure it prioritizes fairness. This can be as simple as adding a fairness regularizer [@lowy2021fermi], to training an ensemble of models and sampling from them in a specific manner [@agarwal2018reductions]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also be interesting to mention reading datacards/ draw back to the chapter on building out datasets?
|
||
With ML devices personalized to individual users and then deployed to remote edges without connectivity, a challenge arises---how can models responsively "forget" data points after deployment? If a user requests their personal data be removed from a personalized model, the lack of connectivity makes retraining infeasible. Thus, efficient on-device data forgetting is necessary but poses hurdles. | ||
|
||
Initial unlearning approaches faced limitations in this context. Retraining models from scratch on the device to forget data points proves inefficient or even impossible, given the resource constraints. Fully retraining also requires retaining all the original training data on the device, which brings its own security and privacy risks. Common machine unlearning techniques [@bourtoule2021machine] for remote embedded ML systems fail to enable responsive, secure data removal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool, I've never thought to look into this! Are there models/ companies that are building this in? What are some of the more concrete tradeoffs in terms of efficiency/ model size when integrating this in and are there examples of it in use?
For instance, past work shows successful attacks that trick models for tasks like NSFW detection [@bhagoji2018practical], ad-blocking [@tramer2019adversarial], and speech recognition [@carlini2016hidden]. While errors in these domains already pose security risks, the problem extends beyond IT security: recently adversarial robustness has been proposed as an additional performance metric by approximating worst-case behavior. | ||
|
||
The surprising model fragility highlighted above casts doubt on real-world reliability and opens the door to adversarial manipulation. This growing vulnerability underscores several needs. First, principled robustness evaluations are essential for quantifying model vulnerabilities before deployment. Approximating worst-case behavior surfaces blindspots. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section makes me wonder if putting this chapter at the beginning of the book makes any sense? It's a nice summary of why many of the concepts that are talked about in future chapters are important (data fairness/ transparency, security & privacy, etc.). Each of those chapters provides a more in depth look at metrics for assessing them, but maybe including some of those metrics here could be helpful (ie, how a developer would go about looking at security or transparency when building their model)
|
||
By providing examples of changes to input features that would alter a prediction (counterfactuals) or indicating the most influential features for a given prediction (attribution), these post hoc explanation techniques shed light on model behavior for individual inputs. This granular transparency helps users determine whether they can trust and act upon specific model outputs. | ||
|
||
**Concept based explanations** aim to explain model behavior and outputs using a pre-defined set of semantic concepts (e.g. the model recognizes scene class "bedroom" based on the presence of concepts "bed" and "pillow"). Recent work shows that users often prefer these explanations to attribution and example based explanations because they "resemble human reasoning and explanations" [@ramaswamy2023ufo]. Popular concept based explanation methods include TCAV [@kim2018interpretability], Network Dissection [@bau2017network], and interpretable basis decomposition [@zhou2018interpretable]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is super clear and the examples are helpful!
|
||
#### Inherent Interpretability | ||
|
||
Inherently interpretable models are constructed such that their explanations are part of the model architecture and are thus naturally faithful, which sometimes makes them preferable to post-hoc explanations applied to black-box models, especially in high-stakes domains where transparency is imperative [@rudin2019stop]. Often, these models are constrained so that the relationships between input features and predictions are easy for humans to follow (linear models, decision trees, decision sets, k-NN models), or they obey structural knowledge of the domain, such as monotonicity [@gupta2016monotonic], causality, or additivity [@lou2013accurate; @beck1998beyond]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A section on the tradeoffs of different factors could be interesting. While a black box is less explainable, I'd imagine it's more private/ secure?
Before submitting your Pull Request, please ensure that you have carefully reviewed and completed all items on this checklist.
Content
References & Citations
Quarto Website Rendering
Grammar & Style
Collaboration
Miscellaneous
Final Steps