responsible ai chapter draft #91

skmur · 2023-12-05T15:35:11Z

Before submitting your Pull Request, please ensure that you have carefully reviewed and completed all items on this checklist.

Content
- The chapter content is complete and covers the topic in detail.
- All technical terms are well-defined and explained.
- Any code snippets or algorithms are well-documented and tested.
- The chapter follows a logical flow and structure.
References & Citations
- All references are correctly listed at the end of the chapter.
- In-text citations are used appropriately and match the references.
- All figures, tables, and images have proper sources and are cited correctly.
Quarto Website Rendering
- The chapter has been locally built and tested using Quarto.
- All images, figures, and tables render properly without any glitches.
- All images have a source or they are properly linked to external sites.
- Any interactive elements or widgets work as intended.
- The chapter's formatting is consistent with the rest of the book.
Grammar & Style
- The chapter has been proofread for grammar and spelling errors.
- The writing style is consistent with the rest of the book.
- Any jargon is clearly explained or avoided where possible.
Collaboration
- All group members have reviewed and approved the chapter.
- Any feedback from previous reviews or discussions has been addressed.
Miscellaneous
- All external links (if any) are working and lead to the intended destinations.
- If datasets or external resources are used, they are properly credited and linked.
- Any necessary permissions for reused content have been obtained.
Final Steps
- The chapter is pushed to the correct branch on the repository.
- The Pull Request is made with a clear title and description.
- The Pull Request includes any necessary labels or tags.
- The Pull Request mentions any stakeholders or reviewers who should take a look.

…o new_branch

- updated references.bib - added keys to responsible AI chapter

…anch

DivyaAmirtharaj · 2023-12-10T02:35:35Z

responsible_ai.qmd

+Maintaining individuals' privacy is an ethical obligation and legal requirement for organizations deploying AI systems. Regulations like the EU's GDPR mandate data privacy protections and rights like the ability to access and delete one's data.
+
+However, maximizing the utility and accuracy of data for training models can conflict with preserving privacy - modeling disease progression could benefit from access to patients' full genomes but sharing such data widely violates privacy.
+


I really like the way the definitions are written, they're clear and easy to follow! It might be helpful to also include some context on why these particular pillars are considered important (which seems a little obvious) through specific examples of failures, or specific successes?

DivyaAmirtharaj · 2023-12-10T02:38:16Z

responsible_ai.qmd

+
+When AI systems eventually fail or produce harmful outcomes, there must be mechanisms to address resultant issues, compensate affected parties, and assign responsibility. Both corporate accountability policies and government regulations are indispensable for responsible AI governance. For instance, [Illinois' Artificial Intelligence Video Interview Act](https://www.ilga.gov/legislation/ilcs/ilcs3.asp?ActID@15&ChapterIDh) requires companies to disclose and obtain consent for AI video analysis, promoting accountability.
+
+Without clear accountability, even harms caused unintentionally could go unresolved, furthering public outrage and distrust. Oversight boards, impact assessments, grievance redress processes, and independent audits promote responsible development and deployment.


Are there examples of this having been done successfully/ finding a way to assign responsibility to either the model or model creator? If yes it would be helpful to list here or to be more specific since responsibility/ ownership is one of the biggest discussions I've heard around particularly GenAI. If not, it would be interesting to touch on why this has been a difficult problem to solve.

DivyaAmirtharaj · 2023-12-10T02:41:40Z

responsible_ai.qmd

+
+Understanding the social, ethical and historical background of a system is critical to prevent harm and should inform decisions throughout the model development lifecycle. After understanding the context, there are a wide array of technical decisions one can make to remove bias. First, one must decide what fairness metric is the most appropriate criterion to optimize for. Next, there are generally three main areas where one can intervene to debias an ML system.
+
+First, preprocessing is when one balances a dataset to ensure fair representation, or even increases the weight on certain underrepresented groups to ensure the model performs well on them. Second, in processing attempts to modify the training process of an ML system to ensure it prioritizes fairness. This can be as simple as adding a fairness regularizer [@lowy2021fermi], to training an ensemble of models and sampling from them in a specific manner [@agarwal2018reductions]. 


Could also be interesting to mention reading datacards/ draw back to the chapter on building out datasets?

DivyaAmirtharaj · 2023-12-10T02:54:47Z

responsible_ai.qmd

+
+With ML devices personalized to individual users and then deployed to remote edges without connectivity, a challenge arises---how can models responsively "forget" data points after deployment? If a user requests their personal data be removed from a personalized model, the lack of connectivity makes retraining infeasible. Thus, efficient on-device data forgetting is necessary but poses hurdles.
+
+Initial unlearning approaches faced limitations in this context. Retraining models from scratch on the device to forget data points proves inefficient or even impossible, given the resource constraints. Fully retraining also requires retaining all the original training data on the device, which brings its own security and privacy risks. Common machine unlearning techniques [@bourtoule2021machine] for remote embedded ML systems fail to enable responsive, secure data removal.


This is cool, I've never thought to look into this! Are there models/ companies that are building this in? What are some of the more concrete tradeoffs in terms of efficiency/ model size when integrating this in and are there examples of it in use?

DivyaAmirtharaj · 2023-12-10T02:58:32Z

responsible_ai.qmd

+For instance, past work shows successful attacks that trick models for tasks like NSFW detection [@bhagoji2018practical], ad-blocking [@tramer2019adversarial], and speech recognition [@carlini2016hidden]. While errors in these domains already pose security risks, the problem extends beyond IT security: recently adversarial robustness has been proposed as an additional performance metric by approximating worst-case behavior.
+
+The surprising model fragility highlighted above casts doubt on real-world reliability and opens the door to adversarial manipulation. This growing vulnerability underscores several needs. First, principled robustness evaluations are essential for quantifying model vulnerabilities before deployment. Approximating worst-case behavior surfaces blindspots.
+


This section makes me wonder if putting this chapter at the beginning of the book makes any sense? It's a nice summary of why many of the concepts that are talked about in future chapters are important (data fairness/ transparency, security & privacy, etc.). Each of those chapters provides a more in depth look at metrics for assessing them, but maybe including some of those metrics here could be helpful (ie, how a developer would go about looking at security or transparency when building their model)

DivyaAmirtharaj · 2023-12-10T02:59:48Z

responsible_ai.qmd

+
+By providing examples of changes to input features that would alter a prediction (counterfactuals) or indicating the most influential features for a given prediction (attribution), these post hoc explanation techniques shed light on model behavior for individual inputs. This granular transparency helps users determine whether they can trust and act upon specific model outputs.  
+
+**Concept based explanations** aim to explain model behavior and outputs using a pre-defined set of semantic concepts (e.g. the model recognizes scene class "bedroom" based on the presence of concepts "bed" and "pillow"). Recent work shows that users often prefer these explanations to attribution and example based explanations because they "resemble human reasoning and explanations" [@ramaswamy2023ufo]. Popular concept based explanation methods include TCAV [@kim2018interpretability], Network Dissection [@bau2017network], and interpretable basis decomposition [@zhou2018interpretable]. 


This section is super clear and the examples are helpful!

DivyaAmirtharaj · 2023-12-10T03:09:16Z

responsible_ai.qmd

+
+#### Inherent Interpretability
+
+Inherently interpretable models are constructed such that their explanations are part of the model architecture and are thus naturally faithful, which sometimes makes them preferable to post-hoc explanations applied to black-box models, especially in high-stakes domains where transparency is imperative [@rudin2019stop]. Often, these models are constrained so that the relationships between input features and predictions are easy for humans to follow (linear models, decision trees, decision sets, k-NN models), or they obey structural knowledge of the domain, such as monotonicity [@gupta2016monotonic], causality, or additivity [@lou2013accurate; @beck1998beyond].


A section on the tradeoffs of different factors could be interesting. While a black box is less explainable, I'd imagine it's more private/ secure?

skmur and others added 2 commits December 5, 2023 10:23

responsible ai chapter draft

4692f94

Removed the new filename and moved content to original filename

1837e95

profvjreddi added the cs249r label Dec 5, 2023

euranofshin and others added 27 commits December 5, 2023 12:31

getting latest version before adding refs

4e8aa1d

Merge branch 'new_branch' of https://github.com/skmur/cs249r_book int…

63d190e

…o new_branch

added references

bb9141f

- updated references.bib - added keys to responsible AI chapter

Auto export to md cleanup

3d7ec01

Fix broken table

deabaaa

Fix references

32bd413

Header fixes

caef35d

Shorten intro material

a86afb4

Added learning objectives

79db4fa

MD formatting fixes

6669c6d

remove export image issues

c8ef6e1

Remove strikethrough content from google doc

b757a9a

draft of conclusion

a6582f5

fix to section header

2192838

Added a cover image

0ced42e

MD fixes

f37b906

Working on bias and privacy section

0fe31f1

minor

ff057ef

section header fix

13cc613

Updating interpretable models section

85e7ba0

Updating AI safety and Value alignment sections

037072a

Updated Autonomous systems and control section

17784bf

Updaed the economic section

77af545

Updated the scientific communication section

0cfae45

added figures to responsible_ai.

93587e5

Sorting the references

2c6323c

Merge branch 'new_branch' of github.com:skmur/cs249r_book into new_br…

4a10053

…anch

Fixing broken references

c237035

mpstewart1 self-assigned this Dec 8, 2023

profvjreddi merged commit cf9dce2 into harvard-edge:main Dec 8, 2023

DivyaAmirtharaj reviewed Dec 10, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

responsible ai chapter draft #91

responsible ai chapter draft #91

skmur commented Dec 5, 2023

DivyaAmirtharaj Dec 10, 2023

DivyaAmirtharaj Dec 10, 2023

DivyaAmirtharaj Dec 10, 2023

DivyaAmirtharaj Dec 10, 2023

DivyaAmirtharaj Dec 10, 2023

DivyaAmirtharaj Dec 10, 2023

DivyaAmirtharaj Dec 10, 2023

		Maintaining individuals' privacy is an ethical obligation and legal requirement for organizations deploying AI systems. Regulations like the EU's GDPR mandate data privacy protections and rights like the ability to access and delete one's data.

		However, maximizing the utility and accuracy of data for training models can conflict with preserving privacy - modeling disease progression could benefit from access to patients' full genomes but sharing such data widely violates privacy.


		When AI systems eventually fail or produce harmful outcomes, there must be mechanisms to address resultant issues, compensate affected parties, and assign responsibility. Both corporate accountability policies and government regulations are indispensable for responsible AI governance. For instance, [Illinois' Artificial Intelligence Video Interview Act](https://www.ilga.gov/legislation/ilcs/ilcs3.asp?ActID@15&ChapterIDh) requires companies to disclose and obtain consent for AI video analysis, promoting accountability.

		Without clear accountability, even harms caused unintentionally could go unresolved, furthering public outrage and distrust. Oversight boards, impact assessments, grievance redress processes, and independent audits promote responsible development and deployment.


		Understanding the social, ethical and historical background of a system is critical to prevent harm and should inform decisions throughout the model development lifecycle. After understanding the context, there are a wide array of technical decisions one can make to remove bias. First, one must decide what fairness metric is the most appropriate criterion to optimize for. Next, there are generally three main areas where one can intervene to debias an ML system.

		First, preprocessing is when one balances a dataset to ensure fair representation, or even increases the weight on certain underrepresented groups to ensure the model performs well on them. Second, in processing attempts to modify the training process of an ML system to ensure it prioritizes fairness. This can be as simple as adding a fairness regularizer [@lowy2021fermi], to training an ensemble of models and sampling from them in a specific manner [@agarwal2018reductions].


		With ML devices personalized to individual users and then deployed to remote edges without connectivity, a challenge arises---how can models responsively "forget" data points after deployment? If a user requests their personal data be removed from a personalized model, the lack of connectivity makes retraining infeasible. Thus, efficient on-device data forgetting is necessary but poses hurdles.

		Initial unlearning approaches faced limitations in this context. Retraining models from scratch on the device to forget data points proves inefficient or even impossible, given the resource constraints. Fully retraining also requires retaining all the original training data on the device, which brings its own security and privacy risks. Common machine unlearning techniques [@bourtoule2021machine] for remote embedded ML systems fail to enable responsive, secure data removal.

		For instance, past work shows successful attacks that trick models for tasks like NSFW detection [@bhagoji2018practical], ad-blocking [@tramer2019adversarial], and speech recognition [@carlini2016hidden]. While errors in these domains already pose security risks, the problem extends beyond IT security: recently adversarial robustness has been proposed as an additional performance metric by approximating worst-case behavior.

		The surprising model fragility highlighted above casts doubt on real-world reliability and opens the door to adversarial manipulation. This growing vulnerability underscores several needs. First, principled robustness evaluations are essential for quantifying model vulnerabilities before deployment. Approximating worst-case behavior surfaces blindspots.


		By providing examples of changes to input features that would alter a prediction (counterfactuals) or indicating the most influential features for a given prediction (attribution), these post hoc explanation techniques shed light on model behavior for individual inputs. This granular transparency helps users determine whether they can trust and act upon specific model outputs.

		Concept based explanations aim to explain model behavior and outputs using a pre-defined set of semantic concepts (e.g. the model recognizes scene class "bedroom" based on the presence of concepts "bed" and "pillow"). Recent work shows that users often prefer these explanations to attribution and example based explanations because they "resemble human reasoning and explanations" [@ramaswamy2023ufo]. Popular concept based explanation methods include TCAV [@kim2018interpretability], Network Dissection [@bau2017network], and interpretable basis decomposition [@zhou2018interpretable].


		#### Inherent Interpretability

		Inherently interpretable models are constructed such that their explanations are part of the model architecture and are thus naturally faithful, which sometimes makes them preferable to post-hoc explanations applied to black-box models, especially in high-stakes domains where transparency is imperative [@rudin2019stop]. Often, these models are constrained so that the relationships between input features and predictions are easy for humans to follow (linear models, decision trees, decision sets, k-NN models), or they obey structural knowledge of the domain, such as monotonicity [@gupta2016monotonic], causality, or additivity [@lou2013accurate; @beck1998beyond].

responsible ai chapter draft #91

responsible ai chapter draft #91

Conversation

skmur commented Dec 5, 2023

DivyaAmirtharaj Dec 10, 2023

Choose a reason for hiding this comment

DivyaAmirtharaj Dec 10, 2023

Choose a reason for hiding this comment

DivyaAmirtharaj Dec 10, 2023

Choose a reason for hiding this comment

DivyaAmirtharaj Dec 10, 2023

Choose a reason for hiding this comment

DivyaAmirtharaj Dec 10, 2023

Choose a reason for hiding this comment

DivyaAmirtharaj Dec 10, 2023

Choose a reason for hiding this comment

DivyaAmirtharaj Dec 10, 2023

Choose a reason for hiding this comment