-
Notifications
You must be signed in to change notification settings - Fork 12
/
index.xml
210 lines (210 loc) · 23.1 KB
/
index.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>Risks (and Benefits) of Generative AI and Large Language Models</title>
<link>https://llmrisks.github.io/</link>
<description>Recent content on Risks (and Benefits) of Generative AI and Large Language Models</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<managingEditor>[email protected] (David Evans)</managingEditor>
<webMaster>[email protected] (David Evans)</webMaster>
<lastBuildDate>Tue, 12 Dec 2023 00:00:00 +0000</lastBuildDate>
<atom:link href="https://llmrisks.github.io/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Summary of Semester</title>
<link>https://llmrisks.github.io/summary/</link>
<pubDate>Tue, 12 Dec 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/summary/</guid>
<description>Here&rsquo;s a summary of the topics for the semester:
Week 1: Introduction
Attention, Transformers, and BERT Training LLMs, Risks and Rewards Week 2: Alignment
Introduction to AI Alignment and Failure Cases Redteaming Jail-breaking LLMs Week 3: Prompting and Bias
Prompt Engineering Marked Personas Week 4: Capabilities of LLMs
LLM Capabilities Medical Applications of LLMs Week 5: Hallucination
Hallucination Risks Potential Solutions Week 6: Visit from Anton Korinek
Week 7: Generative Adversarial Networks and DeepFakes</description>
</item>
<item>
<title>Week 14b: Ethical AI</title>
<link>https://llmrisks.github.io/week14b/</link>
<pubDate>Mon, 04 Dec 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week14b/</guid>
<description>Presenting Team: Aparna Kishore, Elena Long, Erzhen Hu, Jingping Wan
Blogging Team: Haolin Liu, Haochen Liu, Ji Hyun Kim, Stephanie Schoch, Xueren Ge
Note: since the topics were unrelated, Week 14 is split into two posts:
Monday, November 27: Multimodal Models Wednesday, November 29: Ethical AI Wednesday, November 29: Ethical AI Ben Shneiderman. Bridging the Gap Between Ethics and Practice: Guidelines for Reliable, Safe, and Trustworthy Human-centered AI Systems. ACM Transactions on Interactive Intelligent Systems, October 2020.</description>
</item>
<item>
<title>Week 14a: Multimodal Models</title>
<link>https://llmrisks.github.io/week14a/</link>
<pubDate>Sun, 03 Dec 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week14a/</guid>
<description>Presenting Team: Aparna Kishore, Elena Long, Erzhen Hu, Jingping Wan
Blogging Team: Haolin Liu, Haochen Liu, Ji Hyun Kim, Stephanie Schoch, Xueren Ge
Note: since the topics were unrelated, Week 14 is split into two posts:
Monday, November 27: Multimodal Models Wednesday, November 29: Ethical AI Monday, November 27: Multimodal Models Today&rsquo;s topic is how to improve model performance by combining multiple modes.
We will first introduce the multimodal foundations and then center around CLIP, which is the most famous vision-language model.</description>
</item>
<item>
<title>Week 13: Regulating Dangerous Technologies</title>
<link>https://llmrisks.github.io/week13/</link>
<pubDate>Mon, 20 Nov 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week13/</guid>
<description>The slides are here: Regulating Dangerous Technologies (I&rsquo;ve included some slides in the posted slides that I didn&rsquo;t present in class but you might find interesting, including some excerpts from a talk I gave in 2018 on Mutually Assured Destruction and the Impending AI Apocalypse.)
Since one of the groups made the analogy to tobacco products, I also will take the liberty of pointing to a talk I gave at Google making a similar analogy: The Dragon in the Room.</description>
</item>
<item>
<title>Week 12: LLM Agents</title>
<link>https://llmrisks.github.io/week12/</link>
<pubDate>Thu, 16 Nov 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week12/</guid>
<description>Presenting Team: Liu Zhe, Peng Wang, Sikun Guo, Yinhan He, Zhepei Wei
Blogging Team: Anshuman Suri, Jacob Christopher, Kasra Lekan, Kaylee Liu, My Dinh
Monday, November 13: LLM Agents LLM agents are the &ldquo;next big thing&rdquo;, with the potential to directly impact important fields like healthcare and education. Essentially, they are LLM-based systems that have the ability to use external tools, such as Internet browsing access and calculators, to augment their abilities.</description>
</item>
<item>
<title>Week 11: Watermarking on Generative Models</title>
<link>https://llmrisks.github.io/week11/</link>
<pubDate>Mon, 13 Nov 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week11/</guid>
<description>Presenting Team: Tseganesh Beyene Kebede, Zihan Guan, Xindi Guo, Mengxuan Hu
Blogging Team: Ajwa Shahid, Caroline Gihlstorf, Changhong Yang, Hyeongjin Kim, Sarah Boyce
Monday, November 6: Watermarking LLM Outputs Recent instances of AI-generated text passing for human text and the writing of students being misattributed to AI suggest the need for a tool to distinguish between human-written and AI-generated text. The presenters also noted that the increase in the amount of AI-generated text online is a risk for training future LLMs on this data.</description>
</item>
<item>
<title>Week 10: Data Selection for LLMs</title>
<link>https://llmrisks.github.io/week10/</link>
<pubDate>Fri, 03 Nov 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week10/</guid>
<description>(see bottom for assigned readings and questions)
Presenting Team: Haolin Liu, Xueren Ge, Ji Hyun Kim, Stephanie Schoch Blogging Team: Aparna Kishore, Elena Long, Erzhen Hu, Jingping Wan
Monday, 30 October: Data Selection for Fine-tuning LLMs Question: Would more models help? We&rsquo;ve discussed so many risks and issues of GenAI so far and one question is that it can be difficult for us to come up with a possible solution to these problems.</description>
</item>
<item>
<title>Week 9: Interpretability</title>
<link>https://llmrisks.github.io/week9/</link>
<pubDate>Mon, 30 Oct 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week9/</guid>
<description>(see bottom for assigned readings and questions)
Presenting Team: Anshuman Suri, Jacob Christopher, Kasra Lekan, Kaylee Liu, My Dinh
Blogging Team: Hamza Khalid, Liu Zhe, Peng Wang, Sikun Guo, Yinhan He, Zhepei Wei
Monday, 23 October: Interpretability: Overview, Limitations, &amp; Challenges Definition of Interpretability Interpretability in the context of artificial intelligence (AI) and machine learning refers to the extent to which a model&rsquo;s decisions, predictions, or internal workings can be understood and explained by humans.</description>
</item>
<item>
<title>Week 8: Machine Translation</title>
<link>https://llmrisks.github.io/week8/</link>
<pubDate>Sun, 22 Oct 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week8/</guid>
<description>(see bottom for assigned readings and questions)
Machine Translation (Week 8) Presenting Team: Ajwa Shahid, Caroline Gihlstorf, Changhong Yang, Hyeongjin Kim, Sarah Boyce
Blogging Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan
Monday, 16 Oct: Diving into the History of Machine Translation Let&rsquo;s kick off this topic with an activity that involves translating an English sentence into a language of your choice and subsequently composing pseudocode to describe the process.</description>
</item>
<item>
<title>Week 7: GANs and DeepFakes</title>
<link>https://llmrisks.github.io/week7/</link>
<pubDate>Mon, 16 Oct 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week7/</guid>
<description>(see bottom for assigned readings and questions)
Presenting Team: Aparna Kishore, Elena Long, Erzhen Hu, Jingping Wan Blogging Team: Haochen Liu, Haolin Liu, Ji Hyun Kim, Stephanie Schoch, Xueren Ge Monday, 9 October: Generative Adversarial Networks and DeepFakes Today's topic is how to utilize generative adversarial networks to create fake images and how to identify the images generated by these models.
Generative Adversarial Network (GAN) is a revolutionary deep learning framework that pits two neural networks against each other in a creative showdown.</description>
</item>
<item>
<title>Week 5: Hallucination</title>
<link>https://llmrisks.github.io/week5/</link>
<pubDate>Wed, 04 Oct 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week5/</guid>
<description>(see bottom for assigned readings and questions)
Hallucination (Week 5) Presenting Team: Liu Zhe, Peng Wang, Sikun Guo, Yinhan He, Zhepei Wei
Blogging Team: Anshuman Suri, Jacob Christopher, Kasra Lekan, Kaylee Liu, My Dinh
Wednesday, September 27th: Intro to Hallucination People Hallucinate Too Hallucination Definition There are three types of hallucinations according to the “Siren's Song in the AI Ocean” paper: Input-conflict: This subcategory of hallucinations deviates from user input. Context-conflict: Context-conflict hallucinations occur when a model generates contradicting information within a response.</description>
</item>
<item>
<title>Week 4: Capabilities of LLMs</title>
<link>https://llmrisks.github.io/week4/</link>
<pubDate>Mon, 25 Sep 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week4/</guid>
<description>(see bottom for assigned readings and questions)
Capabilities of LLMs (Week 4) Presenting Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan
Blogging Team: Ajwa Shahid, Caroline Gihlstorf, Changhong Yang, Hyeongjin Kim, Sarah Boyce
Monday, September 18 Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. April 2023. https://arxiv.org/abs/2304.13712</description>
</item>
<item>
<title>Week 3: Prompting and Bias</title>
<link>https://llmrisks.github.io/week3/</link>
<pubDate>Mon, 18 Sep 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week3/</guid>
<description>(see bottom for assigned readings and questions)
Prompt Engineering (Week 3) Presenting Team: Haolin Liu, Xueren Ge, Ji Hyun Kim, Stephanie Schoch Blogging Team: Aparna Kishore, Erzhen Hu, Elena Long, Jingping Wan
(Monday, 09/11/2023) Prompt Engineering Warm-up questions What is Prompt Engineering? How is prompt-based learning different from traditional supervised learning? In-context learning and different types of prompts What is the difference between prompts and fine-tuning? When is the best to use prompts vs fine-tuning?</description>
</item>
<item>
<title>Week 2: Alignment</title>
<link>https://llmrisks.github.io/week2/</link>
<pubDate>Mon, 11 Sep 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week2/</guid>
<description>(see bottom for assigned readings and questions)
Table of Contents (Monday, 09/04/2023) Introduction to Alignment Introduction to AI Alignment and Failure Cases Discussion Questions The Alignment Problem from a Deep Learning Perspective Group of RL-based methods Group of LLM-based methods Group of Other ML methods (Wednesday, 09/06/2023) Alignment Challenges and Solutions Opening Discussion Introduction to Red-Teaming In-class Activity (5 groups) How to use Red-Teaming? Alignment Solutions LLM Jailbreaking - Introduction LLM Jailbreaking - Demo Observations Potential Improvement Ideas Closing Remarks (by Prof.</description>
</item>
<item>
<title>Week 1: Introduction</title>
<link>https://llmrisks.github.io/week1/</link>
<pubDate>Sun, 03 Sep 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/week1/</guid>
<description>(see bottom for assigned readings and questions)
Attention, Transformers, and BERT Monday, 28 August
Transformers1 are a class of deep learning models that have revolutionized the field of natural language processing (NLP) and various other domains. The concept of transformers originated as an attempt to address the limitations of traditional recurrent neural networks (RNNs) in sequential data processing. Here&rsquo;s an overview of transformers&rsquo; evolution and significance.
Background and Origin RNNs2 were one of the earliest models used for sequence-based tasks in machine learning.</description>
</item>
<item>
<title>Github Discussions</title>
<link>https://llmrisks.github.io/discussions/</link>
<pubDate>Fri, 25 Aug 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/discussions/</guid>
<description>Everyone should have received an invitation to the github discussions site, and be able to see the posts there and submit your own posts and comments. If you didn&rsquo;t get this invitation, it was probably blocked by the email system. Try visiting:
https://github.com/orgs/llmrisks/invitation
(while logged into the github account you listed on your form).
Once you&rsquo;ve accepted the invitation, you should be able to visit https://github.com/llmrisks/discussions/discussions/2 (the now-finalized discussion post for Week 1), and contribute to the discussions there.</description>
</item>
<item>
<title>Class 0: Getting Organized</title>
<link>https://llmrisks.github.io/class0/</link>
<pubDate>Wed, 23 Aug 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/class0/</guid>
<description>I&rsquo;ve updated the Schedule and Bi-Weekly Schedule based on the discussions today.
The plan is below:
Week Lead Team Blogging Team Everyone Else Two Weeks Before Come up with idea for the week and planned readings, send to me by 5:29pm on Tuesday (2 weeks - 1 day before) - - Week Before Post plan and questions in github discussions by no later than 9am Wednesday; prepare for leading meetings Prepare plan for blogging (how you will divide workload, collaborative tools for taking notes and writing) Read/do materials and respond to preparation questions in github discussions (by 5:29pm Sunday) Week of Leading Meetings Lead interesting, engaging, and illuminating meetings!</description>
</item>
<item>
<title>Weekly Schedule</title>
<link>https://llmrisks.github.io/weeklyschedule/</link>
<pubDate>Wed, 23 Aug 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/weeklyschedule/</guid>
<description>This is the regular bi-weekly schedule:
Week Lead Team Blogging Team Everyone Else Two Weeks Before Come up with idea for the week and planned readings, send to me by 5:29pm on Tuesday (2 weeks - 1 day before) - - Week Before Post plan and questions in github discussions by no later than 9am Wednesday; prepare for leading meetings Prepare plan for blogging (how you will divide workload, collaborative tools for taking notes and writing) Read/do materials and respond to preparation questions in github discussions (by 5:29pm Sunday) Week of Leading Meetings Lead interesting, engaging, and illuminating meetings!</description>
</item>
<item>
<title>Readings and Topics</title>
<link>https://llmrisks.github.io/readings/</link>
<pubDate>Mon, 21 Aug 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/readings/</guid>
<description>This page collects some potential topics and readings for the seminar.
Introduction (Week 1) Introduction to Large Language Models (from Stanford course)
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. Attention Is All You Need. https://arxiv.org/abs/1706.03762. NeurIPS 2017.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ACL 2019.
(optional) Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever.</description>
</item>
<item>
<title>Schedule</title>
<link>https://llmrisks.github.io/schedule/</link>
<pubDate>Mon, 21 Aug 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/schedule/</guid>
<description>The schedule details will be filled in as the semester progresses (and future weeks are subject to change, but as much as is known is documented here).
See Weekly Schedule for the bi-weekly expectations for each team.
Week Lead Team Blog Team Topic 0: 23 Aug Dave Starting the Seminar 1: 28/30 Aug 14 2: 4/6 Sep 25 3: 11/13 Sep 36 4: 18/20 Sep 41 5: 25/27 Sep 52 6: 4 Oct TBD (2 Oct is Fall Classes Break) 7: 9/11 Oct 63 8: 16/18 Oct 14 9: 23/25 Oct 25 10: 30 Oct/1 Nov 36 11: 6/8 Nov 41 14: 13/15 Nov 5 2 15: 20 Nov TBD (22 Nov is Thanksgiving Break) 16: 27/29 Nov 6 3 17: 4 Dec TBD (Last meeting is 4 December) Leading Team Schedule As the leading team, your job is to select a worthwhile topic, decide on a reading assignment (which can include things other than reading and is not limited to typical research papers) for the class, write questions that the class should write responses to in preparation for the discussion, and lead an interesting, engaging, and illuminating class!</description>
</item>
<item>
<title>Updates</title>
<link>https://llmrisks.github.io/updates/</link>
<pubDate>Mon, 21 Aug 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/updates/</guid>
<description>Some materials have been posted on the course site:
Syllabus Schedule (you will find out which team you are on at the first class Wednesday) Readings and Topics (a start on a list of some potential readings and topics that we might want to cover) Dall-E Prompt: "comic style drawing of a phd seminar on AI" </description>
</item>
<item>
<title>Welcome Survey</title>
<link>https://llmrisks.github.io/survey/</link>
<pubDate>Thu, 17 Aug 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/survey/</guid>
<description>Please submit this welcome survey before 8:59pm on Monday, August 21:
https://forms.gle/dxhFmJH7WRs32s1ZA
Your answers won&rsquo;t be shared publicly, but I will use the responses to the survey to plan the seminar, including forming teams, and may share some aggregate and anonymized results and anonymized quotes from the surveys.</description>
</item>
<item>
<title>Welcome to the LLM Risks Seminar</title>
<link>https://llmrisks.github.io/welcome/</link>
<pubDate>Fri, 26 May 2023 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/welcome/</guid>
<description>Full Transcript
Seminar Plan The actual seminar won&rsquo;t be fully planned by GPT-4, but more information on it won&rsquo;t be available until later.
I&rsquo;m expecting the structure and format to that combines aspects of this seminar on adversarial machine learning and this course on computing ethics, but with a topic focused on learning as much as we can about the potential for both good and harm from generative AI (including large language models) and things we can do (mostly technically, but including policy) to mitigate the harms.</description>
</item>
<item>
<title></title>
<link>https://llmrisks.github.io/images/week14/day1/test/</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/images/week14/day1/test/</guid>
<description></description>
</item>
<item>
<title></title>
<link>https://llmrisks.github.io/images/week14/day2/test/</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/images/week14/day2/test/</guid>
<description></description>
</item>
<item>
<title></title>
<link>https://llmrisks.github.io/images/week14/test/</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/images/week14/test/</guid>
<description></description>
</item>
<item>
<title></title>
<link>https://llmrisks.github.io/syllabus/</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/syllabus/</guid>
<description>Syllabus cs6501: Risks and Benefits of Generative AI and LLMs University of Virginia, Fall 2023
Meetings: Mondays and Wednesdays, 9:30-10:45am in Rice 340
Course Objective. This seminar will focus on understanding the potential risks and benefits of advances in Generative Artificial Intelligence and Large Language Models. This is a research-focused seminar that will expect students to read papers and lead discussions.
Expected Background: Students are not required to have prior background in machine learing or security, but will be expected to learn whatever background they need on these topics mostly on their own.</description>
</item>
<item>
<title>Blogging Mechanics</title>
<link>https://llmrisks.github.io/blogging/</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>[email protected] (David Evans)</author>
<guid>https://llmrisks.github.io/blogging/</guid>
<description>Here are some suggestions for how to create the class blog posts for your assigned classes.
I believe each team has at least a few members with enough experience using git and web contruction tools that following these instructions won&rsquo;t be a big burden, but if you have other ways you want to build your blog page for a topic let me know and we can discuss alternative options.
Install Hugo.</description>
</item>
</channel>
</rss>