-
Notifications
You must be signed in to change notification settings - Fork 2
/
notes.txt
380 lines (234 loc) · 10.2 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
Section 1: Introduction
Intelligent Agent introduction to the perception-action-cycle.
Terminology: (reinforced)
Fully Observable: immediately measurable information is sufficient to make optimal choice
Partially Observable: immediately measurable information is not sufficient to make optimal choice but adding history improves or allows for optimal choice
Deterministic: actions uniquely determine outcome
Stochastic: actions cause probablistic outcome
Discrete:
Continuous:
Benign: actions are neutral
Adversarial: actions cause counter reactions
Examples:
Checkers: fully observable, deterministic, discrete, and adversarial
Poker: partially observable, stochastic, discreete, and adversarial
Robotic Car: partially observable, stochastic, continuous, benign
AI deals with and manages uncertainty rationally.
Section 2: Problem Solving
Definiton of a problem:
1) initial state
2) action (state) -> { a_1, a_2, ..., a_n } -- can be dependent on the state!
3) result (state, action) -> state_1
4) goalTest (state_1) -> boolean
5) pathCost (path) -> costOfPath where path is a sequence of states and actions
stepCost (state, action, state_1) where path is sum of steps
Infinite set complete: breadth first and uniform cost
A* search finds lowest cost path IF:
h(s) <= true cost which is to say
h(s) never overestimates cost which is to say
h(s) is optimistic which is to say
h(s) is admissable
Problem solving works when the domain is:
1) Fully observable
2) Known (know set of available actions)
3) Discrete (finite number of actions)
4) Deterministic (know result of taking an action)
5) Static (only our actions change the world)
Setion 3:
Product Rule:
P(A|B) = P(A, B) / P(B)
Total Probability:
P(A) = sum (P(A|B_k)*P(B_k))
where k=1,2,...,K making a complete sample set. For instance, true and false in the case of B being boolean.
Bayes Rule:
P(B|A) * P(A)
P(A|B) = -------------
P(B)
where, B is the evidence
A is the outcome
P(A|B) is the diagnostic or posterior
P(B|A) is the causal or likelihood
P(A) is the prior
P(B) is the marginal likelihood and often the difficult part
Can defer computing P(B) by using P(~A|B) ==>>
P(B|~A) * P(~A)
P(~A|B) = -------------
P(B)
1
eta = ---------------
P(A|B) + P(~A|B)
P(A|B) = eta * P(B|A) * P(A)
Chain rule:
P(A_1,A_2,...,A_k) = P(A_k|A_k-1,...,A_1)...P(A_2|A_1)P(A_1)
Networks: lines are influcencs in the direction of the arrow.
d-separatin or reachability
Caps are unknown and lower are known
Active Triplets Inactive Triplets (conditionally independent)
A -> B -> C A -> b -> C
A <- B -> C A <- b -> C
A -> b <- C A -> B <- C
In the last case, b can be known through a direct decendent.
Section 4:
-- Intentionally left blank
Section 5:
Maximum Liklihood: P(x) = count(x) / N
count(x) + k
Laplace Smoothing: P(x) = ------------
N + k|x|
Linear Regression: f(x) = w_1 * x + w_0
1
Logistic Regression: z = ---------------
-f(x)
1 + e
Section 6:
k-means: Fit k cluster centers.
k-means generalizes into expectation maximization via Gaussian distributions
Section 7:
First Order Logic (FOL)
Section 8:
Situation Calculus
1) actions (a) are represented by objects in FOL
2) situations (s) are objects in FOL and are paths not states
s' = Result (s,a)
possible predicate: Poss (a,s)
some precond(s) => Poss (a,s)
Conventions:
functions that change for the situation (fluents) always list the situation last
successor-state axioms used to describe situation following an action
Section 9:
Markov Decision Process (MDP)
states s_1, ..., s_N --> fully observable
actions a_1, ..., a_K
state transition matrix T(s, a, s') = P(s'|a,s)
reward function R(s)
The states and actions are a traditional FSM. The stochastic nature of the transition matrix is what makes it more and transforms it to an MDP. The reward seems to act mostly a hueristic.
Policy => pi(s) -> a
Value function: a potential function that leads from the goal locaiton as the peak of a gradient across the space. Hill climbing can then be used to choose the best path.
Value iteration: iterate over value functions to find the optimal
The iteration is over possible states from s given and action a.
Section 10:
Supervised learning: given set of tuples, learn f such that y = f(x)
Unsupervised learning: given set of data, learn P such that P(X=x)
Reinforcement learning: given sequence of states and actions, learn the optimal policy given a state ==> pi(s)
Utility based agent: know P, learn R which gives U and then use U
Q-learning agent: do not know P, learn a Q(s,a) as a sub for R and then use Q
Reflex agent: do not know P or R, just choose a policy pi(s)
The reflex agent is perfect example of locally handling an exception. Know what to do simply because of the state it is in.
Passive reinforcement learning: passive means uses a fixed policy but learns about P and R while applying the policy.
Active reinforcement learning: changes the policy from what is learned.
Section 11:
Hidden Markov Model
- hidden because the state is observable through an emitted measurable and
not the state itself. Does this imply partially observable?
For instance, the robot that wanders the museum. It does not know its
location, but, rather, uses range finding sensors to measure objects around
it and then infers its location (state) from the range data and a map.
Stationary Distribution
P(A_t) = P(A_t|A_t-1) * P(A_t-1) + P(A_t|B_t-1) * P(B_t-1)
in the stationary distribution, P(A_t) = P(A_t-1) = x
P(B_t-1) = 1 - x
Estimation: P(x_1 | z_1) where x_1 is the state and z_1 is the measurement
Use Bayes Rule to restructure
P(x_1|z_1) = eta P(z_1|x_1) P(x_1) where eta is the normalizer as before
Prediction: P(x_2) = total probability
Hidden Markov Model parameters: P(x_0), P(z|x), and P(x_n|x_n-1)
Section 12: review
Section 13:
Time complexity is b**m where b is the branching and m is the depth
Space complexity is bm where b is branching and m is depth
Use alpha-beta pruning to reduce time complexity
Use depth limits to reduce space complexity
Use a graph to reduce both, but graphs require simplified conditions like end games or openings.
Section 14:
Dominant Strategy:
A strategy for which a player does better no matter what the other player does.
Pareto Optimal Outcome:
No other outcome that all other players would prefer.
Equillibrium:
No player can benefit from switching to another strategy if all other players remain the same.
Section 15:
Earliest start time: ES
Lastest start time: LS
ES(s) = 0 where s is the start
LS(f) = ES(f) where f is the finish
ES(B) = max_{over all A -> B} ES(A) + duration (A)
LS(A) = min+{over all B <- A} LS(B) - duration (A)
where -> is proceeds and <- is follows
Heirarchical Planning is used to close the abstraction gap also called refinement planning.
Section 16:
Pinhole camera:
x/f = X/Z where x is the projection height
f is the focal length
X is the object height
Z is the distance from the pinhole
Also known as Perspective Projection
1/f = 1/Z + 1/z where f is the focal length
Z is the extrinsic distance (to the object)
a is the intrinsic distance (lens to focal plane)
Computer vision
- classify objects
- 3D reconstrunction
- motion analysis
Linear Filter: I x g -> I' where I is the image
x is the convolution operator
g is the kernel
I' is the resultant
translates to
I'(x,y) = sum_u,v I(x-u, y-v) * g(u,v)
Edge detector (Gradient Image): E = sqrt (I_x**2 + I_y**2)
Canny Edge Detector: more sophisticated and better edge detector
Fancy kernels: Prewitt, Sobel, Kirsh, etc.
Reasons to blur and image (apply a gaussain kernel):
- down sampling (blur first to reduce aliasing)
- pixel noise reduction
Harris Corner Detector: uses eigenvalue decomposition to
sum (I_x**2) sum (I_x * I_y)
sum (I_x * I_y) sum (I_y**2)
and both eigenvalues are large gives us a corner
Modern Feature Detectors:
- localizable
- unique signatures
HOG = Histogram of Oriented Gradients
SIFT = Scale Invariant Feature Transform
Section 17:
Stereoscopic vision (two pinhole camera):
x_2 - x_1 B
--------- = - where x_2 is the distance from the hole 2 to the image
f Z x_1 is the distance from the hole 1 to the image
B is the baseline or distance between pinholes
f is the focal length
Z is the range or distance to the point
Dynamic Programming:
Use the cost model of a mismatch vs occlusion to correlate pixels in an image.
Section 18:
Structure from Motion: Structure means 3D world
Motion means location of camera
m camera poses
n points in the scene
2mn constraints
6m + 2n unknowns
6m + 2n - 7 <= 2mn for a well posed problem
where the 7 represents the non-recoverable scale and absolute location
Sections 19 and 20 are purposely omitted.
Section 21:
Language Models (2 camps):
- sequence of letters or words
* probabilistic
* word-based
* learned
- logical
* tree
* category
* hand coded
Probablility Models:
P(w_1, w_2, ..., w_n) = P(w_1:n)
Markov Assumption
P(w_i|w_1:i-1) = P(w_i|w_i-k:i-1)
where k = 1 is first Markov
Stationary Assumption
P(w_i|w_i-1) = P(w_j|w_j-1)
Smoothing as in Lapacian Smoothing.
The best segmentation is maximum of the joint probability
S* = max P(w_1:n)
Approximate it with Markov Assumption
Still leaves 2**(n-1) seqmentations where there are n characters