Skip to content

Commit

Permalink
Merge pull request #2 from JoeOsborn/gh-pages
Browse files Browse the repository at this point in the history
Merge #2
  • Loading branch information
mchappidi authored Jul 15, 2016
2 parents 138680c + 350f2d5 commit fbdf6f7
Show file tree
Hide file tree
Showing 12 changed files with 324 additions and 152 deletions.
6 changes: 6 additions & 0 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ <h1 id="project-ideas">Project ideas</h1>
<ul>
<li>Video game character AI/social simulation</li>
<li>Automated “Heads Up” player</li>
<li>Realtime game AI</li>
<li>Guess game from level, generate level given game</li>
<li>GVGAI competition bot (or Arcade Learning Environment)</li>
<li>Learn game rules (GDL/VGDL) from forward model</li>
Expand All @@ -99,6 +100,11 @@ <h1 id="project-ideas">Project ideas</h1>
</ul></li>
<li>Survey or deep-dive papers from recent AAAI/IJCAI/NIPS/etc conferences or arxiv publications in machine learning/AI/etc</li>
<li>Deep analysis/historical study/etc of some AI system(s) or survey of techniques</li>
<li>Use a probabilistic programming language to encode one or more interesting models
<ul>
<li>E.g. http://webppl.org (book at http://dippl.org), or PyMC3 for Python</li>
</ul></li>
<li>Program synthesis — given a specification or a set of examples or a partially implemented program, come up with a function implementing it</li>
<li></li>
</ul></li>
</ul>
Expand Down
108 changes: 101 additions & 7 deletions instruction-notes.html
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,19 @@ <h2 class="author">Joseph C. Osborn</h2>
<ul>
<li>“MCTS estimates temporary state values in order to decide the next move, whereas TDL learns the long-term value of each state that then guides future behaviour”—mcts guesses more</li>
</ul></li>
<li>What worked:
<ul>
<li>Graph search, A* material, MCTS explanation, small-group activity</li>
</ul></li>
<li>What didn’t work:
<ul>
<li>RL material wasn’t prepared thoroughly enough</li>
<li>Maze should have been an immutable value object!</li>
</ul></li>
<li>What to change:
<ul>
<li>Needed more active learning opportunities, more specific RL material, too many different topics in one day (nobody had time to do enough reading).</li>
</ul></li>
<li><dl>
<dt>Assignment</dt>
<dd>Individual or pair (long) assignment. (1) should be done by tomorrow, (2) and (3) by Monday.
Expand All @@ -198,24 +211,102 @@ <h2 class="author">Joseph C. Osborn</h2>
</ol>
<em>(TAs could help with writing the code or understanding the algorithms. Eager students could implement multiple algorithms, select one on the fly, generate mazes, visualize the path-finding algorithms.)</em></li>
</ul></li>
<li>Day 4: Intelligent agents (also, intro to probability)
<li>Day 4: Intro to probability and probabilistic programming
<ul>
<li><strong>Note: also need to do intermediate evaluations at the end of the day</strong></li>
<li>Topic 1: Basic probability/Bayes rule</li>
<li>Topic 2: Agent architectures</li>
<li>Topic 3: Let’s talk about projects</li>
<li>Topic 0: Feedback/notes on yesterday’s assignment
<ul>
<li>Ask the TAs for code to let you put Maze objects into sets or use them as keys in dicts. I meant to include that from the start. This lets you avoid tricks like turning the Maze into a string for this purpose.</li>
<li>NOTE: If you put a Maze into a set or use it as a dict key, don’t modify it at all afterwards (e.g. via <code>move_player</code>, <code>toggle</code>, or setting its variables)! Only clone it and change the clones—otherwise the set/dict will break in super confusing ways.</li>
</ul>
<ol style="list-style-type: decimal">
<li>Should use <code>m.move_player()</code> and <code>m.toggle()</code> to modify game state and not e.g. check if <code>m.grid[y][x] == &quot;#&quot;</code></li>
<li>Since this is destructive, they should copy the maze using <code>m.clone()</code> before making a move</li>
<li>They can’t (easily) use a distance grid because switching switches changes which doors are open and closed, so player position doesn’t suffice to represent world state, so:</li>
<li>They need to track whether a maze has been seen before. One way is to use a set to track seen mazes.</li>
<li>They need to track the cost and predecessor state along with Maze states. Some ways of doing that include:
<ul>
<li>Add <code>cost</code> and <code>predecessor</code> properties to maze objects.</li>
<li>Include <code>cost</code> and <code>predecessor</code> with the maze in the frontier in a tuple. Of course, using priority queues the tuple’s first element should be <code>g(n) + h(n)</code> where <code>g(n)</code> is the real cost to reach node <code>n</code> and <code>h(n)</code> is the heuristic value (e.g. Manhattan distance from goal) at <code>n</code>.</li>
<li>Keep a <code>costs</code> dict and a <code>predecessors</code> dict or a <code>best_paths</code> dict keyed by Maze objects.</li>
</ul></li>
</ol></li>
<li>Topic 1: Basic probability/Bayes rule
<ul>
<li>what probability of an event means</li>
<li>P(X) – cases where X happens / all possible cases</li>
<li>P(X, Y) – cases where both X and Y happen / all possible cases</li>
<li>If X,Y independent, then P(X,Y) = P(X) P(Y)</li>
<li>P(X | Y) = P(Y|X) P(X) / P(Y) – or equivalently, P(Y,X) = P(X|Y) P(Y) / P(X)</li>
<li>Chaining: P(G,S,R)=P(G | S,R) P(S | R) P(R)</li>
<li>I like the <a href="https://en.wikipedia.org/wiki/Conditional_probability">wikipedia page</a> on conditional probability too</li>
<li>Bayesian statistics
<ul>
<li>Prior (background belief) and posterior (after considering prior) probability</li>
<li>Not, “What is the chance of X happening”, but “Given my background knowledge/superstition/experience, what is the chance of X happening?”</li>
<li>Example priors: Uniform/flat prior; or:
<ul>
<li>“An example is a prior distribution for the temperature at noon tomorrow. A reasonable approach is to make the prior a normal distribution with expected value equal to today’s noontime temperature, with variance equal to the day-to-day variance of atmospheric temperature, or a distribution of the temperature for that day of the year.”</li>
</ul></li>
<li>Priors are really important but we don’t have time to get too deeply into them</li>
</ul></li>
<li>Bayes nets</li>
<li>Given P(X | Y), P(X), and P(Y), we can find P(Y | X) with Bayes rule</li>
<li>P(X)=“Chance of rain”, P(Y)=“chance of clouds”: “Chance of rain when it’s cloudy = chance of clouds * chance of clouds given rain / chance of rain”</li>
<li>Bayes nets</li>
<li>Random variables and expected value
<ul>
<li>P(X=x)</li>
<li>“probability-weighted average of all possible values”</li>
</ul></li>
<li>Distributions (normal, poisson, negative binomial), mean/variance, and PDFs</li>
</ul></li>
<li>Exercise: Make a bayes net for some situation. It’s okay to use e.g. “low/medium/high” for the probabilities in the tables.</li>
<li>Topic 2: Probabilistic programming
<ul>
<li>Programming with distribution variables</li>
<li>“probabilistic programming languages extend a well-specified deterministic programming language with primitive constructs for random choice” <span class="citation">(Goodman and Stuhlmüller 2014)</span></li>
<li>“If we view the semantics of the underlying deterministic language as a map from programs to executions of the program, the semantics of a PPL built on it will be a map from programs to distributions over executions. When the program halts with probability one, this induces a proper distribution over return values.” <span class="citation">(Goodman and Stuhlmüller 2014)</span></li>
<li>WebPPL examples</li>
</ul></li>
<li>Topic 3: Let’s talk about projects
<ul>
<li>Project format</li>
<li>Project suggestions</li>
</ul></li>
<li>What went well:
<ul>
<li>Basic probability, Bayes nets and exercise, connection to ML, WebPPL</li>
</ul></li>
<li>What went poorly:
<ul>
<li>Bayes rule. Typo in my notes, stumbled, ended up with a result I couldn’t interpret well.</li>
</ul></li>
<li>What to do next time:
<ul>
<li>Should have practiced the Bayes Rule part of the lecture specifically!</li>
<li>Should have had more examples of belief nets in the bag.</li>
</ul></li>
<li><dl>
<dt>Assignment 1</dt>
<dd>Individual (small) assignment.
</dd>
</dl>
Give me a list of three or more project ideas you might be interested in doing, either from the suggestions or your own idea. If you have a partner or partners in mind, let me know as well.</li>
Give me a list of three or more project ideas you might be interested in doing, either from the suggestions or your own idea. If you have a partner or partners in mind, let me know as well. Submit them (as Markdown <code>.md</code> files) in the folder <code>projects/4-project-ideas</code>.</li>
<li><dl>
<dt>Assignment 2</dt>
<dd>Individual or pair (medium-length) assignment
<dd>Individual (small) assignment.
</dd>
</dl>
Write some knowledge-based agents for the maze assignments. Maybe an adversary and a player character? Make the MCTS/RL compete against the adversary? Make adversaries for each other’s things?</li>
Fill out the preliminary evaluation form and send it to a TA. They’ll anonymize them and send them on to me so I can adjust the course based on your feedback. You can find the form in the <code>projects/4-evaluations</code> folder.</li>
<li>Assignment 3: MCTS and Reinforcement Learning agents for maze solving.</li>
<li><dl>
<dt>Assignment 4 (Optional)</dt>
<dd>Individual or pair (small) assignment.
</dd>
</dl>
<p>Make a probabilistic program expressing your Bayesian model from earlier today. Try to give it a prior based on your intuition, or by loading up a dataset. Either PyMC3 or WebPPL is fine.</p>
Submit it as an <code>.ipynb</code> or a <code>.wppl</code> file in <code>projects/4-probabilistic</code>.</li>
</ul></li>
<li>Day 5: Machine learning as function approximation
<ul>
Expand Down Expand Up @@ -338,6 +429,9 @@ <h1 id="references" class="unnumbered">References</h1>
<div id="ref-compton2013generative">
<p>Compton, Kate, Joseph C Osborn, and Michael Mateas. 2013. “Generative Methods.” In <em>The Fourth Procedural Content Generation in Games Workshop, PCG</em>. Vol. 1.</p>
</div>
<div id="ref-dippl">
<p>Goodman, Noah D, and Andreas Stuhlmüller. 2014. “The Design and Implementation of Probabilistic Programming Languages.” <a href="http://dippl.org" class="uri">http://dippl.org</a>.</p>
</div>
<div id="ref-sutton1998reinforcement">
<p>Sutton, Richard S, and Andrew G Barto. 1998. <em>Reinforcement Learning: An Introduction</em>. Vol. 1. 1. MIT press Cambridge.</p>
</div>
Expand Down
41 changes: 41 additions & 0 deletions projects/2-state-machines/answers.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,47 @@ def test2():
assert not check(test2(), "x@")
assert not check(test2(), "[email protected]")


# Extension to Assignment 1-2 (full legal e-mail addresses)
def test2_extension():
#(x+(\.x+)*)@x+\.x+(\.x+)*)
return StateMachine([("s1", "x", "s2"), ("s1", ".", "s9"), ("s1", "@", "s9"),
("s2", "x", "s2"), ("s2", ".", "s3"), ("s2", "@", "s4"),
("s3", "x", "s2"), ("s3", ".", "s9"), ("s3", "@", "s9"),
("s4", "x", "s5"), ("s4", ".", "s9"), ("s4", "@", "s9"),
("s5", "x", "s5"), ("s5", ".", "s6"), ("s5", "@", "s9"),
("s6", "x", "s7"), ("s6", ".", "s9"), ("s6", "@", "s9"),
("s7", "x", "s7"), ("s7", ".", "s8"), ("s7", "@", "s9"),
("s8", "x", "s7"), ("s8", ".", "s9"), ("s8", "@", "s9"),
("s9", "x", "s9"), ("s9", ".", "s9"), ("s9", "@", "s9")],
"s1",
["s7"]
)

assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert check(test2_extension(), "[email protected]")
assert not check(test2_extension(), "@x.x")
assert not check(test2_extension(), "[email protected]")
assert not check(test2_extension(), "x@x.")
assert not check(test2_extension(), "x@x")
assert not check(test2_extension(), "x.x")
assert not check(test2_extension(), "x@")
assert not check(test2_extension(), "[email protected]")
assert not check(test2_extension(), "[email protected]")
assert not check(test2_extension(), "[email protected]")


# Assignment 2:
def sample2(sm, length, sofar):
# Each path of length 0 is either accepting or non-accepting;
Expand Down
Loading

0 comments on commit fbdf6f7

Please sign in to comment.