 What is the bayesian formula?
 [image]
 What does p(A|B) mean?
 p(A|B) = p(A and B) / p(B)
 What does p(A and B) mean?
 p(A and B) = p(A|B) * p(B)
 What is Bayes' rule?
 p(B|A) = p(B and A) / p(A)   = p(A|B) * p(B) / p(A)
 Show a joint distribution of 3 variables
 [image]
 How can you reduce (the variables of) a joint distribution table?
 If a variable is independent there is no need to compare it to each of the other variables as if A is independent of B, then p(A|B) is just equal to p(A)
 What is a bayesian network used for?
 Describe which variables influence which otehr variables No connection between two variables implies conditional independence
 What is P(A or B)?
 P(A) + P(B) - P(A and B)
 What are decision trees?
 Decision support tool that uses tree like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, utility, etc.
 What is entropy?
 The measure of the amount of disorder or surprise in a system   High entropy means we have no idea what is going to happen.   Low entropy means we've pinned thiings down to some extent; we've got some information about what is likely to happen
 What is the entropy formula?
 H(X) = Ex(I(x)) = -sum(p(x)log2(p(x)) for all x
 How to build a decision tree?
 Split the tree on the attribute with the highest information gain. Then recurse
 What is the k-neighbour algorithm?
 http://www.youtube.com/watch?v=4ObVzTuFivY It takes a specific point, and classifies it according to the majority vote of the k nearest points
 What is supervised learning?
 The learner must learn to classify cases but a labelled training set spells out what the right answer should be in each case. K-nearest neighbour, decision trees, neural nets are all examples of supervised learning
 What is reinforcement learning?
 Agent lives in an environment, it must choose actions within that world and periodically it gets either positive or negative reinforcement
 What is the temporal difference learning equation?
 Vi = Vi + a [r + Vj - Vi] Where Vi = new opinion Vj = old opinion a = learning rate r = actual reward
 What is discounting future rewards and how can it be used?
 Rewards that can be obtained now are usually better than the one we obtain later We need to discount the future rewards implicit in Vj with a new d term (e.g. d= 0.9):   Vi = Vi + a [r + dVj - Vi]
 What is Q learning?
 Different from Value per state Vstate We keep track of a Q value for each possible state-action pair This is also called the model-free learning
 What is the update formula for the Q-learning?
 Qi,k = Qi,k + a [r + d.max(Qj,x - Vi,k]   Where We're in state i, we choose action k that takes us to state j and gives us reward r Learning rate a = 0.1, discount factor d = 0.9 Expected value of getting to state j is the maximum Q value we could get for any action x done at j
 What are the characteristics of local search?
 Local search methods are highly general Means starting somewhere in a space of posibilities and iteratively trying the neighbours of our current location to see if they are better - thus myopic Used when we're ignorant of the global structure of our possiblity space Computationally inefficient Mirrors natural adaptation
Definition
Term
Definition
Term
Definition
Term
Definition
Term
Definition
Term
Definition
Term
Definition
Term
Definition
Term
Definition
