Ändringar mellan två versioner
Här visas ändringar i "Keywords to remember in the course" mellan 2014-10-29 18:43 av Anton Osika och 2014-10-30 11:06 av Anders Grandell.
Visa nästa > ändring.
Summary of topics in the course
K-nearest neighbour, average vote for K nearest training samples.
Decisions trees: Entropy = unpredictability = sum -p log p. Maximize information (= - entropy) gain for every node. (gini = - sum p(1-p). Crossvalidate and prune the last nodes.
Bayesian Inference: The process of calculating the posterior probability distribution P(y | x) for certain data x.
Bayesian Learning: The process of learning the likelihood distribution P(x | y) and prior probability distribution P(y) from a set of training points.
Liklihood: P(x|y)
a posterori: P(y|x) proportional to P(x|y) P(y)
Boosting: Aggregating many weak classifiers. Adaboost: Weight misclassfiied samples iterate, and then use weighted average of classfiiers.
Bagging: Bootstrap new samples, trains classfiers and averge them.
Concept: c true/false labeling of x in X
Hypothesis space: All possible true/false concepts, h in H (before data arrives)
True error of hypothesis: probability that hypothesis h gives wrong classification for one datapoint.
Probably Approximately Correct: How many training samples are needed if we want probability that any hypothesis missclassifies possible data to be less than delta = H*(1-eps)^m
VC dimension: The largest set of data for which each subset can be described by a hypothesis h in H.
Naive Bayes: Assume features are not dependant, maximise aposterori (not liklihood)
Logistic regression: Regression to a probability.
Inference and decision:
Discriminative function
Discriminative vs Generative model
Credit Assignment - The problem of enforcing the right parts of a compound behaviour