Model Representation

Notation:

x(i): input variables/features

y(i): output/target variables

(x(i), y(i)): training example (pair)

training set - a list of training examples

superscripted (i) - an index into the training set (not an exponentiation)

Given a training set, we want to learn a function hθ(x) so that it is a good predictor for each value of x's corresponding y. This function is called a hypothesis function.

e.g., given a set of square feet and costs of houses, you could create a scatter plot of data

supervised learning - given the "right answer" for each example in the training set for the data

regression problem - given square feet of a living area, we want to predict a real-valued output (the price)

classification problem - given square feet of a living area, we want to predict a discrete value (if it's a house or apartment)

x: input, square feet of a living area

y: output, estimated price

hθ(x): hypothesis function that maps from x to y

Take the training set and put it through the learning algorithm.

We represent our hypothesis function with: hθ(x) = θ0 + θ1x

This is univariate (x) linear regression.

θ0 and θ1 are the parameters of the model.

We compute these parameters using the cost function.

Published November 28, 2016