Neil Thawani - Blog - Model Representation

Notation:

x⁽ⁱ⁾: input variables/features

y⁽ⁱ⁾: output/target variables

(x⁽ⁱ⁾, y⁽ⁱ⁾): training example (pair)

training set - a list of training examples

superscripted (i) - an index into the training set (not an exponentiation)

Given a training set, we want to learn a function h_θ(x) so that it is a good predictor for each value of x's corresponding y. This function is called a hypothesis function.

e.g., given a set of square feet and costs of houses, you could create a scatter plot of data

supervised learning - given the "right answer" for each example in the training set for the data

regression problem - given square feet of a living area, we want to predict a real-valued output (the price)

classification problem - given square feet of a living area, we want to predict a discrete value (if it's a house or apartment)

x: input, square feet of a living area

y: output, estimated price

h_θ(x): hypothesis function that maps from x to y

Take the training set and put it through the learning algorithm.

We represent our hypothesis function with: h_θ(x) = θ₀ + θ₁x

This is univariate (x) linear regression.

θ₀ and θ₁ are the parameters of the model.

We compute these parameters using the cost function.