Machine Learning by Andrew Ng_2 Cost Function

Systematic learning of professor Ng’s machine learning course at youtube.

First, repaste the major topics he covered in this 112-video series:

Through out the whole course of machine learning, we need to grasp the essences of it. Cost of function is one the essences. to understand this concept from linear regression is a easy way. For example, below illustrate square error function, one type of cost function in a single feature linear regression problem:

The theta theta1 are adjusted to find the best fit regressional line as

So the key here is to come up with an algorithm to automatically find the theta and theta1 pair residing at the bottom or convergence of bowl shaped curve above.

This is algorithm is Gradient Descent. It’s actually just the partial differentials in calculus, note the theta0 and theta 1 must be updated simultaneously, which is consistent with the partial differentials concept in calculus.

alpha is the learning rate, in calculus, it’s the attempting stride you make when descending toward convergence with the rapidest speed. Intuitively we can infer that even setting alpha a big value can accelerate, it could also prevent you from arriving the right bottom as it escape a narrow bottom area.

To sum, we apply gradient descent algorithm to minimize/optimize the cost function J(theta0, theta1) of the hypothesis functoin Ho(x).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.