jeffreyxiao
Hi! I'm a student at the University of Waterloo studying software engineering. I have an interest in large scale distributed algorithms and infrastructure for data analytics. I also enjoy working with low level systems.

Machine Learning Notes

Tuesday, February 16, 2016

ProgrammingMachine Learning

Introduction

My ongoing notes for machine learning.

Definitions

θ=weights/parametersx=featuresJ(θ)=cost functionhθ(x)=predicted value of featuresn=size of weightsm=size of training setα=learning rateλ=constant for regularization\begin{aligned} \theta &= \text{weights/parameters} \\ x &= \text{features} \\ J(\theta) &= \text{cost function} \\ h_\theta(x) &= \text{predicted value of features} \\ n &= \text{size of weights} \\ m &= \text{size of training set} \\ \alpha &= \text{learning rate} \\ \lambda &= \text{constant for regularization} \\ \end{aligned}

Linear Regression

hθ(x)=θTxJ(θ)=12mi=1m(hθ(x(i))y(i))2θ=θα1mi=1m(hθ(x(i))y(i))x(i)θ=(XTX)1XTyJ(θ)reg=λ2mj=1nθj2θreg=λmθ\begin{aligned} h_\theta(x) &= \theta^Tx \\\\ J(\theta) &= \frac{1}{2m}\sum\limits_{i=1}^m(h_\theta(x^{(i)}) - y^{(i)})^2 \\\\ \theta &= \theta - \alpha\frac{1}{m}\sum\limits_{i=1}^m(h_\theta(x^{(i)}) - y^{(i)})x^{(i)} \\\\ \theta &= (X^TX)^{-1}X^Ty \\\\ J(\theta)_{reg} &= \frac{\lambda}{2m}\sum\limits_{j=1}^n\theta_j^2 \\\\ \theta_{reg} &= \frac{\lambda}{m}\theta \\\\ \end{aligned}

Logistic Regression

hθ(x)=g(θTx) where g(z)=11+ezJ(θ)=12mi=1m(y(i)log(hθ(x(i))+(1y(i))log(1hθ(x(i)))θ=θα1mi=1m(hθ(x(i))y(i))x(i)J(θ)reg=λ2mj=1nθj2θreg=λmθ\begin{aligned} h_\theta(x) &= g(\theta^Tx) \text{ where } g(z) = \frac{1}{1 + e^{-z}} \\\\ J(\theta) &= -\frac{1}{2m}\sum\limits_{i=1}^m(y^{(i)}log(h_\theta(x^{(i)})+(1-y^{(i)})log(1-h_\theta(x^{(i)}))\\\\ \theta &= \theta - \alpha\frac{1}{m}\sum\limits_{i=1}^m(h_\theta(x^{(i)}) - y^{(i)})x^{(i)} \\\\ J(\theta)_{reg} &= \frac{\lambda}{2m}\sum\limits_{j=1}^n\theta_j^2 \\\\ \theta_{reg} &= \frac{\lambda}{m}\theta \\\\ \end{aligned}

Neural Networks

Support Vector Machines

Kernels

K-means Clustering

Recommender Systems

Introduction
Definitions
Linear Regression
Logistic Regression
Neural Networks
Support Vector Machines
Kernels
K-means Clustering
Recommender Systems
Previous Post
UofT Hacks
Next Post
CCC 2016 Analysis