
In our daily life, we can see many classification examples, such as:
-
It is that when we go to the market and buy food, we use classification. We identify the fruits, the vegetables and the meats.
-
It is that when we take examinations and get a score, we use classification. We may ask other people, “Do you got an A?” or “Have you passed the exam?”, this is also classification.
However, how could machine make classification decisions by themselves? By what? Actually, by training a lot of what we said “training-data”, then the machine build up a connection between the information they get and the prediction they should make, therefore, they can try to make classifications.
In this article, I will describe the “binary classification” problems first, and then we come to “multi-classification” problems.
Binary classification
If we focus on more on “yes/no” rather than “what is it”, we can easily understand it. For example, rather than consider “What’s your score of the math exam?”, we focus more on “Have you passed the math exam?”. The latter, exactly, a “Binary classification” problem.
General describe
y∈{0,1}, event A
y = 1 if A happens, also called the positive class
y = 0 if A don’t happen, also called the negative class
Hypothesis Representataion - Sigmoid Function
When we talked about machine learning, we know there should be a hypothesis funcion, hθ(x).
In the classification problem, the linear function{hθ(x)=θTX} doesn’t make sense, since it can divide the examples well and its value range is hθ(x)∈R rather than [0, 1]. Therefore, a new hypothesis funcion is needed, called the “Sigmoid Function” or “Logistic Function”.
hθ(x) = g(θTx), z = θTx
g(z) = 1/(1 + e-z),
hθ(x) = 1/(1 + e-θTx)
It shows like the following figure, which can be easily drew using the follow Python code.

import numpy as np
x = np.linspace(-10,10,100)
y = 1/(1 + np.exp(-x))
plt.plot(x, y)
plt.title("Sigmoid Function")
plt.show()
So we can see that the sigmoid function is a convergent function with the value range of [0, 1], exactly what we want for a binary classification problem. However, we still need to do some trick to this function to get {0, 1} classification.
hθ(x) >= 0.5 –> y = 1
hθ(x) < 0.5 –> y = 0
This transition boundary of 0.5 is also called the “Decision Boundary”, from which we can define the boundary of the datas to be “1/0”.
Cost funciton
If we still use the cost function form in the linear regression, there is a problem. Here, we can draw the Square of difference between y and hθ(x) using Python, which is (hθ(x)-y)2.

from matplotlib import pyplot as plt
import numpy as np
x = np.linspace(-10,10,100)
y = 1/(1 + np.exp(-x))
sigmoid_list = list()
diff_list = list()
for sig_y in y:
if sig_y >= 0.5:
sig_y = 1
else:
sig_y = 0
sigmoid_list.append(sig_y)
for i in range(len(y)):
diff = (y[i] - sigmoid_list[i])**2
diff_list.append(diff)
plt.figure(figsize=(10,6))
ax1 = plt.subplot(131)
plt.title("Sigmoid Funciton")
plt.plot(x, y, color="r")
ax2 = plt.subplot(132)
plt.title("Classified Sigmoid Funciton")
plt.plot(x, sigmoid_list, color='g')
ax3 = plt.subplot(133)
plt.plot(x, diff_list, color='b')
plt.title("Square of difference")
plt.show()
print(diff_list)
The curve we get is a non-convex figure. In order to get a convex figure, we use a new cost function form, called Logistic Regression Cost Function.
cost(hθ(x), y) = -loghθ(x), if y=1
cost(hθ(x), y) = -log(1-hθ(x)), if y=0
By doing this, we can get a convex figure.

We can get this figure by slightly change the Python code:
from matplotlib import pyplot as plt
import numpy as np
x = np.linspace(-5,5,100)
y = 1/(1 + np.exp(-x))
sigmoid_list = list()
cost1 = list()
cost2 = list()
for sig_y in y:
if sig_y >= 0.5:
sig_y = 1
else:
sig_y = 0
sigmoid_list.append(sig_y)
for i in range(len(sigmoid_list)):
if sigmoid_list[i] == 1:
cost1.append(-np.log(y[i]))
else:
cost2.append(-np.log(1-y[i]))
plt.figure(figsize=(10, 6))
ax1 = plt.subplot(121)
plt.plot(x[0:50], cost1)
plt.title("y = 1")
ax2 = plt.subplot(122)
plt.plot(x[50:], cost2)
plt.title("y = 0")
plt.show()




近期评论