machine learning basis: lagrange duality and kkt condition

Introduction

Duality and KKT condition are very important for machine learning, especially in SVM models. I’ll focus more on the high-level idea and the derivation of Lagrange Duality and how to introduce to KKT condition. There are some connceptions should be covered first.

Optimization Problems without Restrictions

The basic form of optimization problem without restrictions is just like finding the $x in Re^d$ makes that

A simple solution is to calculate the derivatives of $f(x)$, solve the equation and test if is the minimal value.

Lagrange Multiplier

Consider an Optimization Problem with equality restrictions.

Lagrange Multiplier is a method to solve this kind of problem. We can rewrite the objective function as . Then we can prove that the solution of is equal to the solution of previous problem. Here are called the Lagrange Multiplier. The new optimization problem is

And here the new function is called Lagrange function.

If there is any , the minimum will become $-infty$ due to the unrestricted , so we should add a restriction that which makes the solution finite and converge into .

Dual Problem

Let ,then there is a trivial theorem that

Here is the dual problem of .

Derivation

Transformation of Primal Problem

Assume are continuous functions on , then consider the restricted optimization problem.

We have already known that the primal problem without restrictions can be solved easily by calculating derivatives and testing. So our first step is to translate the primal problem into a problem without restrictions.

We have a enhanced Lagrange Function in form of . Here because the direction of has been restricted.

Define a new function , we can conclude that under all primal constraints.

Obviously, , so is an upper bound of . Then under all constraints, we have , then .

In conclusion, the primal problem has an equivalent form

KKT condition

We have already known the equivalent form of primal problem, but in this form we should still consider the constraints which makes the calculation too complicated. The next step is to find a simpler way of finding the best solution.

Consider the dual problem, a well property should be when is the best solution of primal problem.

Think back the transformation of primal problem, if dual problem is equal to the primal problem on $x = x^*$, the formula should be

Then consider the Lagrange condition of both inner optimizations which are and . This leads to and .

Then consider the parameter . There are two situations about . First is that the minimized point is of and the other is that the minimized point is of .

For the first case, the inequality constraint becomes a equality constraint. That is

For the second case, the inequality constraint disappears, that is

So combine two situations together we have . Then under this constraint, the becomes a regular Lagrange Function, which leads to a Lagrange Multiplier constraint that

So the final constraint becomes

This is the KKT condition.