Cela vaut peut-être la peine d'être lu sur la dualité lagrangienne et une relation plus large (parfois équivalente) entre:
- optimisation soumise à des contraintes dures (c'est-à-dire inviolables)
- optimisation avec pénalités pour violation des contraintes.
Introduction rapide à la dualité faible et à la dualité forte
Supposons que nous ayons une fonction de deux variables. Pour tout x et y , nous avons:f(x,y)x^y^
minxf(x,y^)≤f(x^,y^)≤maxyf(x^,y)
Depuis détient pour tout x et y détient également que:x^y^
maxyminxf(x,y)≤minxmaxyf(x,y)
This is known as weak duality. In certain circumstances, you have also have strong duality (also known as the saddle point property):
maxyminxf(x,y)=minxmaxyf(x,y)
When strong duality holds, solving the dual problem also solves the primal problem. They're in a sense the same problem!
Lagrangian for constrained Ridge Regression
Let me define the function L as:
L(b,λ)=∑i=1n(y−xi⋅b)2+λ(∑j=1pb2j−t)
The min-max interpretation of the Lagrangian
The Ridge regression problem subject to hard constraints is:
minbmaxλ≥0L(b,λ)
You pick b to minimize the objective, cognizant that after b is picked, your opponent will set λ to infinity if you chose b such that ∑pj=1b2j>t.
If strong duality holds (which it does here because Slater's condition is satisfied for t>0), you then achieve the same result by reversing the order:
maxλ≥0minbL(b,λ)
Here, your opponent chooses λ first! You then choose b to minimize the objective, already knowing their choice of λ. The minbL(b,λ) part (taken λ as given) is equivalent to the 2nd form of your Ridge Regression problem.
As you can see, this isn't a result particular to Ridge regression. It is a broader concept.
References
(I started this post following an exposition I read from Rockafellar.)
Rockafellar, R.T., Convex Analysis
You might also examine lectures 7 and lecture 8 from Prof. Stephen Boyd's course on convex optimization.