Skip to main content
Log in

Leave-one-out bounds for support vector ordinal regression machine

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The success of support vector machine depends upon its parameters. The leave-one-out (LOO) method provides a quantitative criterion for selecting those parameters. However, one shortcoming of the LOO method is that it is highly time consuming. An effective approach is to approximate the LOO error by an upper bound. This paper is concerned with the support vector ordinal regression machine (SVORM). Two bounds of the LOO error for SVORM are presented. The first bound is based on the geometrical concept of a span. The second one is based on the concept of support vector. Preliminary numerical experiments show the validity of the bounds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Boser B, Guyon I, Vapnik V (1992) A training algorithm for optimal margin classifiers. In: Proceeding of the 5th annual ACM workshop on computational learing theory, pp 144–152

  2. Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  3. Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Springer, New York

    MATH  Google Scholar 

  4. Jaakkola TS, Haussler D (1999) Exploiting generative models in discriminative classifiers. In: Advances in neural information processing systems, vol 11. MIT Press, Cambridge, pp 487–493

  5. Vapnik V, Chapelle O (2000) Bounds on error expectation for support vector machines. Neural Comput 12(9):2013–2036

    Article  Google Scholar 

  6. Gretton A, Herbrich R, Chapelle O (2003) Estimating the leave-one-out error for classification learning with SVMs. http://www.kyb.tuebingen.mpg.de/publications/pss/ps1854.ps, May 15

  7. Joachims T (2000) Estimating the generalization performance of an SVM efficientily. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, San Franscisco, pp 431–438

  8. Tian Y-J (2005) Support vector regession machine and its application. Ph.D. thesis, China Algricultural University

  9. Chang M-W, Lin C-J (2005) Leave-one-out bounds for support vector regression model selection. Neural Comput 17(5):1188–1222

    Article  MATH  MathSciNet  Google Scholar 

  10. Shashua A, Levin A (2002) Taxonomy of large margin principle algorithms for ordinal regression problems. In: Advances in neural information processing systems, vol 15. MIT Press, Cambridge, pp 57–64

  11. Herbrich R, Graepel R, Bollmann-Sdorra P, Obermayer K (1998) Learning a preference relation for information retrieval. In: Proceedings of the AAAI workshop text categorization and machine learning, Madison, USA

  12. Tangian A, Gruber J (1995) Constructing quadratic and polynomial objective functions. In: Proceedings of the 3rd international conference on econometric decision models, Schwerte, Germany. Springer, Heidelberg, pp 166–194

  13. Anderson J (1984) Regression and ordered categorical variables (with discussion). J R Stat Soc C Ser B 46:1–30

    MATH  Google Scholar 

  14. Herbrich R, Graepel T, Obermayer K (1999) Support vector learning for ordinal regression. In: Proceedings of the ninth international conference on arrifical neural networks, pp 97–102

  15. Chu W, Keerthi SS (2005) New approaches to support vector ordinal regression. In: Proceedings of international conference on machine learning (ICML-05), pp 145–152

  16. Arie BD, Yoav G (2005) Ordinal datasets. http://www.cs.waikato.ac.nz/ml/weka/

  17. Weston J (1999) Leave-one-out support vector machines. In: Proceedings of the sixteenth international joint conference on artificial intelligence, pp 727–733

Download references

Acknowledgments

We would like to thank anonymous reviewers for their very concrete and helpful comments which improve this paper greatly.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Naiyang Deng.

Additional information

This work is supported by the Key Project of National Natural Science Foundation of China (no. 10631070), the National Natural Science Foundation of China (no. 10801112, 70601033, 10601064) and the China Postdoctoral Science Foundation funded project (no. 20080430573).

Appendix

Appendix

1.1 Proof of Lemma 1

Proof

We only prove that the set Λ q p is non-empty since the corresponding result for the set Λ *q p can be shown similarly.

Let us define Λ q+ p as the subset of Λ q p with additional constraint λ j i  ≥ 0:

$$ \Uplambda_p^{q+}=\left\{\sum_{i\in M_p^q(\alpha,q)}\lambda_i^q\hbox{x}_i^q+\sum_{i\in M_p^q(\alpha^*,q+1)}\lambda_i^{q+1}\hbox{x}_i^{q+1}\in \Uplambda_p^q, \lambda_i^j\geq 0,j=q,q+1\right\}. $$
(42)

Next we shall show that the set Λ q p is non-empty by proving the subset Λ q+ p  ≠ ∅. In order to prove that \(\Uplambda_p^{q+}\neq \varnothing\) , it is sufficient to prove that there exists a vector λ such that

$$ \lambda_i^q=\mu{\frac{C-\alpha_i^q}{\alpha_p^q}},\quad i\in M_p^q(\alpha,q), $$
(43)
$$ \lambda_i^{q+1}=\mu{\frac{\alpha_i^{q+1}}{\alpha_p^q}},\quad i\in M_p^q(\alpha^*,q+1), $$
(44)
$$ 0\leq \mu\leq 1, $$
(45)
$$ \sum_{i\in M_p^q (\alpha,q)}\lambda_i^q+\sum_{i\in M_p^q(\alpha^*,q+1)}\lambda_i^{q+1}=1, $$
(46)

because it is straightforward to show that when a vector λ satisfies (43)–(46), we have

$$ \sum\limits_{i\in M_p^q(\alpha,q)}\lambda_i^q\hbox{x}_i^q+\sum\limits_{i\in M_p^q(\alpha^*,q+1)}\lambda_i^{q+1}\hbox{x}_i^{q+1}\in \Uplambda_p^{q+}, $$
(47)

and therefore Λ q+ p  ≠ ∅. Now we prove that the vector λ satisfying (43)–(46) exists. Taking into account equations (43) and (44), we rewrite constraint (46) as

$$ 1={\frac{\mu}{\alpha_p^q}}\left[\sum_{i\in M_p^q(\alpha,q)}(C-\alpha_i^q)+\sum_{i\in M_p^q(\alpha^*,q+1)}\alpha_i^{q+1}\right]. $$
(48)

Thus, it is sufficient to show that the value of μ given by equation (48) satisfies constraint (45). For this purpose, define

$$ \Updelta=\sum_{i\in M(\alpha,q)}(C-\alpha_i^q)+\sum_{i\in M(\alpha^*,q+1)}\alpha_i^{q+1}, $$
(49)
$$ =\sum_{i\in M(\alpha,q)}C-\sum_{i\in M(\alpha,q)}\alpha_i^q+\sum_{i\in M(\alpha^*,q+1)}\alpha_i^{q+1}. $$
(50)

Noting that

$$ \begin{aligned} 0&=\sum_{i\in M(\alpha^*,q+1)}\alpha_i^{q+1}+\sum_{i\in N(\alpha^*,q+1)}\alpha_i^{q+1}-\sum_{i\in M(\alpha,q)}\alpha_i^{q}-\sum_{i\in N(\alpha,q)}\alpha_i^{q}\\ &=\sum_{i\in M(\alpha^*,q+1)}\alpha_i^{q+1}-\sum_{i\in M(\alpha,q)}\alpha_i^{q}+\sum_{i\in N(\alpha^*,q+1)}C-\sum_{i\in N(\alpha,q)}C, \end{aligned} $$
(51)

and combining equation (50) and (51), we get

$$ \Updelta=\sum_{i\in M(\alpha,q)}C-\sum_{i\in N(\alpha^*,q+1)}C+\sum_{i\in N(\alpha,q)}C=C\tau, $$
(52)

where τ is a integer.

Since equation (49) gives Δ > 0, we have

$$ \Updelta\geq C. $$
(53)

Rewrite equation (48) as

$$ 1={\frac{\mu}{\alpha_p^q}}(\Updelta-(C-\alpha_p^q)), $$

or

$$ \mu={\frac{\alpha_p^q}{\Updelta-(C-\alpha_p^q)}}. $$

Taking into account inequality (53), we finally get 0 ≤ μ ≤ 1. Thus the set Λ q+ p is not empty. Namely, the set Λ q p is not empty. \(\square\)

1.2 Proofs of Lemma 2 and Lemma 3

We only give the proof of Lemma 2 since the proof of Lemma 3 is similar.

Proof

Recalling (10), the vector w can be expressed as

$$ \hbox{w}=\sum_{j\neq q}\sum_{i=1}^{l^j}(\alpha_i^{*j}-\alpha_i^j)\hbox{x}_i^j+\sum_{i\neq q}(\alpha_i^{*q}-\alpha_i^q)+(\alpha^{*q}_p-\alpha_p^q)\hbox{x}_p^q. $$
(54)

Now we endeavor to replace support vector x q p with a linear combination of the remaining margin support vectors about α in the qth class points and margin support vectors about α* in the q + 1th class points; this gives:

$$ \hbox{x}_p^q\approx \sum\limits_{i\in M_p^q(\alpha,q)}\lambda_i^q\hbox{x}_i^q+\sum\limits_{i\in M_p^{q}(\alpha^*,q+1)}\lambda_i^{q+1}\hbox{x}_i^{q+1}=\tilde{\hbox{x}}_p^q. $$

Taking this replacement yields an approximate expression for w

$$ \begin{aligned} \tilde{\hbox{w}}&=\sum_{j\neq q}\sum_{i=1}^{l^j}(\alpha_i^{*j}-\alpha_i^j)\hbox{x}_i^j+\sum_{i\neq q}(\alpha_i^{*q}-\alpha_i^q)+(\alpha^{*q}_p-\alpha_p^q)\left[\sum\limits_{i\in M_p^q(\alpha,q)}\lambda_i^q\hbox{x}_i^q+\sum\limits_{i\in M_p^{q}(\alpha^*,q+1)}\lambda_i^{q+1}\hbox{x}_i^{q+1}\right], \\ &=\sum_{j\neq q,q+1}\sum_{i=1}^{l^j}(\alpha_i^{*j}-\alpha_i^j)\hbox{x}_i^j+\sum_{i\notin M_p^q(\alpha,q),i\neq p}(\alpha_i^{*q}-\alpha_i^{q})\hbox{x}_i^{q}+\sum_{i\notin M_p^q(\alpha^*,q+1)}(\alpha_i^{*q+1}-\alpha_i^{q+1})\hbox{x}_i^{q+1}\\ &\quad+\sum_{i\in M _p^q(\alpha,q)}\left[\underbrace{(\alpha_i^{*q}-\alpha_i^q)+(\alpha_p^{*q}-\alpha_p^q)\lambda_i^q}_{\tilde{\alpha}_i^{*q}-\tilde{\alpha}_i^q}\right]\hbox{x}_i^q +\sum_{i\in M _p^q(\alpha^*,q+1)}\left[\underbrace{(\alpha_i^{*q+1}-\alpha_i^{q+1})+(\alpha_p^{*q}-\alpha_p^{q})\lambda_i^{q+1}}_{\tilde{\alpha}_i^{*q+1}-\tilde{\alpha}_i^{q+1}}\right]\hbox{x}_i^{q+1}, \end{aligned} $$

Setting

$$ \begin{aligned} \,&\tilde{\alpha}_i^{q}=\alpha_i^{q}+\lambda_i^{q}\alpha_p^{q},\ \tilde{\alpha}_i^{*q}=\alpha_i^{*q}+\lambda_i^{q}\alpha_p^{*q},\quad i\in M_p^q(\alpha,q), \\ \,&\tilde{\alpha}_i^{q+1}=\alpha_i^{q+1}-\lambda_i^{q+1}\alpha_p^{*q},\quad \tilde{\alpha}_i^{*q+1}=\alpha_i^{*q+1}-\lambda_i^{q+1}\alpha_p^{q},\quad i\in M_p^q(\alpha^*,q+1), \\ \,&\tilde{\alpha}_i^q=\alpha_i^q,\tilde{\alpha}_i^{*q}=\alpha_i^{*q},\quad i\notin M_p^q(\alpha,q),\\ \,&\tilde{\alpha}_i^{q+1}=\alpha_i^{q+1},\tilde{\alpha}_i^{*q+1}=\alpha_i^{*q+1},\quad i\notin M_p^q(\alpha^*,q+1), \\ \,&\tilde{\alpha}_i^j=\alpha_i^j,\tilde{\alpha}_i^{*j}=\alpha_i^{*j},\quad j=1,\ldots,q-1,q+2,\ldots,k,\quad i=1,\ldots,l^j, \end{aligned} $$

we have

$$ \begin{aligned} &0\leq \alpha_i^{q}+\lambda_i^q\alpha_p^{q}\leq C,\quad 0\leq \alpha_i^{*q}+\lambda_i^{q}\alpha_p^{*q}\leq C, \quad i\in M _p^q(\alpha,q), \\ &0\leq \alpha_i^{q+1}-\lambda_i^{q+1}\alpha_p^{*q}\leq C,\quad 0\leq\alpha_i^{*q+1}-\lambda_i^{q+1}\alpha_p^{q}\leq C ,\quad i\in M _p^q(\alpha^*,q+1), \\ &\sum_{i\in M _p^q(\alpha,q)}\tilde{\alpha}_i^q+\sum_{i\in N _p^q(\alpha,q)}\alpha_i^q=\sum_{i\in M _p^q(\alpha^*,q+1)}\tilde{\alpha}_i^{*q+1}+\sum_{i\in N _p^q(\alpha^*,q+1)}\alpha_i^{*q+1}. \end{aligned} $$

The above equalities imply that

$$ \sum_{i\in V_p^q(\alpha,q)}\alpha_i^{q}+\alpha_p^q\sum_{i\in M_p^q(\alpha,q)}\lambda_i^{q}=\sum_{i\in V _p^q(\alpha^*,q+1)}\alpha_i^{*q+1}-\alpha_p^{q}\sum_{i\in M _p^q(\alpha^*,q+1)}\lambda_i^{q+1}. $$
(55)

According to the constraint (8) of the dual problem (7)–(9) again, we get

$$ \sum\limits_{i\in M_p^q(\alpha,q)}\lambda_i^{q}+\sum\limits_{i\in M_p^q(\alpha^*,q+1)}\lambda_i^{q+1}=1, \lambda_p^q=-1. $$
(56)

So, by (55) and (56), \(\tilde{\alpha}^{(*)}\) is a feasible solution of dual problem (7)–(9) for the training set \({T_p^q=T\setminus \{(\hbox{x}_p^q,y_p^q)\}.}\) \(\square\)

1.3 Proof of Lemma 4

Proof

1. For being left out the point x q p , we consider its margin support vector about α. Consider the following optimization problem

$$ W(\alpha^{(*)}):=\max_{\alpha^{(*)}}\quad\sum_{j,i}(\alpha_i^j+\alpha_i^{*j})-\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\alpha_i^{*j}-\alpha_i^j)(\alpha_{i'}^{*j'}-\alpha_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'}), $$
(57)
$$ \hbox{s.t.}\quad\sum_{i=1}^{l^j}\alpha_i^j=\sum_{i=1}^{l^{j+1}}\alpha_i^{*j+1},\quad j=1,2,\ldots,k-1, $$
(58)
$$0\leq \alpha_i^j,\alpha_i^{*j}\leq C,\quad j=1,\ldots,k,\quad i=1,\ldots,l^j, $$
(59)
$$\alpha_p^q=\alpha_p^{*q}=0. $$
(60)

Assuming that α (*)q p is the optimal solution of the optimization problem (57)–(60) and α(*) is the optimal solution of the dual problem (7)–(9), then the following inequalities hold:

$$ W(\alpha_p^{(*)q})\geq W(\alpha^{(*)}-\delta^{(*)}), $$
(61)
$$ W(\alpha^{(*)})\geq W(\alpha_p^{(*)q}+\gamma^{(*)}), $$
(62)

where δ(*) satisfies the following conditions:

$$ \begin{aligned} \,&0\leq \alpha_i^j-\delta_i^j\leq C, \quad 0\leq \alpha_i^{*j}-\delta_i^{*j}\leq C, \quad j=1,\ldots,k,\quad i=1,\ldots,l^j, \\ \,&\sum_{i=1}^{l^j}\delta_i^j=\sum_{i=1}^{l^{j+1}}\delta_i^{*j+1},\quad j=1,2,\ldots,k-1, \\ &\delta_i^{*1}=0,\quad i=1,2,\ldots,l^1,\delta_i^k=0,\quad i=1,2,\ldots,l^k, \\ \,&\delta_p^q=\alpha_p^q,\quad \delta_p^{*q}=\alpha_p^{*q}, \end{aligned} $$

and γ(*) satisfies the following conditions:

$$ 0\leq \alpha_{pi}^{qj}+\gamma_i^j\leq C,\quad 0\leq \alpha_{pi}^{*qj}+\gamma_i^{*j}\leq C,\quad j=1,\ldots,k,\quad i=1,\ldots,l^j, $$
(63)
$$ \sum_{i=1}^{l^q}\gamma_i^q=\sum_{i=1}^{l^{q+1}}\gamma_i^{*q+1}, $$
(64)
$$ \alpha_i^q=0\Rightarrow \gamma_i^q=0,\quad i\neq p,\quad \alpha_i^{*q+1}=0\Rightarrow \gamma_i^{*q+1}=0, $$
(65)
$$ \gamma_i^j=0,j\neq q,i=1,\ldots,l^j, \quad \gamma_i^{*j}=0,\quad j\neq q+1,\quad i=1,\ldots,l^j, $$
(66)
$$ \gamma_i^{*1}=0,\quad i=1,2,\ldots,l^1,\quad \gamma_i^k=0,\quad i=1,2,\ldots,l^k. $$
(67)

From inequality (61) we obtain

$$ W(\alpha^{(*)})-W(\alpha_p^{(*)q})\leq W(\alpha^{(*)})-W(\alpha^{(*)}-\delta^{(*)}). $$
(68)

Combining inequality (68) with (62), we get

$$ I_1=W(\alpha_p^{(*)q}+\gamma^{(*)})-W(\alpha_p^{(*)q})\leq W(\alpha^{(*)})-W(\alpha^{(*)}-\delta^{(*)})=I_2. $$
(69)

Next we calculate both the left hand side I 1 and the right hand side I 2 of inequality (69). First, for I 1, we have

$$ \begin{aligned} I_1&=W(\alpha_p^{(*)q}+\gamma^{(*)})-W(\alpha_p^{(*)q})\\ &=\sum_{j,i}(\alpha_{pi}^{qj}+\gamma_i^j+\alpha_{pi}^{*qj}+\gamma_i^{*j})-\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\alpha_{pi}^{*qj}+\gamma_i^{*j}-\alpha_{pi}^{qj}-\gamma_i^j)(\alpha_{pi'}^{*qj'}+\gamma_{i'}^{*j'}\\ &-\alpha_{pi'}^{qj'}-\gamma_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'})-\sum_{j,i}(\alpha_{pi}^{qj}+\alpha_{pi}^{*qj})+\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\alpha_{pi}^{*qj}-\alpha_{pi}^{qj})(\alpha_{pi'}^{*qj'}-\alpha_{pi'}^{qj'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'}),\\ &=\sum_{j,i}(\gamma_i^j+\gamma_i^{*j})-\sum_{j,i}\sum_{j',i'}(\alpha_{pi}^{*qj}-\alpha_{pi}^{qj})(\gamma_{i'}^{*j'}-\gamma_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'})-\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\gamma_i^{*j}-\gamma_i^j)(\gamma_{i'}^{*j'}-\gamma_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'}),\\ &=\sum_{j,i}(\gamma_i^j+\gamma_i^{*j})-\sum_{j,i}(\gamma_{i}^{*j}-\gamma_{i}^{j})(w_p^q\cdot \hbox{x}_i^j)-\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\gamma_i^{*j}-\gamma_i^j)(\gamma_{i'}^{*j'}-\gamma_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'}). \end{aligned} $$

According to equalities (63)–(67), we rewrite the above expression as

$$ \begin{aligned} I_1&=\sum_{i=1}^{l^{q+1}}\gamma_i^{*q+1}[1-(w_p^q\cdot \hbox{x}_i^{q+1})]+\sum_{i=1}^{l^{q}}\gamma_i^{q}[1+(w_p^q\cdot \hbox{x}_i^q)] -\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\gamma_i^{*j}-\gamma_i^j)(\gamma_{i'}^{*j'}-\gamma_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'}),\\ &=\sum_{i=1}^{l^{q+1}}\gamma_i^{*q+1}[1-(w_p^q\cdot \hbox{x}_i^{q+1})+b'_q]+\sum_{i=1}^{l^{q}}\gamma_i^{q}[1+(w_p^q\cdot \hbox{x}_i^q)-b'_q] \\ &\quad -\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\gamma_i^{*j}-\gamma_i^j)(\gamma_{i'}^{*j'}-\gamma_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'}). \end{aligned} $$

Taking into account that γ(*) satisfies the conditions (63)–(67), the following two equalities hold:

$$ \begin{aligned} \,&\gamma_i^{*q+1}[1-(w_p^q\cdot \hbox{x}_i^{q+1})+b'_q]=0, \\ \,&\gamma_i^{q}[1+(w_p^q\cdot \hbox{x}_i^q)-b'_q]=0,\quad i\neq p. \end{aligned} $$

So we obtain

$$ I_1=\gamma_p^{q}[1+(\hbox{x}_p^q\cdot \hbox{x}_i^q)-b'_q]-\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\gamma_i^{*j}-\gamma_i^j)(\gamma_{i'}^{*j'}-\gamma_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'}). $$
(70)

Now let us define vector γ(*) as follows:

$$ \gamma_p^q=\gamma_s^{*q+1}=a, $$
(71)
$$ \gamma_i^j=0,\quad j= q,\quad i\neq p, \gamma_i^{*j}=0,\quad j= q+1,\quad i\neq s, $$
(72)
$$ \gamma_i^j=0,\quad j\neq q,\quad i=1,\ldots,l^j,\quad \gamma_i^{*j}=0,\quad j\neq q+1,\quad i=1,\ldots,l^j, $$
(73)

where a is some constant and α *q+1 s  ∈ (0, C). Substituting (71)–(73) into (70), we get

$$ \begin{aligned} I_1&=a\left[(\hbox{x}_p^q\cdot \hbox{x}_p^q)-b'_q+1\right]-{\frac{a^2} 2}\|\hbox{x}_p^{q}-\hbox{x}_s^{q+1}\|^2\\ &\geq a\left[(\hbox{x}_p^q\cdot \hbox{x}_p^q)-b'_q+1\right]-{\frac{a^2} 2}D_{q,q+1}^2. \end{aligned} $$
(74)

where D q,q+1 is the diameter of the minimum sphere containing the qth class points and the q + 1th class points in the training set T. Now choose the value a* by maximizing the expression (74)

$$ a^*=\frac{\left[(\hbox{x}_p^q\cdot \hbox{x}_p^q)-b'_q+1\right]}{D_{q,q+1}^2}. $$

Putting this expression back into (74), we get

$$ I_1\geq \frac{{\left[(w_p^q\cdot \hbox{x}_p^q)-b'_q+1\right]^2}}{2D_{q,q+1}^2}. $$

Since, according to our assumption, when the LOO procedure commits an error at the point x q p , the following inequality holds

$$ (\hbox{x}_p^q\cdot \hbox{x}_p^q)-b'_q > 0, $$

we obtain

$$ I_1\geq \frac{1}{2D_{q,q+1}^2}. $$

But we need to fulfill the condition a ≤ C. Thus, if a* > C, we replace a by C in equality (74) and we get

$$ \begin{aligned} I_1&\geq C\left[(w_p^q\cdot \hbox{x}_p^q)-b'_q+1\right]-\frac{{C^2}}{2}D_{q,q+1}^2\\ &=CD_{q,q+1}^2\left(a^*-\frac{C}{2}\right)\\ &\geq CD_{q,q+1}^2\frac{{a^*}}{2}\\ &=C\left[(w_p^q\cdot \hbox{x}_p^q)-b'_q+1\right],\\ &\geq \frac{{C}}{2}. \end{aligned} $$

Finally, we have

$$ I_1\geq \frac{1}{ 2}\min\left(C,\frac{1}{D_{q,q+1}^2}\right). $$
(75)

Now we estimate the right hand side I 2 of the inequality (69). According to Lemma 2, we choose

$$ \begin{aligned} \,&\delta_i^{q}=-\lambda_i^{q}\alpha_p^{q},\quad \delta_i^{*q}=-\lambda_i^{q}\alpha_p^{*q},\quad i\in M_p^q(\alpha,q), \\ \,&\delta_i^{q+1}=\lambda_i^{q+1}\alpha_p^{*q},\quad \delta_i^{*q+1}=\lambda_i^{q+1}\alpha_p^{q},\quad i\in M_p^q(\alpha^*,q+1), \\ \,&\delta_i^{q}=\delta_i^{*q}=0,\quad i\notin M_p^q(\alpha,q), \\ \,&\delta_i^{q+1}=\delta_i^{*q+1}=0,\quad i\notin M_p^q(\alpha^*,q+1), \\ \,&\delta_i^{j}=\delta_i^{*j}=0, \quad j\neq q,q+1,\quad i=1,\ldots,l^j, \\ \,& \delta_p^q=\alpha_p^q,\quad \delta_p^{*q}=\alpha_p^{*q}, \end{aligned} $$

where

$$ \sum\limits_{i\in M_p^q(\alpha,q)}\lambda_i^q\hbox{x}_i^q+\sum\limits_{i\in M_p^{q}(\alpha^*,q+1)}\lambda_i^{q+1}\hbox{x}_i^{q+1}\in \Uplambda_p^q. $$

Then the right hand side I 2 of the inequality (69) is expressed as

$$ \begin{aligned} I_2&=W(\alpha^{(*)})-W(\alpha^{(*)}-\delta^{(*)}) \\ &=\sum_{j,i}(\alpha_i^j+\alpha_i^{*j})-\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\alpha_i^{*j}-\alpha_i^j)(\alpha_{i'}^{*j'}-\alpha_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'})-\sum_{j,i}(\alpha_i^j-\delta_i^j+\alpha_i^{*j}-\delta_i^{*j}) \\ &\quad+\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\alpha_i^{*j}-\delta_i^{*j}-\alpha_i^j+\delta_i^j)(\alpha_{i'}^{*j'}-\delta_{i'}^{*j'}-\alpha_{i'}^{j'}+\delta_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'}) \\ &=\sum_{j,i}(\delta_i^j+\delta_i^{*j})-\sum_{j,i}\sum_{j',i'}(\alpha_i^{*j}-\alpha_i^j)(\delta_{i'}^{*j'}-\delta_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'})+\frac{1}{2}\sum_{j,i}\sum_{j',i'}(\delta_i^{*j}-\delta_i^j)(\delta_{i'}^{*j'}-\delta_{i'}^{j'})(\hbox{x}_i^j\cdot \hbox{x}_{i'}^{j'}) \\ &=-(\alpha_p^q+\alpha_p^{*q})\left[\sum_{i\in M(\alpha,q)\cup \{p\}}\lambda_i^{q}-\sum_{i\in M(\alpha^*,q+1)}\lambda_i^{q+1}\right]\\ &\quad+(\alpha_p^q+\alpha_p^{*q})\left[\sum_{i\in M(\alpha,q)\cup \{p\}}\lambda_i^{q}(w\cdot \hbox{x}_i^{q})+\sum_{i\in M(\alpha^*,q+1)}\lambda_i^{q+1}(w\cdot \hbox{x}_i^{q+1})\right]\\ &\quad+\frac{{(\alpha_p^{*q}-\alpha_p^q)^2}}{2}\left\|\hbox{x}_p^q-\sum\limits_{i\in M_p^q(\alpha,q)}\lambda_i^{q}\hbox{x}_i^{q-1}+\sum\limits_{i\in M_p^{q}(\alpha^*,q+1)}\lambda_i^{q+1}\hbox{x}_i^{q+1}\right\|^2.\\ \end{aligned} $$

Since the definition of Λ q p implies that

$$ \sum_{i\in M_p^q(\alpha,q)}\lambda_i^{q}+ \sum_{i\in M_p^q(\alpha^*,q+1)}\lambda_i^{q+1}=1, \lambda_p^q=-1, $$
(76)

We have

$$ \begin{aligned} I_2&=-(\alpha_p^q+\alpha_p^{*q})\sum_{i\in M_p^q(\alpha,q)\cup \{p\}}\lambda_i^{q}[1-(w\cdot \hbox{x}_i^{q})]\\ &\quad+(\alpha_p^q+\alpha_p^{*q})\sum_{i\in M(\alpha^*,q+1)}\lambda_i^{q+1}[1+(w\cdot \hbox{x}_i^{q+1})]+\frac{{(\alpha_p^{*q}-\alpha_p^q)^2}}{2}S^2(p,q),\\ &=-(\alpha_p^q+\alpha_p^{*q})\sum_{i\in M_p^q(\alpha,q)\cup \{p\}}\lambda_i^{q}[1-(w\cdot \hbox{x}_i^{q})+b_q]\\ &\quad+(\alpha_p^q+\alpha_p^{*q})\sum_{i\in M(\alpha^*,q+1)}\lambda_i^{q+1}[1+(w\cdot \hbox{x}_i^{q+1})-b_q]+\frac{{(\alpha_p^{*q}-\alpha_p^q)^2}}{2}S^2(p,q),\\ &=\frac{{(\alpha_p^{*q}-\alpha_p^q)^2}}{2}S^2(p,q). \end{aligned} $$
(77)

Combining the equalities (69), (75) and (77), we obtain

$$ (\alpha_p^{*q}-\alpha_p^q)^2S^2(p,q)\geq\min \left(C,\frac{1}{D_{q,q+1}^2}\right), $$
(78)

where D q,q+1 is the diameter of the minimum sphere containing both the qth class points and the q + 1th class points in the training set T.

2. Similarly to the process of deriving (78), for being left out the point x q p , we can get the following inequality by considering its margin support vector about α*:

$$ (\alpha_p^{*q}-\alpha_p^q)^2S^{*2}(p,q)\geq\min \left(C,\frac{1}{D_{q-1,q}^2}\right), $$
(79)

where D q−1,q is the diameter of minimum sphere containing the q−1th class points and the qth class points in the training set T. \(\square\)

1.4 Proof of Lemma 5

Proof

Suppose that α(*) is the optimal solution of the problem (7)–(9). It is sufficient to study the following three cases, respectively:

1. The case α q p  = α *q p  = 0: Being left out the points (x q p , y q p ) is not a support vector. Then the object function value of the problem (7)–(9) is equal to that of the problem (34)–(36), namely, W(*)) = W q p (*)q p ). So the decision function does not change after left out the point (x q p , y q p ). So the point (x q p , y q p ) is not counted as a leave one out error.

2. The case α q p  > 0: Being left out the point (x q p , y q p ) is a support vector. Starting from the solution α (*)q p of the problem (34)–(36), a feasible points β(*) of the problem (7)–(9) can be constructed by

$$ \begin{aligned} \beta_i^j&=\left\{\begin{array}{ll} \alpha_{pi}^{qj},& j=1,\ldots,q-1,q+1,\ldots,k,\quad i=1,\ldots,l^j,\\ \alpha_{pi}^{qq},& \alpha_{pi}^{qq}=0,\quad \alpha_{pi}^{qq}=C,\\ \alpha_{pi}^{qq}-\nu_i^q,& i\in M(\alpha_p^{(*)q},q),\\ \alpha_p^q, & j=q,\quad i=p,\\ \end{array}\right.\\ \beta_i^{*j}&=\left\{\begin{array}{ll} \alpha_{pi}^{*qj},& j=1,\ldots,q-1,q+1,\ldots,k,\quad i=1,\ldots,l^j,\\ \alpha_{pi}^{*qq},& \alpha_{pi}^{*qq}=0,\quad \alpha_{pi}^{*qq}=C,\\ \alpha_{pi}^{*qq}-\nu_i^{*q},& i\in M(\alpha_p^{(*)q},q),\\ \alpha_p^{*q}, & j=q,\quad i=p,\\ \end{array}\right. \end{aligned} $$

where M (*)q p , q) is the set of margin support vector about α(*) for the problem (34)–(36), ν q i and ν *q i , respectively, satisfy conditions

$$ \sum_{i\in M(\alpha_p^{(*)q},q)}\nu_i^q=\alpha_p^q, $$
(80)

and

$$ \sum_{i\in M(\alpha_p^{(*)q},q)}\nu_i^{*q}=\alpha_p^{*q}. $$
(81)

It is easy to see that

$$ \begin{aligned} \,&0\leq \beta_i^j,\quad \beta_i^{*j}\leq C,\quad j=1,\ldots,k,\quad i=1,\ldots,l^j, \\ \,&\sum_{i=1}^{l^j}\beta_i^j=\sum_{i=1}^{l^{j+1}}\beta_i^{*j+1},\quad j=1,\ldots,k-1. \end{aligned} $$

Thus β(*) is a feasible solution of the problem (7)–(9). After a series of transformations, W(*)) can be written as

$$ \begin{aligned} W(\beta^{(*)})=&W_p^q(\alpha_p^{(*)q})+(\alpha_p^{*q}+\alpha_p^q)-\frac{1}{ 2}(\alpha_p^{*q}-\alpha_p^q)^2K(x_p^q,x_p^q)\\ &-(\alpha_p^{*q}-\alpha_p^q)\sum_{(j,i)\in I\setminus\{(q,p)\}}(\alpha_{pi}^{*qq}-\alpha_{pi}^{qq})K(x_p^q,x_i^j)\\ &+\sum_{i\in M(\alpha_p^{(*)q},q)}(\nu_i^{*q}-\nu_i^q)\left[1+\sum_{(j',i')\in I\setminus\{(q,p)\}}(\alpha_{pi'}^{*qj'}-\alpha_{pi'}^{qj'})K(x_i^q,x_{i'}^{j'})\right]\\ &+(\alpha_p^{*q}-\alpha_p^q)\sum_{i\in M(\alpha_p^{(*)q},q)}(\nu_i^{*q}-\nu_i^q)K(x_p^q,x_i^q)\\ &-\frac{1}{ 2}\sum_{i\in M(\alpha_p^{(*)q},q)}\sum_{i'\in M(\alpha_p^{(*)q},q)}(\nu_i^{*q}-\nu_i^q)(\nu_{i'}^{*q}-\nu_{i'}^q)K(x_i^q,x_{i'}^q). \end{aligned} $$
(82)

According to the assumption in the lemma, there exists at least a α j i  ∈ (0, C), j = 1,…, k in each class points. Therefore,

$$ 1+\sum_{(j',i')\in I\setminus\{(q,p)\}}(\alpha_{pi'}^{*qj'}-\alpha_{pi'}^{qj'})K(x_i^q,x_{i'}^{j'})=b'_q. $$
(83)

From the equalities (80) and (81), we get

$$ \sum_{i\in M(\alpha_p^{(*)q},q)}(\nu_i^{*q}-\nu_i^q)=(\alpha_p^{*q}-\alpha_p^q). $$
(84)

So we rewrite the equality (82) as

$$ \begin{aligned} \,&(\alpha_p^{*q}-\alpha_p^q) {\left [\sum_{(j,i)\in I\setminus\{(q,p)\}}(\alpha_{pi}^{*qj}-\alpha_{pi}^{qj})K(x_p^q,x_i^j)-b'_q\right]} \\ \,&\quad=-W(\beta^{(*)})+W_p^q(\alpha_p^{(*)q})+(\alpha_p^{*q}+\alpha_p^q)-\frac{1}{ 2}(\alpha_p^{*q}-\alpha_p^q)^2K(x_p^q,x_p^q)\\ \,&\quad\quad-\frac{1}{ 2}\sum_{i\in M(\alpha_p^{(*)q},q)}\sum_{i'\in M(\alpha_p^{(*)q},q)}(\nu_i^{*q}-\nu_i^q)(\nu_{i'}^{*q}-\nu_{i'}^q)K(x_i^q,x_{i'}^q),\\ \,&\quad\quad+(\alpha_p^{*q}-\alpha_p^q)\sum_{i\in M(\alpha_p^{(*)q},q)}(\nu_i^{*q}-\nu_i^q)K(x_p^q,x_i^q). \end{aligned} $$
(85)

Similarly, let us construct a feasible solution γ(*) of the problem (34)–(36) based on the solution α(*) of the problem (7)–(9) by

$$ \gamma_i^j= \left\{\begin{array}{ll} \alpha_i^j,& j=1,\ldots,q-1,q+1,\ldots,k, i=1,\ldots,l^j,\\ \alpha_i^q,& \alpha_i^q=0,\quad \alpha_i^q=C,\\ \alpha_i^q+\mu_i^q,& i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}, \end{array}\right. $$

and

$$ \gamma_i^{*j}= \left\{\begin{array}{ll} \alpha_i^{*j},& j=1,\ldots,q-1,q+1,\ldots,k,\quad i=1,\ldots,l^j,\\ \alpha_i^{*q},& \alpha_i^q=0 \hbox{ or } \alpha_i^q=C,\\ \alpha_i^{*q}+\mu_i^{*q},& i\in M(\alpha^{(*)},q)\setminus \{(q,p)\},\\ \end{array}\right. $$

where μ q i  ≥ 0 and μ *q i  ≥ 0, respectively, satisfy conditions

$$ \sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}\mu_i^q=\alpha_p^q; $$

and

$$ \sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}\mu_i^{*q}=\alpha_p^{*q}. $$

It is easy to see that

$$ \begin{aligned} &0\leq \gamma_i^j,\quad \gamma_i^{*j}\leq C, \quad (j,i)\in I\setminus \{(q,p)\},\\ &\sum_{i=1}^{l^j}\gamma_i^j=\sum_{i=1}^{l^{j+1}}\gamma_i^{*j+1},\quad (j,i)\in I\setminus \{(q,p)\}. \end{aligned} $$

So γ(*) is a feasible solution to the problem (34)–(36). After a series of transformations, W q p (*)) can be written as

$$ \begin{aligned} W_p^q(\gamma^{(*)})=&W(\alpha^{(*)})+(\alpha_p^{*q}-\alpha_p^q)+\frac{1}{ 2}(\alpha_p^{*q}-\alpha_p^q)^2K(x_p^q,x_p^q)\\ &+(\alpha_p^{*q}-\alpha_p^q)\sum_{(j,i)\in I\setminus \{(q,p)\}}(\alpha_i^{*j}-\alpha_i^j)K(x_p^q,x_i^j)\\ &-\sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}(\mu_i^{*q}-\mu_i^q)\left[1+\sum_{j',i'}(\alpha_{i'}^{*j'}-\alpha_{i'}^{j'})K(x_i^q,x_{i'}^{j'})\right]\\ &-\frac{1}{ 2}\sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}\sum_{i'\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}(\mu_i^{*q}-\mu_i^q)(\mu_{i'}^{*q}-\mu_{i'}^q)K(x_i^q,x_i^q)\\ &+(\alpha_p^{*q}-\alpha_p^q)\sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}(\mu_i^{*q}-\mu_i^q)K(x_p^q,x_i^q). \end{aligned} $$
(86)

According to the assumption that there exists a α j i  ∈ (0, C), j = 1,…, k in each class points at least, then we have

$$ 1+\sum_{j',i'}(\alpha_{i'}^{*j'}-\alpha_{i'}^{j'})K(x_i^q,x_{i'}^{j'})=b_q. $$

So we rewrite the equality (86) as

$$ \begin{aligned} -W(\alpha^{(*)})=&-W_p^q(\gamma^{(*)})+(\alpha_p^{*q}-\alpha_p^q)+\frac{1}{ 2}(\alpha_p^{*q}-\alpha_p^q)^2K(x_p^q,x_p^q)\\ &+(\alpha_p^{*q}-\alpha_p^q)\left[\sum_{(j,i)\in I\setminus \{(q,p)\}}(\alpha_i^{*j}-\alpha_i^j)K(x_p^q,x_i^j)-b_q\right]\\ &-\frac{1}{ 2}\sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}\sum_{i'\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}(\mu_i^{*q}-\mu_i^q)(\mu_{i'}^{*q}-\mu_{i'}^q)K(x_i^q,x_i^q)\\ &+(\alpha_p^{*q}-\alpha_p^q)\sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}(\mu_i^{*q}-\mu_i^q)K(x_p^q,x_i^q). \end{aligned} $$
(87)

Due to W(*)) ≥ W(*)), W q p (*)q p ) ≥ W q p (*)) and the equalities (85) and (87), we get

$$ \begin{aligned} \,&(\alpha_p^{*q}-\alpha_p^q) {\left [\sum_{(j,i)\in I\setminus\{(q,p)\}}(\alpha_{pi}^{*qj}-\alpha_{pi}^{qj})K(x_p^q,x_i^j)-b'_q \right]} \geq (\alpha_p^{*q}-\alpha_p^q) {\left [\sum_{(j,i)\in I\setminus \{(q,p)\}}(\alpha_i^{*j}-\alpha_i^j)K(x_p^q,x_i^j)-b_q \right]} \\ \,&\quad-\frac{1}{ 2}\sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}\sum_{i'\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}(\mu_i^{*q}-\mu_i^q)(\mu_{i'}^{*q}-\mu_{i'}^q)K(x_i^q,x_i^q)\\ \,&\quad+(\alpha_p^{*q}-\alpha_p^q)\sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}(\mu_i^{*q}-\mu_i^q)K(x_p^q,x_i^q) \\ \,&\quad-\frac{1}{ 2}\sum_{i\in M(\alpha_p^{(*)q},q)}\sum_{i'\in M(\alpha_p^{(*)q},q)}(\nu_i^{*q}-\nu_i^q)(\nu_{i'}^{*q}-\nu_{i'}^q)K(x_i^q,x_{i'}^q)+(\alpha_p^{*q}-\alpha_p^q)\sum_{i\in M(\alpha_p^{(*)q},q)}(\nu_i^{*q}-\nu_i^q)K(x_p^q,x_i^q). \end{aligned} $$

According to the definitions of ν q i , ν *q i , i ∈ M (*)q p , q) and \({\mu_i^q,\mu_i^{*q},i\in M(\alpha^{(*)},q)\setminus \{(q,p)\},}\) we know

$$ \begin{aligned} &(\alpha_p^{*q}-\alpha_p^q)\sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}(\mu_i^{*q}-\mu_i^q)K(x_p^q,x_i^q)\geq 0,\\ &(\alpha_p^{*q}-\alpha_p^q)\sum_{i\in M(\alpha_p^{(*)q},q)}(\nu_i^{*q}-\nu_i^q)K(x_p^q,x_i^q)\geq 0. \end{aligned} $$

Furthermore since

$$ \begin{aligned} &\frac{1}{ 2}\sum_{i\in M(\alpha_p^{(*)q},q)}\sum_{i'\in M(\alpha_p^{(*)q},q)}(\nu_i^{*q}-\nu_i^q)(\nu_{i'}^{*q}-\nu_{i'}^q)K(x_i^q,x_{i'}^q)\leq \frac{1}{ 2}(\alpha_p^{*q}-\alpha_p^q)^2R^2,\\ &\frac{1}{ 2}\sum_{i\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}\sum_{i'\in M(\alpha^{(*)},q)\setminus \{(q,p)\}}(\mu_i^{*q}-\mu_i^q)(\mu_{i'}^{*q}-\mu_{i'}^q)K(x_i^q,x_i^q)\leq \frac{1}{ 2}(\alpha_p^{*q}-\alpha_p^q)^2R^2, \end{aligned} $$

where R 2 = max {K(x j i , x j i )|j = 1,…, k, i = 1,…, l j}, we arrive that

$$ \begin{aligned} \,&(\alpha_p^{*q}-\alpha_p^q)\left[\sum_{(j,i)\in I\setminus\{(q,p)\}}(\alpha_{pi}^{*qj}-\alpha_{pi}^{qj})K(x_p^q,x_i^j)-b'_q\right] \\ \,&\quad\geq (\alpha_p^{*q}-\alpha_p^q)\left[\sum_{(j,i)\in I\setminus \{(q,p)\}}(\alpha_i^{*j}-\alpha_i^j)K(x_p^q,x_i^j)-b_q\right]-(\alpha_p^{*q}-\alpha_p^q)^2R^2, \end{aligned} $$

namely,

$$ \begin{aligned} \,&\left[\sum_{(j,i)\in I\setminus\{(q,p)\}}(\alpha_{pi}^{*qj}-\alpha_{pi}^{qj})K(x_p^q,x_i^j)-b'_q\right] \\ \,&\quad\leq \left[\sum_{j,i}(\alpha_i^{*j}-\alpha_i^j)K(x_p^q,x_i^j)-b_q\right]-(\alpha_p^{*q}-\alpha_p^q)(K(x_p^q,x_p^q)+R^2). \end{aligned} $$

3. The case α *q p  > 0: Following the argumentation in the case (2), an LOO error can occur only if

$$ \begin{aligned} \,&-\left[\sum_{(j,i)\in I\setminus \{(q,p)\}}(\alpha_{pi}^{*qj}-\alpha_{pi}^{qj})K(x_p^q,x_i^j)-b'_{q-1}\right]\\ \,&\quad\leq-\left[\sum_{j,i}(\alpha_i^{*j}-\alpha_i^j)K(x_p^q,x_i^j)- b_{q-1}\right]+(\alpha_p^{*q}-\alpha_p^q)(K(x_p^q,x_p^q)+R^2). \end{aligned} $$

\(\square\)

1.5 Computation of the S-span

In this subsection, we address the problem of computing S 2(q, p) and S *2(q, p) appeared in Definitions 3 and 4. They can be obtained by solving a quadratic programming, respectively. We only discuss the computation of S 2(q, p) since the computation of S *2(q, p) can be shown similarly.

The following lemma gives an equivalent description to Definition 3.

Lemma 6

Introducing the following notation:

$$ \{\lambda_i^q, i\in M_p^q(\alpha,q)\}=\{\lambda_1,\ldots,\lambda_r\}, $$
(88)
$$ \{\lambda_i^{q+1}, i\in M_p^q(\alpha^*,q+1)\}=\{\lambda_{r+1},\ldots,\lambda_s\}, $$
(89)
$$ \{x_i^q,i\in M_p^q(\alpha,q)\}=\{x_1,\ldots,x_r\}, $$
(90)
$$ \{x_i^{q+1},i\in M_p^q(\alpha^*,q+1)\}=\{x_{r+1},\ldots,x_s\}, $$
(91)
$$ \{\alpha_i^q,\alpha_i^{*q},i\in M_p^q(\alpha,q)\}=\{\alpha_1,\ldots,\alpha_r,\alpha^*_1,\ldots,\alpha^*_r\}, $$
(92)
$$ \{\alpha_i^{q+1},\alpha_i^{*q+1},i\in M_p^q(\alpha^*,q+1)\}=\{\alpha_{r+1},\ldots,\alpha_s,\alpha^*_{r+1},\ldots,\alpha^*_s\}, $$
(93)

Definition 3 is equivalent to the following description: For any margin support vector x q p about α, its S-span is

$$ S^2(q,p):=\min\{\|x_p^q-\tilde{x}_p^q\|^2|\tilde{x}_p^q\in\Uplambda_p^q\}, $$
(94)

where \( \Uplambda_p^q:= \left\{\sum\limits_{i=1}^s\lambda_ix_i\right\},\) subject to constraints

$$ \begin{aligned} \,&0\leq \alpha_i+\lambda_i\alpha_p^{q}\leq C,\quad 0\leq \alpha_i^*+\lambda_i\alpha_p^{*q}\leq C, \quad i=1,\ldots,r,\\ \,&0\leq \alpha_i-\lambda_i\alpha_p^{*q}\leq C,\quad 0\leq\alpha_i^*-\lambda_i\alpha_p^{q}\leq C ,\quad i=r+1,\ldots,s, \\ \,& \sum_{i=1}^s\lambda_i=1,\quad\lambda_p^q=-1. \end{aligned} $$

Proof

It is sufficient to substitute (88)–(93) into Definition 3. \(\square\)

The following theorem shows that, in Definition 3, S 2(q, p) can be obtained by solving a quadratic programming.

Theorem 3

S 2(q, p) defined in Definition 3 can be obtained by solving the following quadratic programming:

$$ \min_\lambda \quad \sum_{i=1}^s\sum_{j=1}^s\lambda_i\lambda_j(x_i\cdot x_j)-2\sum_{i=1}^s\lambda_i(x_p^q\cdot x_i), $$
(95)
$$ \hbox{s.t.}\quad -{\frac{\alpha_i}{\alpha_p^q}}\leq \lambda_i\leq {\frac{C-\alpha^*_i}{\alpha_p^{*q}}},\quad -{\frac{\alpha^*_i} {\alpha_p^{*q}}}\leq \lambda_i\leq {\frac{C-\alpha^*_i} {\alpha_p^{*q}}},\quad i=1,\ldots,r, $$
(96)
$$ {\frac{\alpha_i-C}{\alpha_p^{*q}}}\leq \lambda_i\leq {\frac{\alpha_i}{\alpha_p^{*q}}},{\frac{\alpha^*_i-C} {\alpha_p^{q}}}\leq \lambda_i\leq {\frac{\alpha^*_i} {\alpha_p^{q}}},\quad i=r+1,\ldots,s, $$
(97)
$$ \sum_{i=1}^s\lambda_i=1. $$
(98)

Proof

We want to minimize the quantity in (94) with respect to {λ i }

$$ \begin{aligned} \|x_p^q-\tilde{x}_p^q\|^2&=((x_p^q-\tilde{x}_p^q)\cdot (x_p^q-\tilde{x}_p^q)) \\ &=(x_p^q\cdot x_p^q)+(\tilde{x}_p^q\cdot \tilde{x}_p^q)-2(x_p^q\cdot\tilde{x}_p^q) \\ &=(x_p^q\cdot x_p^q)+\sum_{i=1}^s\sum_{j=1}^s\lambda_i\lambda_j(x_i\cdot x_j)-2\left(x_p^q\cdot \left(\sum_{i=1}^s\lambda_ix_i\right)\right), \end{aligned} $$

subject to constraints

$$ \begin{aligned} \,&-{\frac{\alpha_i}{\alpha_p^q}}\leq \lambda_i\leq {\frac{C-\alpha^*_i} {\alpha_p^{*q}}},\quad -{\frac{\alpha^*_i}{\alpha_p^{*q}}}\leq \lambda_i\leq {\frac{C-\alpha^*_i}{\alpha_p^{*q}}},\quad i=1,\ldots,r, \\ \,&{\frac{\alpha_i-C}{\alpha_p^{*q}}}\leq \lambda_i\leq {\frac{\alpha_i} {\alpha_p^{*q}}},\quad {\frac{\alpha^*_i-C}{\alpha_p^{q}}}\leq \lambda_i\leq {\frac{\alpha^*_i}{\alpha_p^{q}}},\quad i=r+1,\ldots,s, \\ \,&\sum_{i=1}^s\lambda_i=1, \end{aligned} $$

namely, solving the quadratic programming (95)–(98). \(\square\)

In order to compute S 2(p, q) and S *2(p, q) in Theorem 1, we only need to replace the inner product (x · x′) in (95)–(98) by the kernel function K(x, x′).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Z., Tian, Y. & Deng, N. Leave-one-out bounds for support vector ordinal regression machine. Neural Comput & Applic 18, 731–748 (2009). https://doi.org/10.1007/s00521-008-0217-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-008-0217-z

Keywords

Navigation