header
SVM/SMO

Tag:
Edit edit   Starstar

Program Information

Name: SVM/SMO
Domain: Graph and Image
Functionality: A computer program to implement the SMO algorithm.
Input: A data point is a tuple $\langle \overrightarrow{x_i},y_i \rangle$ in a K-dimensional space where $\overrightarrow{x_i}$ is a vector that consists of K attributes and is described as $[u_i^{(1)},\ldots ,u_i^{(K)}]$. Values of each attribute belong to a set $D(u_i^{(j)} \in D)$. D may be a subset of real numbers $\mathbb{R}$ such as the interval $[0,1]$. $y_i$ is a label, and takes a value of -1 or +1, $y_i \in L$ where $L = \{-1,+1\}$.A dataset $X$ is a set of $N$ data points such that $X \in \mathbb{P}(D^K \times L)$ and $X=\{\langle \overrightarrow{x_i},y_i\}$.
Output: We introduce translations $T$ on datasets $X$ such that $T(X)$ returns a new dataset,$T\in \mathbb{P}(D^K \times L) \rightarrow \mathbb{P}(D^M \times L)$.We sometimes use a higher order function $map$ as well, $map\in (A\rightarrow B) \times \mathbb{P}(A) \rightarrow \mathbb{P}(B)$ where $A$ and $B$ refer to certain domains. The function $map(f,X)$ applies a function $f$ of $A\rightarrow B$ to each element of the set $X$ of $\mathbb{P}(A)$ to produce another set of $\mathbb{P}(B)$. We introduce translations $T_c$ on datasets, essentially to increase the population of the input dataset. $T_c$ adds data points so that a newly calculated hyperplane is predictable with respect to the current one. $X^{(m)}$ depicts the dataset obtained from the $m$ iteration. Furthermore,we assume that the hyperplane is $y^{(m)}(\overrightarrow{x})$ for the dataset $X^{(m)}$.We will present below how to obtain $X^{(m+1)}$ from $X^{(m)}$. $X^{(m+1)}$ contains more data points than $X^{(m)}$. 

Reference

 Dataset Coverage for Testing Machine Learning Computer Programs https://doi.org/10.1109/APSEC.2016.049 

MR Information

MR1------ Reorder Data Points

Description: MR for Pseudo Oracles
Property: A dataset $X$ consists of $N$ data points $\langle \overrightarrow{x_n},y_n \rangle$, $X = \{\langle \overrightarrow{x_1},y_1 \rangle,\ldots ,\langle \overrightarrow{x_i},y_i \rangle,\ldots , \langle \overrightarrow{x_j},y_j \rangle, \ldots ,\langle \overrightarrow{x_N},y_N \rangle \}$. The translation $T^{(1)}$, for reordering data points, returns a dataset in which the data points with their indices $i$ and $j$ of the data points are swapped. $T^{(1)}(X) = \{\langle \overrightarrow{x_1},y_1 \rangle,\ldots ,\langle \overrightarrow{x_j},y_j \rangle,\ldots , \langle \overrightarrow{x_i},y_i \rangle, \ldots ,\langle \overrightarrow{x_N},y_N \rangle \}$ where the indices $i$ and $j$ are arbitrary if they are between 1 and $N$ and $i \neq j$. For this translation $T^{(1)}$, the resultant Lagrange multipliers $\alpha_i$ and $\alpha_j$ are swapped as well.The hyperplane remains the same.
Source input:
Source output:
Follow-up input:
Follow-up output:
Input relation:
Output relation:
Pattern:

MR2------ Reverse Labels 

Description: MR for Pseudo Oracles
Property: We introduce a function $\tau^{(2)}$ on a data point, $\tau^{(2)} \in D^K \times L \rightarrow D^K \times L$ such that it returns a data point with a reversed label, $\tau^{(2)}(\langle \overrightarrow{x},l \rangle) = \langle \overrightarrow{x},-l \rangle$. Then, $T^{(2)}(X)=map(\tau^{(2)},X)$. For this translation $T^{(2)}$,he resultant Lagrange multipliers remain the same. Signs of the hyperplane are reversed, namely $\overrightarrow{\omega}$ and $b$ are now $-\overrightarrow{\omega}$ and $-b$ respectively.
Source input:  
Source output:  
Follow-up input:  
Follow-up output:  
Input relation:  
Output relation:  
Pattern:

MR3------ Reorder Attributes 

Description: MR for Pseudo Oracles
Property: For a K-dimensional vector, $\overrightarrow{x}=[u_1,\ldots ,u_i,\ldots ,u_j,\ldots ,u_K]$, we have a function $\tau^{(3)}$ on a data point such that $\tau^{(3)}(\langle [u_1,\ldots ,u_i,\ldots ,u_j,\ldots ,u_K],y \rangle) = \langle [u_1,\ldots ,u_j,\ldots ,u_i,\ldots ,u_K],y \rangle$ where the indices $i$ and $j$ are arbitrary if they are between 1 and $N$ and $i \neq j$. Then, $T^{(3)}(X)=map(\tau^{(3)},X)$.  For this translation $T^{(3)}$, the resultant Lagrange multipliers remain the same, and so does the hyperplane.
Source input:  
Source output:  
Follow-up input:  
Follow-up output:  
Input relation:  
Output relation:  
Pattern:

MR4------ Add a Constant Attribute 

Description: MR for Pseudo Oracles
Property: Let $c$ be a constant $\in D$. $\tau^{(4)}$ essentially adds the constant $c$ to form a K-dimensional vector $\overrightarrow{x}$ to be $(K+1)$-dimensional, $\tau^{(4)} \in D^K \times L \Rightarrow D^{K+1} \times L$. $\tau^{(4)}(\langle [u_1,\ldots,u_K],y \rangle) = \langle [u_1,\ldots,u_K,c],y \rangle$. Then, $T^{(4)}(X)= map(\tau^{(4)},X)$. For this translation $T^{(4)}$, the resultant  Lagrange multipliers may be changed, but the hyperplane remains the same. The newly added attribute does not have any effect on the classification of data points. Note that $T^{(4)}$ is of $\mathbb{P}(D^K \times L) \rightarrow \mathbb{P}(D^{K+1} \times L)$. 
Source input:  
Source output:  
Follow-up input:  
Follow-up output:  
Input relation:  
Output relation:  
Pattern:

MR5------ Change Attribute Values 

Description: MR for Pseudo Oracles
Property: Let $op$ be an operation for changing attribute values, $op \in D\rightarrow D$. $\tau^{(5)}$ is a function to apply $op$ to a particular attribute, $\tau^{(5)} \in D^K \times L \rightarrow D^K \times L$. $\tau^{(5)}(\langle [u_1,\ldots ,u_i,\ldots ,u_K],y \rangle) = \langle [u_1,\ldots ,op(u_i),\ldots ,u_K],y \rangle$. Then,$T^{(5)}(X) = map(\tau^{(5)},X)$. This translation introduces invariants which depends on the choice of $op$ and the kernel function. For example, if op is an addition of a constant to an attribute and the kernel function is Gaussian, then the resultant Lagrange multipliers and the hyperplane are unchanged. 
Source input:  
Source output:  
Follow-up input:  
Follow-up output:  
Input relation:  
Output relation:  
Pattern:

MR6------ Reduce Margin

Description: MR for Dataset Generation
Property: $T_r$ is an instance of $T_c$ with two additional data points $\langle \overrightarrow{x}_{N+1},+1 \rangle$ and $\langle \overrightarrow{x}_{N+2},+1 \rangle$ such that the both lie on the hyperplane to $X$, $T_r \in \mathbb{P}(D^N \times L) \rightarrow \mathbb{P}(D^{N+2} \times L)$. If the current hyperplane is such that $y^{(m)}(\overrightarrow{x}) = \overrightarrow{\omega}^{(m)}\cdot \overrightarrow{x} + b^{(m)}$, then $y^{(m)}(\overrightarrow{x}_{N+1})=0$ and $y^{(m)}(\overrightarrow{x}_{N+2})=0$. We denote $X^{(m+1)}$ to be $T_r(X^{(m)})$, and that $X^{(m+1)} = X^{(m)} \cup \{\langle \overrightarrow{x}_{N+1},+1 \rangle ,\langle \overrightarrow{x}_{N+2},+1 \rangle \}$.
Source input:
Source output:
Follow-up input:
Follow-up output:
Input relation:
Output relation:
Pattern:

MR7------ Insert Noise 

Description: MR for Dataset Generation 
Property: A translation $T_c^{(1)}$ adds a noise to the current dataset $X^{(m)}$ whose hyperplane exists.If a new data point $\overrightarrow{x}_{N+1}$ is introduced such that $y^{(m)}(\overrightarrow{x}_{N+1}) = l$, then $T_c^{(1)}(X^{(m)})$ returns a new dataset $X^{(m+1)}$ to add $\langle \overrightarrow{x}_{N+1},-l \rangle$ to $X^{(m)}$, $X^{(m+1)} = X^{(m)} \cup \{\langle \overrightarrow{x}_{N+1},-l \rangle\}$  The data point is not classifiable, but has a penalty of the soft margin method. The Lagrange multiplier for this new data point is such that $\alpha_{N+1} = C$
Source input:  
Source output:  
Follow-up input:  
Follow-up output:  
Input relation:  
Output relation:  
Pattern:

MR8------ Insert Separable 

Description: MR for Dataset Generation 
Property: Alternatively, another translation $T_c^{(2)}$ adds a separable data point to the current dataset $X^{(m)}$, whose hyperplane does exist. If a new data point $\overrightarrow{x}_{N+1}$ is such that $y^{(m)}(\overrightarrow{x}_{N+1}) = l$, then $T_c^{(2)}(X^{(m)})$ returns a new data set $X^{(m+1)}$ to add $\langle \overrightarrow{x}_{N+1},l \rangle$ to $X^{(m)},X^{(m+1)} = X^{(m)} \cup \{\langle \overrightarrow{x}_{N+1},l \rangle\}$ The data point is classifiable, but is not a support vector. The Lagrange multiplier for this new data point is such that $\alpha_{N+1} = 0$ 
Source input:  
Source output:  
Follow-up input:  
Follow-up output:  
Input relation:  
Output relation:  
Pattern:

MR9------ Inconsistent Labels 

Description: MR for Dataset Generation 
Property: We could consider a special composed translation, \textit{Inconsistent Labels
Source input:  
Source output:  
Follow-up input:  
Follow-up output:  
Input relation:  
Output relation:  
Pattern:
Insert title here