machine learning on encrypted data using the seal v2.2 library

I do an experiments follow the method of the thesis Machine Learning on Encrypted Data using the SEAL v2.2 library. This paper is published in 2012, and the SEAL is released in 2015 with the powerful computational capacity of SEAL the method mentioned in the paper can be implemented in a straightforward way.

The thesis introduce the binary perceptron which is an algorithm in machine learning and a basis of neural networks. In this paper the author consider the simple Linear Means and Fishers Linear Discriminant classifiers, both of which require only class-conditional statistics to be evaluated. Simply, the algorithm includes two stages:training stage and testing stage. In the training stage, there is an prediction function $f(x) = w^{star}x+c^{star}$ in which $w^{star}$ is the weight vector and $c^{star}$ is the bias. Our aim is to get the appropriate parameters through the sample data with training algorithm. This related to a lot of data processing. We need to calculate the statistical characteristics of the data, however for different attributes of the we will give specific weights $w$ of the attribute thus we can get a pair of parameters. However we will find some specific weights $w$ of the attribute thus we can get a pair of parameters. However we will find some data don’t fit in the function thus we will adjust the weights $w$ for example add the weights with $eta$. Repeat the process, a pair of parameters will be appropriate for all the training data. We think such parameters can be used for prediction function. In the testing stage, input a sample of testing data $X$ to the function which we get in the tarinning stage, the score $f(x_0)$ reflects which class the data belong to.

Let $I_y = {iin {1,cdots ,m}| y_i = y}$ be the index set of training examples with label $y in {+1,-1}$, I implement the simple Linear Means Classification on SEAL the training step are the follows:

  1. Calculate the sum of weights of different class:$S_y = sum_{iin{I_y}}x_i$.
  2. Calculate the class-conditional mean vectors: $M_y = frac{S_y}{left| I_y right|}$.
  3. Obtain the weight vector: $W^{*} = M_{+1} - M_{-1}$.
  4. Obtain the bias: $c^* = (M_{+1} - M_{-1})^{T}(M_{+1}+M_{-1})/2$.
  5. Prediction function: $f^{star}(X;W^{star T}X - c^{star})$.

With the help of homomorphic encryption all these algorithms can be evaluated in encrypted
data, SEAL library supports all the operations in encryption scheme FV, the details of the FV scheme I don’t describe here, I read the documents of SEAL and it explains every details in FV scheme. The data I use for training is from Wisconsin Breast Cancer Data set. There are 30 attributes and I test 100 times and the result is well sound.

$$
begin{array} {|l|l|l|l|}hline features&training&test&errors \ hline
9& 20& 100& 5 \ hline
9& 60& 100& 8 \ hline
30&60&100&3 \ hline
end{array}
$$