Tuesday, March 6, 2007

Follow up on: Can Machine Learning Be Secure?

In the previous post I mentioned I did not understand the relative distance metric they used for analyzing the security simple learning problem. However, in the paper they refer to a Master thesis that explains the metric in more detail.

The keys to understanding are the following:
  1. To move the decision boundary as much as possible, each data point equals the previous mean added with the radius: Xt-1 + R
  2. This is done alpha times at each iteration: (Xt-1 + R) * at
  3. The next mean will be Xt = [Xt-1 * (n + a1+ ... +at-1) + (Xt-1 + R) * at ]/(n + a1+ ... +at-1 +at) where n is the number of previous data points
  4. If we simplify the expression we will get: Xt = Xt-1 + R * at /(n + a1+ ... +at)
  5. Assuming that the attacker has complete control of the learning process then n=0 thus: Xt = Xt-1 + R * at /(a1+ ... +at)
  6. Using Mt = (a1+ ... +at) as the effort of the attacker and by noticing that the recursion leads to: Xt = X0 + R * [1 + a2/M2+ ... +at /Mt]
  7. Thus: (Xt - X0)/R = [1 + a2/M2+ ... +at /Mt]
  8. Noticing that ai/Mi = (Mi - Mi-1)/Mi = 1 - Mi-1/Mi
  9. Thus we have the relative displacement (Xt - X0)/R = 1 + a2/M2+ ... +at /Mt = 1 + 1 - M1/M2 + ... +1 - Mt-1/Mt = t - [M1/M2 + ... + Mt-1/Mt]
  10. The relative distance (or displacement) is then for t = T as in the paper:
    • D({Mi}) = T − [M1/M2 + ... + MT-1/MT]

No comments: