Tuesday, August 31, 2010

Re: [Geopriv] location obscuring

Hi Richard and all,

instead of trying to present the whole idea in one piece,
let me discuss the problem in shorter mails.

Today:
1. The Problem to be solved
2. Why there is no perfect solution (and still, ones are better than others)

---
1. The Problem to be solved

A user wants to obscure his location: even if the system
knows his location with great preciseness, the user wants
that his location is presented (to certain users or
applications) as a region with a certain "pretended
uncertainty" or "obscurity distance". Instead of saying
"Peter is in this point", the algorithm should return
"Peter is in this region".

There are two different locations, which are some regions
in space, the original location and the reported one. A
condition we have to meet is that the reported location
must contain the original location.

We will discuss the case of the locations as being circles.
Their radius is the uncertainty associated with them.

- The original location can be precise (a circle of radius
0) or imprecise. Without loss of generality we may assume
that the original location is precise. (If the requested
obscurity is smaller than the original uncertainty, we
must live with the original uncertainty).

- The reported location (which is, if needed, obscured, by
adding a "pretended uncertainty"). Without loss of
generality we may assume that the reported location is
imprecise and has a radius chosen by the user.

Thus we will assume that:

- the original location is a point "o" and

- the reported location is a circle C(r,d) with center "r"
and radius "d".

Let us call:

r= reported center
o= original location (or original center)
d= reported uncertainty, or "distance"

The difference of the two centers is a vector:

reported center - original center

that we may call the "shift in reporting". Its length (or
absolute value or norm) can be called "distance to reported
center" and its direction (angle to some "x-axis") the
"direction of the shift".

What we want is an algorithm that given
o= a precise original location and
d= a distance
provides a reported location, which is a circle of radius d
around a point
r= reported center
in such a way that this circle contains the original
location and "leaks as less information about the real
location" as possible.

Thus the algorithm has inputs o and d and produces r.

An initial design would be:
Algo1: choose randomly any point within the distance d from
o and report the circle of the required radius d around this
point. In the moment that the user leaves this region,
choose another r.

---
2. Why there is no perfect solution (and still, ones are better than others)

The algorithm Algo1, when the location changes, leaks a lot
of information about the location of the user. Anyone
observing the reported locations knows that the user has
just left one circle in an arc contained by the new circle.
Thus we know that the user is (or just was) in a "region"
(the arc) of area 0. We say, the algorithm "leaks 0-area"
locations.

In principle, this kind of problem is unavoidable. One
would like to have an algorithm that never discloses that a
user was in an area of size 0 (that is, does not leak
0-area). One would also like not to disclose the velocity
of the movement.

It is impossible to obtain such an algorithm. One reason
why there are no perfect algorithms is because there is
public information about streets, planes, speed limits,
etc.

As to the velocity: suppose I want to hide my location
within 10 miles, but I a moving at 80 mi per hour. If an
observer sees my reported locations, he may, no matter how
I choose the set of reported centers r1, r2, r3, ..., if he
sees enough measurements, he can calculate quite accurately
my (mean) velocity.

Two examples are show that it is impossible not to leak
"0-area" when continuously moving: if my location is
reported to be Paris (+- 100 miles) at 10 am and New York
at 6 pm of the same day (+- 100 miles), then you may
conclude (with a very high probability, and that is what
matters) that at some point of the day I was over the
meridian of Greenwich, which has area 0. Or: if you find my
location to be moving west at roughly 80 mi per hour in an
area where there is only one highway (say, in a country
with speed limit 80 mi per hour), then any algorithm
discloses quite precisely my exact location with any
precision after enough time.

Of course, Algo1 is worse than, say:

Algo2: choose randomly any point r within the distance d
from o and report the circle of the required radius d
around this point. Choose randomly a second radius, the
"trigger radius", a value between 0 and d. In the moment
that the user leaves the "trigger circle", C(r, d1), choose
another r.

This one is better because when the reported location of
the user changes continuously, the algorithm does not leak
0-area by providing those two reported values. (But as
mentioned before, an observer may notice than in a certain
time span the user crossed a border or another region of
area 0).

This algorithm, as many others have a different problem
(which, I think, is worse from the point of view of privacy
than 0-area leakage): suppose I go once a week to a
certain town, to visit somebody, say a friend or a business
partner. (Or every night I go home, or everyday I go to
work, etc..). I let my location to be know with an
uncertainty of, say, 10 miles. So the algorithm chooses
some location r within 10 mi and reports a circle around
this r. But after 20 times, the 20 different circles it
provided intersect in a small region.

Thus observers obtain information about what the user
regularly does. That is bad.

This problem has all algorithms that create a shift in
reporting (recall: the vector vector r - o) randomly every
time. Of course, the shift in reporting can not be chosen
in a deterministic way, even if the value depends on some
secret of the user (those secrets will leak much to quickly
fro many reasons. For instance, the exact location of the
person can be observed at certain places, etc).
---

ciao,
Jorge
_______________________________________________
Geopriv mailing list
Geopriv@ietf.org
https://www.ietf.org/mailman/listinfo/geopriv