Wednesday, September 1, 2010

Re: [Geopriv] location obscuring

Hi Martin,

> This sort of analysis is interesting, but I'm trying to
> imagine what sort of outcome you're hoping to achieve.

What I am hoping to achieve: an algorithm that does protect
privacy, as good as we get it. I do not want an algorithm
that discloses where I am or what places I visit regularly.
I do think we can have an algorithm that protects privacy
in a better way.

> For 2, I'm not yet convinced of the conclusion that there
> is no perfect solution.

Good! I would like to know about it.

> The model is still too loosely defined to make that
> assessment. In all cases, there is some information
> revealed; but the question is to what extent that
> information can be considered significant. Now, we could
> spend a lot of time on this sort of analysis, but that's
> not helping to address the immediate need for a solution.
> From my perspective, and with the work as it is, I'd be
> more interested in a shooting gallery. Propose an
> algorithm.

OK. In short it looks like this:

==Proposed algorithm

For a given uncertainty "d" construct a grid of "landmarks"
at roughly distance 1.5d from each other. A "region" is
roughly-a-circle of radius d centered on a landmark. Each
point in space must be at least in one region.

Given a point choose randomly (with equal probabilities or
with a bias towards the prevouly reported location, a
preferred location, or towards closest landmarks) a region
to which this point belongs. But if there are too many
regions to choose from (say, >2, 3 or 4), then choose only
from the 2 (or 3 or 4) regions whose landmarks are closest
to the point.

Any time you move, you can use the same algorithm to
provide a new (or same) region. (But instead of "any time
you move" this can be implemented using a "trigger circle").

That's it!!

You can choose a geodetic grid or a civil one, in the first
case you can choose a triangular or a rectangular grid or
other ones. Most of my past mails was describing in detail
how to construct the girds.

The thing to care about is how to implement the algorithm
efficiently. This was the topic also in my past mails. The
basic idea is simple: The borders of the regions divide the
space into connected components called "blocks": Two points
are in the same block iff they are in exactly the same
regions. We do not want blocks that are too small, nor
points that are in too many regions. We get rid of small
blocks by rearranging the borders of the regions in those
small areas. Thus we are cutting the small blocks into
pieces and lumping them into neighboring blocks that are
larger. In this way, the regions are not perfect circles,
but roughly-a-circle.

> Describe what information is revealed and under what
> circumstances and maybe assumptions. Then we pick one and
> keep the shortcomings visible. We could pick more than
> one and leave it to the policy to dictate which algorithm
> is applied, but that is far from optimal for many
> reasons.

agree!

==Weaknesses

The information leaked in the algorithm described is at
worst: in which intersection or regions you are moving
around, or are frequently visiting, etc. This is not too
bad as you can choose those intersections to be not too
small.


As to the algorithm you proposed:

> (1)
> Scenario:
> The Target visits the same location multiple times over time.
>
> Assumptions required by recipient:
> The Target is visiting the same location.
>
> Constraints on scenario:
>
> Each time, location is only reported while at that
> location, not on the approach to, or leaving from the
> location. This requires that the Target be unable to
> located on approach to or exodus from the location. This
> constraint is met by having the means of location
> disabled while in transit - e.g. by turning off their
> phone.

This is not entirely correct: if you are approaching, say,
every evening your home via the same route, if you report
your location when you are approaching home, you will get a
very close approximation of the route that you are using.

Another question:

If you have several devices providing information: are we
sure all the provided locations are processed by the same
*instance* of the algorithm (same server, same local data)?

-- Jorge
_______________________________________________
Geopriv mailing list
Geopriv@ietf.org
https://www.ietf.org/mailman/listinfo/geopriv