The most interesting thing about the “Thanksgiving Effect” study is what it tells us about the limits of data anonymization

mostlysignssomeportents:

Late last year, a pair of economists released an interesting paper
that used mobile location data to estimate the likelihood that
political polarization had shortened family Thanksgiving dinners in
2016.

The conclusions were indeed interesting, but far more telling is the
methodology. The researchers were able to buy location data from a
marketing broker (the same kind of shadowy figure that sells ER patients’ identities to ambulance chasing lawyers, and also sometimes continuously leaks all location data for everyone in the USA and Canada to anyone in the world, for years), and by tracking how long people stayed at dinner on Thanksgiving, they were able to calculate the duration of the meal.

Then the researchers used the same brokerages to get the location of the
precinct where their subjects had voted, and they used that to infer
the subjects’ political alignment (precinct-level voting is a matter of
public record and tends to be very homogenous).

It’s not hard to imagine how the re-identification process could have
gone farther – for example, you could look up the owner of the house
where the diners ate, then look for people with the same surname living
at the addresses they went home to.

The tech industry is in the midst of a largely invisible re-identification crisis:
much of the promise of machine learning and other Big Data applications
rests on the idea that potentially compromising data can be rendered
safe for use and sharing through “de-identification” (the GDPR has a
huge loophole that absolves companies of most of their responsibilities
if they “de-identify” data before sharing it!), and the existence of
reliable de-identification is taken as an article of faith within
industry and regulators, even though computer scientists are incredibly
skeptical that it can be effective or even possible.

The real interesting thing about the Thanksgiving Effect is how trivial
it is to identify the political alignment, familial relations, and other
personal (and confidential) information about people from supposedly
harmless marketing data.

https://boingboing.net/2018/06/01/location-privacy-is-hard.html

Oh wow yeah thats not creepy at all…nope

Leave a comment