Twitter’s vast metadata haul is a privacy nightmare for users

Posted July 10, 2018 · July 10, 2018

Working with publicly available metadata from Twitter, a machine learning algorithm was able to identify users with 96.7 per cent accuracy

Metadata is everywhere. Everything you tweet, every picture you take, and every status update you post on Facebook. It’s used by police and security forces to identify people who try to hide their identities and locations, while associated metadata in selfies can inadvertently ensnare criminals unaware that the data can destroy their alibi.

And metadata on Twitter can also be used in extremely precise identification each and every one of us – according to a new paper by researchers at University College London and the Alan Turing Institute. Your tweets, it turns out, no matter how anonymous you might think they are, can be traced back to you with unerring accuracy. All someone needs to do is look at the metadata.

The scientists used tweets and the associated metadata to identify any user in a group of 10,000 Twitter users with 96.7 per cent accuracy. Even when muddling up to 60 per cent of the metadata, the model could still pinpoint a single person with more than 95 per cent accuracy.

“Metadata is much larger when compared to the actual content of a tweet,” says Savvas Zannettou, a PhD student at the Cyprus University of Technology. People wrongly assume that because the data is online, they aren’t vulnerable to identification, adds Beatrice Perez of University College London, a co-author of the paper.

No right-thinking person would tell a total stranger what their address is if approached on the street. But they might tell them how often they turn their bedroom light on and off. “That’s the mentality with metadata,” says Perez. “People think it’s not a big deal. But couple it with another piece of information and I know when you’re home or not.”

It’s a commonly held belief, agrees Zannettou. “The average person doesn’t recognise that she can be easily identified using metadata.” Most Twitter users, he reckons, have no idea that Twitter holds 144 pieces of metadata on them, which is publicly accessible through the site’s API.

Being anonymous won't help

The researchers took a corpus of five million Twitter users and ran 14 pieces of metadata from their tweets (including the time the account was created, the time a tweet was published, and the number of favourites, followers and following) through three different machine learning algorithms.

The most efficient at identifying individual accounts with the best accuracy was also one of the most basic machine learning algorithms, say the researchers. It showed that it’s possible to identify with near-precise accuracy an individual using just a handful of pieces of metadata.

It does so by training the model with a known dataset of users, demonstrating that they behave in a certain way on Twitter based on the metadata of their tweets. When the model is run “in the wild”, using new tweets from the same users, it can unpick people’s behaviour from metadata, identifying them as a specific individual.

Trying to anonymise the data collected by social networks isn’t the answer, says Perez. “It’s very hard to anonymise a data set,” she explains. Triangulation using one or more sets of data is easy to do, and can often undo any attempts to remove identifying information.

Perez and her colleagues proved that by obfuscating the dataset they had from Twitter, removing some fields to try and make it more difficult for their system to pinpoint individuals. “If we had a few data points not blurred, it was still easy,” she says. The identification rate stayed largely stable right up to the point at which all unique elements are removed – and it becomes impossible to discern one person from any other.

Things are likely to improve following the introduction of GDPR in late May. “I think we’re going to find more scrutiny around metadata,” explains Pat Walshe, a data protection consultant. Article 25 of GDPR calls for “data protection by design and by default”. That regulation, also called data minimisation, requires that only the specific data required to carry out a task is processed by companies.

But the bigger question, beyond whether it’s right or not that companies can hold so much identifying information about us all, is whether the average person values their privacy in the first place. “For sure, the average user should care,” says Zannettou. “But I’m sceptical if they do.”

< Here >

Posted July 10, 2018 · July 10, 2018

Hundreds of articles are written warning people that if they are on the internet they and their data is not secure. If you use any website, whether you use a commercial VPN or proxy or a free one, or even if you use TOR, you can be traced. It may not be possible for normal users to track you but any of the thousands of law enforcement/intelligence organizations can and some with the ease that you use to open a web page. For years it has been public information that people can be traced on facebook, twitter, etc by anyone. The show Catfish showed people weekly how they tracked people down on facebook and twitter, it didn't take a rocket scientist for a bunch of code and software to do it.

Posted July 11, 2018 · July 11, 2018

10 hours ago, straycat19 said:

Hundreds of articles are written warning people that if they are on the internet they and their data is not secure. If you use any website, whether you use a commercial VPN or proxy or a free one, or even if you use TOR, you can be traced.

There's a big difference in being traced and having some malware the FBI are some other agency has bought from some hacker that is no better than the person they are exploiting served to you. Are some dumb user having poor cybersec to begin with and even using sites like Twitter , Facebook , Etc to begin with . In many countries they use tor in and open source social media its a matter of life or death because they speak on subjects that there Government don't allow , Things were people like you take for granted like freedom of the press can get you killed in many countries , these are tor users too and there life depends on this software . Funny they never was able too catch up with Snowden tell after he escaped , and all the carnies they put on sites he used didn't do a bit of good after the fact. Last time i read anything about them exploiting Tor was some years ago when they was using it to catch a pedo ring.

Basically every article tells you how they got caught and none of it was as simple as tracing someone. Only very few have they ever posted about who they caught are they got away they knew about, because the majority never got caught . For every person they took out on the darkweb there were 100s they never caught . Drug Markets on the Darkweb are so bad that the idiots use to come on the clearnet on sites like reddit and tell each other were to buy so reddit ban the sub and they were just doing these junkies a favor and shutting them up were the fbi and others could not see them, the further they drive them underground the harder they are to catch. Its like all the hackers that are using Twitter there idiots as soon as one of them do something serious the FBI will have enough too catch them on.

Quote

hackerfactor:

How do Tor users get caught later down the track after years?

(A) The user was careless and leaked enough information about themselves outside of TOR.

(B) The user was targetted by malware or a hostile site (even a hostile hidden service) that exposed enough information to determine their identity.

(C) A hostile exit node alters content, permitting tracking. (E.g., call-home javascript, replacing downloaded executables, altering SSL certificates to force a non-tunnel CA lookup), tracking bitcoin transfers, etc.

(D) A server creates a profile of the user that is specific enough to identify the user/browser/computer outside of TOR.

(E) The user's non-TOR activities led to the capture. After acquiring the computer (via warrant), they also identify past TOR activity.

(F) The user is lured out of TOR. E.g., "We should meet in person..."

You wrote: "Isn't the whole idea of Tor to remain anonymous?" No. TOR only anonymizes the network and transport layers. It does nothing for anonymizing the session, presentation, or application layers. And it does nothing to anonymize the user (your name or any information that you reveal about yourself).

You also mentioned people who were captured years later. Even if law enforcement knows who you are right now, they have no obligation to arrest you right now. They may find you due to a minor crime and wait until you commit a major infraction before sweeping in. Or in the case of child porn, they may queue up the people so they can arrest them all at the same time. With a big bust, there's no early warning that the cops are coming. (Basically, rather than arresting one person and warning the others -- potentially allowing criminals to flee/hide, the police will wait and arrest everyone at once.)

As you can see from the quote above smart hackers know how you get caught and don't even need to be told because they done discussed it a million times , the ones on twitter are wanting media attention and trying to make a name for themselves and are reckless to even to be posting about hacking on such a site. If every one got caught online the dark markets or hacking forums would not even exist , Most of them stay one step ahead or crimes would go away on the internet and everyday someone is getting exploited with some bitcoin miners witch is legal software you can can even download from windows store too mine your own pc.

Whitehat hackers would be without jobs and they be no need for billions of people to do security updates or use antivirus and law enforcement would be able to put all there efforts into solving crimes in the street, if they always caught the bad guys on the internet in a perfect fairytale world you make cyber crime solving out too be. Even the Government uses security software because there is no agency that never been hacked.

Equifax hack alone put like half the USA users data at risk and that was just one hack they really doing a super job stopping it.

https://money.cnn.com/2018/02/09/pf/equifax-hack-senate-disclosure/index.html

Sign In

Twitter’s vast metadata haul is a privacy nightmare for users

Recommended Posts

tao

Being anonymous won't help

Link to comment

Share on other sites

straycat19

Link to comment

Share on other sites

steven36

Link to comment

Share on other sites

Archived

Recently Browsing 0 members

nsane.down

Latest News

Browse

Activity