It’s no secret that social networks are able to gain enough usable information on their users to predict behaviors, desires, and advertising leanings. But what might come as a surprise to some, is that you no longer even need to sign up to one of these services for those same social networks to gain the same information.
You can thank for your friends for that.
At least, that’s one of the main takeaways from new research published in the journal Nature Human Behavior.
The research, brought forward by scientists at the University of Vermont and the University of Adelaide, indicates that machine learning algorithms are able to predict what a user would say in a tweet without technically looking at the person’s past Twitter behavior.
The machine learning-powered models were fed over thirty million posts from close to 14,000 Twitter users. From an information pool that large it would already be expected that a model would be able to digest everything a person has said and predict what they might say next with a fairly high degree of accuracy. This was once again borne out by the latest research.
Where the research really proved its worth, was in the finding that almost to the same level of “potential predictive accuracy,” these models could predict tweets not based on the user’s data, but on the data of the person’s contacts. Typically, 8-9 contacts proved sufficient to ensure the predictive accuracy was 95-percent of the predictive accuracy if the model had been fed data directly from the user.
Taking the notion a step further, the same predictive accuracy is said to be in effect after the user in question leaves the social network or worst still, had never joined the social network in the first place.
Privacy has become an increasingly important and controversial issue in the wake of scandal after scandal and in particular when it comes to social networks and the vast wealth of data they are made privy to.
However, the premise of the how-much-data-should-you-share debate has always intrinsically circled around the notion of the user offering up their data in the first place. This research, although only research at the moment, suggests the question of data collection is likely to be more fundamental and more all-encompassing than had previously been thought.
Regardless of whether any one particular social network makes use of such tactics, the suggestion is that sites could get enough data to predict what a user might say, do, or think simply by having access to other people who know the user. Opening new doors of possibility when it comes to understanding and regulating what data a company can or should collect and the very definition of data-profiling.
If this sounds like a far-fetched idea, then you do only have to look as far as the TV advertising world for an example of the real-world implications of this research. In this world, TV ads have become the new currency and this spans TV makers through to the content providers with all those involved down the ad-food chain collecting viewing habits to sell or use to serve ads that are directly relevant to the viewer.
Previously, any or all of these data-seeking entities would have needed access to the user’s viewing habits to establish the associations needed to generate elevated accuracy in ad-serving. Based on the research here, it is very possible those same advertisers might be able to gain the same information just from known associates.
While this might not be as much of an issue to those who are never exposed to the service, or never will be, it is still food for thought on what privacy means in the current climate and how date can be collected. For others, it might be the case that the moment you sign up to that new video, music or social network service, the information the service has already collected from your friends and associates (who probably also recommended the service to you in the first place) is enough to serve ads hand-picked just for you.
Almost as well as if you had been a subscriber to the service for years.