Researchers from the UpGuard Cyber Risk team now say that they recently discovered as much as 146 gigabytes of Facebook user data exposed to the public internet via Amazon servers in one of two separate leaks. Representing the larger of the two datasets discovered and managed by Mexico-based media company Cultura Colectiva, that collection contained 540 million records. Details held include Facebook IDs, account names, comments, likes, reactions, and more.
The data remained exposed from at least January 10 when it was discovered through April 3, but Cultura Colectiva was never able to be reached about the apparent breach. That's despite Amazon Web Services (AWS) becoming apprised of the situation as of early February and claiming that the owner of the AWS S3 storage bucket, titled "cc-datalake," was aware of the problem.
The storage bucket was reportedly only secured after Facebook was contacted directly about the matter by Bloomberg and there's no indication as to how long it had remained visible.
Of two leaks
As noted above, the 'cc-datalake' storage bucket was the larger of two separate data exposures but the second could potentially be more impactful, although it was secured more quickly.
Stored as a backup on AWS servers for a Facebook-integrated app called "At the Pool," the second leak discovered by UpGuard contained similar data but also exposed unprotected passwords from the app for as many as 22,000 users. Researchers say that it didn't expose Facebook passwords directly.
Similarly to the data exposed by Cultura Colectiva, At the Pool's data was exposed for an indeterminant amount of time but the stored data became secured while it was being viewed by UpGuard's team. No emails were sent to the owner of the storage so the circumstances surrounding that removal aren't immediately clear.
Both the At the Pool app and its developers have also been out of operation since at least 2014 but that any users who had the same password across services and haven't changed that in the meantime may be vulnerable.
Is this the new normal?
This is not Facebook's first leak in 2019 by any stretch of the imagination despite changes to the company's current practices and efforts it has made since the Cambridge Analytica scandal to clean up third-party data collection. It's also not clear whether the data was collected before or after those changes had been made but, as pointed out by UpGuard, the discovery may point to a much deeper problem with mass data collection and storage.
The datasets in question were, in fact, no longer under Facebook's management or control after having been skimmed from the social media platform. In its capacity as a high-ranking Latin American-focused digital media publisher, Cultura Colectiva was likely using the data to better predict how well future content would spread, become popularized on similar platforms, and generate traffic. But the problem of stored data doesn't go away once it's left Facebook.
Instead, the collection of data and its storage being out of the control of the originating company — and in the hands of the platform's developers — only increases the surface area where potential breaches and leaks can occur from the platform.