Skip links

Microsoft worker accidentally exposes 38TB of sensitive data in GitHub blunder

A Microsoft employee accidentally exposed 38 terabytes of private data while publishing a bucket of open-source AI training data on GitHub, according to Wiz security researchers who spotted the leaky account and reported it to the Windows giant.

And Redmond, in a Monday write-up, downplayed the blunder, and said it was merely “sharing the learnings” to help customers avoid making similar mistakes. This is despite Wiz claiming the leaky data bucket had private keys, passwords, and over 30,000 internal Microsoft Teams messages, as well as backup data from two employees’ workstations.

“No customer data was exposed, and no other internal services were put at risk because of this issue,” the Microsoft Security Response Center team said. “No customer action is required in response to this issue.”

In a report published on Monday, Wiz researchers Hillai Ben-Sasson and Ronny Greenberg detailed what happened. While they were scanning for misconfigured storage containers, they came across a GitHub repository belonging to the Microsoft AI research team that provides open-source code and machine learning models for image recognition.

This repository contained a URL with an overly-permissive Shared Access Signature (SAS) token for a Microsoft-owned internal Azure storage account containing private data.

A SAS token is a signed URL that grants some level of access to Azure Storage resources. The user can customize the level of access, from read-only to full-control, and in this case, the SAS token was misconfigured with full-control permissions.

This not only gave the Wiz team — and potentially more nefarious-minded snoops — the ability to view everything in the storage account, but they also could have deleted or altered existing files.

“Our scan shows that this account contained 38TB of additional data — including Microsoft employees’ personal computer backups,” Ben-Sasson and Greenberg said. “The backups contained sensitive personal data, including passwords to Microsoft services, secret keys, and over 30,000 internal Microsoft Teams messages from 359 Microsoft employees.”

Microsoft, for its part, says the personal computer backups belonged to two former employees. After being notified about the exposure on June 22, Redmond says it revoked the SAS token to prevent any external access to the storage account, and it plugged the leak on June 24. 

“Additional investigation then took place to understand any potential impact to our customers and/or business continuity,” the MSRC report says. “Our investigation concluded that there was no risk to customers as a result of this exposure.”

Also in the write-up, Redmond recommended a series of best practices for SAS to minimize the risk of overly permissive tokens. This includes limiting the scope of the URLs to the smallest set of resources required, and also limiting permissions to only those needed by the application.

There’s also a feature that allows users to set an expiration time, and Microsoft recommends one hour or less for SAS URLs. This is all good advice, it’s just a pity Redmond didn’t eat its own dog food in this instance.

Finally, Redmond promises to do better on its end of things: “Microsoft is also making ongoing improvements to our detections and scanning toolset to proactively identify such cases of over-provisioned SAS URLs and bolster our secure-by-default posture.”

This, of course, isn’t Microsoft’s only issue with key-based authentication in recent months.

In July, Chinese spies stole a secret Microsoft key and used to break into US government email accounts. Wiz researchers weighed in on that security snafu, too. ®