More than 11,000 files were found to be easily accessible on cloud storage containers, some of which exposed sensitive data such as email addresses, passwords, and credit card transaction logs. That’s according to our research into the risks surrounding infrastructure-as-a-service cloud environments, which found that administrators have been leaking private information through their cloud accounts thanks to simple misconfiguration errors.
As part of our research, we demonstrated an attack scenario, showing how an amateur attacker could access thousands of files stored in the cloud without needing any user names and passwords. Based on our investigations, an attacker could have discovered 16,000 valid domain prefixes for users’ cloud accounts, leading the attacker to more than 11,000 publicly accessible files hosted through just one cloud service provider.
Storage container attack
Meet Jack, a misguided script kiddie who wanted to get his hands on data in the cloud. Considering how valuable data is and that data breaches have increased by 23 percent in 2014, Jack believed that there’s a good market for data and he could jump on this bandwagon by trying to compromise a company’s cloud resources and steal their valuable data. His plan was to sell the stolen data for a good price on cybercrime forums.
To plan and carry out the attack, Jack had to find out a bit more about his target. Jack registered a trial account for Microsoft’s Azure cloud service to try out the environment. While setting up a test data storage bucket, he created a unique URL that is used to access his own storage bucket.
Once Jack knew the URL structure for Azure data storage buckets, he realized that he could potentially find the buckets of other users simply by guessing the URL. As it happens, some administrators have misconfigured their access permissions leaving the access right wide open. This meant that their files, including the ones containing sensitive data, were accessible by anybody from the internet, as long as they knew the address for the bucket.
All that Jack needed was the domain prefix and the name of the target’s bucket. There is no central listing of all domain prefixes available from the cloud service provider, but since the domain names follow a simple schema, Jack was able to use a simple script to conduct dictionary attacks and guess countless combinations of domains. After running the script for a few hours, which iterated through a wordlist of common names, Jack saw the first hits roll in.
Within a day, the script aggregated a list of over 16,000 valid domain prefixes from the cloud service provider. But the domain names alone were not enough to let Jack access the users’ buckets, even if they had subfolders without viewing restrictions. To view the subfolders, Jack needed to know the names of them. There is no publicly available directory listing for the root folder. But it’s just as easy to guess the folder names as it is to find the domain prefix in the first place. All that Jack had to do was modify his dictionary attack script to search for commonly used directory names such as “backup,” “archive” or “logs.”
These names suggest that the found folders could hold valuable information which he could sell to cybercriminals.
Fortunately in many cases, the administrator had set the permissions correctly, preventing Jack from accessing the files. However, Jack’s script still identified 51 open directories from the 16,000 domains. This represents a hit ratio of 0.3 percent, which doesn’t sound like much. But keep in mind that each bucket may have contained multiple files. As a result, Jack’s final list included around 11,000 accessible files. If Jack searched with larger wordlists of directory names, he may have found even more data.
Not all of the accessible data blobs contained sensitive information. Some files were just images or public html files. However, other files included sensitive data. One blob belonging to a payment processor company contained some “bacpac” files. This file extension is used for database backups. This database included a lot of sensitive data, including credit card transaction logs, user IDs, email addresses, and passwords. This was exactly the kind of data that Jack was looking for.
Figure. Leaked database schema
Jack didn’t have to use any password to get to this data. The domain prefix and the folder names were all that Jack needed to access a storage bucket where the owner failed to restrict access permissions.
While Jack is just a fictional character, our research has proven that this attack method is highly feasible and the sensitive data that was uncovered is real, indicating that this is not just a hypothetical attack scenario.
The relevant payment company was notified of this issue. The problems illustrated in this research are not isolated to just one single cloud service provider. Similar attacks could be carried out against other cloud infrastructures.
Mitigation
Don’t become Jack’s next victim. Symantec advises administrators to adhere to the following steps to better secure their IaaS cloud resources:
- Ensure that you understand the settings of your cloud resources and configure them accordingly
- Enable event logging to keep track of who is accessing data in the cloud
- Read the cloud providers’ service-level agreements to learn how data in the cloud is secured
- Include cloud IP addresses in vulnerability management processes and perform audits on any services that are provided through the cloud
You can find more information about this example, along with the other attack scenarios that we researched, in our whitepaper covering the risks to IaaS environments.