ProxySG is a Symantec Secure Web Gateway (SWG) that can serve as a forward or reverse proxy. In both deployment modes, it leverages its extremely-efficient caching capabilities to improve a customer’s Internet experience. In forward proxy mode, the customer is typically an enterprise with employees enjoying faster speeds accessing the Internet due to the proximity of caching resources. Note that in this mode there is a possibility of additional upstream caching devices (think Content Delivery Network, Reverse Proxies, Load Balancers etc.). In reverse proxy mode, ProxySG is deployed in front of the Origin Content Server (OCS) and typically is the last caching device on the way to the web server.
Recently, an interesting research appeared online called “Web Cache Deception.” The original research dates back to February 2017, but it gained additional publicity when Omer Gil has presented it at Black Hat USA this July. In parallel, a more detailed white paper was published here. The research represents a new vector of attack that leverages the discrepancies between caching behavior on a caching device and resource retrieval behavior on the web server serving the resource behind the caching device.
This simple attack exploits sometimes-undefined behavior upon requesting a non-existent but cacheable resource from OCS. Depending on the web framework and server configuration, the OCS might fall back to the last known resource while retrieving the page. The researcher provides several specific examples of this behavior in PHP, Django and ASP.NET. The focus of this article is ProxySG caching behavior, rather than OCS, therefore we will use the simplest example of PHP page for demonstration purposes.
When accessing the most basic authenticated PHP page:
Upon successful authentication, the default PHP/Apache configuration on Ubuntu 12.04 returns status 200 and serves the content of secret_w_auth.php:
For simplicity, we will leave the query string and request/response headers aside for now.
On the caching side, seeing status 200 and not 404, the caching device assumes nonexistent.css was served and caches the resource under the requested URL. This is an example of impedance mismatch, this time between the logic at the middlebox (caching engine) and the endpoint (OCS). This is because the caching device does not always know what web servers / web frameworks reside upstream and, arguably, it shouldn’t know. The researcher provides several examples of caching devices that make the attack possible (Cloudflare, IIS ARR and NGINX). In addition, there were several publications from affected CDN vendors (see References section). In the next section, we will explore ProxySG caching behavior in the context of this attack in both forward proxy and reverse proxy modes.
ProxySG Caching Logic
ProxySG by default is very careful when caching an object. Out-of-the-box configuration obeys all the accepted cache controls, such as Cache-Control headers and expiration timestamps. In addition, additional factors affect the default caching behavior, such as existence of cookies and authentication header. The rule of thumb is not to cache private or user-specific information.
Caching Authenticated Data
Taking a closer look at the previous example, the GET request will carry the Authentication header:
ProxySG has a feature to cache authenticated data which is turned on by default. This feature can be controlled via configuration. All the factors that can affect the HTTP request or response cacheability (such as Cache-Control etc.) in a non-authenticated flow apply when authenticated data is cached. In addition, the authenticated cached data is marked with “authenticated” flag when it is stored on the disk, which indicates that future requests for such content will always require clients to authenticate to the server before the cached content is served. Note that a similar flow applies to other authentication methods; HTTP basic authentication is only chosen here for the sake of simplicity.
In these cases, ProxySG always issues a GET request with an “If-Modified-Since” header to verify that the client has provided valid authentication credentials to the origin server even when the cached authenticated data has not yet expired. Therefore, it is not possible for an unauthenticated user to access the cached authenticated data, which the server would not have served if the user tried to directly access it without authentication. In the case where the cached object is fresh and the origin server allows access to the object, the origin server can reply back to the proxy with a 304 (instead of 200) response, saving the server-side bandwidth.
Caching Unauthenticated Data
For unauthenticated cached objects, ProxySG would not contact the origin server if the object is still fresh in cache. So, the deception is certainly possible, but there’s no harm in this because the server would have served the same content to all users even when no caching was involved.
This brings us to cookie-based authentication and the original Paypal vulnerability from the aforementioned white paper. Following is the request-response flow visiting the most basic PHP web page that uses custom authentication login form and standard session management support:
The initial login page would redirect authenticated user to the next page containing private information. The PHP session module takes care of session management and embeds cookie value in HTTP requests as seen in the screenshot. Because presence of cookie in the request/response is considered to be associated with the presence of private information, one of the default ProxySG caching behaviors is to bypass caching for these transactions. So, the exploit is not possible with out-of-the-box config.
To override this default behavior, the ProxySG administrator would have to consciously use dangerous force_cache(personal_pages) policy gesture (marked “for advanced users only” in the ProxySG CPL reference). This would open up the possibility of the exploit discussed above and thus should be used very cautiously and avoided if unsure.
Like in many other web applications, the authentication state for Paypal session is stored in cookies that will be present even when retrieving paypal.com/home/account/nonexistent.css. However, a caching middlebox would have to disregard the presence of cookies in both HTTP requests and HTTP responses for this exploit to be successful.
From the very beginning, caching controls were developed by the Internet community to standardize caching behavior across various devices on the web. As such, following the RFCs and common recommendations on the way to and from the OCS inherently minimizes the infamous impedance mismatch. In addition, smart middleboxes can look for other signs of user-specific content such as “Authentication”, “Cookie”, or “Vary” headers to protect against serving private information when an origin server fails to set the standard cache controls correctly. ProxySG administrators should not need to do much when using the ProxySG with recommended or default settings. However, caching overrides such as force_cache() should be used with extreme caution.
ProxySG also provides additional controls to identify content that may vary per user or which should only be served after verifying server authentication. The ProxySG’s Content Policy Language (CPL) provides the cookie_sensitive() and ua_sensitive() properties to modify caching behavior by declaring that the requested object varies based on cookie values or user agent respectively. It also provides the check_authorization() property to identify content subject to authentication when standard authentication headers are not used.