Cross-site scripting prevention gets a lot harder than it should because people hear “sanitize input” and stop there. That’s not enough. If you remember one thing from this tutorial, make it this: XSS is prevented at output time, based on the exact context where data is rendered.
Output encoding is the boring, reliable workhorse of XSS defense. It’s not flashy, but it’s the thing that stops untrusted data from turning into executable HTML, JavaScript, or CSS in the browser.
In this guide, I’ll walk through what output encoding actually is, why context matters so much, and how to do it correctly with practical examples.
What output encoding means
Output encoding means transforming untrusted data before placing it into a page so the browser treats it as data, not code.
For example, if a user submits this as their display name:
<script>alert(1)</script>
and you render it directly into HTML like this:
<div>Welcome, <script>alert(1)</script></div>
the browser sees a real <script> tag and runs it.
If instead you HTML-encode the dangerous characters, you get:
<div>Welcome, <script>alert(1)</script></div>
Now the browser shows the text literally instead of executing it.
That’s the whole idea. Simple in principle. Easy to mess up in practice.
The core rule: encode for the specific output context
This is where most XSS bugs come from. Developers use the wrong encoder for the place where data is inserted.
There is no single “escape everything” function that works everywhere.
Different contexts need different encoding:
- HTML element content
- HTML attribute values
- JavaScript strings
- CSS values
- URL parameters
If you use HTML encoding inside a JavaScript string, you can still get popped. If you use URL encoding where HTML attribute encoding is needed, same story.
You need to ask: Where exactly will this untrusted value end up in the final document?
Context 1: HTML element content
This is the most common and safest place to render untrusted text.
Example:
<p>Hello, USERNAME</p>
If USERNAME is untrusted, encode these characters for HTML:
&→&<→<>→>"→"'→'
Example in a server-rendered template:
<p>Hello, {{ username }}</p>
In a safe templating engine, {{ username }} is usually escaped automatically. That’s good. Keep it that way.
Unsafe manual rendering:
<p>Hello, ${username}</p>
Safe result after HTML encoding:
<p>Hello, <img src=x onerror=alert(1)></p>
Good practice
Use templating engines and frameworks that auto-escape HTML by default. Modern tools usually do the right thing unless you explicitly bypass them.
Examples of dangerous bypasses include things like:
dangerouslySetInnerHTML
v-html
innerHTML
raw template output
unescaped server-side interpolation
If you use those, output encoding often gets skipped entirely.
Context 2: HTML attribute values
Attributes are trickier than plain HTML text because breaking out of the attribute can create new attributes or event handlers.
Unsafe:
<input value="USERNAME">
If USERNAME is:
" autofocus onfocus="alert(1)
the output becomes:
<input value="" autofocus onfocus="alert(1)">
Now you’ve got XSS.
To prevent this, attribute-encode the value and always quote attributes.
Safe:
<input value="" autofocus onfocus="alert(1)">
Important rules for attributes
- Always wrap attribute values in quotes
- Use attribute encoding, not just generic string replacement
- Never place untrusted data into event handler attributes like:
onclickonloadonerror
This is a bad pattern even if you think you’re escaping it:
<button onclick="doSomething('USER_INPUT')">Click</button>
Don’t do this. Put untrusted data in safe attributes like data-* and read them from JavaScript.
Better:
<button data-name="ENCODED_USER_INPUT">Click</button>
Then in JavaScript:
const name = button.dataset.name;
That’s much easier to reason about.
Context 3: JavaScript context
Putting untrusted data directly into inline JavaScript is one of the easiest ways to create XSS.
Unsafe:
<script>
const username = 'USER_INPUT';
</script>
If USER_INPUT contains:
'; alert(1); //
you get:
<script>
const username = ''; alert(1); //';
</script>
Game over.
The right fix
Don’t inject untrusted data into inline scripts if you can avoid it. This is my strong opinion: most inline JavaScript with dynamic user data should be treated as a design smell.
Prefer one of these patterns:
Option 1: Store data in HTML and read it
<div id="profile" data-username="ENCODED_ATTRIBUTE_VALUE"></div>
Then:
const username = document.getElementById('profile').dataset.username;
Option 2: Return JSON from the server
Instead of embedding data in HTML, fetch it as JSON:
fetch('/api/profile')
.then(r => r.json())
.then(data => {
document.getElementById('name').textContent = data.username;
});
This is much cleaner.
Option 3: If you absolutely must embed into JS, use JS string encoding
You need JavaScript string escaping, not HTML escaping.
For example, characters like quotes, backslashes, newlines, and the closing script sequence need careful treatment.
Safer conceptually:
<script>
const username = "\u003cscript\u003ealert(1)\u003c/script\u003e";
</script>
Even then, this area is easy to get wrong. Avoid it if possible.
Context 4: URL context
Untrusted data often ends up in links:
<a href="/search?q=USER_INPUT">Search</a>
This needs URL encoding for the parameter value:
/search?q=hello%20world
But there’s a catch: if the URL is then placed into an HTML attribute, you often need both URL encoding and attribute encoding.
Example
Build the URL safely first:
/search?q=%3Cscript%3Ealert(1)%3C%2Fscript%3E
Then insert it into the href attribute with attribute encoding if needed.
Dangerous URL schemes
Encoding alone won’t save you if you allow attacker-controlled schemes like:
javascript:alert(1)
data:text/html,<script>alert(1)</script>
If users can control all or part of a URL, validate allowed schemes explicitly. Usually only allow:
httphttps- maybe
mailto
Anything else should be rejected.
Context 5: CSS context
Putting untrusted data into CSS is usually a bad idea. CSS can be abused in weird ways, and historically browsers have had some ugly parsing behaviors.
Unsafe:
<div style="width: USER_INPUT">
If users control CSS values, you can end up with style injection and, depending on the context and browser behavior, potentially worse.
My practical advice: don’t put raw user input into style attributes or style blocks. Use predefined classes instead.
Bad:
<div style="color: USER_COLOR">
Better:
<div class="theme-red">
Then map approved values on the server or client:
if (userColor === 'red') className = 'theme-red';
else className = 'theme-default';
That’s much safer than trying to escape CSS perfectly.
Safe sinks: use browser APIs that treat data as text
A huge part of XSS prevention is using safe DOM APIs.
Use these:
element.textContent = userInput;
element.setAttribute('title', userInput);
input.value = userInput;
Be careful with this one though:
element.setAttribute('href', userInput);
This avoids HTML parsing, but you still need to validate the URL scheme.
Avoid these unless content is trusted or sanitized for HTML:
element.innerHTML = userInput;
element.outerHTML = userInput;
document.write(userInput);
If your goal is to display text, textContent is almost always the right answer.
Example: unsafe vs safe DOM insertion
Unsafe:
results.innerHTML = `<li>${userInput}</li>`;
Safe:
const li = document.createElement('li');
li.textContent = userInput;
results.appendChild(li);
That one change eliminates a whole category of bugs.
A quick example in a few common stacks
Server-side template example
Unsafe:
<p>Comment: {{{ comment }}}</p>
Triple-brace or raw output syntax in template engines usually disables escaping.
Safe:
<p>Comment: {{ comment }}</p>
Default escaped output is what you want.
React example
Safe by default:
function Greeting({ name }) {
return <p>Hello, {name}</p>;
}
React escapes inserted values in normal JSX.
Unsafe:
function Bio({ html }) {
return <div dangerouslySetInnerHTML={{ __html: html }} />;
}
Only use that with trusted or properly sanitized HTML.
Plain JavaScript example
Unsafe:
document.getElementById('msg').innerHTML = userInput;
Safe:
document.getElementById('msg').textContent = userInput;
This is one of the simplest and highest-value fixes you can make.
Output encoding is not the same as sanitization
These terms get mixed up constantly.
- Encoding: makes data safe for a specific context
- Sanitization: removes or allows only certain HTML/content
If you want to display plain text, encode it. If you want to allow limited HTML, sanitize it with a well-maintained HTML sanitizer.
For example, if users can submit rich text comments with <b> and <i>, HTML encoding would show the tags literally. That may not be what you want. In that case, use sanitization with an allowlist and then render the cleaned HTML carefully.
But if you just need to show a username, product name, comment text, or search query, output encoding is the correct default.
Common mistakes I see all the time
1. Encoding on input instead of output
People store encoded data in the database and think they solved XSS.
They didn’t. The same data may later be used in a different context where the old encoding is wrong, double-encoded, or bypassed.
Store raw data. Encode when rendering.
2. Using one encoder everywhere
There is no universal XSS escape function. Context matters.
3. Forgetting about attributes and JavaScript
Developers often protect HTML body text but ignore values placed in href, title, data-*, or script blocks.
4. Using innerHTML for convenience
This is probably the most common client-side footgun.
5. Assuming framework defaults protect every case
Frameworks help a lot, but they all have escape hatches. Those escape hatches are where bugs live.
Defense in depth still matters
Output encoding is foundational, but pair it with other controls:
- Content Security Policy
- HttpOnly cookies
- input validation
- safe templating defaults
- dependency hygiene
- avoiding inline script where possible
And test your site. Scan your site for XSS vulnerabilities and other security issues at headertest.com - free, instant, no signup required.
A practical checklist
When rendering any untrusted value, ask:
- Is this data trusted? If not, treat it as hostile.
- Where will it be inserted?
- HTML text
- attribute
- JavaScript
- CSS
- URL
- Am I using the correct encoder for that exact context?
- Can I avoid this risky context entirely?
- Am I using a safe sink like
textContentinstead ofinnerHTML? - If this is a URL, am I validating the allowed scheme?
- If this is HTML, should I sanitize instead of encode?
Final takeaway
The safest mental model is this: untrusted data should enter the page only through APIs and template features that treat it as text, not markup or code.
If you stick to escaped template output, quoted attributes, safe DOM APIs like textContent, and avoid inline JavaScript and raw HTML insertion, you’ll prevent most XSS issues before they start.
Output encoding isn’t glamorous, but it works. And in web security, boring and reliable wins every time.