Cross-site scripting in PHP usually happens because someone escaped the wrong thing, in the wrong place, at the wrong time.

I’ve seen teams add htmlspecialchars() everywhere and still ship XSS. Not because the function is bad, but because XSS prevention is context-sensitive. A value that is safe in HTML text is not automatically safe in an attribute, a JavaScript string, or a URL.

If you remember one rule, make it this one:

Validate on input. Escape on output. Escape for the specific output context.

Here are the mistakes I see most often in PHP apps, and the fixes that actually hold up.

Mistake #1: Escaping input instead of output

A classic bad pattern is escaping as soon as data enters the system.

$name = htmlspecialchars($_POST['name'], ENT_QUOTES, 'UTF-8');
// save $name to database

This feels safe, but it creates a mess:

  • data is permanently transformed
  • the same value may later be used in a different context
  • double-encoding bugs show up
  • developers stop thinking about output context

Fix

Store raw data, after validation. Escape only when rendering.

$name = trim($_POST['name'] ?? '');

if ($name === '') {
    throw new RuntimeException('Name is required');
}

// save raw $name

Then, when outputting into HTML:

echo htmlspecialchars($name, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8');

That last version is the one I prefer in PHP. ENT_QUOTES handles both quote types, and ENT_SUBSTITUTE avoids weird encoding failures.

Official docs:

Mistake #2: Using the same escaping everywhere

This is the root of most XSS bugs. Developers learn one escaping function and try to use it in every context.

That does not work.

HTML text context

Safe:

<p><?= htmlspecialchars($comment, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') ?></p>

HTML attribute context

Also safe, if quoted:

<input type="text" value="<?= htmlspecialchars($name, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') ?>">

But this becomes dangerous when people build unquoted attributes:

<!-- bad -->
<input value=<?= htmlspecialchars($name, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') ?>>

Always quote attributes.

JavaScript context

This is where people get burned. htmlspecialchars() is not a JavaScript escaping function.

Bad:

<script>
    const username = '<?= htmlspecialchars($name, ENT_QUOTES, 'UTF-8') ?>';
</script>

If the string contains line breaks or other JS-significant characters, this can still break in ugly ways.

Fix

Use json_encode() when embedding server data into JavaScript.

<script>
    const username = <?= json_encode($name, JSON_HEX_TAG | JSON_HEX_AMP | JSON_HEX_APOS | JSON_HEX_QUOT) ?>;
</script>

That gives you a valid JavaScript string literal, not a half-escaped guess.

Official docs:

Mistake #3: Trusting rich HTML from users

If users can submit HTML, plain escaping is not enough because you’re explicitly allowing markup.

A lot of apps do this:

echo $postBody;

Or this:

echo strip_tags($postBody, '<p><a><strong><em>');

strip_tags() is not an HTML sanitizer. It removes tags, but it does not safely reason about attributes, protocols, malformed markup, parser edge cases, or browser behavior.

Fix

If you do not need HTML, escape everything.

echo nl2br(htmlspecialchars($postBody, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8'));

If you do need user HTML, sanitize it with a real HTML sanitizer and a strict allowlist. In PHP, that usually means bringing in a dedicated library rather than inventing one in a helper function.

And be strict about attributes. Allowing <a> but forgetting to restrict href protocols is how javascript: payloads survive.

Mistake #4: Building URLs unsafely

Developers often escape URLs as HTML and assume they’re done.

Bad:

<a href="<?= htmlspecialchars($url, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') ?>">Profile</a>

This protects the HTML attribute, but not the meaning of the URL. If $url is javascript:alert(1), the browser still treats it as executable script.

Fix

Validate the URL first, then escape for the attribute.

$url = $_POST['website'] ?? '';

if (!filter_var($url, FILTER_VALIDATE_URL)) {
    $url = '';
}

$scheme = parse_url($url, PHP_URL_SCHEME);
$allowedSchemes = ['http', 'https'];

if (!in_array(strtolower((string)$scheme), $allowedSchemes, true)) {
    $url = '';
}

Then render:

<?php if ($url !== ''): ?>
    <a href="<?= htmlspecialchars($url, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') ?>">Website</a>
<?php endif; ?>

Official docs:

Mistake #5: Rendering JSON into the page with string concatenation

A lot of PHP apps pass data to frontend code like this:

<script>
    window.APP = {
        name: '<?= $name ?>',
        email: '<?= $email ?>'
    };
</script>

This is fragile and one payload away from XSS.

Fix

Encode the whole object once.

<?php
$bootstrap = [
    'name' => $name,
    'email' => $email,
    'isAdmin' => $isAdmin,
];
?>

<script>
    window.APP = <?= json_encode($bootstrap, JSON_HEX_TAG | JSON_HEX_AMP | JSON_HEX_APOS | JSON_HEX_QUOT) ?>;
</script>

Even better, put JSON in a non-executable script block and read it from JavaScript:

<script type="application/json" id="app-data">
<?= json_encode($bootstrap, JSON_HEX_TAG | JSON_HEX_AMP | JSON_HEX_APOS | JSON_HEX_QUOT) ?>
</script>

Then in JS:

const data = JSON.parse(document.getElementById('app-data').textContent);

I like this pattern because it separates data from code.

Mistake #6: Assuming templates make everything safe

Template engines help, but they don’t magically solve XSS.

Twig, Blade, and other engines usually auto-escape HTML context. That’s good. But developers bypass it all the time:

{!! $content !!}

Or they mark values as safe too early and forget where they end up later.

Fix

Keep auto-escaping enabled. Treat “raw” output as a security-sensitive exception, not a convenience feature.

If a value must be rendered raw, document why, where it came from, and what sanitizer touched it.

My rule is simple: if I see raw output in a review, I assume it’s a bug until proven otherwise.

Mistake #7: Using CSP as a bandage for bad output encoding

Content Security Policy helps a lot, but it does not replace correct escaping. If your app freely injects attacker-controlled HTML and JavaScript into the page, CSP may reduce impact, or it may get bypassed because the policy is weak.

Common weak policy:

Content-Security-Policy: script-src 'self' 'unsafe-inline';

That 'unsafe-inline' largely defeats the point.

Fix

Use CSP as a second layer, not the first. Start with proper output encoding, then add a strict policy with nonces or hashes.

A better direction looks like this:

$nonce = base64_encode(random_bytes(16));
header("Content-Security-Policy: default-src 'self'; script-src 'self' 'nonce-$nonce'; object-src 'none'; base-uri 'self';");

Then apply the nonce:

<script nonce="<?= htmlspecialchars($nonce, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') ?>">
    console.log('trusted inline script');
</script>

For implementation details and rollout strategy, https://csp-guide.com is useful.

Official docs:

Mistake #8: Forgetting DOM-based XSS on the frontend

PHP developers sometimes secure server-rendered output and then lose everything in JavaScript.

Example:

<div id="message"><?= htmlspecialchars($message, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') ?></div>

Later, frontend code does this:

document.getElementById('preview').innerHTML =
    document.getElementById('message').textContent;

That specific example is okay because it starts from textContent, but the real bug usually looks like this:

document.getElementById('preview').innerHTML = userControlledValue;

Fix

Prefer safe DOM APIs:

document.getElementById('preview').textContent = userControlledValue;

Or build elements explicitly:

const a = document.createElement('a');
a.textContent = profile.name;
a.href = profile.url;
container.appendChild(a);

Server-side PHP escaping won’t save you from unsafe browser-side DOM manipulation.

Mistake #9: Rolling your own escaping helper badly

I’ve seen helpers like this in old PHP codebases:

function e($value) {
    return htmlspecialchars($value);
}

Looks fine until you notice:

  • no encoding specified
  • no flags specified
  • arrays blow up
  • developers use it in JavaScript and URL contexts

Fix

If you create a helper, make it explicit and narrow.

function e_html(?string $value): string
{
    return htmlspecialchars($value ?? '', ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8');
}

Then use names that force intent:

  • e_html()
  • e_attr()
  • e_js_json() maybe, if you wrap json_encode()

I’d rather have slightly noisy code than a fake universal escape function that trains people into dangerous habits.

A practical baseline for PHP apps

If I’m hardening a plain PHP app, my default checklist is:

  1. Store raw input after validation
  2. Escape on output
  3. Use htmlspecialchars(..., ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') for HTML
  4. Use json_encode() for JavaScript data
  5. Validate URL schemes before rendering links
  6. Never trust strip_tags() as a sanitizer
  7. Keep template auto-escaping on
  8. Use CSP as defense in depth
  9. Avoid innerHTML unless content is fully trusted and sanitized

That’s the boring answer, but boring is what you want in XSS prevention. The flashy custom solution is usually the thing that gets you popped.