XSS in Jekyll Templates: Where Static Sites Go Wrong

Jekyll is a static site generator, which makes people assume it’s automatically safe from XSS. That’s a bad assumption.

Static output can still ship dangerous HTML and JavaScript to every visitor. If untrusted content gets into your templates, markdown, front matter, data files, or generated JSON, you can absolutely create stored XSS in a Jekyll site. The fact that the site is “just files” doesn’t help once the browser starts parsing them.

I’ve seen this happen in docs sites, blogs with community-submitted content, marketing sites pulling from YAML/JSON data, and GitHub Pages setups where people trust front matter way too much.

Why Jekyll sites get XSS

Jekyll itself doesn’t execute browser-side code. The browser does. Jekyll’s job is to assemble strings into HTML. If those strings contain attacker-controlled markup and you render them in the wrong place, the browser will happily execute it.

Common sources of untrusted data in Jekyll:

Front matter fields
_data/*.yml, .json, .csv
Markdown content from contributors
CMS-backed content feeding Jekyll builds
Query-string values used by client-side JavaScript
JSON blobs embedded into pages
HTML passed through Liquid includes

The main problem is context. Escaping for text inside HTML is not the same as escaping for an HTML attribute, JavaScript string, URL, or raw HTML block.

The classic unsafe template

Here’s a simple layout that looks harmless:

<h1>{{ page.title }}</h1>
<p>{{ page.description }}</p>

If page.title is attacker-controlled and Liquid outputs it unescaped, you can end up with:

---
title: '<img src=x onerror=alert(1)>'
description: 'Hello'
---

Rendered HTML:

<h1><img src=x onerror=alert(1)></h1>
<p>Hello</p>

That’s game over.

Whether this is exploitable depends on how your site is built and how Liquid is configured, but the safe habit is simple: always escape untrusted content for the output context.

Safer version:

<h1>{{ page.title | escape }}</h1>
<p>{{ page.description | escape }}</p>

This turns < into <, quotes into entities, and prevents the browser from interpreting the string as markup.

`escape` is necessary, but not enough

A lot of Jekyll advice stops at | escape. That’s fine for plain HTML text nodes, but it does not solve every context.

Safe in HTML body text

<div class="author-bio">{{ author.bio | escape }}</div>

Good.

Risky in attributes

<a href="{{ page.url }}">Read more</a>

If page.url is untrusted, escaping alone won’t stop javascript: URLs:

url: 'javascript:alert(1)'

Rendered:

<a href="javascript:alert(1)">Read more</a>

That’s still dangerous.

For URLs, validate allowed schemes and preferably restrict to relative URLs or trusted hosts.

A practical pattern:

{% assign safe_url = page.url | to_s %}
{% if safe_url contains '://' %}
  {% assign allowed = false %}
  {% if safe_url startswith 'https://' or safe_url startswith 'http://' %}
    {% assign allowed = true %}
  {% endif %}
{% else %}
  {% assign allowed = true %}
{% endif %}

{% if allowed %}
  <a href="{{ safe_url | escape }}">Read more</a>
{% else %}
  <a href="/">Read more</a>
{% endif %}

Liquid isn’t great for serious validation logic, so my opinion: if you can, normalize URLs before they ever reach the template.

The `markdownify` trap

People often render contributor content like this:

<div class="content">
  {{ page.user_bio | markdownify }}
</div>

This is risky because Markdown processors often allow raw HTML. If the input contains:

Hi there <script>alert(1)</script>
```text

or

```markdown
<img src=x onerror=alert(1)>

you may emit executable HTML directly into the page.

If the content is not fully trusted, don’t allow raw HTML through Markdown. Sanitize it before build time or use a markdown pipeline configured to strip dangerous HTML.

A common mistake is chaining filters like this:

{{ page.user_bio | escape | markdownify }}

That may prevent HTML injection, but it can also break expected markdown rendering depending on the content. The right fix is usually sanitization before rendering, not hoping filter order saves you.

`strip_html` is not a sanitizer

I still see this:

{{ page.comment | strip_html }}

People assume this makes content safe. It doesn’t.

strip_html is a text cleanup tool, not a security boundary. It can remove tags, but it’s not designed to safely handle malformed HTML, encoded payloads, or context-sensitive output. And if you later place that content into JavaScript or attributes, you’ve just moved the problem.

Use escape for text output. Use sanitization when you intentionally allow limited HTML.

JSON inside `<script>` tags

This is one of the easiest ways to introduce XSS in Jekyll.

Unsafe:

<script>
  window.siteData = {
    title: "{{ page.title }}",
    author: "{{ page.author }}"
  };
</script>

If page.title contains a quote or </script>, the script block can break out:

title: '"; alert(1); //'

Safer approach: serialize as JSON, not hand-built JavaScript.

<script>
  window.siteData = {{ page.data_object | jsonify }};
</script>

Or for individual fields:

<script>
  window.siteData = {
    title: {{ page.title | jsonify }},
    author: {{ page.author | jsonify }}
  };
</script>

jsonify is the right tool here because JavaScript string escaping is not the same thing as HTML escaping.

Data files are a huge footgun

Jekyll _data files feel “internal,” so people trust them too much.

Example _data/team.yml:

- name: Alice
  role: Engineer
- name: '<img src=x onerror=alert(1)>'
  role: Security

Template:

<ul>
  {% for member in site.data.team %}
    <li>
      <strong>{{ member.name }}</strong>
      <span>{{ member.role }}</span>
    </li>
  {% endfor %}
</ul>

If that YAML comes from a CMS export, partner feed, or non-technical editor, you’ve got stored XSS.

Safer:

<ul>
  {% for member in site.data.team %}
    <li>
      <strong>{{ member.name | escape }}</strong>
      <span>{{ member.role | escape }}</span>
    </li>
  {% endfor %}
</ul>

My rule is simple: treat all content as untrusted unless you personally own the input path and format.

Includes can hide dangerous rendering

A lot of Jekyll sites centralize HTML in includes:

{% include card.html title=page.title summary=page.summary %}

Then card.html does this:

<div class="card">
  <h2>{{ include.title }}</h2>
  <p>{{ include.summary }}</p>
</div>

This is where XSS bugs become hard to track. The page looks clean, but the include is rendering raw values.

Safer include:

<div class="card">
  <h2>{{ include.title | escape }}</h2>
  <p>{{ include.summary | escape }}</p>
</div>

Better yet, decide in your team whether includes expect raw data or pre-sanitized data. Mixing both is how bugs survive code review.

`raw` HTML fields need sanitization, not hope

Sometimes you actually want to allow HTML, like a marketing callout or CMS block:

promo_html: '<a href="/sale">Big sale</a>'

Then:

<div class="promo">{{ page.promo_html }}</div>

That’s only safe if the HTML has already been sanitized with an allowlist. If you allow arbitrary HTML from editors or external systems, you need a sanitizer in the content pipeline before Jekyll sees it.

Good allowlist examples:

Safe tags like p, a, strong, em, ul, li
Safe attributes like href, title
Restricted URL schemes for links
No event handlers
No style unless you really know what you’re doing

Add CSP as a backup, not a primary fix

Content Security Policy won’t fix bad templates, but it can reduce blast radius. A solid CSP can block inline script execution, limit where scripts load from, and make many XSS payloads fail.

If you haven’t done this yet, read CSP implementation details.

For a Jekyll site, I’d start with something like:

Content-Security-Policy: default-src 'self'; script-src 'self'; object-src 'none'; base-uri 'self'; frame-ancestors 'none'

Then tighten it based on your actual asset needs.

You can also check your deployed headers with HeaderTest, which is useful when you’re debugging whether CSP and related headers are really being served by your CDN or hosting layer.

Practical safe patterns for Jekyll

Here’s the shortlist I’d enforce in code review:

1. Escape all plain text output

{{ value | escape }}

2. Use `jsonify` for JavaScript data

{{ value | jsonify }}

3. Validate URLs before rendering them into `href` or `src`

Prefer relative URLs or strict allowlists.

4. Never trust Markdown or HTML from contributors by default

Sanitize before build time.

5. Audit includes and partials

That’s where unescaped output often hides.

6. Don’t build HTML or JS strings manually

Use the right serializer for the context.

A before-and-after example

Unsafe component:

<article class="post">
  <h1>{{ page.title }}</h1>
  <p class="byline">
    By <a href="{{ page.author_url }}">{{ page.author }}</a>
  </p>

  <script>
    window.postMeta = {
      title: "{{ page.title }}",
      category: "{{ page.category }}"
    };
  </script>

  <div class="body">
    {{ content }}
  </div>
</article>

Safer version:

<article class="post">
  <h1>{{ page.title | escape }}</h1>
  <p class="byline">
    By
    {% assign author_url = page.author_url | to_s %}
    {% if author_url startswith '/' or author_url startswith 'https://' or author_url startswith 'http://' %}
      <a href="{{ author_url | escape }}">{{ page.author | escape }}</a>
    {% else %}
      <a href="/">{{ page.author | escape }}</a>
    {% endif %}
  </p>

  <script>
    window.postMeta = {
      title: {{ page.title | jsonify }},
      category: {{ page.category | jsonify }}
    };
  </script>

  <div class="body">
    {{ content }}
  </div>
</article>

That last {{ content }} is still a trust boundary. If your post body can contain unsafe HTML, this template is still vulnerable. That’s the part many teams miss.

The real fix

Jekyll XSS isn’t really a “static site” problem. It’s an output encoding problem plus a content trust problem.

If you remember only one thing, make it this: escape by context, sanitize when allowing HTML, and stop treating build-time content as inherently safe.

That mindset catches most Jekyll XSS bugs before they hit production.

Why Jekyll sites get XSS#

The classic unsafe template#

escape is necessary, but not enough#

Safe in HTML body text#

Risky in attributes#

The markdownify trap#

strip_html is not a sanitizer#

JSON inside <script> tags#

Data files are a huge footgun#

Includes can hide dangerous rendering#

raw HTML fields need sanitization, not hope#

Add CSP as a backup, not a primary fix#

Practical safe patterns for Jekyll#

1. Escape all plain text output#

2. Use jsonify for JavaScript data#

3. Validate URLs before rendering them into href or src#

4. Never trust Markdown or HTML from contributors by default#

5. Audit includes and partials#

6. Don’t build HTML or JS strings manually#

A before-and-after example#

The real fix#