Jekyll is a static site generator, which makes people assume it’s automatically safe from XSS. That’s a bad assumption.
Static output can still ship dangerous HTML and JavaScript to every visitor. If untrusted content gets into your templates, markdown, front matter, data files, or generated JSON, you can absolutely create stored XSS in a Jekyll site. The fact that the site is “just files” doesn’t help once the browser starts parsing them.
I’ve seen this happen in docs sites, blogs with community-submitted content, marketing sites pulling from YAML/JSON data, and GitHub Pages setups where people trust front matter way too much.
Why Jekyll sites get XSS
Jekyll itself doesn’t execute browser-side code. The browser does. Jekyll’s job is to assemble strings into HTML. If those strings contain attacker-controlled markup and you render them in the wrong place, the browser will happily execute it.
Common sources of untrusted data in Jekyll:
- Front matter fields
_data/*.yml,.json,.csv- Markdown content from contributors
- CMS-backed content feeding Jekyll builds
- Query-string values used by client-side JavaScript
- JSON blobs embedded into pages
- HTML passed through Liquid includes
The main problem is context. Escaping for text inside HTML is not the same as escaping for an HTML attribute, JavaScript string, URL, or raw HTML block.
The classic unsafe template
Here’s a simple layout that looks harmless:
<h1>{{ page.title }}</h1>
<p>{{ page.description }}</p>
If page.title is attacker-controlled and Liquid outputs it unescaped, you can end up with:
---
title: '<img src=x onerror=alert(1)>'
description: 'Hello'
---
Rendered HTML:
<h1><img src=x onerror=alert(1)></h1>
<p>Hello</p>
That’s game over.
Whether this is exploitable depends on how your site is built and how Liquid is configured, but the safe habit is simple: always escape untrusted content for the output context.
Safer version:
<h1>{{ page.title | escape }}</h1>
<p>{{ page.description | escape }}</p>
This turns < into <, quotes into entities, and prevents the browser from interpreting the string as markup.
escape is necessary, but not enough
A lot of Jekyll advice stops at | escape. That’s fine for plain HTML text nodes, but it does not solve every context.
Safe in HTML body text
<div class="author-bio">{{ author.bio | escape }}</div>
Good.
Risky in attributes
<a href="{{ page.url }}">Read more</a>
If page.url is untrusted, escaping alone won’t stop javascript: URLs:
url: 'javascript:alert(1)'
Rendered:
<a href="javascript:alert(1)">Read more</a>
That’s still dangerous.
For URLs, validate allowed schemes and preferably restrict to relative URLs or trusted hosts.
A practical pattern:
{% assign safe_url = page.url | to_s %}
{% if safe_url contains '://' %}
{% assign allowed = false %}
{% if safe_url startswith 'https://' or safe_url startswith 'http://' %}
{% assign allowed = true %}
{% endif %}
{% else %}
{% assign allowed = true %}
{% endif %}
{% if allowed %}
<a href="{{ safe_url | escape }}">Read more</a>
{% else %}
<a href="/">Read more</a>
{% endif %}
Liquid isn’t great for serious validation logic, so my opinion: if you can, normalize URLs before they ever reach the template.
The markdownify trap
People often render contributor content like this:
<div class="content">
{{ page.user_bio | markdownify }}
</div>
This is risky because Markdown processors often allow raw HTML. If the input contains:
Hi there <script>alert(1)</script>
```text
or
```markdown
<img src=x onerror=alert(1)>
you may emit executable HTML directly into the page.
If the content is not fully trusted, don’t allow raw HTML through Markdown. Sanitize it before build time or use a markdown pipeline configured to strip dangerous HTML.
A common mistake is chaining filters like this:
{{ page.user_bio | escape | markdownify }}
That may prevent HTML injection, but it can also break expected markdown rendering depending on the content. The right fix is usually sanitization before rendering, not hoping filter order saves you.
strip_html is not a sanitizer
I still see this:
{{ page.comment | strip_html }}
People assume this makes content safe. It doesn’t.
strip_html is a text cleanup tool, not a security boundary. It can remove tags, but it’s not designed to safely handle malformed HTML, encoded payloads, or context-sensitive output. And if you later place that content into JavaScript or attributes, you’ve just moved the problem.
Use escape for text output. Use sanitization when you intentionally allow limited HTML.
JSON inside <script> tags
This is one of the easiest ways to introduce XSS in Jekyll.
Unsafe:
<script>
window.siteData = {
title: "{{ page.title }}",
author: "{{ page.author }}"
};
</script>
If page.title contains a quote or </script>, the script block can break out:
title: '"; alert(1); //'
Safer approach: serialize as JSON, not hand-built JavaScript.
<script>
window.siteData = {{ page.data_object | jsonify }};
</script>
Or for individual fields:
<script>
window.siteData = {
title: {{ page.title | jsonify }},
author: {{ page.author | jsonify }}
};
</script>
jsonify is the right tool here because JavaScript string escaping is not the same thing as HTML escaping.
Data files are a huge footgun
Jekyll _data files feel “internal,” so people trust them too much.
Example _data/team.yml:
- name: Alice
role: Engineer
- name: '<img src=x onerror=alert(1)>'
role: Security
Template:
<ul>
{% for member in site.data.team %}
<li>
<strong>{{ member.name }}</strong>
<span>{{ member.role }}</span>
</li>
{% endfor %}
</ul>
If that YAML comes from a CMS export, partner feed, or non-technical editor, you’ve got stored XSS.
Safer:
<ul>
{% for member in site.data.team %}
<li>
<strong>{{ member.name | escape }}</strong>
<span>{{ member.role | escape }}</span>
</li>
{% endfor %}
</ul>
My rule is simple: treat all content as untrusted unless you personally own the input path and format.
Includes can hide dangerous rendering
A lot of Jekyll sites centralize HTML in includes:
{% include card.html title=page.title summary=page.summary %}
Then card.html does this:
<div class="card">
<h2>{{ include.title }}</h2>
<p>{{ include.summary }}</p>
</div>
This is where XSS bugs become hard to track. The page looks clean, but the include is rendering raw values.
Safer include:
<div class="card">
<h2>{{ include.title | escape }}</h2>
<p>{{ include.summary | escape }}</p>
</div>
Better yet, decide in your team whether includes expect raw data or pre-sanitized data. Mixing both is how bugs survive code review.
raw HTML fields need sanitization, not hope
Sometimes you actually want to allow HTML, like a marketing callout or CMS block:
promo_html: '<a href="/sale">Big sale</a>'
Then:
<div class="promo">{{ page.promo_html }}</div>
That’s only safe if the HTML has already been sanitized with an allowlist. If you allow arbitrary HTML from editors or external systems, you need a sanitizer in the content pipeline before Jekyll sees it.
Good allowlist examples:
- Safe tags like
p,a,strong,em,ul,li - Safe attributes like
href,title - Restricted URL schemes for links
- No event handlers
- No
styleunless you really know what you’re doing
Add CSP as a backup, not a primary fix
Content Security Policy won’t fix bad templates, but it can reduce blast radius. A solid CSP can block inline script execution, limit where scripts load from, and make many XSS payloads fail.
If you haven’t done this yet, read CSP implementation details.
For a Jekyll site, I’d start with something like:
Content-Security-Policy: default-src 'self'; script-src 'self'; object-src 'none'; base-uri 'self'; frame-ancestors 'none'
Then tighten it based on your actual asset needs.
You can also check your deployed headers with HeaderTest, which is useful when you’re debugging whether CSP and related headers are really being served by your CDN or hosting layer.
Practical safe patterns for Jekyll
Here’s the shortlist I’d enforce in code review:
1. Escape all plain text output
{{ value | escape }}
2. Use jsonify for JavaScript data
{{ value | jsonify }}
3. Validate URLs before rendering them into href or src
Prefer relative URLs or strict allowlists.
4. Never trust Markdown or HTML from contributors by default
Sanitize before build time.
5. Audit includes and partials
That’s where unescaped output often hides.
6. Don’t build HTML or JS strings manually
Use the right serializer for the context.
A before-and-after example
Unsafe component:
<article class="post">
<h1>{{ page.title }}</h1>
<p class="byline">
By <a href="{{ page.author_url }}">{{ page.author }}</a>
</p>
<script>
window.postMeta = {
title: "{{ page.title }}",
category: "{{ page.category }}"
};
</script>
<div class="body">
{{ content }}
</div>
</article>
Safer version:
<article class="post">
<h1>{{ page.title | escape }}</h1>
<p class="byline">
By
{% assign author_url = page.author_url | to_s %}
{% if author_url startswith '/' or author_url startswith 'https://' or author_url startswith 'http://' %}
<a href="{{ author_url | escape }}">{{ page.author | escape }}</a>
{% else %}
<a href="/">{{ page.author | escape }}</a>
{% endif %}
</p>
<script>
window.postMeta = {
title: {{ page.title | jsonify }},
category: {{ page.category | jsonify }}
};
</script>
<div class="body">
{{ content }}
</div>
</article>
That last {{ content }} is still a trust boundary. If your post body can contain unsafe HTML, this template is still vulnerable. That’s the part many teams miss.
The real fix
Jekyll XSS isn’t really a “static site” problem. It’s an output encoding problem plus a content trust problem.
If you remember only one thing, make it this: escape by context, sanitize when allowing HTML, and stop treating build-time content as inherently safe.
That mindset catches most Jekyll XSS bugs before they hit production.