SQL Injection Explained

What Is SQL Injection?

SQL injection (SQLi) is a vulnerability that allows an attacker to interfere with the database queries that an application makes. By inserting malicious SQL code into input fields, URL parameters, or HTTP headers, an attacker can manipulate the database query to do something the developer never intended.

The vulnerability exists whenever an application constructs SQL queries by concatenating user input directly into the query string. The database engine cannot distinguish between the developer's intended SQL and the attacker's injected code -- it executes all of it.

💡

SQL injection has been around since the late 1990s.

Despite being well-understood for over 25 years and having straightforward prevention methods, SQLi remains in the OWASP Top 10 and is still found in production applications. In 2023, OWASP merged it into the broader "Injection" category, which ranked third overall. The root cause is always the same: building queries with string concatenation instead of parameterized queries.

The consequences of a successful SQL injection attack range from unauthorized data access to complete database destruction. An attacker can read sensitive data, modify or delete records, execute administrative operations, and in some cases even gain control of the underlying operating system.

How SQL Injection Works

To understand SQL injection, you need to see how a vulnerable application processes user input. Consider a simple login form.

The Vulnerable Code

# Vulnerable Python code (DO NOT use this pattern)
username = request.form['username']
password = request.form['password']

query = "SELECT * FROM users WHERE username = '" + username + "' AND password = '" + password + "'"
result = db.execute(query)

When a normal user enters alice and mypassword, the query becomes:

SELECT * FROM users WHERE username = 'alice' AND password = 'mypassword'

This works correctly. But what if an attacker enters this as the username?

' OR '1'='1' --

The resulting query becomes:

SELECT * FROM users WHERE username = '' OR '1'='1' --' AND password = ''

The OR '1'='1' condition is always true, and the -- comments out the rest of the query (the password check). The database returns all users, and the application logs the attacker in as the first user in the table -- often the administrator.

⚠️

String concatenation is the root cause.

Every SQL injection vulnerability begins with the application treating user input as SQL code rather than as data. The single quote character ' breaks out of the string literal in the query, allowing the attacker to inject their own SQL. This is why string concatenation must never be used to build SQL queries with user input.

Beyond Authentication Bypass

The login bypass above is the classic introductory example, but SQL injection can do far more damage. An attacker who can inject SQL can:

Extract entire databases -- Read every table, including credentials, personal data, and financial records
Modify data -- Change prices, elevate user privileges, alter records
Delete data -- Drop tables or truncate critical data with DROP TABLE or DELETE FROM
Read system files -- On MySQL, LOAD_FILE('/etc/passwd') reads files from the server filesystem
Write files -- On MySQL, INTO OUTFILE can write files to disk, potentially creating a web shell
Execute OS commands -- On SQL Server, xp_cmdshell can execute arbitrary operating system commands

Types of SQL Injection

SQL injection attacks are classified by how the attacker retrieves results from the database. The type determines the attacker's technique, not the severity -- all types can lead to full database compromise.

Classic (In-Band) SQL Injection

In classic SQLi, the results of the injected query are displayed directly on the page. This is the simplest type to exploit because the attacker can immediately see the data they are extracting.

# Vulnerable URL:
https://shop.example.com/product?id=5

# The server runs:
SELECT name, price, description FROM products WHERE id = 5

# Attacker modifies the URL:
https://shop.example.com/product?id=5 OR 1=1

# The server runs:
SELECT name, price, description FROM products WHERE id = 5 OR 1=1
# Returns ALL products instead of just one

Union-Based SQL Injection

Union-based SQLi uses the UNION SELECT statement to append additional queries to the original. This allows the attacker to extract data from entirely different tables by combining results into the same output.

# Original query returns 3 columns (name, price, description)
# Attacker injects UNION SELECT to pull data from the users table:

https://shop.example.com/product?id=5 UNION SELECT username, password, email FROM users --

# The page now displays user credentials alongside product data
# The attacker must match the number of columns in the UNION

The key challenge in union-based attacks is determining the number and data types of columns in the original query. Attackers typically do this by injecting ORDER BY with increasing column numbers until they get an error, or by using UNION SELECT NULL, NULL, NULL... until the column count matches.

Blind SQL Injection

Blind SQLi occurs when the application does not display query results or error messages on the page. The attacker cannot directly see the data, so they infer information by observing the application's behavior.

Boolean-Based Blind

The attacker asks true/false questions and observes whether the page changes. For example, a page that displays "Product found" or "Product not found" can be used to extract data one character at a time.

# Is the first character of the admin password 'a'?
https://shop.example.com/product?id=5 AND (SELECT SUBSTRING(password,1,1) FROM users WHERE username='admin') = 'a'

# If the page shows "Product found" → first character is 'a'
# If the page shows "Product not found" → first character is NOT 'a'
# Repeat for each character position and value

Time-Based Blind

When even the boolean response is not visible, attackers use time delays. If the injected condition is true, the database waits for a specified duration before responding.

# MySQL time-based blind injection:
https://shop.example.com/product?id=5 AND IF((SELECT SUBSTRING(password,1,1) FROM users WHERE username='admin')='a', SLEEP(5), 0)

# If the response takes 5 seconds → first character is 'a'
# If the response is immediate → first character is NOT 'a'

💡

Blind SQLi is slow but effective.

Extracting data one character at a time is tedious for a human, but automated tools like sqlmap can extract entire databases through blind injection in minutes to hours, depending on the database size and response times.

Real-World SQL Injection Incidents

SQL injection has been behind some of the largest data breaches in history. These incidents demonstrate why prevention is critical.

Heartland Payment Systems (2008) -- SQL injection was the initial entry point in a breach that exposed 130 million credit card numbers. The attackers used SQLi to install packet-sniffing malware on payment processing servers. It was the largest credit card breach at the time.
Sony Pictures (2011) -- Attackers used a simple SQL injection to compromise databases containing personal information of over 77 million PlayStation Network accounts, including names, addresses, and credit card data. The breach cost Sony an estimated $171 million.
TalkTalk (2015) -- A SQL injection attack on the UK telecom provider exposed personal data of 157,000 customers, including bank account details. TalkTalk was fined 400,000 pounds. The attack was carried out by teenagers using basic SQLi techniques.
Equifax (2017) -- While the initial vulnerability was Apache Struts (not SQLi), the attackers used SQL injection techniques post-compromise to extract data from multiple databases, ultimately accessing records of 147 million people.

⚠️

SQLi is often trivially easy to exploit.

Many of these breaches did not require advanced skills or custom tools. Basic SQL injection attacks can be performed with nothing more than a web browser. The TalkTalk breach was carried out by a 15-year-old using freely available tools. Prevention is straightforward -- there is no excuse for shipping vulnerable code.

Prevention: Parameterized Queries

Parameterized queries (also called prepared statements) are the definitive solution to SQL injection. They work by separating the SQL code from the data, ensuring that user input is always treated as a value and never as executable SQL.

How They Work

Instead of concatenating user input into the query string, you define the query with placeholder parameters. The database driver sends the query structure and the data separately. The database engine compiles the query first, then binds the parameters as data values. It is structurally impossible for the data to alter the query logic.

Examples in Multiple Languages

# Python (with psycopg2 for PostgreSQL)
cursor.execute(
    "SELECT * FROM users WHERE username = %s AND password = %s",
    (username, password)
)

# Python (with SQLite3)
cursor.execute(
    "SELECT * FROM users WHERE username = ? AND password = ?",
    (username, password)
)

# PHP (with PDO)
$stmt = $pdo->prepare("SELECT * FROM users WHERE username = :username AND password = :password");
$stmt->execute(['username' => $username, 'password' => $password]);

# Java (with JDBC)
PreparedStatement stmt = conn.prepareStatement(
    "SELECT * FROM users WHERE username = ? AND password = ?"
);
stmt.setString(1, username);
stmt.setString(2, password);
ResultSet rs = stmt.executeQuery();

# Node.js (with mysql2)
const [rows] = await connection.execute(
    'SELECT * FROM users WHERE username = ? AND password = ?',
    [username, password]
);

🎉

Parameterized queries make SQLi impossible.

When using parameterized queries correctly, the database engine never interprets user input as SQL code. Even if an attacker enters ' OR '1'='1' --, the database treats the entire string as a literal username value, not as SQL syntax. The query returns no results because no user has that exact username.

Prevention: ORM Usage

Object-Relational Mappers (ORMs) provide an additional layer of abstraction over raw SQL. When used correctly, they automatically generate parameterized queries, making SQL injection nearly impossible by default.

ORM Examples

# Django ORM (Python) -- automatically parameterized
user = User.objects.filter(username=username, password=password).first()

# SQLAlchemy (Python) -- automatically parameterized
user = session.query(User).filter_by(username=username, password=password).first()

# Eloquent ORM (PHP/Laravel) -- automatically parameterized
$user = User::where('username', $username)->where('password', $password)->first();

# ActiveRecord (Ruby on Rails) -- automatically parameterized
user = User.find_by(username: username, password: password)

ORM Pitfalls

ORMs are safe by default, but most provide escape hatches for raw SQL. These escape hatches reintroduce the risk of SQL injection if used carelessly.

# DANGEROUS: Raw SQL in Django ORM
User.objects.raw("SELECT * FROM users WHERE username = '" + username + "'")

# SAFE: Raw SQL with parameters in Django ORM
User.objects.raw("SELECT * FROM users WHERE username = %s", [username])

# DANGEROUS: Raw SQL in SQLAlchemy
session.execute(text("SELECT * FROM users WHERE username = '" + username + "'"))

# SAFE: Raw SQL with parameters in SQLAlchemy
session.execute(text("SELECT * FROM users WHERE username = :name"), {"name": username})

⚠️

Raw SQL in ORMs requires the same discipline as direct database access.

Whenever you drop down to raw SQL within an ORM, you must use parameterized queries. The ORM's protection only applies to its query builder methods. Raw queries are your responsibility. Code review should flag any raw SQL that uses string concatenation or format strings with user input.

WAF Limitations and Defense in Depth

Web Application Firewalls (WAFs) are sometimes used to block SQL injection attempts by inspecting incoming HTTP requests for malicious patterns. While WAFs provide useful visibility and can catch basic attacks, they should never be your primary or only defense.

Why WAFs Are Not Enough

Bypass techniques -- Attackers use encoding tricks (double URL encoding, Unicode substitution, comment injection) to evade WAF pattern matching. For every WAF rule, there are documented bypasses.
False positives -- Aggressive WAF rules block legitimate requests. A user searching for "O'Brien" might trigger a SQL injection rule. This creates pressure to weaken the rules.
No structural protection -- WAFs pattern-match on text. They cannot understand the SQL query structure or verify that parameterized queries are being used. They treat symptoms, not the disease.
Maintenance burden -- WAF rules need constant updates as new bypass techniques are discovered. A set-and-forget WAF provides diminishing protection over time.

Defense in Depth Strategy

The correct approach is layered defense. No single measure is sufficient on its own.

Layer 1: Parameterized Queries The primary defense. Use prepared statements or ORM query builders for every database interaction. This eliminates the vulnerability at its source.

Layer 2: Input Validation Validate that input conforms to expected types, lengths, and formats. Reject data that does not match (a product ID should be a positive integer, not a string with SQL keywords).

Layer 3: Least-Privilege Database Accounts The application's database user should only have the permissions it needs. A read-only page should use a read-only database account. Never connect as root or a DBA account.

Layer 4: Error Handling Never expose database error messages to users. Detailed SQL errors reveal table names, column names, and query structure -- invaluable information for an attacker. Log errors server-side and show generic messages to users.

Layer 5: WAF Use a WAF as an additional detection and logging layer, not as a substitute for secure code. WAFs are most valuable for monitoring, alerting, and providing temporary protection for known vulnerabilities while code fixes are deployed.

# Example: Least-privilege MySQL user for a product catalog
CREATE USER 'catalog_reader'@'localhost' IDENTIFIED BY 'strong_random_password';
GRANT SELECT ON shop.products TO 'catalog_reader'@'localhost';
GRANT SELECT ON shop.categories TO 'catalog_reader'@'localhost';
-- This user CANNOT modify or delete data, even if SQLi is exploited
-- This user CANNOT access other databases or tables

Testing for SQL Injection

Identifying SQL injection vulnerabilities requires systematic testing of every input vector in the application. Here is a structured approach.

Manual Testing Steps

Identify input points: Map every place the application accepts user input: URL parameters, form fields, HTTP headers (Cookie, User-Agent, Referer), and JSON/XML request bodies.

Submit a single quote: Enter a single quote character (') into each input. If the application returns a database error message, it is very likely vulnerable. Even a generic 500 error is suspicious.

Test boolean conditions: For numeric parameters, try id=5 AND 1=1 (should return normal results) versus id=5 AND 1=2 (should return no results or different behavior). If both behave differently, the input is being interpreted as SQL.

Test time-based payloads: Try id=5 AND SLEEP(5) (MySQL) or id=5; WAITFOR DELAY '0:0:5' (SQL Server). If the response is delayed by 5 seconds, the application is vulnerable.

Automated Tools

sqlmap -- The industry-standard open-source SQLi exploitation tool. It automates detection, exploitation, and data extraction across MySQL, PostgreSQL, SQL Server, Oracle, and SQLite. Use with explicit permission only.
Burp Suite Scanner -- Professional web scanner with comprehensive SQLi detection. Identifies in-band, blind, and time-based vulnerabilities.
OWASP ZAP -- Free open-source scanner with active SQL injection detection modules.

# sqlmap basic usage (only on systems you have permission to test)
sqlmap -u "https://example.com/product?id=5" --batch --level=3 --risk=2

# Options:
# -u          Target URL with injectable parameter
# --batch     Non-interactive mode (accepts defaults)
# --level=3   Thoroughness level (1-5)
# --risk=2    Risk of test payloads (1-3, higher = more aggressive)
# --dbs       Enumerate databases after confirming injection

💡

Practice Legally and Ethically

Only test for SQL injection on applications you own or have written authorization to test. Use practice environments like DVWA, WebGoat, HackTheBox, or PortSwigger Web Security Academy. Unauthorized testing is a criminal offense in most jurisdictions, regardless of whether you cause damage.

Summary

SQL injection is one of the oldest and most damaging web application vulnerabilities. Here is what you learned:

SQL injection occurs when user input is concatenated directly into SQL queries, allowing attackers to manipulate the query logic
Classic SQLi returns results directly on the page; Union-based SQLi combines data from multiple tables; Blind SQLi extracts data through boolean or time-based inference
Real-world SQLi attacks have compromised hundreds of millions of records and caused billions of dollars in damages
Parameterized queries are the definitive prevention -- they structurally separate SQL code from user data, making injection impossible
ORMs use parameterized queries by default, but raw SQL escape hatches must still be parameterized
WAFs are a useful monitoring layer but cannot replace secure coding practices due to bypass techniques and false positives
Defense in depth combines parameterized queries, input validation, least-privilege database accounts, proper error handling, and WAF monitoring

🎉

Strong knowledge!

You now understand how SQL injection works and how to prevent it at every layer. Combined with your knowledge of XSS and security headers, you have a solid foundation in web application security. Continue exploring the OWASP Top 10 to learn about additional vulnerability classes.