The Most Common Vulnerabilities You Don't Understand
SummaryDeep technical examination of SQL injection, XSS, CSRF,...
Deep technical examination of SQL injection, XSS, CSRF,...
Deep technical examination of SQL injection, XSS, CSRF, and SSRF through the abstraction lens, showing how each vulnerability exploits the gap between what an engineer thinks a security mechanism does and what it actually does, with vulnerable and secure code comparisons.
The Most Common Vulnerabilities You Don’t Understand
The OWASP Top 10 reads like a greatest hits list that never changes. Injection, broken authentication, XSS, SSRF — the same vulnerability categories appear year after year, decade after decade. The tools change. The frameworks change. The languages change. The vulnerabilities persist.
Why? Because every vulnerability on that list exploits a gap between an abstraction and its implementation. Engineers learn “use parameterized queries” without understanding what parameterization actually does. They learn “React prevents XSS” without understanding where that protection ends. They learn “CORS handles cross-origin security” without understanding what the Same-Origin Policy actually enforces.
Understanding these vulnerabilities at the mechanism level — not just the prevention recipe — is the difference between an engineer who builds secure software and one who builds software that appears secure until someone tests it.
SQL Injection: Deeper Than You Think
You know the classic: '; DROP TABLE users; --. Bobby Tables. The solution is parameterized queries. End of story, right?
Not even close. Parameterized queries prevent value injection — they ensure user input is treated as data, not as SQL syntax. The database driver sends the query structure and the parameter values separately. The database parses the query first, then binds the values. No amount of creative quoting can escape the value context because quoting isn’t involved.
# Parameterized: query structure and data are separated
cursor.execute(
"SELECT * FROM users WHERE email = %s AND status = %s",
(user_email, 'active')
)
The database receives two things: a query template with placeholder slots, and a list of values. It compiles the template into an execution plan, then fills in the values. The values never pass through the SQL parser. That’s the mechanism. That’s why it works.
Now here’s where ORMs create a false sense of total safety. Django’s ORM parameterizes all standard operations. But engineers reach for escape hatches:
# Django: raw() with string formatting — VULNERABLE
User.objects.raw(
f"SELECT * FROM users WHERE name LIKE '%{search_term}%'"
)
# Django: extra() with unsanitized input — VULNERABLE
User.objects.extra(
where=[f"username = '{username}'"]
)
# Django: RawSQL expression with interpolation — VULNERABLE
from django.db.models.expressions import RawSQL
User.objects.annotate(
val=RawSQL(f"SELECT COUNT(*) FROM orders WHERE user_id = {user_id}", [])
)
Each of these is a place where the engineer left the ORM’s protection boundary without realizing it. The ORM promised safety. These methods let you break the promise.
There’s also a class of injection that parameterization can’t solve: identifier injection. Column names, table names, and ORDER BY clauses can’t be parameterized because they’re structural parts of the query, not values. If your application lets users choose which column to sort by:
# VULNERABLE: column names can't be parameterized
def sort_users(sort_column):
cursor.execute(f"SELECT * FROM users ORDER BY {sort_column}")
# SECURE: whitelist allowed column names
ALLOWED_SORT_COLUMNS = {'name', 'email', 'created_at'}
def sort_users(sort_column):
if sort_column not in ALLOWED_SORT_COLUMNS:
sort_column = 'created_at'
cursor.execute(f"SELECT * FROM users ORDER BY {sort_column}")
The whitelist is the only defense here. You’re making a structural decision about the query, and the database has no mechanism to separate that from the SQL syntax. If you don’t validate the input against a known-safe set, you’re injecting user input into your query structure.
XSS: What Sanitization Actually Does
Cross-site scripting happens when user-supplied content is rendered as executable code in another user’s browser. The browser’s HTML parser doesn’t know the difference between markup the developer intended and markup an attacker injected. It parses everything the same way.
Sanitization is the process of transforming user input so it can’t be interpreted as active content. At the byte level, this means replacing characters that have special meaning in HTML with their entity equivalents:
< → <
> → >
" → "
' → '
& → &
When the browser encounters <script>, it renders the text <script> instead of creating a script element. The bytes are different. The parser treats them differently.
React’s JSX performs this encoding automatically for all interpolated values:
// Safe: React encodes the output
function UserGreeting({ name }) {
return <h1>Hello, {name}</h1>;
}
// If name is "<script>alert('xss')</script>"
// React renders: Hello, <script>alert('xss')</script>
This works until you explicitly opt out:
// VULNERABLE: dangerouslySetInnerHTML bypasses encoding
function RichContent({ htmlContent }) {
return <div dangerouslySetInnerHTML={{ __html: htmlContent }} />;
}
The name dangerouslySetInnerHTML is React’s way of screaming at you. But engineers use it for rendering markdown, displaying CMS content, embedding third-party widgets. Every use is a potential XSS vector unless the HTML has been sanitized server-side with a library like DOMPurify that parses the HTML, walks the DOM tree, and removes elements and attributes that could execute code — script tags, event handlers like onerror, javascript: URLs, data: URLs with executable MIME types.
The abstraction gap: “React prevents XSS” is true for the default rendering path. It’s false for dangerouslySetInnerHTML, for href attributes that can contain javascript: URLs, and for server-side rendering contexts where the initial HTML is constructed outside React’s control.
// VULNERABLE: javascript: URL in href
function UserLink({ url }) {
return <a href={url}>Click here</a>;
}
// If url is "javascript:alert(document.cookie)" — XSS
// SECURE: validate URL protocol
function UserLink({ url }) {
const safeUrl = /^https?:\/\//.test(url) ? url : "#";
return <a href={safeUrl}>Click here</a>;
}
CSRF: Why Tokens Exist and Where SOP Fails
Cross-Site Request Forgery exploits the browser’s automatic credential attachment. When your browser sends a request to bank.com, it automatically includes your cookies for bank.com — regardless of which site initiated the request. An attacker’s page at evil.com can submit a form to bank.com/transfer, and the browser will helpfully attach your session cookie.
<!-- On evil.com — auto-submitting form -->
<form action="https://bank.com/api/transfer" method="POST" id="f">
<input type="hidden" name="to" value="attacker-account" />
<input type="hidden" name="amount" value="10000" />
</form>
<script>
document.getElementById("f").submit();
</script>
The Same-Origin Policy (SOP) prevents evil.com from reading the response from bank.com. But it doesn’t prevent evil.com from sending the request. This is the critical distinction that most engineers miss. SOP protects confidentiality (reading cross-origin data), not integrity (sending cross-origin requests).
CSRF tokens close this gap. The server generates a random token, embeds it in the page, and requires it with every state-changing request. Since evil.com can’t read pages from bank.com (SOP prevents that), it can’t obtain the token, so it can’t forge a valid request.
# Server: generate and validate CSRF token
@app.before_request
def csrf_protect():
if request.method in ('POST', 'PUT', 'DELETE'):
token = request.form.get('csrf_token') or request.headers.get('X-CSRF-Token')
if not token or token != session.get('csrf_token'):
abort(403, 'CSRF validation failed')
@app.route('/form')
def render_form():
token = secrets.token_hex(32)
session['csrf_token'] = token
return render_template('form.html', csrf_token=token)
The SameSite cookie attribute provides additional defense by telling the browser not to send cookies on cross-origin requests. SameSite=Strict prevents all cross-site cookie transmission. SameSite=Lax (the modern default in Chrome) allows cookies on top-level navigations but blocks them on cross-origin POST requests.
But SameSite=Lax still allows GET requests, so if your application performs state changes on GET endpoints (a common mistake), SameSite=Lax doesn’t protect you. The abstraction of “modern browsers handle CSRF” has limits, and those limits depend on your application’s design.
SSRF: When Your Server Makes Requests for the Attacker
Server-Side Request Forgery happens when an application fetches a URL provided by the user, and the attacker supplies an internal URL. The server, which sits inside the network perimeter, can reach resources that the attacker can’t.
The canonical example targets cloud metadata endpoints:
# VULNERABLE: fetching user-supplied URL
@app.route('/preview')
def preview_url():
url = request.args.get('url')
response = requests.get(url)
return response.text
# Attacker supplies: http://169.254.169.254/latest/meta-data/iam/security-credentials/
# Server returns: temporary AWS credentials with whatever permissions the EC2 role has
The IP address 169.254.169.254 is the AWS metadata service, accessible only from within the EC2 instance. It serves temporary credentials, instance identity documents, and user data scripts. When your application fetches a URL on behalf of a user, the attacker makes your server fetch its own credentials.
This was the core mechanism in the Capital One breach. A misconfigured WAF allowed SSRF, which accessed the metadata service, which returned credentials with overly broad S3 permissions.
The defense requires validating the URL at multiple levels:
import ipaddress
from urllib.parse import urlparse
BLOCKED_RANGES = [
ipaddress.ip_network('169.254.0.0/16'), # Link-local / metadata
ipaddress.ip_network('10.0.0.0/8'), # Private
ipaddress.ip_network('172.16.0.0/12'), # Private
ipaddress.ip_network('192.168.0.0/16'), # Private
ipaddress.ip_network('127.0.0.0/8'), # Loopback
]
def is_safe_url(url):
parsed = urlparse(url)
if parsed.scheme not in ('http', 'https'):
return False
try:
# Resolve DNS to check the actual IP
resolved_ip = ipaddress.ip_address(socket.gethostbyname(parsed.hostname))
return not any(resolved_ip in network for network in BLOCKED_RANGES)
except (socket.gaierror, ValueError):
return False
@app.route('/preview')
def preview_url():
url = request.args.get('url')
if not is_safe_url(url):
abort(400, 'URL not allowed')
response = requests.get(url, allow_redirects=False) # No redirects!
return response.text
Note allow_redirects=False — an attacker could supply a safe-looking URL that redirects to 169.254.169.254. DNS rebinding is another bypass: a domain that resolves to a public IP for the first lookup and a private IP for the second. AWS addressed this on their end with IMDSv2, which requires a PUT request with a TTL header to obtain a session token before any metadata access. But IMDSv2 is opt-in unless you enforce it, and many instances still run with IMDSv1 exposed.
The Pattern Across All Four
Every one of these vulnerabilities follows the same structure. An abstraction promises safety: “the ORM prevents injection,” “React prevents XSS,” “SOP prevents cross-origin attacks,” “the firewall prevents internal access.” Each promise is true within a specific boundary. Each boundary has gaps. Attackers exploit the gaps.
The defense isn’t memorizing a checklist of mitigations. The defense is understanding the mechanism — what parameterization actually does at the database protocol level, what character encoding prevents at the parser level, what SOP enforces and what it permits, what network boundaries the metadata service sits behind. When you understand the mechanism, you can reason about novel situations. When you only know the recipe, you’re safe until you encounter a situation the recipe didn’t cover.
That situation arrives more often than anyone wants to admit.