Authentication Fundamentals
Authentication is the process of verifying that a user is who they claim to be. It is the gateway to your entire application. If authentication is broken, every other security control downstream becomes irrelevant because the attacker is already inside.
Authentication answers the question "Who are you?" while authorization answers "What are you allowed to do?" Both are critical, but authentication must come first and must be implemented correctly.
Authentication factors fall into three categories:
No single authentication factor is sufficient for high-security applications. Combining factors from different categories (multi-factor authentication) dramatically increases security because an attacker must compromise multiple independent systems.
Password Storage: Hashing and Salting
How you store passwords is one of the most consequential security decisions in your application. A database breach is not a matter of "if" but "when." When it happens, the difference between properly hashed passwords and poorly stored ones determines whether millions of accounts are compromised instantly or remain protected.
What NOT to Do
# CATASTROPHIC: Plain text storage
INSERT INTO users (username, password) VALUES ('alice', 'MyP@ssw0rd!');
# If the DB is breached, every password is immediately exposed
# DANGEROUS: Simple hash without salt
import hashlib
password_hash = hashlib.sha256(password.encode()).hexdigest()
# Vulnerable to rainbow tables - precomputed hash-to-password mappings
# Same password always produces the same hash across all users
# DANGEROUS: Hash with a static salt
password_hash = hashlib.sha256(('staticSalt2024' + password).encode()).hexdigest()
# If the salt is discovered (it will be - it's in your source code),
# the attacker builds one rainbow table and cracks all passwords
The Right Way: bcrypt, scrypt, or Argon2
Use a purpose-built password hashing algorithm that includes automatic salting, is computationally expensive (slow to brute-force), and is memory-hard (resistant to GPU/ASIC acceleration).
# Python: bcrypt
import bcrypt
def hash_password(password):
"""Hash a password with bcrypt. Salt is generated automatically."""
salt = bcrypt.gensalt(rounds=12) # Cost factor 12 = ~250ms per hash
return bcrypt.hashpw(password.encode('utf-8'), salt)
def verify_password(password, stored_hash):
"""Verify a password against a stored bcrypt hash."""
return bcrypt.checkpw(password.encode('utf-8'), stored_hash)
# Usage:
hashed = hash_password('MySecurePassword123!')
# b'$2b$12$LJ3m4ys3Gzl7v5K9Z1X8aeKj2G...' (salt is embedded in the hash)
is_valid = verify_password('MySecurePassword123!', hashed) # True
is_valid = verify_password('WrongPassword', hashed) # False
# Node.js: bcrypt
const bcrypt = require('bcrypt');
const SALT_ROUNDS = 12;
async function hashPassword(password) {
return await bcrypt.hash(password, SALT_ROUNDS);
}
async function verifyPassword(password, storedHash) {
return await bcrypt.compare(password, storedHash);
}
MD5, SHA-1, SHA-256, and SHA-512 are designed to be fast. That is the opposite of what you want for password hashing. A modern GPU can compute billions of SHA-256 hashes per second, making brute-force attacks trivial. Purpose-built password hashing algorithms are intentionally slow (hundreds of milliseconds per hash) to make brute-force infeasible.
Session Management
After a user authenticates, a session maintains their authenticated state across multiple requests. Poor session management can undermine even the strongest authentication, giving attackers a way to hijack legitimate user sessions.
Session ID Requirements
- Cryptographically random - Use a CSPRNG (e.g.,
secrets.token_hex(32)in Python,crypto.randomBytes(32)in Node.js). Never use predictable values like sequential IDs or timestamps. - Sufficient length - At least 128 bits (32 hex characters) of entropy to prevent brute-force guessing
- Transmitted securely - Only over HTTPS. Set the
Securecookie flag to prevent transmission over HTTP. - Protected from JavaScript - Set the
HttpOnlycookie flag to prevent XSS attacks from reading the session cookie - Scoped correctly - Set appropriate
PathandDomainattributes to limit where the cookie is sent
# Secure session cookie configuration
Set-Cookie: session_id=a3f8b2c1d4e5f6...;
Secure;
HttpOnly;
SameSite=Lax;
Path=/;
Max-Age=3600
# Express.js session configuration
app.use(session({
secret: process.env.SESSION_SECRET, // Load from environment, not code
name: 'sessionId', // Custom name (don't use default 'connect.sid')
resave: false,
saveUninitialized: false,
cookie: {
secure: true, // HTTPS only
httpOnly: true, // No JavaScript access
sameSite: 'lax', // CSRF protection
maxAge: 3600000, // 1 hour in milliseconds
path: '/'
}
}));
Session Lifecycle Best Practices
- Regenerate on login - Create a new session ID after successful authentication to prevent session fixation attacks
- Invalidate on logout - Destroy the session on the server side, not just the cookie. A deleted cookie with a valid server session is still usable if intercepted.
- Set absolute timeouts - Sessions should expire after a fixed period (e.g., 8 hours) regardless of activity
- Set idle timeouts - Sessions should expire after a period of inactivity (e.g., 30 minutes)
- Regenerate on privilege change - Generate a new session ID whenever the user's privilege level changes (e.g., after enabling admin mode)
Store session data on the server (in memory, Redis, or a database), not in the cookie itself. The cookie should contain only the session ID. Storing data in signed cookies (like JWT) requires careful consideration of token size, revocation challenges, and the inability to invalidate sessions server-side.
Multi-Factor Authentication
Multi-factor authentication (MFA) requires users to provide two or more independent authentication factors. Even if an attacker steals a password (through phishing, a data breach, or keylogging), they still cannot access the account without the second factor.
TOTP (Time-Based One-Time Passwords)
TOTP is the most common MFA method. Apps like Google Authenticator, Authy, and KeePassXC generate six-digit codes that change every 30 seconds, derived from a shared secret and the current time.
# Python: Implementing TOTP with pyotp
import pyotp
# During user enrollment:
def generate_totp_secret():
"""Generate a new TOTP secret for user enrollment."""
secret = pyotp.random_base32()
# Store this secret securely in the database (encrypted)
return secret
def get_provisioning_uri(secret, username, issuer='YourApp'):
"""Generate a QR code URI for authenticator apps."""
totp = pyotp.TOTP(secret)
return totp.provisioning_uri(name=username, issuer_name=issuer)
# Returns: otpauth://totp/YourApp:alice?secret=BASE32SECRET&issuer=YourApp
# During login verification:
def verify_totp(secret, code):
"""Verify a TOTP code. Allows 1 window of clock skew."""
totp = pyotp.TOTP(secret)
return totp.verify(code, valid_window=1)
Hardware Security Keys (WebAuthn/FIDO2)
Hardware keys (YubiKey, Google Titan) are the strongest MFA factor available. They use public-key cryptography, are phishing-resistant (they verify the origin domain), and cannot be remotely cloned.
- Phishing-resistant - The key verifies the website's domain before responding, so a phishing site on a different domain receives nothing
- No shared secrets - Uses public/private key pairs instead of shared TOTP secrets. The private key never leaves the hardware device.
- No code to type - User taps the key or uses biometrics; no codes to intercept or phish
SMS codes are vulnerable to SIM swapping (an attacker convinces the carrier to transfer your phone number), SS7 network attacks (intercepting SMS in transit), and social engineering. While SMS MFA is better than no MFA at all, use TOTP or hardware keys whenever possible. NIST has deprecated SMS for MFA in sensitive contexts.
OAuth 2.0 and OpenID Connect
OAuth 2.0 is an authorization framework that allows users to grant third-party applications limited access to their accounts on another service without sharing their password. OpenID Connect (OIDC) is an identity layer built on top of OAuth 2.0 that adds authentication.
When to Use OAuth/OIDC
- "Login with Google/GitHub/Microsoft" - Delegating authentication to a trusted identity provider so you do not need to store passwords at all
- API authorization - Granting third-party apps scoped access to user data (e.g., a calendar app reading your email events)
- Single Sign-On (SSO) - Allowing users to authenticate once and access multiple related applications
The Authorization Code Flow (Most Secure)
# Critical security checks in the OAuth callback:
def oauth_callback(request):
# 1. Verify the state parameter matches what you sent (prevents CSRF)
if request.args.get('state') != session.get('oauth_state'):
abort(403, 'Invalid state parameter')
# 2. Exchange authorization code for tokens (server-to-server)
token_response = requests.post('https://provider.com/oauth/token', data={
'grant_type': 'authorization_code',
'code': request.args.get('code'),
'redirect_uri': 'https://yourapp.com/callback',
'client_id': os.environ['OAUTH_CLIENT_ID'],
'client_secret': os.environ['OAUTH_CLIENT_SECRET'] # Never expose this
})
# 3. Validate the ID token (if using OIDC)
# Verify signature, issuer, audience, and expiration
# 4. Create a local session for the authenticated user
user_info = token_response.json()
session['user_id'] = find_or_create_user(user_info)
For single-page applications and mobile apps (where the client_secret cannot be kept secret), use PKCE (Proof Key for Code Exchange). PKCE adds a code_verifier/code_challenge pair that prevents authorization code interception attacks. Many providers now require PKCE for all clients, not just public ones.
Common Authentication Vulnerabilities
Even well-intentioned authentication implementations frequently contain vulnerabilities. These are the patterns that attackers look for first.
- Username enumeration - Different error messages for "user not found" vs. "wrong password" reveal which usernames exist. Use a generic message like "Invalid username or password" for both cases.
- Timing attacks - If your code returns immediately when the username does not exist but takes time to hash when it does, an attacker can determine valid usernames by measuring response times. Always hash something, even for non-existent users.
- Credential stuffing - Automated attacks using username/password pairs leaked from other breaches. Defend with rate limiting, CAPTCHAs after failed attempts, and monitoring for bulk login attempts from single IPs.
- Session fixation - An attacker sets a known session ID in the victim's browser before they log in. The application must generate a new session ID upon successful authentication.
- Insufficient brute-force protection - No rate limiting or account lockout on login endpoints. Implement progressive delays: 1s after 3 failures, 5s after 5, 30s after 10, temporary lockout after 20.
- Insecure "remember me" - Persistent login tokens stored as predictable values or without proper expiration. Use a cryptographically random token, store only its hash in the database, and require re-authentication for sensitive operations.
# Preventing timing-based username enumeration
import bcrypt
import secrets
# Pre-computed dummy hash (same cost factor as real hashes)
DUMMY_HASH = bcrypt.hashpw(b'dummy_password', bcrypt.gensalt(rounds=12))
def login(username, password):
user = find_user(username)
if user is None:
# IMPORTANT: Still hash the password to prevent timing attacks
bcrypt.checkpw(password.encode('utf-8'), DUMMY_HASH)
return None, 'Invalid username or password'
if bcrypt.checkpw(password.encode('utf-8'), user.password_hash):
return user, 'Success'
else:
return None, 'Invalid username or password' # Same message as above
Secure Password Reset
Password reset flows are a frequent target because they bypass the primary authentication mechanism entirely. A poorly implemented reset flow is equivalent to having no password at all.
Secure Reset Flow
import secrets
import hashlib
from datetime import datetime, timedelta
def create_reset_token(user_id):
"""Generate a secure password reset token."""
token = secrets.token_urlsafe(32) # 256 bits of randomness
token_hash = hashlib.sha256(token.encode()).hexdigest()
expiry = datetime.utcnow() + timedelta(minutes=30)
# Store the HASH, not the token. If DB is breached, tokens are useless.
db.execute(
"INSERT INTO password_resets (user_id, token_hash, expires_at) VALUES (%s, %s, %s)",
(user_id, token_hash, expiry)
)
return token # Send this in the email link
def verify_reset_token(token):
"""Verify a reset token and return the associated user_id."""
token_hash = hashlib.sha256(token.encode()).hexdigest()
result = db.execute(
"SELECT user_id, expires_at FROM password_resets WHERE token_hash = %s",
(token_hash,)
)
if not result:
return None
if result.expires_at < datetime.utcnow():
return None # Token expired
return result.user_id
Questions like "What is your mother's maiden name?" or "What city were you born in?" can often be answered using social media, public records, or simple guessing. Do not use security questions as a sole recovery mechanism. If you must include them, treat them as a weak second factor, not a primary reset method.
Summary
In this tutorial, you learned:
- The three authentication factor categories and why combining them provides defense in depth
- Why password hashing requires purpose-built algorithms (bcrypt, scrypt, Argon2) and not general-purpose hash functions
- How to securely manage sessions with proper cookie attributes, regeneration, and timeouts
- Multi-factor authentication methods from weakest (SMS) to strongest (hardware keys / WebAuthn)
- OAuth 2.0 and OpenID Connect flows, including the critical security checks in the callback handler
- Common authentication vulnerabilities including username enumeration, timing attacks, and session fixation
- How to implement a secure password reset flow with hashed tokens, expiration, and session invalidation
Authentication is the foundation of your application's security. By using proven password hashing algorithms, managing sessions correctly, implementing MFA, and avoiding common pitfalls, you create a strong first line of defense against unauthorized access.