Authentication Fundamentals

Authentication is the process of verifying that a user is who they claim to be. It is the gateway to your entire application. If authentication is broken, every other security control downstream becomes irrelevant because the attacker is already inside.

Authentication answers the question "Who are you?" while authorization answers "What are you allowed to do?" Both are critical, but authentication must come first and must be implemented correctly.

Authentication factors fall into three categories:

Something you know Passwords, PINs, security questions. The most common factor, but also the most vulnerable to theft, guessing, and phishing.
Something you have A phone (for SMS/TOTP codes), a hardware security key (YubiKey), or a smart card. Harder to steal remotely, but can be lost or cloned.
Something you are Biometrics like fingerprints, facial recognition, or iris scans. Convenient but cannot be changed if compromised.
💡
The principle of defense in depth

No single authentication factor is sufficient for high-security applications. Combining factors from different categories (multi-factor authentication) dramatically increases security because an attacker must compromise multiple independent systems.

Password Storage: Hashing and Salting

How you store passwords is one of the most consequential security decisions in your application. A database breach is not a matter of "if" but "when." When it happens, the difference between properly hashed passwords and poorly stored ones determines whether millions of accounts are compromised instantly or remain protected.

What NOT to Do

# CATASTROPHIC: Plain text storage
INSERT INTO users (username, password) VALUES ('alice', 'MyP@ssw0rd!');
# If the DB is breached, every password is immediately exposed

# DANGEROUS: Simple hash without salt
import hashlib
password_hash = hashlib.sha256(password.encode()).hexdigest()
# Vulnerable to rainbow tables - precomputed hash-to-password mappings
# Same password always produces the same hash across all users

# DANGEROUS: Hash with a static salt
password_hash = hashlib.sha256(('staticSalt2024' + password).encode()).hexdigest()
# If the salt is discovered (it will be - it's in your source code),
# the attacker builds one rainbow table and cracks all passwords

The Right Way: bcrypt, scrypt, or Argon2

Use a purpose-built password hashing algorithm that includes automatic salting, is computationally expensive (slow to brute-force), and is memory-hard (resistant to GPU/ASIC acceleration).

# Python: bcrypt
import bcrypt

def hash_password(password):
    """Hash a password with bcrypt. Salt is generated automatically."""
    salt = bcrypt.gensalt(rounds=12)  # Cost factor 12 = ~250ms per hash
    return bcrypt.hashpw(password.encode('utf-8'), salt)

def verify_password(password, stored_hash):
    """Verify a password against a stored bcrypt hash."""
    return bcrypt.checkpw(password.encode('utf-8'), stored_hash)

# Usage:
hashed = hash_password('MySecurePassword123!')
# b'$2b$12$LJ3m4ys3Gzl7v5K9Z1X8aeKj2G...'  (salt is embedded in the hash)

is_valid = verify_password('MySecurePassword123!', hashed)  # True
is_valid = verify_password('WrongPassword', hashed)          # False
# Node.js: bcrypt
const bcrypt = require('bcrypt');
const SALT_ROUNDS = 12;

async function hashPassword(password) {
    return await bcrypt.hash(password, SALT_ROUNDS);
}

async function verifyPassword(password, storedHash) {
    return await bcrypt.compare(password, storedHash);
}
bcrypt The most widely supported option. Built-in salt, configurable cost factor. Supported in every major language. Maximum input length of 72 bytes.
scrypt Memory-hard in addition to CPU-hard. More resistant to ASIC/GPU attacks than bcrypt. Good choice for new projects.
Argon2 Winner of the Password Hashing Competition (2015). The current recommended standard. Configurable memory, time, and parallelism costs. Use Argon2id variant.
⚠️
Never use general-purpose hash functions for passwords

MD5, SHA-1, SHA-256, and SHA-512 are designed to be fast. That is the opposite of what you want for password hashing. A modern GPU can compute billions of SHA-256 hashes per second, making brute-force attacks trivial. Purpose-built password hashing algorithms are intentionally slow (hundreds of milliseconds per hash) to make brute-force infeasible.

Session Management

After a user authenticates, a session maintains their authenticated state across multiple requests. Poor session management can undermine even the strongest authentication, giving attackers a way to hijack legitimate user sessions.

Session ID Requirements

  • Cryptographically random - Use a CSPRNG (e.g., secrets.token_hex(32) in Python, crypto.randomBytes(32) in Node.js). Never use predictable values like sequential IDs or timestamps.
  • Sufficient length - At least 128 bits (32 hex characters) of entropy to prevent brute-force guessing
  • Transmitted securely - Only over HTTPS. Set the Secure cookie flag to prevent transmission over HTTP.
  • Protected from JavaScript - Set the HttpOnly cookie flag to prevent XSS attacks from reading the session cookie
  • Scoped correctly - Set appropriate Path and Domain attributes to limit where the cookie is sent
# Secure session cookie configuration
Set-Cookie: session_id=a3f8b2c1d4e5f6...;
    Secure;
    HttpOnly;
    SameSite=Lax;
    Path=/;
    Max-Age=3600

# Express.js session configuration
app.use(session({
    secret: process.env.SESSION_SECRET,  // Load from environment, not code
    name: 'sessionId',                    // Custom name (don't use default 'connect.sid')
    resave: false,
    saveUninitialized: false,
    cookie: {
        secure: true,      // HTTPS only
        httpOnly: true,     // No JavaScript access
        sameSite: 'lax',   // CSRF protection
        maxAge: 3600000,   // 1 hour in milliseconds
        path: '/'
    }
}));

Session Lifecycle Best Practices

  • Regenerate on login - Create a new session ID after successful authentication to prevent session fixation attacks
  • Invalidate on logout - Destroy the session on the server side, not just the cookie. A deleted cookie with a valid server session is still usable if intercepted.
  • Set absolute timeouts - Sessions should expire after a fixed period (e.g., 8 hours) regardless of activity
  • Set idle timeouts - Sessions should expire after a period of inactivity (e.g., 30 minutes)
  • Regenerate on privilege change - Generate a new session ID whenever the user's privilege level changes (e.g., after enabling admin mode)
💡
Server-side session storage

Store session data on the server (in memory, Redis, or a database), not in the cookie itself. The cookie should contain only the session ID. Storing data in signed cookies (like JWT) requires careful consideration of token size, revocation challenges, and the inability to invalidate sessions server-side.

Multi-Factor Authentication

Multi-factor authentication (MFA) requires users to provide two or more independent authentication factors. Even if an attacker steals a password (through phishing, a data breach, or keylogging), they still cannot access the account without the second factor.

TOTP (Time-Based One-Time Passwords)

TOTP is the most common MFA method. Apps like Google Authenticator, Authy, and KeePassXC generate six-digit codes that change every 30 seconds, derived from a shared secret and the current time.

# Python: Implementing TOTP with pyotp
import pyotp

# During user enrollment:
def generate_totp_secret():
    """Generate a new TOTP secret for user enrollment."""
    secret = pyotp.random_base32()
    # Store this secret securely in the database (encrypted)
    return secret

def get_provisioning_uri(secret, username, issuer='YourApp'):
    """Generate a QR code URI for authenticator apps."""
    totp = pyotp.TOTP(secret)
    return totp.provisioning_uri(name=username, issuer_name=issuer)
    # Returns: otpauth://totp/YourApp:alice?secret=BASE32SECRET&issuer=YourApp

# During login verification:
def verify_totp(secret, code):
    """Verify a TOTP code. Allows 1 window of clock skew."""
    totp = pyotp.TOTP(secret)
    return totp.verify(code, valid_window=1)

Hardware Security Keys (WebAuthn/FIDO2)

Hardware keys (YubiKey, Google Titan) are the strongest MFA factor available. They use public-key cryptography, are phishing-resistant (they verify the origin domain), and cannot be remotely cloned.

  • Phishing-resistant - The key verifies the website's domain before responding, so a phishing site on a different domain receives nothing
  • No shared secrets - Uses public/private key pairs instead of shared TOTP secrets. The private key never leaves the hardware device.
  • No code to type - User taps the key or uses biometrics; no codes to intercept or phish
⚠️
SMS-based MFA is the weakest option

SMS codes are vulnerable to SIM swapping (an attacker convinces the carrier to transfer your phone number), SS7 network attacks (intercepting SMS in transit), and social engineering. While SMS MFA is better than no MFA at all, use TOTP or hardware keys whenever possible. NIST has deprecated SMS for MFA in sensitive contexts.

OAuth 2.0 and OpenID Connect

OAuth 2.0 is an authorization framework that allows users to grant third-party applications limited access to their accounts on another service without sharing their password. OpenID Connect (OIDC) is an identity layer built on top of OAuth 2.0 that adds authentication.

When to Use OAuth/OIDC

  • "Login with Google/GitHub/Microsoft" - Delegating authentication to a trusted identity provider so you do not need to store passwords at all
  • API authorization - Granting third-party apps scoped access to user data (e.g., a calendar app reading your email events)
  • Single Sign-On (SSO) - Allowing users to authenticate once and access multiple related applications

The Authorization Code Flow (Most Secure)

1
Your app redirects the user to the identity provider (e.g., Google) with a client_id, redirect_uri, scope, and a random state parameter
2
The user authenticates with the identity provider and grants permission for the requested scopes
3
The provider redirects back to your app with a short-lived authorization code and the state parameter
4
Your server exchanges the authorization code for an access token (and optionally a refresh token) via a server-to-server request
5
Your server uses the access token to fetch the user's profile from the identity provider's userinfo endpoint
# Critical security checks in the OAuth callback:

def oauth_callback(request):
    # 1. Verify the state parameter matches what you sent (prevents CSRF)
    if request.args.get('state') != session.get('oauth_state'):
        abort(403, 'Invalid state parameter')

    # 2. Exchange authorization code for tokens (server-to-server)
    token_response = requests.post('https://provider.com/oauth/token', data={
        'grant_type': 'authorization_code',
        'code': request.args.get('code'),
        'redirect_uri': 'https://yourapp.com/callback',
        'client_id': os.environ['OAUTH_CLIENT_ID'],
        'client_secret': os.environ['OAUTH_CLIENT_SECRET']  # Never expose this
    })

    # 3. Validate the ID token (if using OIDC)
    # Verify signature, issuer, audience, and expiration

    # 4. Create a local session for the authenticated user
    user_info = token_response.json()
    session['user_id'] = find_or_create_user(user_info)
💡
PKCE: Required for public clients

For single-page applications and mobile apps (where the client_secret cannot be kept secret), use PKCE (Proof Key for Code Exchange). PKCE adds a code_verifier/code_challenge pair that prevents authorization code interception attacks. Many providers now require PKCE for all clients, not just public ones.

Common Authentication Vulnerabilities

Even well-intentioned authentication implementations frequently contain vulnerabilities. These are the patterns that attackers look for first.

  • Username enumeration - Different error messages for "user not found" vs. "wrong password" reveal which usernames exist. Use a generic message like "Invalid username or password" for both cases.
  • Timing attacks - If your code returns immediately when the username does not exist but takes time to hash when it does, an attacker can determine valid usernames by measuring response times. Always hash something, even for non-existent users.
  • Credential stuffing - Automated attacks using username/password pairs leaked from other breaches. Defend with rate limiting, CAPTCHAs after failed attempts, and monitoring for bulk login attempts from single IPs.
  • Session fixation - An attacker sets a known session ID in the victim's browser before they log in. The application must generate a new session ID upon successful authentication.
  • Insufficient brute-force protection - No rate limiting or account lockout on login endpoints. Implement progressive delays: 1s after 3 failures, 5s after 5, 30s after 10, temporary lockout after 20.
  • Insecure "remember me" - Persistent login tokens stored as predictable values or without proper expiration. Use a cryptographically random token, store only its hash in the database, and require re-authentication for sensitive operations.
# Preventing timing-based username enumeration
import bcrypt
import secrets

# Pre-computed dummy hash (same cost factor as real hashes)
DUMMY_HASH = bcrypt.hashpw(b'dummy_password', bcrypt.gensalt(rounds=12))

def login(username, password):
    user = find_user(username)

    if user is None:
        # IMPORTANT: Still hash the password to prevent timing attacks
        bcrypt.checkpw(password.encode('utf-8'), DUMMY_HASH)
        return None, 'Invalid username or password'

    if bcrypt.checkpw(password.encode('utf-8'), user.password_hash):
        return user, 'Success'
    else:
        return None, 'Invalid username or password'  # Same message as above

Secure Password Reset

Password reset flows are a frequent target because they bypass the primary authentication mechanism entirely. A poorly implemented reset flow is equivalent to having no password at all.

Secure Reset Flow

1
User requests reset by entering their email. The application responds with the same message regardless of whether the email exists: "If an account with that email exists, a reset link has been sent."
2
Server generates a token using a CSPRNG (at least 32 bytes), stores its hash (not the token itself) with an expiration timestamp, and sends the token in a URL to the user's email
3
User clicks the link and is taken to a password reset form. The server validates the token hash and checks expiration (15-60 minutes maximum).
4
User sets new password. The server hashes the new password, invalidates the reset token, invalidates all existing sessions for the account, and optionally notifies the user via email that their password was changed
import secrets
import hashlib
from datetime import datetime, timedelta

def create_reset_token(user_id):
    """Generate a secure password reset token."""
    token = secrets.token_urlsafe(32)  # 256 bits of randomness
    token_hash = hashlib.sha256(token.encode()).hexdigest()
    expiry = datetime.utcnow() + timedelta(minutes=30)

    # Store the HASH, not the token. If DB is breached, tokens are useless.
    db.execute(
        "INSERT INTO password_resets (user_id, token_hash, expires_at) VALUES (%s, %s, %s)",
        (user_id, token_hash, expiry)
    )
    return token  # Send this in the email link

def verify_reset_token(token):
    """Verify a reset token and return the associated user_id."""
    token_hash = hashlib.sha256(token.encode()).hexdigest()
    result = db.execute(
        "SELECT user_id, expires_at FROM password_resets WHERE token_hash = %s",
        (token_hash,)
    )
    if not result:
        return None
    if result.expires_at < datetime.utcnow():
        return None  # Token expired
    return result.user_id
⚠️
Security questions are not secure

Questions like "What is your mother's maiden name?" or "What city were you born in?" can often be answered using social media, public records, or simple guessing. Do not use security questions as a sole recovery mechanism. If you must include them, treat them as a weak second factor, not a primary reset method.

Summary

In this tutorial, you learned:

  • The three authentication factor categories and why combining them provides defense in depth
  • Why password hashing requires purpose-built algorithms (bcrypt, scrypt, Argon2) and not general-purpose hash functions
  • How to securely manage sessions with proper cookie attributes, regeneration, and timeouts
  • Multi-factor authentication methods from weakest (SMS) to strongest (hardware keys / WebAuthn)
  • OAuth 2.0 and OpenID Connect flows, including the critical security checks in the callback handler
  • Common authentication vulnerabilities including username enumeration, timing attacks, and session fixation
  • How to implement a secure password reset flow with hashed tokens, expiration, and session invalidation
🎉
You can now build secure authentication!

Authentication is the foundation of your application's security. By using proven password hashing algorithms, managing sessions correctly, implementing MFA, and avoiding common pitfalls, you create a strong first line of defense against unauthorized access.