.. meta::
    :description lang=en: Common Python security vulnerabilities and why legacy cryptographic patterns are insecure, with attack demonstrations
    :keywords: Python, Python3, security, vulnerability, PyCrypto, padding oracle, PKCS1 v1.5, AES-CBC, timing attack, insecure

===============================
Common Security Vulnerabilities
===============================

:Source: `src/security/vulnerability_.py <https://github.com/crazyguitar/pysheeet/blob/master/src/security/vulnerability_.py>`_

.. contents:: Table of Contents
    :backlinks: none

Introduction
------------

This page explains why certain cryptographic patterns are insecure and how
attackers can exploit them. Understanding these vulnerabilities helps you
recognize dangerous code in legacy systems and avoid introducing similar
weaknesses in new projects. For secure implementations, see
:doc:`python-crypto` and :doc:`python-tls`.

AES-CBC Without Authentication (Padding Oracle)
-----------------------------------------------

AES-CBC mode encrypts data but provides no integrity protection. An attacker
who can modify ciphertext and observe whether decryption succeeds can recover
the plaintext byte-by-byte through a **padding oracle attack**. This attack
exploits the PKCS#7 padding validation to leak information.

**Vulnerable Code:**

.. code-block:: python

    # INSECURE: AES-CBC without authentication
    from Crypto.Cipher import AES

    def encrypt_cbc(key, iv, plaintext):
        cipher = AES.new(key, AES.MODE_CBC, iv)
        # Manual PKCS#7 padding
        pad_len = 16 - (len(plaintext) % 16)
        padded = plaintext + bytes([pad_len] * pad_len)
        return cipher.encrypt(padded)

    def decrypt_cbc(key, iv, ciphertext):
        cipher = AES.new(key, AES.MODE_CBC, iv)
        padded = cipher.decrypt(ciphertext)
        # VULNERABLE: Padding validation leaks information
        pad_len = padded[-1]
        if not all(b == pad_len for b in padded[-pad_len:]):
            raise ValueError("Invalid padding")  # Oracle!
        return padded[:-pad_len]

**Why It's Vulnerable:**

The padding validation error reveals whether the decrypted padding is valid.
An attacker can:

1. Intercept a ciphertext block
2. Modify the previous block's last byte
3. Submit to the server and observe if padding error occurs
4. Repeat 256 times to determine one plaintext byte
5. Continue for all bytes

.. code-block:: python

    # Simplified padding oracle attack concept
    def padding_oracle_attack(ciphertext, oracle_func):
        """
        oracle_func returns True if padding is valid, False otherwise.
        This leaks enough information to decrypt without the key.
        """
        # For each block, XOR previous block to control decrypted value
        # Try all 256 values until padding is valid
        # Valid padding reveals the intermediate state
        # XOR with known value gives plaintext
        pass  # Full implementation is complex but well-documented

**Secure Alternative:** Use AES-GCM which provides authenticated encryption:

.. code-block:: python

    from cryptography.hazmat.primitives.ciphers.aead import AESGCM

    key = AESGCM.generate_key(bit_length=256)
    aesgcm = AESGCM(key)
    nonce = os.urandom(12)
    # Encryption includes authentication tag - tampering is detected
    ciphertext = aesgcm.encrypt(nonce, plaintext, associated_data)

RSA PKCS#1 v1.5 Padding (Bleichenbacher Attack)
-----------------------------------------------

RSA with PKCS#1 v1.5 padding is vulnerable to the **Bleichenbacher attack**
(also called the "million message attack"). If a server reveals whether
decryption produced valid PKCS#1 v1.5 padding, an attacker can decrypt
messages or forge signatures.

**Vulnerable Code:**

.. code-block:: python

    # INSECURE: PKCS#1 v1.5 padding
    from Crypto.Cipher import PKCS1_v1_5
    from Crypto.PublicKey import RSA

    def decrypt_rsa_v15(private_key_pem, ciphertext):
        key = RSA.import_key(private_key_pem)
        cipher = PKCS1_v1_5.new(key)
        # VULNERABLE: Different errors for padding vs other failures
        plaintext = cipher.decrypt(ciphertext, sentinel=None)
        if plaintext is None:
            raise ValueError("Decryption failed")  # Oracle!
        return plaintext

**Why It's Vulnerable:**

PKCS#1 v1.5 padding has a specific structure: ``0x00 0x02 [random] 0x00 [message]``.
When decryption fails due to invalid padding vs. other reasons, the different
error responses create an oracle. An attacker can:

1. Choose a ciphertext ``c``
2. Compute ``c' = c * s^e mod n`` for various ``s`` values
3. Submit ``c'`` and check if padding is valid
4. Use valid/invalid responses to narrow down the plaintext

**Secure Alternative:** Use RSA-OAEP padding:

.. code-block:: python

    from cryptography.hazmat.primitives.asymmetric import padding
    from cryptography.hazmat.primitives import hashes

    ciphertext = public_key.encrypt(
        plaintext,
        padding.OAEP(
            mgf=padding.MGF1(algorithm=hashes.SHA256()),
            algorithm=hashes.SHA256(),
            label=None
        )
    )

Timing Attacks on String Comparison
-----------------------------------

Comparing secrets using ``==`` is vulnerable to **timing attacks**. The
comparison stops at the first different byte, so the time taken reveals
how many bytes match. An attacker can guess secrets byte-by-byte.

**Vulnerable Code:**

.. code-block:: python

    # INSECURE: Regular string comparison
    def verify_token(user_token, stored_token):
        return user_token == stored_token  # Timing leak!

    def verify_signature(computed_sig, provided_sig):
        return computed_sig == provided_sig  # Timing leak!

**Why It's Vulnerable:**

.. code-block:: python

    # Demonstration of timing difference
    import time

    secret = b"correct_secret_token_here"

    def insecure_compare(a, b):
        if len(a) != len(b):
            return False
        for x, y in zip(a, b):
            if x != y:
                return False  # Returns early - timing leak
        return True

    # Attacker measures time for different guesses:
    # "a..." - fails fast (wrong first byte)
    # "c..." - takes slightly longer (first byte correct)
    # "co..." - even longer (two bytes correct)
    # Eventually recovers entire secret

**Secure Alternative:** Use constant-time comparison:

.. code-block:: python

    import hmac

    def verify_token(user_token, stored_token):
        # hmac.compare_digest runs in constant time
        return hmac.compare_digest(user_token, stored_token)

Weak Random Number Generation
-----------------------------

Using ``random`` module for security purposes is dangerous. It uses a
deterministic PRNG (Mersenne Twister) that can be predicted if an attacker
observes enough outputs.

**Vulnerable Code:**

.. code-block:: python

    # INSECURE: Using random for security
    import random
    import string

    def generate_token():
        # VULNERABLE: Predictable after ~624 outputs observed
        chars = string.ascii_letters + string.digits
        return ''.join(random.choice(chars) for _ in range(32))

    def generate_session_id():
        # VULNERABLE: Can be predicted
        return random.randint(0, 2**64)

**Why It's Vulnerable:**

Mersenne Twister has 624 32-bit state values. After observing 624 outputs,
an attacker can reconstruct the internal state and predict all future outputs.

.. code-block:: python

    # Mersenne Twister state recovery (conceptual)
    # After collecting 624 consecutive 32-bit outputs,
    # attacker can "untemper" them to recover internal state
    # Then predict all future random() calls

**Secure Alternative:** Use ``secrets`` module:

.. code-block:: python

    import secrets

    def generate_token():
        return secrets.token_urlsafe(32)

    def generate_session_id():
        return secrets.token_hex(16)

Hardcoded Secrets and Keys
--------------------------

Embedding secrets in source code exposes them through version control,
logs, error messages, and decompilation.

**Vulnerable Code:**

.. code-block:: python

    # INSECURE: Hardcoded secrets
    API_KEY = "sk_live_abc123xyz789"  # Exposed in git history!
    DB_PASSWORD = "super_secret_password"
    ENCRYPTION_KEY = b"0123456789abcdef"

    def connect_to_api():
        return requests.get(url, headers={"Authorization": API_KEY})

**Why It's Vulnerable:**

- Secrets in git history persist even after deletion
- Error messages may include variable values
- Compiled Python (.pyc) can be decompiled
- Logs may capture the values

**Secure Alternative:** Use environment variables or secret managers:

.. code-block:: python

    import os

    API_KEY = os.environ.get("API_KEY")
    if not API_KEY:
        raise RuntimeError("API_KEY environment variable required")

    # Or use a secrets manager
    from aws_secretsmanager import get_secret
    secrets = get_secret("my-app/production")

SQL Injection
-------------

Building SQL queries with string concatenation allows attackers to inject
malicious SQL commands.

**Vulnerable Code:**

.. code-block:: python

    # INSECURE: String concatenation in SQL
    def get_user(username):
        query = f"SELECT * FROM users WHERE username = '{username}'"
        cursor.execute(query)  # SQL injection!
        return cursor.fetchone()

    # Attacker input: "admin' OR '1'='1"
    # Results in: SELECT * FROM users WHERE username = 'admin' OR '1'='1'
    # Returns all users!

    # Worse: "admin'; DROP TABLE users; --"
    # Deletes the entire table!

**Secure Alternative:** Use parameterized queries:

.. code-block:: python

    def get_user(username):
        query = "SELECT * FROM users WHERE username = ?"
        cursor.execute(query, (username,))  # Safe - parameterized
        return cursor.fetchone()

    # Or with SQLAlchemy
    from sqlalchemy import select
    stmt = select(User).where(User.username == username)

Command Injection
-----------------

Passing user input to shell commands allows arbitrary command execution.

**Vulnerable Code:**

.. code-block:: python

    # INSECURE: Shell injection
    import os
    import subprocess

    def ping_host(hostname):
        os.system(f"ping -c 1 {hostname}")  # Command injection!

    # Attacker input: "google.com; rm -rf /"
    # Executes: ping -c 1 google.com; rm -rf /

    def get_file_info(filename):
        # Also vulnerable with subprocess and shell=True
        result = subprocess.run(
            f"file {filename}",
            shell=True,  # DANGEROUS
            capture_output=True
        )

**Secure Alternative:** Avoid shell, use argument lists:

.. code-block:: python

    import subprocess
    import shlex

    def ping_host(hostname):
        # Validate input first
        if not hostname.replace('.', '').replace('-', '').isalnum():
            raise ValueError("Invalid hostname")
        # Use list of arguments, not shell string
        subprocess.run(["ping", "-c", "1", hostname], check=True)

    def get_file_info(filename):
        # shell=False (default) prevents injection
        result = subprocess.run(
            ["file", filename],
            capture_output=True,
            check=True
        )

Insecure Deserialization (Pickle)
---------------------------------

Python's ``pickle`` module can execute arbitrary code during deserialization.
Never unpickle data from untrusted sources.

**Vulnerable Code:**

.. code-block:: python

    # INSECURE: Unpickling untrusted data
    import pickle

    def load_user_data(data):
        return pickle.loads(data)  # Remote code execution!

    # Attacker can craft malicious pickle:
    import os
    class Exploit:
        def __reduce__(self):
            return (os.system, ("rm -rf /",))

    malicious = pickle.dumps(Exploit())
    # When unpickled, executes: os.system("rm -rf /")

**Secure Alternative:** Use safe formats like JSON:

.. code-block:: python

    import json

    def load_user_data(data):
        return json.loads(data)  # Safe - no code execution

    # If you must use pickle, restrict classes
    import pickle
    import io

    class RestrictedUnpickler(pickle.Unpickler):
        ALLOWED_CLASSES = {('mymodule', 'SafeClass')}

        def find_class(self, module, name):
            if (module, name) not in self.ALLOWED_CLASSES:
                raise pickle.UnpicklingError(f"Forbidden: {module}.{name}")
            return super().find_class(module, name)

Summary: Legacy vs Modern
-------------------------

+------------------------+---------------------------+---------------------------+
| Vulnerability          | Legacy (Insecure)         | Modern (Secure)           |
+========================+===========================+===========================+
| Symmetric Encryption   | AES-CBC without auth      | AES-GCM                   |
+------------------------+---------------------------+---------------------------+
| RSA Padding            | PKCS#1 v1.5               | OAEP                      |
+------------------------+---------------------------+---------------------------+
| Secret Comparison      | ``==``                    | ``hmac.compare_digest``   |
+------------------------+---------------------------+---------------------------+
| Random Numbers         | ``random``                | ``secrets``               |
+------------------------+---------------------------+---------------------------+
| Password Hashing       | MD5, SHA1                 | Argon2, bcrypt            |
+------------------------+---------------------------+---------------------------+
| Crypto Library         | PyCrypto                  | ``cryptography``          |
+------------------------+---------------------------+---------------------------+
| SSL/TLS                | ``ssl.wrap_socket``       | ``SSLContext``            |
+------------------------+---------------------------+---------------------------+
