AuditCoreAuditCore
highCWE-22OWASP A01:2021 — Broken Access ControlDirectory TraversalPath Injection

How to fix Path Traversal

Path traversal lets attackers read files outside the directory they're supposed to access — `/etc/passwd`, your `.env`, your database backup, your private SSH keys. The classic `../../../etc/passwd` attack still works on apps written in 2026 because the fix isn't 'strip ../' — it's path canonicalization with allow-listing. Here's the playbook.

What is Path Traversal?

Path traversal happens when an app constructs a file path from user input without validating that the result stays within an intended directory. The pattern is universal: `read_file('/uploads/' + user_filename)`. If the user supplies `../../../etc/passwd` (or URL-encoded `..%2F..%2F..%2Fetc%2Fpasswd`, or the Windows variant `..\..\..\Windows\System32\drivers\etc\hosts`), the resulting path escapes `/uploads/` and reads anywhere on the filesystem the app process has access to.

Naive fixes don't work. Stripping `../` is bypassed by `....//` (which becomes `../` after one strip). Stripping all `..` is bypassed by URL encoding, double encoding, Unicode normalization tricks. The only reliable fix is to canonicalize the resolved path and verify it starts with your intended base directory — using your language's path-resolution standard library, not regex.

Path traversal is OWASP A01 (Broken Access Control) and CWE-22. It's been the root cause of dozens of major breaches: the 2021 Microsoft Exchange ProxyShell chain (CVE-2021-34473) used path traversal to bypass auth. The 2023 GoAnywhere MFT zero-day (CVE-2023-0669) was traversal-based. CitrixBleed in 2023 had a traversal angle. Modern web apps fall to it primarily through file-upload endpoints, attachment-download links, log-viewer admin tools, and any file-serving code that takes a parameter.

What an attacker can do

The concrete impact of leaving Path Traversal unpatched.

Read sensitive files outside the web root

Read `/etc/passwd`, your `.env` file, application source, configuration with API keys, certificate private keys, SSH authorized_keys.

Source code disclosure → exploit chain

Read your application source to find other vulnerabilities (SQLi, hardcoded credentials, business logic flaws). Path traversal often appears in writeups as 'step 1' of a multi-stage exploit.

Write traversal → RCE

Some path traversal vulns allow writing too — write a `.php`/`.jsp` shell to the web root, then execute it via HTTP. Reads escalate to writes when upload code uses unvalidated filenames.

Container escape (less common)

In containerized apps, traversal might read files mounted from the host (Docker socket, K8s service account tokens at `/var/run/secrets/kubernetes.io/serviceaccount/token`). Token gives cluster API access.

How do I know if I'm vulnerable?

Manual: search your codebase for `open()`, `readFile()`, `fs.createReadStream()`, `sendFile()`, `Path.Combine()`, plus your language's path-joining functions, and check whether each takes a user-influenced input. Any file-serving endpoint that accepts a query parameter or URL segment for the filename is suspect.

Automated: AuditCore's Path Traversal Scanner sends 50+ traversal payloads (URL-encoded, double-encoded, null-byte truncation, Windows backslash, Unicode normalization) against parameters discovered by the crawler. Confirms with file content patterns (e.g. `root:x:0:0:` indicates `/etc/passwd` was returned). Static analysis via Semgrep catches the pattern at the source code level when you provide a repo URL.

How to fix Path Traversal

5 ordered steps. Apply them in order — each builds on the previous.

  1. 1

    Use a strict allow-list of filenames if possible

    If the user is choosing from a known set of files (a documentation viewer, a localized template loader), don't accept the raw filename — accept an opaque ID and look up the path server-side.

    This is the only path-traversal-proof design: if the user can't supply a path, they can't traverse one. Use it whenever feasible.

    javascriptStep 1
    // ❌ User provides filename directly
    app.get("/docs/:lang", (req, res) => {
      res.sendFile(`/var/docs/${req.params.lang}.md`);
    });
    
    // ✅ User picks from an allow-list
    const ALLOWED_LANGS = new Set(["en", "pl", "de", "fr", "es"]);
    app.get("/docs/:lang", (req, res) => {
      if (!ALLOWED_LANGS.has(req.params.lang)) {
        return res.status(404).send("Not found");
      }
      res.sendFile(`/var/docs/${req.params.lang}.md`);
    });
  2. 2

    Canonicalize the path and verify the prefix

    When you must accept user-controlled paths (legitimate uploads, generated reports), resolve to absolute path and check it starts with your base directory.

    Use `path.resolve()` (Node), `os.path.realpath()` (Python), `filepath.Clean` + `filepath.IsAbs` (Go), `realpath()` (PHP). All of these handle `..`, multiple slashes, and absolute-path injection. Then string-prefix-check against your base.

    javascriptStep 2
    import path from "node:path";
    import fs from "node:fs/promises";
    
    const UPLOAD_DIR = path.resolve("/var/uploads");
    
    async function readUpload(userFilename) {
      // Resolve to absolute path
      const requested = path.resolve(UPLOAD_DIR, userFilename);
    
      // Verify it stays inside UPLOAD_DIR (with separator to avoid prefix attacks
      // where /var/uploads-evil/ prefix-matches /var/uploads)
      if (!requested.startsWith(UPLOAD_DIR + path.sep)) {
        throw new Error("Path traversal attempt");
      }
    
      return await fs.readFile(requested);
    }
    
    // Python equivalent:
    // import os
    // resolved = os.path.realpath(os.path.join(UPLOAD_DIR, user_filename))
    // if not resolved.startswith(UPLOAD_DIR + os.sep):
    //     raise ValueError("Path traversal")
  3. 3

    Reject suspicious filename characters before path operations

    Even with canonicalization, reject filenames with `..`, leading `/`, leading `\`, null bytes, or control characters. Belt-and-suspenders defense.

    Allow-list characters rather than block-list: `[A-Za-z0-9._-]+` covers most legitimate filenames. If you need Unicode support (international users), be careful — `\u202E` (right-to-left override) and other Unicode tricks can disguise extensions.

    pythonStep 3
    import re
    import os
    
    SAFE_FILENAME = re.compile(r'^[A-Za-z0-9._-]+$')
    
    def safe_open(base_dir: str, filename: str):
        # Reject obvious bad input early
        if not SAFE_FILENAME.match(filename):
            raise ValueError(f"Invalid filename: {filename}")
        if filename.startswith(".") or filename.endswith("."):
            raise ValueError(f"Invalid filename: {filename}")
    
        # Canonicalize and verify
        base = os.path.realpath(base_dir)
        resolved = os.path.realpath(os.path.join(base, filename))
        if not resolved.startswith(base + os.sep):
            raise ValueError(f"Path traversal: {filename}")
    
        return open(resolved, "rb")
  4. 4

    Run the file-serving process with minimal filesystem access

    Defense-in-depth: even if traversal happens, the process can only read what the OS lets it. Use Linux capabilities, AppArmor/SELinux profiles, or run inside a container with read-only root filesystem.

    Don't run web servers as root. Use a dedicated user (`www-data`, `nobody`) with read access only to the directories it needs. In containers, mount only required paths. For sensitive ops (user uploads), consider chroot or systemd's `ProtectSystem=strict` + `ReadWritePaths=`.

    yamlStep 4
    # Docker — read-only root + writable upload volume
    services:
      api:
        image: my-app
        read_only: true
        tmpfs:
          - /tmp
        volumes:
          - uploads:/var/uploads:rw      # writable for legit uploads
          - /etc/myapp:/etc/myapp:ro     # config readable
        user: "1000:1000"                # non-root
        cap_drop: [ALL]
        security_opt:
          - no-new-privileges:true
    
    # systemd unit equivalent:
    # ProtectSystem=strict
    # ProtectHome=true
    # ReadWritePaths=/var/uploads
  5. 5

    Don't reflect the filename in error messages

    When path validation fails, return a generic error. Reflecting the requested path tells attackers exactly what filter you're using, helping them craft bypasses.

    `Error: cannot read /etc/passwd` confirms the traversal got through canonicalization but failed the prefix check. `Error: invalid file` reveals nothing. Log the details server-side; show users only the generic message.

    goStep 5
    // ❌ Leaks the resolved path
    http.Error(w, fmt.Sprintf("Cannot read %s", resolved), http.StatusBadRequest)
    
    // ✅ Generic to user, full detail in logs
    log.Printf("path traversal attempt: user=%s requested=%q resolved=%q",
        userID, userFilename, resolved)
    http.Error(w, "Invalid filename", http.StatusBadRequest)

How to verify the fix

Run AuditCore — the Path Traversal Scanner sends payloads including `../../etc/passwd`, `..%2F..%2Fetc%2Fpasswd`, `....//....//etc/passwd`, `/etc/passwd%00`, and the Windows variant. Findings include the exact payload that worked and the file content prefix as evidence.

Manual smoke: in your browser dev tools, modify URLs that take filename parameters (e.g., `/api/files/document.pdf` → `/api/files/../../../etc/passwd`). On Linux, success returns root:x:0:0:... in the response. Check your logs for the failed-validation messages — frequent attempts indicate active probing.

FAQ

Frequently asked questions

Why doesn't filtering `..` work as a fix?+

Encoding tricks bypass single-pass filters. `....//` becomes `../` after the filter strips `../` once. URL-encoded `..%2F` arrives at your code already decoded. Double-encoded `%252e%252e%252f` decodes to `%2e%2e%2f`, then to `../`. Unicode normalization can produce `..` from sequences like `..\u002F`. Reliable fix is canonicalization + prefix check, not blacklisting.

Is path traversal the same as Local File Inclusion (LFI)?+

Closely related. LFI specifically refers to path traversal where the attacker's input is used in `include`/`require`/`fopen`-style functions that interpret the file (PHP `include`, Python `exec(open().read())`). LFI usually escalates to RCE via log poisoning or by including a file the attacker can write to. The fixes are the same — canonicalization, allow-listing, plus 'don't include user-supplied files in code-execution sinks'.

What about path traversal in zip/tar uploads (zip-slip)?+

Zip-slip is the same vulnerability applied to archive extraction. The attacker creates a zip where a file's name is `../../../etc/cron.d/backdoor`. When extracted naively, it writes outside the intended dir. Fix: validate each entry's resolved path against your extraction dir BEFORE writing. Most modern archive libraries have safe-extraction options now (`tarfile.data_filter` in Python 3.12+).

Can I rely on Linux file permissions to prevent traversal?+

Permissions help (run as non-root, can't read /etc/shadow), but they don't fully fix the issue — the web process needs read access to its own config files (.env, certificates), and traversal can read those. Permissions are defense-in-depth, not the primary fix.

Does using a CDN protect against path traversal?+

No. CDNs cache static files; they pass through dynamic requests to origin. Path traversal lives in your origin code. CloudFlare/Cloudfront WAF rules will catch obvious patterns (`../`) but encoding bypasses them. Don't rely on the WAF as a fix; fix the code.

Don't just guess — scan and verify

AuditCore Free Trial scans your homepage for Path Traversal and 50+ other vulnerability classes. No credit card. Results in 60 seconds.