AuditCoreAuditCore
EnterprisePhase 5 · Static / Code & Mobile

Semgrep

p/security-audit rules. Part of AuditCore's automated security audit pipeline — runs on every scan in the Enterprise tier and above, with findings normalized into a single severity-rated table.

What is Semgrep?

Semgrep is the SAST (static application security testing) workhorse for AuditCore's static analysis phase. It pattern-matches your source code against a library of security rules — finding things like hardcoded secrets, dangerous eval() usage, SQL queries built via string concatenation, and unsafe deserialization patterns. Unlike traditional SAST tools (Veracode, Checkmarx) that take hours and produce thousands of false positives, Semgrep runs in seconds and uses precise AST-based matching for ~5× lower noise.

AuditCore runs Semgrep with the `p/security-audit` ruleset — Semgrep's curated security pack covering the OWASP Top 10 plus language-specific anti-patterns. Languages supported: JavaScript/TypeScript, Python, Go, Ruby, Java, PHP, C/C++, Rust, Kotlin, Scala. We add the `p/secrets` pack for credential detection (overlaps with gitleaks, but Semgrep catches inline string assignments that gitleaks misses since it scans git history not file contents).

Static analysis is included in the Pro+ tiers when you provide a repository URL during scan setup. For URL-only scans, we skip Semgrep (no source code to analyze). The trade-off: SAST findings have less context than dynamic findings (we know the code is dangerous, not whether it's reachable in production), but they catch issues *before* they ship — earlier in the SDLC means cheaper to fix.

Common gotchas: Semgrep won't trace data flow across files in OSS edition (taint mode is enterprise). It also misses logic bugs (broken auth, race conditions) — those need our dynamic scanners. Use Semgrep findings as a hit list, not as the final word; some rules have caveats that require human judgment.

What it tests

Where it runs in the AuditCore pipeline

Phase 5/5 · Static / Code & Mobile
Source-code, dependency and mobile-binary analysis — Semgrep rules, gitleaks secrets, Trivy CVEs, APK / IPA manifest, permissions, strings, network and native-binary hardening.

Source: scanners/semgrep_scanner.py

Sample findings

SQL query built via f-string in Python

High. `cursor.execute(f"SELECT * FROM users WHERE id={user_id}")` is vulnerable even if `user_id` looks numeric — passing `1; DROP TABLE users` becomes a valid query. Mitigation: parameterized form: `cursor.execute("SELECT * FROM users WHERE id=%s", (user_id,))`.

Hardcoded AWS access key in committed file

Critical. `AWS_ACCESS_KEY_ID = 'AKIA…'` found in `config/dev.py`. Even if it's a 'dev' key, it's now in git history forever and probably has IAM permissions a public exposure shouldn't have. Mitigation: rotate key immediately, move to environment variable or AWS Secrets Manager, scrub from git history with `git filter-repo`.

subprocess call with `shell=True` and user input

Critical. `subprocess.run(f'convert {filename} output.jpg', shell=True)` allows command injection if `filename` is `'a.jpg; rm -rf /'`. Mitigation: pass arguments as a list (`subprocess.run(['convert', filename, 'output.jpg'])`) which never invokes a shell.

JWT verification accepts alg:none

Critical. `jwt.decode(token, options={'verify_signature': False})` skips signature checking entirely — anyone can forge a JWT claiming any user identity. Mitigation: `jwt.decode(token, secret, algorithms=['HS256'])` with explicit algorithm allow-list. Never set `verify_signature=False` in production.

Available in Enterprise tier and above

Full pentest suite. Adds BOLA / BFLA, sqlmap, SSRF, deep GraphQL, race conditions, AI agent / prompt injection, business logic, mobile binary analysis, code review. Per-domain license — pay once, rescan unlimited.

Other static / code & mobile scanners

FAQ

Does Semgrep replace my existing SAST tool (Veracode, SonarQube)?

It can replace SonarQube's security rules — Semgrep is faster (~30s vs ~30min on a medium repo), produces fewer false positives, and is free open-source. Veracode and Checkmarx have deeper enterprise features (compliance reporting, ticketing integrations, taint analysis across files) — Semgrep OSS doesn't match those. If you want enterprise-grade SAST, Semgrep Pro/Enterprise exists; AuditCore uses the OSS version.

How long does Semgrep take on a typical repo?

10-60 seconds for repos up to ~100K lines. Larger monorepos can take 2-5 minutes. We run it with `--jobs auto` to use all available CPU. AuditCore caps Semgrep runtime at 5 minutes — past that, we report whatever it found and move on (rare for typical web apps).

Can I add custom Semgrep rules for AuditCore to run?

Not in the current self-serve flow. Custom rules support is on the Enterprise roadmap — useful for codifying internal coding standards (e.g. 'never call our deprecated `legacyAuth()` function'). Contact us if needed.

Why does Semgrep flag patterns that I know are safe?

Pattern-matching can't always tell whether user input actually reaches a sink. A finding like 'SQL via string concat' might be safe if the string is hardcoded or comes from a trusted enum. Mark these as `nosemgrep` in the source comment — Semgrep respects in-line suppressions. For broader exclusions, add a `.semgrepignore` file. AuditCore reports raw findings; tuning happens in your repo.

Does Semgrep work on minified JavaScript or compiled binaries?

No. Semgrep needs human-readable source code with stable structure. For minified bundles, run Semgrep against the original source before bundling. For compiled binaries, AuditCore uses Trivy (dependency CVE) and our APK Binary Analyzer (mobile binaries) instead.