The product

A SAST tool whose results you can defend.

Most security scanners ship results. Scanipy ships evidence: every finding carries a witness, a fingerprint, and a verifiable provenance chain.

How it's built

Three properties carry the wedge.

Reproducibility

For a fixed spec set and analysis environment, the deterministic-core partition is a deterministic function of source code. Same code in, same SARIF out, byte for byte.

Incrementality

Analysis cost scales with the semantic delta of a commit, not the size of the repository. A typed-rename touches kilobytes of work, not minutes of CI.

Provenance

Every finding ships with a signed chain: source commit, snapshot digest, spec version, env digest, witness, rule id, signature. Auditors verify without re-running.

Deterministic core

Two partitions, honest labels.

Taint-style classes (injection, path traversal, SSRF, deserialisation) run through a precise IFDS/IDE solver over a canonical code-property graph. We call these deterministic-core and back them with a reproducibility theorem.

Pattern queries, CodeQL queries, and memory-safety checks on C/C++ run through external engines. We call these oracle-passthrough and report their measured reproduction rate.

Every finding declares its origin. The dashboard never blurs the two.

core

ifds
ide

≡

byte-identical

oracle

semgrep
codeql
cpg-query

≈

reported rate

Both partitions ship. Only one carries the theorem.

git push · pre-receive hook · scanipy

→ snapshotting commit 7f3a2c9...
→ parent: e1d8f4a (cached cpg)
→ closed-world precondition: held
→ affected: 8 files (of 14,328)
→ replaying ifds over slice...
✓ analysis complete in 6.4s
✓ 0 new findings
✓ attestation: byte-identical to parent

Incremental by default

Scan the delta, not the repo.

Scanipy treats every commit as a delta against its parent. The code-property graph from the parent commit is reused; only the changed declarations and their transitive callers are re-analysed. Refactor a class? Touch a config file? You pay for the change, not the codebase.

On open-world snapshots we publish a median ≥5× speedup, p95 ≥2×, with an explicit fallback rate.

Witnesses & fingerprints

A finding without a witness is a guess.

Every taint-style finding ships with the exact program path that produced it: source statement, every propagation step, sink statement. The witness is content-addressed and signed. Hand it to an engineer and the fix is obvious; hand it to an auditor and the case is closed.

Slice-fingerprints survive cosmetic refactors. Rename, reorder, extract, move: the same vulnerability keeps the same identity. No false-new alerts on style fixes.

High · CWE-89Deterministic-core

SQL injection in lookup helper

api/search.py · L23 → db/raw.py · L91

23q = req.args["q"] # ← source

24rows = db.search(q)

↓ taint propagates ↓

91sql = f"SELECT * FROM items WHERE name LIKE '%{term}%'"

92cursor.execute(sql) # ← sink

Languages, by stage

Honest about what's ready.

Front-end fidelity dominates the schedule of any cross-language SAST tool. We don't pretend otherwise. Every language is gated by a parse-fidelity test before we publish recall claims for it.

Java & Python

Core · Stage A

Strongest front-ends. Full IFDS support for injection, path traversal, SSRF, deserialisation.

JavaScript & TypeScript

Core · Stage B

Determinism-attested after Stage A. Full IFDS for the four core classes.

Go

Core · Stage C

Points-to analysis investment underway. Oracle-passthrough until the fidelity gate passes.

Ruby & PHP

Oracle · Stage D

Oracle-passthrough today. Core port tracked, gated on proprietary front-end work.

C / C++

Oracle

Memory-safety via CodeQL throughout v3. Core port tracked but explicitly post-v3.