Architecture¶
An overview of how iscc-lib is structured — the crate model, module layout, streaming pattern, and conformance testing approach.
Hub-and-Spoke Crate Model¶
iscc-lib uses a hub-and-spoke architecture. A single pure-Rust core crate (iscc-lib) contains
all ISCC algorithms. Each binding crate depends on the core and translates its API to the target
language. The core has no FFI concerns — it publishes to crates.io independently.
graph TD
PY["iscc-py<br/><small>Python · PyO3</small>"] --> CORE["iscc-lib<br/><small>Pure Rust core</small>"]
NAPI["iscc-napi<br/><small>Node.js · napi-rs</small>"] --> CORE
WASM["iscc-wasm<br/><small>WebAssembly · wasm-bindgen</small>"] --> CORE
FFI["iscc-ffi<br/><small>C FFI · extern "C"</small>"] --> CORE
All binding crates are thin wrappers — they contain no algorithm logic. This ensures that every language produces identical results for the same inputs.
Workspace Layout¶
The repository follows kreuzberg's crates/ directory pattern with centralized dependency
management via workspace.dependencies in the root Cargo.toml.
iscc-lib/
├── Cargo.toml # Virtual workspace root
├── pyproject.toml # Python project (uv)
├── zensical.toml # Documentation config
├── mise.toml # Tool versions + tasks
├── .pre-commit-config.yaml # prek hooks
├── docs/ # Documentation site
├── notes/ # Architecture notes
├── benchmarks/
│ └── python/ # Comparative Python benchmarks
├── crates/
│ ├── iscc-lib/ # Core Rust library
│ │ ├── Cargo.toml
│ │ ├── src/ # Algorithm implementations
│ │ └── tests/ # Conformance test vectors
│ ├── iscc-py/ # Python bindings
│ │ ├── Cargo.toml
│ │ ├── pyproject.toml # maturin build config
│ │ ├── src/lib.rs # PyO3 wrappers
│ │ └── python/iscc_lib/ # Python package + type stubs
│ ├── iscc-napi/ # Node.js bindings
│ │ ├── Cargo.toml
│ │ ├── package.json # npm package config
│ │ ├── src/lib.rs # napi-rs wrappers
│ │ └── __tests__/ # Node.js conformance tests
│ ├── iscc-wasm/ # WebAssembly bindings
│ │ ├── Cargo.toml
│ │ ├── package.json # npm package config
│ │ ├── src/lib.rs # wasm-bindgen exports
│ │ └── tests/ # WASM integration tests
│ └── iscc-ffi/ # C FFI bindings
│ ├── Cargo.toml
│ ├── src/lib.rs # extern "C" functions
│ └── tests/ # C test program
└── .github/workflows/
├── ci.yml # Test + lint
└── docs.yml # Documentation deployment
Crate Summary¶
| Crate | Produces | Build Tool | Published To |
|---|---|---|---|
iscc-lib |
Rust library | cargo | crates.io |
iscc-py |
Python wheel | maturin + PyO3 | PyPI |
iscc-napi |
Native Node.js addon | napi-rs | npm |
iscc-wasm |
WASM package | wasm-bindgen | npm |
iscc-ffi |
Shared library (.so/.dll/.dylib) | cargo | Source |
Internal Module Structure¶
The iscc-lib core crate uses a tiered API to control what gets exposed to bindings and
downstream Rust consumers.
Tier 1 — Stable Public API¶
The 9 gen_*_v0 functions are the stable entrypoints, bound in all languages. They are pub
functions in the crate root (lib.rs). Changes require a SemVer MAJOR bump.
// All gen_*_v0 functions are pub in the crate root
pub fn gen_meta_code_v0(name, description, meta, bits) -> IsccResult<String>
pub fn gen_text_code_v0(text, bits) -> IsccResult<String>
pub fn gen_image_code_v0(pixels, bits) -> IsccResult<String>
pub fn gen_audio_code_v0(cv, bits) -> IsccResult<String>
pub fn gen_video_code_v0(frame_sigs, bits) -> IsccResult<String>
pub fn gen_mixed_code_v0(codes, bits) -> IsccResult<String>
pub fn gen_data_code_v0(data, bits) -> IsccResult<String>
pub fn gen_instance_code_v0(data, bits) -> IsccResult<String>
pub fn gen_iscc_code_v0(codes, wide) -> IsccResult<String>
Tier 2 — Public Rust API¶
The codec module is public for Rust consumers but not exposed through FFI bindings. It provides
base32 encoding/decoding and header manipulation utilities. May change in MINOR releases.
Internal Modules¶
These modules implement the core algorithms and are pub(crate) — never exposed to bindings or
external Rust consumers. Free to change at any time.
| Module | Purpose |
|---|---|
cdc |
Content-Defined Chunking for Data-Code |
dct |
Discrete Cosine Transform for Image-Code |
minhash |
MinHash algorithm for Text-Code and Data-Code |
simhash |
SimHash algorithm for Meta-Code and Audio-Code |
utils |
Text normalization, hashing helpers |
wtahash |
Winner-Take-All Hash for Video-Code |
Streaming Pattern¶
ISCC operations are CPU-bound, not I/O-bound — hashing and chunking are pure computation on byte slices. The core library takes bytes, not file paths. File I/O is the caller's responsibility.
This drives a key design decision: the Rust core is synchronous. Each binding adapts to its runtime's async model outside the core.
Core Pattern¶
Data-Code and Instance-Code support streaming via a three-phase pattern:
This matches the approach used by std::io::Write and blake3::Hasher. Callers feed data in chunks
of any size, then finalize to get the result.
Per-Binding Adaptation¶
| Binding | Async Strategy |
|---|---|
| Python | Sync API, GIL released during update(). Callers use asyncio.to_thread() if needed |
| Node.js | napi-rs AsyncTask offloads to libuv thread pool, returning Promise<T> |
| WASM | Sync exports — no threading in browser WASM |
| C FFI | ctx_new() / ctx_update() / ctx_finalize() / ctx_free() — standard streaming C API |
Why not async in the core?
Adding async fn to the core would force all consumers to bring a runtime (tokio), including WASM
and C FFI where that makes no sense. There are no awaitable operations — hashing and chunking are
pure computation. The rule: never expose Rust async across FFI boundaries.
Conformance Testing¶
All bindings share the same conformance test vectors — a vendored snapshot of data.json from the
official iscc-core Python reference implementation.
How It Works¶
- The canonical
data.jsonfile lives incrates/iscc-lib/tests/data.json - Every binding loads the same file (via relative path or
include_str!) - Test vectors contain inputs and expected outputs for all 9
gen_*_v0functions - Tests are parametrized — each test case name maps to a key in the JSON
Cross-Language Test Matrix¶
| Language | Test Runner | Vector Access |
|---|---|---|
| Rust | cargo test |
include_str! at compile time |
| Python | pytest | Relative path from test file |
| Node.js | node:test |
Relative path from __tests__/ |
| WASM | cargo test |
include_str! (no filesystem) |
| C | gcc + runtime | Linked against shared library |
This approach catches cross-language drift — encoding differences, rounding behavior, default parameter mismatches — immediately when any binding diverges from the reference.