Validation & Testing

How do we know if our implementation is good?

1. Validation Strategy -> Cross-Library Testing

When developing fingerprinting algorithms, we compare our results against industry standards like RDKit (C++). However, direct bit-vector comparison is not a valid test for correctness due to implementation details:

  • Hashing Variance: Our native Julia implementation utilizes the internal hash() function, whereas RDKit uses specific PRNG (Pseudo-Random Number Generator) seeds in C++.
  • Result: The specific indices of bits set will differ between libraries.

Verification via Ranking Correlation

Instead of bitwise equality, we validate using Statistical Correlation. If the chemical logic (subgraph extraction) is identical, the Tanimoto similarity between pairs of molecules should be highly correlated across both libraries.

We verify that if Molecule A is "most similar" to Molecule B in RDKit, MolecularFingerprints.jl should produce the same ranking order, even if the underlying bit-vectors are different.

2. Testing Framework

We use the standard Julia Test module to ensure high code coverage and functional correctness.

Running Tests

The most efficient way to run tests is via the Julia REPL. From the root of the repository:

# Method 1: The standard Pkg way (Recommended)
pkg> activate .
pkg> test

# Method 2: Running the test script directly from terminal
# julia --project=test test/runtests.jl

Managing the Test Environment

The tests reside in an independent environment located in the /test directory. This keeps the main package dependencies lightweight by excluding testing-only packages (like RDKitMinimalLib) from the production environment.

Adding a new test-only dependency:

pkg> activate test
pkg> add Statistics  # Example: adding a stats package for validation

Syncing your local changes for testing: If you make changes to the source code in /src, ensure the test environment is tracking your local version:

pkg> activate test
pkg> dev . 

CI Integration

Every Pull Request is automatically tested against multiple Julia versions and operating systems via GitHub Actions. We also track Code Coverage; please ensure that any new fingerprint types added include corresponding tests in test/runtests.jl.