SciRouter Labs · Beta Proof of ConceptFirst JEPA model from SciRouterOpen beta · no login

Disease to mechanistic hypothesis,
in under five seconds.

Sci-JEPA v1.0 is SciRouter's first drug-discovery model built on a Joint-Embedding Predictive Architecture. Type a disease in plain English — we surface the proteins it hits, the compounds most likely to bind them, clustered by mechanism, with one-click physics verification (Boltz-2) and novel-analog generation (REINVENT4).

Fully open beta · no signup, no API key, no data collection

the numbers

Not a ranking on a leaderboard. A production-shaped result.

v1.0 was evaluated on retrieval — the question an actual user asks. Out of a 5,000-compound library, does the correct answer land in the top 50? Three held-out tests, one bar: was the mechanism recoverable?

mean recall@50
0.328
across Tests A / B / C
vs. contrastive baseline
4.2×
+25 pp absolute
prospective 2024+ test
0.05 → 0.15 (Test C, OOD)
end-to-end latency
<5 s
disease → mechanism → verify
what Tests A / B / C measure
APolypharmacology
0.500

Drugs with multiple validated targets (e.g. dasatinib, sorafenib). Given one target pocket, retrieve the drug — does the model place a single compound near all members of its target set?

20 queries

BClinical
0.333

FDA-approved or Phase III/IV drugs paired with their primary named target (imatinib/ABL1, sotorasib/KRAS-G12C). Direct test of recovering real-world drug-target links.

12 queries

CProspective 2024+
0.150

Drug-target pairs first reported in 2024+ literature — strictly after training-data cutoff. Cannot have been memorised; retrieval here is evidence of real generalisation.

20 OOD queries

how it works

Four steps, under five seconds.

Sci-JEPA replaces per-pair affinity prediction with a single shared map where compounds and pockets live. Search becomes a matmul.

01

Disease → 40-protein network

We resolve your plain-English query through OpenTargets into a weighted set of disease-associated proteins.

02

JEPA scoring (5k × 40)

Every protein pocket is scored against 5,000 compound latents in one matmul. No search, no iteration — just matrix multiplication.

03

Mechanism clustering

HDBSCAN groups top candidates by the pathways they jointly hit. You see the shape of the hypothesis, not a flat list.

04

Verify & extend

Any candidate can be physics-verified (Boltz-2) or expanded into novel analogs (REINVENT4) inline.

training-scale ablation

More data helps — but the gains are saturating.

v1.0 is a cross-modal JEPA trained with the LeWorldModel recipe (MSE prediction + SIGReg, no stop-gradient, no EMA). This ablation charts how the retrieval objective moved as we grew the training corpus to the final 1.8 M pairs.

scaling curve · real data

Training-data scaling through v1.0

Sci-JEPA (LeWM recipe)Contrastive baseline
1k10k100k1Mtraining pairs · log scale0.000.100.200.30mean recall@50v1.0 · 0.328
Training-scale lift: 10k→35k +0.239 (steep) · 35k→110k +0.050 (healthy) · 110k→1.8M +0.017 (saturating at v1.0). More data alone won't take us much further — v1.1's next push is architectural.
what it can do

Built for the query a biologist actually wants to ask.

instant

Matmul-native retrieval

Offline-precomputed latents mean 5,000 × 40 scoring is a single matrix multiplication — 200 k scores in under 100 ms on an M-series chip.

mechanism-first

Not a single prediction

Returns clusters of compounds that hit multiple proteins in the same pathway — the shape of a real hypothesis, not a flat list of scores.

verifiable

Boltz-2 + REINVENT4 wired

Any candidate can be verified with physics (Boltz-2 ΔG + cost + latency) or extended into novel analogs (REINVENT4), inline.

sub-5s

Live end-to-end

Disease query through mechanism clustering to candidate spotlight in under five seconds against a live API — no batch jobs, no queues.

pure JEPA

LeWM recipe, verbatim

MSE + SIGReg, no stop-gradient, no EMA target. The 2026 frontier JEPA recipe, applied to drug-target retrieval as SciRouter's first JEPA release.

open beta

No signup, no key

Every query runs against the public v1.0 checkpoint. We collect no query logs. If you want to run it locally, the repo ships with the 42 MB checkpoint.

Try it live

The full v1.0 retrieval engine — live at /scijepa/demo

Open the live demo and type any disease. The v1.0 engine resolves it through OpenTargets, scores 5,000 compounds against 40 implicated proteins in a single matmul, clusters by pathway, and lets you verify or extend any candidate — all inline, all in under five seconds. No signup, no API key, no data collection.

live API contract
# Resolve a disease
curl "https://scirouter.ai/api/scijepa/disease/resolve?q=pancreatic+cancer"

# Screen 5k compounds against the 40 top proteins
curl -X POST https://scirouter.ai/api/scijepa/screen \
  -H "content-type: application/json" \
  -d '{"disease_id":"EFO_0002618"}'

# Verify a candidate (Boltz-2)
curl -X POST https://scirouter.ai/api/scijepa/verify \
  -H "content-type: application/json" \
  -d '{"compound_id":"c_123","smiles":"...","target_uniprot_id":"P01116","pocket_sequence":"..."}'