⚑ Physics-Based β€’ Zero-Shot β€’ Deterministic

Biology is Structural Physics.
Now We Can Calculate It.

HexaGene is a deterministic physics engine that calculates structural stress, resilience, and failure risk from DNA sequences and biomarkers β€” without machine learning, without training data.

Try the Explorer β†’ 🩺 Clinical Demo β†’ View Dataset
T=16.39
ClinVar Separation
4.62Οƒ
NHANES Discrimination
98.4%
Classification Accuracy
p<10⁻⁢⁰
Statistical Certainty

Structural Physics Explorer

Enter any DNA sequence to see the binary lattice encoding and structural damage score in real-time.

Physics of the Genetic Code

HexaGene applies established thermodynamic and biophysical principles to quantify structural stress in DNA sequences.

DNA is not just information β€” it's a physical polymer with measurable mechanical properties. Base-pair stacking energies, hydrogen bond strengths, and local flexibility are governed by well-understood thermodynamics. HexaGene formalizes these principles into a deterministic scoring framework.

Base-Pair Thermodynamics β†’ Binary Encoding

G ≑ C
3 Hydrogen Bonds
Ξ”G β‰ˆ -21 kJ/mol
β†’ 1 (Rigid)
A = T
2 Hydrogen Bonds
Ξ”G β‰ˆ -14 kJ/mol
β†’ 0 (Flexible)
πŸ”¬

Nearest-Neighbor Thermodynamics

The stability of DNA depends on stacking interactions between adjacent base pairs β€” the same principle used in RNA folding models and PCR primer design.

Ξ”G = Ξ£ Ξ”G(stack) + Ξ”G(init)
πŸ“

Sequence Context Effects

A mutation's impact depends on its neighbors. CpG dinucleotides, trinucleotide contexts, and codon position all modulate local physical stability.

6bp sliding window analysis
βš–οΈ

Structural Stress Quantification

Mutations that disrupt local stiffness, introduce torsional strain, or break symmetric patterns create measurable "structural dissonance."

SDS = f(stiffness, lability, harmony)
Seven Biophysical Features
Nucleotide Transition Severity
Transversions cost more energy than transitions
T = 9.51
GC-Content Perturbation
Local rigidity changes from base composition shifts
T = 7.70
Harmonic Balance
Symmetry of the central 4-base "nuclear core"
T = 7.36
Local Sequence Stiffness
Mechanical resistance to conformational change
Validated
Codon Position Impact
Wobble vs. critical position effects
Validated
Compositional Complexity
Information density of local context
Validated
Neighbor-Codon Transitions
Elemental conflicts between adjacent codons
T = 2.33
πŸ“ Thermodynamic Basis
🧬 Context-Aware
βœ“ ACMG-Compatible
πŸ”„ Null-Model Validated

Pattern Recognition Has Hit Its Ceiling

AI and ML tools learn from historical data. They cannot predict what they haven't seen. Biology doesn't work that way β€” it follows physics.

🧬

Variants of Uncertain Significance

40%+ of genetic findings are VUS. Existing tools cannot classify novel or rare variants because they weren't in training data.

$15B genetic testing market held back by uncertainty
πŸ’Š

Late-Stage Drug Failures

90% of candidates fail clinical trials. Current tools miss structural instability that manifests only under biological stress.

$2.6B average cost per approved drug
🏭

Expression & Aggregation Failures

Codon optimization tools achieve 0.14 correlation with expression. Aggregation causes 30% of biologics batch failures.

$400B biologics manufacturing at risk
⏰

Disease Detection Lags Pathology

Standard risk tools achieve 70-80% accuracy and 1.5-2Οƒ separation. Structural deterioration starts years before diagnosis.

$50B+ preventable healthcare costs

One Core Engine. Multiple Applications.

HexaCore is the deterministic physics engine at the foundation. Each module is a validated application of the same underlying physics.

HEXACORE

Structural Physics Engine

Deterministic β€’ Explainable β€’ Zero-Shot

MODULE 01

Manufacturing

Improve yield, stability, and predictability in biological drug manufacturing.

Validated: ρ = -0.92, p < 10⁻⁢
MODULE 02

Drug Discovery

Identify structurally robust therapeutic sequences early in development.

Validated: Detects silent risks
MODULE 03

Diagnostics

Assess functional risk of genetic variants including synonymous mutations.

Validated: T = 16.39, p < 10⁻⁢⁰
MODULE 04

Longevity & Inverse

Quantify system-level resilience from routine biomarkers.

Validated: 4.62Οƒ, 98.4% accuracy

Proven on Real Biological Data

Every module is independently validated on published datasets. No simulations. No synthetic benchmarks.

Primary Validation

ClinVar Pathogenicity Study

38,000 genetic variants from NCBI ClinVar. Engine blind to clinical labels. Physics-based risk calculation separated benign from pathogenic with mathematical certainty.

T = 16.39
T-Statistic
10⁻⁢⁰
P-Value
38,000
Variants Tested
Zero-Shot
No Training
Orthogonality Proof β€’ VUS Rescue

REVEL Grey Zone Resolution Study

2,000 ClinVar missense variants benchmarked against REVEL ensemble predictor. HexaGene maintains discrimination where conservation-based tools fail β€” proving statistical independence and complementary signal.

T = 13.20
Overall Separation
T = 3.62
Grey Zone (0.4-0.6)
AUC = 0.67
VUS Rescue
r = 0.27
REVEL Correlation
145
Grey Zone Variants
5.1Γ—10⁻⁴
Grey Zone P-Value

Fragile Genes (High Volatility)

T = 10.30
TTN, BRCA2, cardiac panels

Robust Genes (High Stiffness)

T = 7.33
40% stronger in fragile genes
Population Health

NHANES Reverse Imputation

7,939 participants from CDC national survey. Structural constants inferred from routine biomarkers without genetic data.

4.62Οƒ
Separation
98.4%
Accuracy
Manufacturing

Ξ²-Lactam Enzyme Stability

IPNS enzymes across 15 organisms. Structural conflict rate predicts expression stability with near-perfect correlation.

ρ = -0.92
Correlation
10⁻⁷
P-Value

Open Science, Protected IP

Validation datasets and results are publicly available. Core methodology is patent-protected.

πŸ“¦

GitHub Repository

Validation scripts, benchmark datasets, and reproducible analysis pipelines.

πŸ“

VUS Rescue Paper

Technical manuscript: HexaGene resolves VUS in ML grey zones with T=3.62 where REVEL scores 0.4-0.6.

πŸ—„οΈ

NHANES Validation Data NEW

Complete dataset: 7,939 subjects, 10-marker panels, outcome validation. DOI for citation.

🩺

Clinical Oracle Demo

Interactive metabolic risk demo with 3,097 NHANES subjects. Enter your own biomarkers for real-time prediction.

Understanding HexaGene

All
Core Engine
Manufacturing
Diagnostics
Longevity
CORE What scientific standards does HexaGene align with? β–Ό
HexaGene builds on established principles already standard across genomics and biophysics:

1. Nearest-Neighbor Thermodynamics: The same energy models used in RNA folding (Vienna, Mfold) and PCR primer design.

2. Sequence-Context Modeling: Consistent with CpG mutability, trinucleotide signatures, and codon context effects in translation.

3. ACMG Orthogonal Evidence: Clinical variant interpretation explicitly encourages independent evidence lines β€” HexaGene qualifies as it doesn't reuse conservation features.

4. Null-Model Validation: Features validated against sequence-shuffled controls and GC-matched nulls, matching gold-standard biophysics methodology.

HexaGene is not new biology β€” it formalizes well-understood physical constraints into a deterministic framework.
CORE What makes HexaGene different from AI/ML tools? β–Ό
HexaGene is a deterministic physics engine, not a machine learning model. It calculates structural stress, friction, and decay from first principles using equations β€” not patterns learned from data. This means it can assess novel sequences, rare variants, and never-seen-before mutations because it computes physics, not history. There is no training, no fitting, no black box.
CORE What does "zero-shot capability" mean? β–Ό
Zero-shot means HexaGene can assess sequences it has never encountered before. Traditional tools require similar examples in their training data. HexaGene calculates the physical stress on any DNA structure from scratch β€” including ultra-rare variants (AF < 0.0001) and completely novel synthetic sequences. This was validated in the ClinVar study where the engine successfully predicted pathogenicity for variants with no population history.
CORE What are k, ΞΌ, Ξ», and SRI? β–Ό
These are the four structural constants that HexaGene calculates:

k (Structural Resilience): How quickly the system returns to equilibrium after stress.
ΞΌ (Metabolic Friction): Turbulence, viscosity, and inflammatory drag in the system.
Ξ» (Structural Decay): Accelerated aging and accumulated material fatigue.
SRI (Structural Risk Index): Composite measure of global structural tension.

These are analogous to material constants in engineering β€” they describe how biological structures respond to stress.
CORE Does HexaGene replace existing tools? β–Ό
No. HexaGene complements existing tools by adding a physics-based layer. Conservation scores (SIFT, PolyPhen) tell you what evolution has filtered. Codon optimization (CAI) tells you about translation efficiency. HexaGene tells you about structural stress β€” an orthogonal dimension. The best results come from combining HexaGene with existing pipelines, not replacing them.
CORE Is the methodology published? β–Ό
Validation results and benchmark datasets are publicly available on GitHub and Zenodo. A technical preprint describing the REVEL benchmark study is available on bioRxiv. The core mathematical framework is protected by US Provisional Patent #63/918,749. We believe in open science for validation while protecting the underlying intellectual property.
CORE How reproducible are the results? β–Ό
Fully reproducible. The pipeline is deterministic β€” same input always produces same output. There is no training, no random initialization, no stochastic components. Any institution with access to the same public datasets (ClinVar, NHANES) can reproduce our validation results exactly. Validation scripts are available on GitHub.
MFG How does HexaGene improve expression yields? β–Ό
HexaGene identifies structural stress patterns in DNA sequences that lead to expression failure β€” independent of codon adaptation. In our Ξ²-lactam enzyme study, structural conflict rate predicted stability with ρ = -0.92 (p < 10⁻⁢). This means we can flag problematic sequences before wet lab, reducing failed batches and optimization cycles.
MFG Can HexaGene predict aggregation? β–Ό
Yes. In our GLP-1 peptide validation, structural conflict rate at junction regions correlated with aggregation propensity (ρ = 0.67, p = 0.002). Constructs that failed due to aggregation had 14% higher junction conflict rates. This allows early identification of aggregation-prone sequences before formulation development.
DX How accurate is HexaGene for variant classification? β–Ό
In the ClinVar validation study (38,000 variants), HexaGene achieved T-statistic = 16.39 and p-value = 3.43 Γ— 10⁻⁢⁰. For context, a T-statistic of 2.0 is typically considered significant; 16.39 represents exceptional separation between benign and pathogenic groups. The engine was completely blind to clinical labels during assessment.
DX Can HexaGene explain WHY a variant is pathogenic? β–Ό
Yes β€” this is a key differentiator. While ML tools output "probably damaging" without explanation, HexaGene provides mechanistic interpretation: "High friction mutation in low-stiffness region causes structural failure." The physics constants (stiffness, friction, decay) explain what makes a mutation break the structure. This supports clinical reasoning and suggests intervention targets.
DX Is HexaGene a diagnostic test? β–Ό
No. HexaGene does not diagnose disease or assign clinical labels. It provides structural state metrics (stiffness, friction, decay, risk scores) that support clinical reasoning. Think of it as a "structural health lens" β€” a complementary layer of physics-based evidence to help interpret variants, not replace clinical judgment.
DX How does HexaGene perform in REVEL grey zones? β–Ό
When REVEL scores fall between 0.4-0.6 (the "grey zone" affecting 15-25% of variants), HexaGene maintains discrimination with T = 3.62 (p = 5.1Γ—10⁻⁴) and AUC = 0.67. The low correlation with REVEL (r = 0.27) confirms HexaGene measures a distinct biological signal β€” structural physics rather than evolutionary conservation. This makes it particularly valuable for VUS rescue.
LONG How does reverse imputation work? β–Ό
HexaGene can infer structural physics from biomarker patterns without requiring genetic data. The same equations that map DNA β†’ structure can be inverted to map biomarkers β†’ structure. In the NHANES validation, we computed k, ΞΌ, Ξ» from routine labs (HbA1c, triglycerides, albumin, CRP, etc.) and achieved 4.62Οƒ separation between healthy and metabolically stressed cohorts.
LONG What biomarkers are required? β–Ό
The validated minimal panel includes: HbA1c, fasting glucose, triglycerides or LDL, albumin, creatinine, and hs-CRP. Optional additions include WBC and lipid panel. These are standard markers available in any routine blood panel β€” no specialized tests required. The engine can also integrate wearable data (HRV, activity) in hybrid mode.
LONG How early can HexaGene detect risk? β–Ό
The NHANES validation showed that structural decay (Ξ») and friction (ΞΌ) rise before biomarkers cross diagnostic thresholds. Early estimates suggest 6-18 months of lead time for metabolic syndrome detection. This is because HexaGene measures system-wide structural stress, not just single pathway markers. Standard risk tools typically achieve 1.5-2Οƒ separation; HexaGene achieves 4.62Οƒ.

Partner With Us

HexaGene is seeking validation partners in pharmaceutical manufacturing, drug discovery, clinical diagnostics, and precision medicine.

βœ‰ Contact Us πŸ“„ View Dataset