MassSpecGym Reconstructor: Fingerprint Prediction Transformer

This model is a Spectral Transformer optimized for molecular fingerprint reconstruction from MS/MS spectra. It treats identification as a multi-label classification task.

Model Details

  • Architecture: Transformer Encoder (2 layers, 4 heads) with Fourier Feature m/z encoding.
  • Objective: Focal Loss (gamma = 2.0) to address sparsity in molecular fingerprints.
  • Input: MS/MS fragment peaks (m/z and intensity) + Precursor mass.
  • Output: 4096-dimensional binary fingerprint vector.

Performance (MassSpecGym Test Set)

The model focuses on structural fidelity and recovering rare active substructures:

  • Sample-wise F1-Score: 28.27%
  • Effectiveness: Successfully avoids the "all-zero" prediction trap common in sparse chemical data.

Key Features

  • Focal Loss Optimization: Specifically down-weights "easy negatives" (zero bits) to focus on rare structural fragments.
  • Isotope Awareness: Uses Fourier Features to distinguish small mass shifts.
  • Attention Pooling: Learns to ignore spectral noise and focus on diagnostic fragments.

Usage

For full implementation and evaluation details, visit the GitHub Repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support