Microbial genomes get a foundation model
-
Nº XXII
- Date
- 25 May 2026
- Issue
- 22
- Stories
- Seven
- Editor
- ARC
Monday open: a foundation model for the gut microbiome lands, single-cell multiomics gets a transformer makeover, and Karpathy switches jerseys.
Foundation model for the microbiome
Genos-m trains a foundation model on human-associated microbial genomes, extending the protein-language-model playbook to the bacterial DNA that lives in and on us. The bioRxiv preprint pitches Genos-m as a general-purpose backbone for downstream microbiome tasks — strain identification, gene-function prediction, host-association — that currently require bespoke pipelines per question. Microbiome work has lagged the foundation-model wave largely because reference databases are messier than UniProt; Genos-m is the first serious attempt to absorb that mess into pretraining weights.
Transformer for single-cell multiomics
scDynOmics applies an optimized transformer to joint single-cell RNA and ATAC data, learning shared representations across modalities rather than stitching them post-hoc. The architecture targets a long-standing pain point: multiomic integration has been the domain of bespoke graph methods and VAEs that don't transfer between datasets. Moves single-cell multiomics one step closer to the plug-and-play embedding model that scRNA-seq alone already has.
Molecular plugins for LLMs
SciCore-Mol bolts molecular cognition modules onto general LLMs — small specialist networks that handle SMILES parsing, property prediction, and reaction logic, swapped in via adapters rather than rebaked into pretraining. The approach narrows the gap between general-purpose chat models and chemistry-native tools, and raises the question of whether every scientific domain ends up shipping a pluggable cognition layer instead of a full domain model.
Generative re-ranking for entity linking
BeLink pairs biomedical entity linking with a generative re-ranker, using an LLM to break ties that retrieval-only systems get wrong on rare gene and disease mentions. Pushes biomedical NER closer to the accuracy floor clinical and curation workflows actually need before they'll let an agent touch records.
Anthropic updates Project Glasswing
Anthropic posted an update on Project Glasswing, its interpretability-meets-safety research program, sharing early findings on what mechanistic analysis catches that black-box evals miss. Adjacent to a broader push we've tracked that has interpretability moving from research curiosity to a vendor checkbox for high-stakes deployments — including biomedical agents touching patient data.
AdventHealth deploys ChatGPT in clinics
AdventHealth rolled out ChatGPT for Healthcare across its system to handle documentation and admin load, OpenAI announced. Signals that LLM deployment inside large hospital networks has moved past pilot phase — the kind of footprint that starts shaping which AI vendors clinical IT departments default to.
Karpathy joins Anthropic
Andrej Karpathy joined Anthropic's pre-training team, leaving a quiet post-OpenAI stretch to work on Claude's core training runs. Concentrates more of the field's top pretraining talent at the lab whose models biomedical agent builders increasingly default to.
Reply with your discoveries. A human reads them. Forward freely.
|