Agents start curating the omics stack
-
Nº XXVII
- Date
- 01 Jun 2026
- Issue
- 27
- Stories
- Six
- Editor
- ARC
Monday opens with agents reaching for the unglamorous middle of biology: curation, ontologies, FHIR — the plumbing that decides what AI actually gets to learn from.
Agent curates a decade of spatial omics
SpatialDataAgent ingests ten years of spatial-omics studies autonomously, harmonizing metadata, tissue ontologies, and assay types across hundreds of datasets without hand-curation. The agent pairs an LLM planner with structured extraction tools, hitting curation accuracy competitive with expert annotators on held-out benchmarks. Spatial omics — transcriptomics and proteomics measured with spatial coordinates intact — has been bottlenecked for years by inconsistent metadata, with most pan-study analyses dying at the harmonization step. Moves agentic curation from toy demos to decade-scale corpora, the first credible signal that the metadata wall in multi-omics integration is breachable.
Protein LM predicts host-pathogen binding
Proteome-scale language model predicts host-pathogen protein interactions directly from sequence, without docking or co-evolution signals. The model generalizes across viral and bacterial pathogens on held-out species, including zoonotic pairs with no training overlap. Pushes protein language models from single-protein property prediction into cross-organism interaction inference — the same structural shift AlphaFold made into pairwise interaction calls — opening a tractable computational front on outbreak preparedness where wet-lab interactome mapping has always been the bottleneck.
FHIR benchmark stress-tests clinical reasoning
MedCase-Structured converts clinical vignettes into FHIR (the HL7 standard for structured EHR data), then asks LLMs to diagnose from the structured record rather than free text. Accuracy drops sharply versus narrative inputs, exposing how much current clinical-LLM performance leans on prose framing. Anchors a new reference benchmark for clinical reasoning under realistic EHR conditions — vendor claims of 'GPT-4 passes USMLE' lose force once the input looks like what hospitals actually store.
Hypothesis generation under partial info
ProjectionBench tests whether LLMs can generate scientific hypotheses as information is progressively disclosed, mimicking how working scientists update mid-investigation. Frontier models often anchor on early evidence and fail to revise. Establishes hypothesis revision — not one-shot generation — as the relevant capability for autonomous discovery agents, complicating story 1's optimism about agents running long-horizon scientific work.
Boston Children's logs 40+ rare diagnoses
Boston Children's Hospital credits OpenAI-powered tooling with surfacing 40+ rare-disease diagnoses that had escaped standard workups. Moves rare-disease AI from conference posters to named institutional deployment, giving pediatric genetics a concrete reference site competitors and payers will now point to.
OpenAI opens biodefense access tier
OpenAI launched Rosalind Biodefense, a vetted-access program giving U.S. government partners and screened developers a biosecurity-tuned model variant. Formalizes a two-tier access model for frontier bio capabilities — the debate over who gets the dangerous-good models now has a working precedent rather than a hypothetical.
Reply with your discoveries. A human reads them. Forward freely.
|