We’re announcing new products and capabilities in structured data curation and delivery:
-
A 30M cell observational atlas spanning 150 diseases, 200 tissues and 27 technologies
-
A partnership with expert data curators Pythia Biosciences and Miraomics
-
An agentic human-in-the-loop Python framework for mass scRNA-seq curation
We’re approaching a new era of biology too complex to navigate with unaided human cognition. The industry needs large volumes of structured molecular data to develop a new class of foundation models and build large atlases. It is becoming clear that data is the oil of modern biotech.
Thanks for reading! Subscribe for free to receive new posts and support my work.
Latch is addressing this need by building curation infrastructure, tools and delivery portals for data solution providers. By equipping our partners with powerful new tools, our ambition is to organize the public molecular information scattered across the Internet and provide it for immediate download on a usage basis.
A 30 Million Single Cell Atlas
The need for large scale data has been met initially by purpose-built industrial data-generation projects. These efforts are incredible but do not sample sufficiently broad observational space, especially for rare indications.
Public datasets remain the largest and most diverse reservoir of diseases, tissues and patients. For indications with small patient populations or for complex diseases demanding fine-grained stratification, statistical models must draw on these niche biological states to achieve translational utility.
Our first 30M cell atlas represents thousands of hours of human labor – the combined efforts of the Pythia Biosciences, Miraomics and LatchBio curation teams. It spans broad space of relevant biology, with 150 diseases, 200 tissues, 27 technologies.
This atlas is available in a public portal. Customers can search by study title, abstract or controlled ontology terms and immediately download their data with transparent pricing.
A Partnership with Data Focused Solution Providers
LatchBio is a platform for solution providers. Our team builds compute infrastructure, tools, delivery portals and white labeling features so our customers can focus on their strengths: developing new assays, kits and services.
Our partnership with Miraomics and Pythia Biosciences expands our focus to data delivery. They leverage their expertise in data curation. We provide them with new technology to increase curation throughput, quality and distribute their services to more customers.
An Agentic Human-in-the-Loop Framework for Curation
Recent advances in foundation models and agentic workflows show promise in autonomous scientific reasoning and software development. We hypothesized the structure of the curation problem was particularly suited for these emerging technologies.
We’re also introducing latch-curate
, an agentic Python framework that guides an expert scientist through an ordered, step-by-step curation lifecycle and helps them perform tasks like count matrix construction, cell typing and metadata harmonization with greater efficiency and accuracy.
A detailed description of the system design and function can be found in this whitepaper. Reach out to our team for access to this framework.
Our team is building a lot of interesting product at the intersection of molecular data curation and delivery. If you are a solution provider interested in trying our new tools or hosting your data on our portal, please reach out to kenny@latch.bio
Thanks for reading! Subscribe for free to receive new posts and support my work.