AI virtual cells: shaping the future of biology and human health
AI and humans in focus
"Modelling a human cell is complex. It contains more than 20,000 genes and 6 billion protein molecules, all interacting dynamically in a microscopic space. Traditional mechanistic cell models can't capture this scale and complexity," says Wei Ouyang, head of the AICell Lab at KTH.
His lab aims to address this challenge by developing the Human Cell Simulator, a data-driven, generative AI model that could change the way we approach biology, drug discovery and personalized medicine.
The inspiration for the Human Cell Simulator comes from real-world problems in research areas such as cancer and the COVID-19 pandemic. Despite advances in computational biology, there remains a gap in our ability to fully simulate a human cell. This is where Wei Ouyang 's team believes AI can step in. Recent breakthroughs in generative AI, such as diffusion models, offer new ways to simulate complex biological systems. "These technologies can help create fully data-driven models of human cells, allowing researchers to study cellular processes more efficiently than ever before" he says.
There is still a data gap
While the potential of AI in the life sciences is huge, the path forward remains foggy and fraught with challenges. Wei Ouyang points out that in order to build a whole human cell model, vast amounts of biological data are required. While large AI models like ChatGPT are trained on trillions of words, current biological datasets, such as the Human Protein Atlas, operate at the scale of 100,000 high-quality cell images. Expanding these datasets to cover a wider variety of cell types and conditions will be crucial for driving further advancements in AI-driven research.
In addition, the computing power required to process and analyse this data is immense.
"Training such large AI models requires hundreds of GPUs [graphics processing units] and sophisticated data management systems," explains Wei Ouyang.
The lab is addressing this challenge by developing platforms such as REEF - an AI-powered microscopy imaging farm that autonomously acquires data - and Hypha, a system for managing and streaming data to GPU clusters such as Berzelius for training AI models. Together, these tools are fundamental to the success of the Human Cell Simulator project.
Towards personalized medicine
If successful, the Human Cell Simulator could have a transformative impact on medicine. One key application is in personalized medicine, where patient-specific data could be used to create customized cell models. In cancer treatment, for example, doctors could simulate how a patient's cells respond to different drugs, leading to more accurate treatments.
In drug development, virtual experiments performed on AI-modelled cells could accelerate the discovery of new treatments. Traditional lab-based experiments are both costly and time-consuming, but in silico experiments [performed by computer simulation] using AI-driven models could dramatically reduce these barriers, making the process more efficient and accurate.
What's next for the AICell Lab?
The lab is also working on long-term projects that could take AI-driven research even further. The next big step is the development of 'schema agents', autonomous AI systems powered by Large Language Models like ChatGPT, designed to plan and run experiments.
"These AI agents could operate instruments, perform real-time data analysis and generate insights to inform the next round of experiments, creating a continuous cycle of scientific discovery. Our ultimate vision is to develop AI life scientists - autonomous systems capable of independently driving scientific exploration, which could revolutionize research by accelerating experimentation. It's an incredibly exciting step for the future of AI in the life sciences," concludes Wei Ouyang.
Text: Marta Marko Tisch ( martamt@kth.se )