AI for Science: a paradigm shift for scientific discovery and translation
15 April 2024
By Theresa Mayer, PhD
To harness the power of AI for science and society, we need to change the fundamental way scientific research is conducted
From the landmark US CHIPS and Science Act legislation opens in new tab/window to major initiatives such as Canada’s Acceleration Consortium opens in new tab/window, leaders around the world are placing big investments in scientific research and innovation to solve today’s grand challenges and those yet to come. To meet the growing urgency of our time, the fundamental way scientific research is conducted in the future must be transformed by advancing and connecting recent breakthrough technologies across the physical and virtual worlds — from remotely controlled, high-throughput automated experimental laboratories at scale all the way to new and powerful foundational AI and physics-based models driven by high-performance computing and rich data resources not previously available.
By harnessing the integrated power of these technologies, AI for Science has the potential to accelerate the pace of innovation like never before. New AI algorithms and models will bring unparalleled capabilities to assist scientists with analyzing huge and complex datasets and identifying patterns that guide their decisions and experimental designs. This will unlock groundbreaking solutions that currently elude scientists, who have been largely limited to making decisions using real-life experience and focused datasets. In parallel, automated scientific experiments will deliver many rich new datasets with the requisite scale and reproducibility to close the experimental loop for AI in Science. These experimental platforms will democratize access and participation across a broad and inclusive population, enhance interdisciplinary and nimble collaboration around the world and speed up translation from scientific discovery into practice. The potential benefits to society are vast: dramatically reducing the time and cost to discover new lifesaving drugs; developing new high-performance materials for clean energy, buildings, and manufacturing; and advancing cellular agriculture for improved food production, to name a few.
This post is from the Not Alone newsletter, a monthly publication that showcases new perspectives on global issues directly from research and academic leaders.
The opportunities today are reminiscent of a period of rapid technological change nearly four decades ago. At that time, an unsolicited proposal to the National Science Foundation unleashed the era of high-performance computing in the US. The proposal gave rise to large and coordinated investments to create a network of regionally distributed supercomputing centers. The centers dramatically expanded access to advanced computing capabilities at a scale not previously possible and helped spark the growth of regional innovation hubs. Since then, sustained investments in hardware, networking capabilities, user-facing services, and workforce have catalyzed an even more robust and fully integrated ecosystem of advanced cyber-infrastructure resources that now benefits research institutions across the country regardless of their type, scale or geography.
3 pillars of investment in a critical research and education infrastructure
Today, strategic investments in AI for Science have the potential to launch an even more powerful era of discovery and translation. Realizing the full potential of AI for Science will require strong leadership together with significant and sustained investment, by both the public and private sectors, across three pillars of a new, transformative critical research and education infrastructure:
High-performance computing and data facilities withhigh-performance GPU and CPU computing at scale that facilitates the use of large, multimodal data sets; and AI-enhanced decision making alongside attendant hardware, methodologies, and training innovations — with safety and security intentionally designed and integrated throughout.
Remote-access automated science laboratories with computer code-driven, fully integrated robotic controlled experimental instruments for biology, chemistry, and materials science integrated and deployed at scale, enhancing the practice of science through fully traceable and high-throughput experimental research workflows, delivering high-fidelity experimental data and metadata to the cloud for open access use; and
AI STEM education and workforce training designed to equip a diverse and inclusive AI and science talent pool across all levels with skills and experience needed to thrive at the forefront of an economy forever changed by AI for Science.
Benefits of AI for Science
AI for Science will be built on a foundation of these enabling technologies and capabilities, leaving behind outdated scientific discovery processes that have been largely based on intuition-driven decision making using limited data. It will enable a broad range of additional benefits, including:
Acceleration, reproducibility and accuracy: By combining AI, machine learning and high-throughput laboratory automation with safety and security incorporated from the outset, experiments can be designed and conducted far faster than ever before and with unmatched accuracy and reproducibility. This can address the reproducibility crisis in science and create viable regulatory pathways, reducing bottlenecks in scientific progress and technology adoption.
Stronger, more equitable and nimble collaborations: By creating a unified, remotely accessible network of infrastructure for computation, experimentation and open data, AI for Science facilitates international collaboration among researchers and educators working at the intersection of biology, chemistry, materials science, computer science and engineering. It also enables robust and nimble private-public cross-sector partnerships.
Larger impact for the funds invested: By reducing the duplication of costly infrastructure and providing remote access to large integrated and automated toolsets, complex scientific research workflows can be run at much lower cost and without geographic boundaries. This provides opportunities for more equitable and democratized access to the most advanced capabilities across all education levels, types of schools and communities.
New technologies and industry: By creating new AI-enabled technologies, AI for Science will accelerate the pace of innovation, seeding entirely new industries and associated workforce and job opportunities. This will provide economic advancement opportunities for individuals across all levels of educational attainment and entrepreneurial capacity.
An agenda to advance AI for Science
Delivering on AI for Science in the next decade will be shaped in large part by a commitment to collectively define and advance an integrated strategy across the three pillars of infrastructure. Initial concepts and pilots are under active discussion and development. Here are some examples.
An integrated network of AI computing and data resources
To advance AI for Science, researchers from universities, nonprofits, mid-sized and small companies, and start-ups require access to unprecedented levels of high-performance computing and data resources for foundation AI and physics-based models and computation. In the US, the Implementation Plan for the National AI Research Resource (NAIRR) opens in new tab/window provides a framework for public and private investment in the major components of a next-generation shared infrastructure for AI, including state-of-the-art computing clusters along with access to data, software, models, training and user support services. A two-year pilot was launched to expedite a proof-of-concept of the NAIRR vision, while future investments are assembled to democratize access.
A unified interoperable network of automated experimental laboratories
AI for Science requires a corresponding focus and investment in remote-access automated science labs for experimental biology, chemistry and materials research. A network of such labs would support complex, high-throughput experimental research workflows with unified and interoperable collection and storage of high-fidelity datasets. A focused agenda on the science of AI safety and security must be embedded from the start, with early pilot labs serving as testbeds to apply, measure and test responsible AI methods. The network would democratize access to advanced experimental capabilities for researchers, educators and entrepreneurs at institutions of any size or location across the US and the world.
Automated labs such as these are being developed and deployed. In recent years, leading biopharmaceutical, chemical and materials companies have made significant advances in deploying labs customized to address their market needs. Building on our pioneering advances in robotics, automation and AI, Carnegie Mellon University (CMU) has become the first academic institution to build a large-scale automated science lab on a university campus, which will open for shared use later this spring. The CMU Cloud Lab opens in new tab/window will provide 24/7 remote access to over 200 scientific instruments that are integrated by a common software platform developed by the Emerald Cloud Lab, a company founded by two CMU alumni. The lab leverages robotics and human-in-the-loop technical support to run up to 80 different high-throughput research workflows at the same time. It is enhanced with auxiliary sensors to enable end-to-end traceability of the workflows and to collect and store comprehensive, high-fidelity datasets for subsequent analysis and use in AI models.
AI for science education and workforce training
Already, progress in AI-enabled scientific discovery, translation and high-throughput experimentation is creating entirely new career pathways and providing economic advancement opportunities for individuals across the full range of educational attainment levels. As AI for Science advances, it has the potential to fundamentally transform the way science education is delivered. For example, students will transition away from learning tedious and time-consuming experimental protocols to writing computer code to remotely run experimental workflows, and from designing experiments by intuition using limited data sets to AI-augmented design using large multimodal data sets. Additionally, education should infuse responsible AI methods and tools throughout the curriculum.
CMU is leading the way by offering the first master’s degree in Automated Science and delivering undergraduate chemistry courses in the Cloud Lab. This work includes a pilot with Morehouse College to co-develop curricular materials and processes for delivery at Historically Black Colleges and Universities (HBCUs) and Minority Serving Institutions (MSIs) across the US. A second pilot is underway with an initial group of high schools, two- and four-year colleges and universities in the Pittsburgh region to increase access across educational levels. Efforts such as these will also ensure that AI for Science will open doors to well-paying jobs with new technical degrees and credentials in adjacent areas such as robotics and automation, software and IT systems, and lab operation and management.
Conclusion
AI for Science is the future of science. The path forward is clear, and the possibilities are game-changing. Let’s seize the moment for a better world through scientific and technological innovation.
Is academia ready for AI governance?
Not according to a new study by Elsevier and Ipsos: View from the Top: Academic Leaders’ and Funders’ Insights on the Challenges Ahead opens in new tab/window. Of the academic leaders interviewed, 64% said AI governance is a high priority but only 23% are well prepared for this challenge.