The NSF's NAIRR pilot program, supported by NVIDIA infrastructure, has enabled over 700 research projects to achieve bre
For the past two years, the U.S. National Science Foundation's National Artificial Intelligence Research Resource pilot program has driven innovative research across the nation for over 700 projects spanning areas such as protein prediction and infectious disease outbreak management.
NVIDIA contributed to the NAIRR pilot through cloud-based resources that give researchers dedicated access to a minimum of four NVIDIA DGX nodes for at least a month. NVIDIA also provided technical support to onboard and assist researchers throughout their projects. With NVIDIA's AI infrastructure support and DGX reference architecture providing dedicated resources, researchers have collapsed workflow timelines and uncovered groundbreaking technologies that will reshape and advance industries such as healthcare, agriculture and energy.
One major initiative is led by Polymathic AI, a coalition of international scientists from Flatiron Institute, Cambridge University and Lawrence Berkeley National Lab. With the help of NVIDIA GPUs and NVIDIA NVLink interconnect technology, this group is strengthening physical, fluidlike simulations with its large-scale dataset called the Well. The dataset is being used to train the largest and most broadly applicable foundation model for fluidlike behavior to date, named Walrus, which has been made publicly available along with its data, code and pretrained weights. The approach addresses current limitations in scale and pretraining diversity, and the team plans to explore scaling laws to accelerate development of more powerful foundation models for scientific applications.
At the University of Michigan, Professor Venkat Viswanathan in the Department of Aerospace Engineering leads research developing a model-fusion framework that combines domain-specific molecular AI and general-purpose large language models. The goal is to help computational scientists more easily explore chemical space, ask chemistry-specific questions in natural language and identify promising materials for next-generation energy technologies. The family of molecular foundation models, called MIST or Molecular Insight SMILES Transformers, is designed for discovery and exploration across chemical space. MIST models were pretrained on large unlabeled molecular datasets and use a novel tokenizer called Smirk to better capture nuclear, electronic, geometric, isotopic and stereochemical information from molecular representations. MIST models have been fine-tuned on more than 400 structure-property relationships and can match or exceed state-of-the-art performance across benchmarks spanning electrochemistry, quantum chemistry, physiology and other domains. MIST was developed on a 40-GPU NVIDIA DGX cluster the researchers gained through a NAIRR allocation and an additional 200,000 NVIDIA GPU hours on ALCF's Polaris cluster. Fusing MIST with general-purpose LLMs makes accurate quantum-chemical calculations more broadly accessible and accelerates the design of energy storage and conversion systems needed to enable widespread electrification of transportation in the heavy-duty and aviation sectors.
Boston University's Hariri Institute for Computing and the Center on Emerging Infectious Diseases is training and evaluating an LLM using NVIDIA accelerated compute through an AI pipeline to support an outbreak monitoring program called BEACON, or Biothreats Emergence, Analysis and Communications Network. This LLM is being trained using a large corpus of documents on infectious diseases and epidemic-prone priority pathogens to support the work of field experts and outbreak analysts working on BEACON. The model will be capable of analyzing online posts of emerging disease outbreaks on a global scale to extract features for downstream categorization and prioritization. BEACON processes signals from a variety of sources including global disease-tracking platform HealthMap, news and social media feeds, subject-matter experts and individual communications via community boards or social media to generate concise outbreak reports. These comprehensive analyses can inform clinical practice guidelines for emerging infectious diseases and identify gaps where further data is needed. Ioannis Paschalidis, director of Boston University's Hariri Institute, noted that infectious disease experts used to spend several hours composing outbreak reports, but with the BEACON pipeline, producing a report now takes roughly two minutes. Internationally deployed doctors, government organizations and academic researchers are already using the BEACON model to quickly identify and treat infectious diseases.
Many other universities including Harvard, Stanford and Colorado State University are pioneering scientific breakthroughs with the help of NAIRR and NVIDIA. With scientists gaining broader access to AI and accelerated computing, innovation for a safer and healthier nation is more tangible than ever.