PegasusAI is the next-generation AI-driven scientific workflow management system developed by a multi-institutional team funded by the U.S. National Science Foundation.
ML models to predict resource needs and optimize execution.
Real-time anomaly detection and automatic plan adjustment.
AI-augmented tools for creation, monitoring, and debugging.
Curated datasets to advance cyberinfrastructure research.
Pegasus AI acts as the intelligent orchestration layer across the scientific computing continuum. By bridging human intent with distributed resources, it creates a seamless flow between data generation and large-scale execution.
ML models that predict task runtime, memory usage, and resource needs to enable smarter scheduling and planning decisions.
Real-time performance tracking and debugging tools that detect anomalies, surface bottlenecks, and support live workflow introspection.
LLM-powered tools that help scientists compose, validate, and refine scientific workflows through natural language interaction.
Educational integration of Pegasus AI into academic curricula, enabling students to engage with real-world scientific computing.
Implement within Pegasus's modular architecture to ensure compatibility with the NSF CI ecosystem.
REAL-WORLD PEGASUS AI WORKFLOWS
A quick-start template demonstrating how to sketch and prototype Pegasus AI workflows using a minimal, expressive style.
A bioinformatics workflow that searches the NCBI Sequence Read Archive (SRA) at scale using distributed Pegasus execution.
Seismic data processing pipeline for large-scale earthquake simulation and ground motion analysis across HPC infrastructure.
Remote sensing and ML pipeline for monitoring agricultural crop health using satellite imagery and distributed compute.
Environmental data workflow for ingesting, processing, and analyzing air quality sensor data at regional and national scale.
A centralized hub for discovering, sharing, and exploring Pegasus AI workflows across the community.
NSF-funded program providing unified access to over 20 compute and storage systems.
Partnership to Advance Throughput Computing for large-scale distributed US open science.
MULTI-INSTITUTIONAL COLLABORATORS
Research · Updates · Insights