XArray NASA Open Source Tools, Frameworks, and Libraries Funding 2021 #5828
scottyhq
announced in
Announcements
Replies: 1 comment 3 replies
-
We've posted a copy of the full proposal here for reference https://figshare.com/articles/preprint/Enhancing_analysis_of_NASA_data_with_the_open_source_Python_Xarray_library/16689265 |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
🎉 Xarray was selected as one of 8 open-source projects for funding from NASA's Open Source Tools, Frameworks and Libraries program in 2021 🎉
The proposal was written by @scottyhq, @dcherian, @JessicaS11, @andersy005, and @snowman2. In this announcement we're copying the project summary text, which is also available as a PDF along with other funded project summaries here
We welcome feedback and ideas in this discussion thread, and will be making related announcements here and tracking development supported by this funding in the Xarray issue tracker (for example #4648) !
Project Summary
The accelerating data deluge from modern sensors presents an unprecedented challenge for all four NASA Science Mission Directorate (SMD) science divisions. SMD collectively stores over 100 Petabytes (PB) of data and estimates generating an additional 100 PB per year within the next five years. Consequently, science today requires software that enables expressive and easily-parallelized workflows on gigabyte to petabyte sized datasets. Xarray is an actively developed open source library that reduces the barriers to analyzing NASA datasets at scale, leading to greater scientific return and faster discoveries.
Xarray’s impact on the scientific community is significant, growing steadily since the library’s public release in 2014. Xarray provides scientists with a powerful interface for parallelized computation with multi-dimensional raster datasets (e.g. image stacks), which are prevalent today across all scientific domains. The scalability that Xarray unlocks plays an essential role in NASA’s transition to hosting large data archives on public cloud computing infrastructure.
High-impact scientific studies relying on Xarray’s capabilities span many topics including ocean dynamics, glacier dynamics, atmospheric and climatological change, and satellite-derived snow properties. We aim to significantly expand the use of Xarray for research using NASA data and sustain development of the library through specific maintenance and outreach activities:
Science users of Xarray integrate multiple Python libraries to achieve results, requiring integration and benchmarking tests to ensure error-free, stable, and performant code. We will greatly expand an existing limited benchmarking test suite that focuses on performance of Xarray for common operations on NASA data.
Domain-specific extensions of Xarray are necessary to meet the needs of unique NASA remote sensing data (imagery swaths, laser altimeters, etc.). We will decouple and expand upon geoscience-specific functionality such as Coordinate Reference System (CRS) management, reprojection, and clipping in the RioXarray extension library.
There is a critical need for documentation of complete workflows that illustrate both core functionality and domain-specific applications of Xarray. We will create novel, interactive documentation for scientific workflows that focus on scalable distributed computing for domain-specific analysis tasks with NASA data.
The paradigm shift towards cloud-computing and community-developed software is a major change for scientists accustomed to operating on small local files with custom code. We will address this socio-technical challenge by hosting public "Xarray for NASA data" monthly virtual office hours along with a regular online Xarray tutorial series.
Ultimately the success of this project will be measured by increasing adoption of Xarray among scientists using NASA data for research. This is a transitional moment in which the scientific community can either be stymied by data management or empowered with enhanced open-source software tools such as Xarray. Our tasks will enable the SMD community to fully unlock Xarray’s potential for efficiently exploring petabyte-scale NASA data, accelerating scientific discoveries in this age of cloud-computing and big data.
Beta Was this translation helpful? Give feedback.
All reactions