-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is the relationship between hydrofunctions, Ulmo, dataretrieval, HyRiver, and others? #79
Comments
I think this is a good summary, Martin. I would actually be keen to transferring my maintenance efforts to supporting your script. I personally haven't contributed to other repositories because I don't feel experienced enough to make meaningful improvements. Everyone has their own way of accessing the NWIS services and making them available through commands. I am used to the commands in my library and know exactly what they are doing, but I wouldn't be opposed to collaborating. I like how you have added tests and good documentation to your library. I am also an earth-science educator, but just for introductory geology. |
Thanks Martin, |
Thanks Martin, The summary for the HyRiver is good enough. As you said, I mostly focus on watershed data data. Your package was the reason that I didn't add coverage for all NWIS services and just added daily data which I needed for my research at the time. I designed the HyRiver project with extensibility in mind. Each one of the six packages in this software stack has a specific purpose that can be used as a standalone project and can be used in other packages. For example, PyGeoOGC and PyGeoUtils are the engines of the project that all the other packages rely on for generating queries and conversion to dataframes for other web services. These two packages are general and can be used for any geospatial web services. Regarding coordination for further development, I agree. @emiliom created hydro_pycommunity repo for this purpose. I think a good starting point would be establishing a framework for the projects that are within the scope of this collaboration. For example, we can create a repo that provides some guidelines for developers for starting a new project such as a categorized list of existing efforts (an awesome-style repo), and steps for creating a new project. We can come up with a template (maybe using cookiecutter) for projects to have some minimum requirements for linting, documentation, etc.. The README file should include some specific sections for example, Motivation, Scope, Usage, Installation, Credits, etc.. |
hi @mroberge et al! i just wanted to mention that i just received funding for @pyOpenSci and we will be starting an effort to help subcommunities organize for exactly this purpose. we also have considered needs such as finding other maintainers and such and i'm super open to what exactly the needs are to better support open source python. I'm a little under the weather this week with what has happened in my town of Boulder, but wonder if there is a way in a few weeks to circle back and check in on whether pyopensci could facilitate helping you and this group build some community around your (and our) tools. I use hydrofunctions in my courses and really appreciate the package and the effort it takes to maintain a package like this. |
Hi @lwasser ! Congratulations on getting funding! I've been following earthlab since @mbjoseph contacted me. I'm interested! @pyOpenSci looks like a great idea and I would be happy to contribute and work with you. |
Thanks @mroberge. With respect to pywaterinfo, it is indeed a (small) Python wrapper around the API used by the Flemish environmental agency to access the data available on https://www.waterinfo.be/Meetreeksen/ (they provide stream data, but also water quality parameters). In terms of maintenance and development, I cite @thodson-usgs
I'm certainly interested in a more community oriented approach. I do have the impression pywaterinfo is the only non-USGS data oriented package, but we can always see on which level some common ground can be found. For example, agree on an output (dataframe) format that would align with the other packages so users can easily reuse a certain analysis on data sets from different sources (waterinfo, USGS,...)? In terms of documentation/cookiecutter-template/... guidelines as described by @cheginit. This is a very useful proposal, but I think we should build on the excellent work @lwasser and @pyOpenSci are already doing instead of defining a new/separate set of guidelines. Looking forward to further collaboration. |
@mroberge @stijnvanhoey this all sounds great to me! i am guessing we will begin real work in May or June but i'd love to see how PyOS can work with you both and this community. we also have a cookie cutter -- and will be developing better standards in our contributing & dev guide over the next year. i'd love to get this communities input as we develop resources to support exactly this use case! |
Hi all, I got invited to give a 15-min (virtual) talk at Pangeo Showcase about HyRiver, tomorrow (May 25th) at 4 pm EDT. I am going to going to talk about the state of the project and future direction. I think it would be a good opportunity to meet and have a discussion. I would be happy to see you there. Edit: The correct date as Martin mentioned is May 26th, 4 pm EDT. |
Looking forward to seeing it! (4pm Wednesday, May 26) |
@mroberge this is an interesting issue, it's good to see so much community engagement! @cheginit told me about this issue and suggested I mention the project I help develop and maintain, HydroTools, at the Office of Water Prediction. HydroTools is a namespace, toolbox like, package that is designed with data scientists in mind. As such, we've taken a different approach than it appears The two motivations for building our Our work does not access any NWIS services other than the instantaneous value service and we don't offer any plotting, quality control, or data resampling methods. |
Thank you @aaraney ! I've been looking over HydroTools and love so many of its features:
|
In response to this issue DOI-USGS/dataretrieval-python#8 and comments from @emiliom @jkreft-usgs @DanCodigaMWRA
There are several open source software projects that allow you to request, parse, and analyze hydrology data from the USGS NWIS. Why are there overlapping projects? My guess is that it is a combination of scientists writing code to meet their very specific research needs, people creating projects without searching for what already exists, and because sometimes people feel uncomfortable trying to work with people they don't know yet. I'd like to work with the maintainers of other projects to eliminate some of the redundancy and improve the cooperation.
My name is Martin Roberge, and I'm the author of hydrofunctions. I do research on stream hydrology and I'm an educator. I made hydrofunctions to meet my specific needs: I download lots of stream gauge data from the USGS which I then analyze inside Pandas dataframes in a Jupyter notebook. Since most of my students do not come from programming backgrounds, I have spent most of my time trying to make hydrofunctions easy for beginners to use.
My main goals for hydrofunctions are:
The problem is that I am just one person, and every hour I spend adding functionality to hydrofunctions is an hour I could have spent measuring how fast flood waves travel down a river, or whatever I'm up to that day. I would love to collaborate with someone else.
Other projects that work with NWIS data are:
ulmo.usgs.nwis.get_site_data()
is the function that requests stream gauge data. Ulmo processes the original WaterML and returns a dictionary that needs further processing to use the data in a dataframe. It can be finicky when you are requesting stream gauge data, and I can't always figure out what is wrong with my requests. Emilio Mayorga @emiliom is the lead developer now.Please let me know if anyone thinks that I have mischaracterized their project.
I would love to hear your opinion about how these different projects could collaborate or how we could 'stake out ground' so that we don't replicate functionality. Why re-invent the wheel?
The text was updated successfully, but these errors were encountered: