Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function to transform an xarray grid into a pandas dataframe #5

Closed
leouieda opened this issue Apr 26, 2018 · 5 comments
Closed

Function to transform an xarray grid into a pandas dataframe #5

leouieda opened this issue Apr 26, 2018 · 5 comments
Labels
enhancement Idea or request for a new feature good first issue Good for newcomers (doesn’t require deep knowledge of the project)
Milestone

Comments

@leouieda
Copy link
Member

The builtin to_dataframe method in xarray uses the coordinates as the dataframe 2D indices, so in practice it's not really spelling out all of the point coordinates. What I want is to make the grid into an xyz format with the coordinates as columns of the dataframe.
This will allow us to use the data in functions that don't like grids, like the forward modeling and inversion functions in Fatiando.

The function should be grid_to_table(grid) and it spits out a table with the columns having the correct names taken from the grid.

@leouieda leouieda added enhancement Idea or request for a new feature help wanted good first issue Good for newcomers (doesn’t require deep knowledge of the project) labels May 11, 2018
@jessepisel
Copy link
Contributor

I've started working on this and have something that works on a few cases. I'll expand it and run it through some tests and put in a pull request next week. I am assuming its just a pandas dataframe with columns named after coordinates and the variable, and an incremental index? What are your thoughts on adding support for multiple variables?

@leouieda
Copy link
Member Author

leouieda commented Oct 6, 2018

Hi @jessepisel, just got back from vacation. Thanks for taking a stab at this! I really appreciate it.

I am assuming its just a pandas dataframe with columns named after coordinates and the variable, and an incremental index?

Yep, that was the basic premise. It's meant as a utility to help with classes/functions that don't support xarray input, like BlockReduce.

What are your thoughts on adding support for multiple variables?

That should probably be the default if given a Dataset instead of DataArray. A column for each variable should be good. This will also handle the cases where there are extra coordinates (like lat, lon on a projected grid).

But we could start off with just a single DataArray and then expand to support Dataset later on. The API wouldn't change from one to the other so it's not a problem.

@leouieda leouieda added this to the 1.1.0 milestone Oct 11, 2018
@jessepisel
Copy link
Contributor

Any ideas on creating some tests for the PR that I just submitted?

@ahartikainen
Copy link

OT. just dropping a comment here:

Hi, this is a function that should be implemented in xarray. Meaning that given nD object, there should be a melt / pivot method to transform nD --> 2D table.

Basically xarray.to_dataframe() outputs data in a long format, and what we usually want is to have data in a wide format.

This is similar problem we had in arviz lib.

https://github.com/arviz-devs/arviz/pull/335/files

I'm just saying, maybe there should be an effort to upgrade xarray functionality.

@leouieda
Copy link
Member Author

@ahartikainen I was thinking the same thing. You are absolutely right. We could get this into Verde for now and then ping the xarray folks to see if they have interest in this. We can then port it over to them and deprecate/remove our function.

It could even be an option for to_dataframe instead of a new method, since that was the method I looked at when I first needed this functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Idea or request for a new feature good first issue Good for newcomers (doesn’t require deep knowledge of the project)
Projects
None yet
Development

No branches or pull requests

3 participants