-
Notifications
You must be signed in to change notification settings - Fork 41
Is this project advancing? #79
Comments
pandas is definitely one ;-) |
I think the content of this repo is somethow superseded by the pandas
roadmap:
https://pandas.pydata.org/pandas-docs/stable/development/roadmap.html
But there are good ideas here and the documentation is still valid, as a
direction for the development of pandas.
But there is no work in parallel to pandas and Apache Arrow development
specific to a pandas 2 project.
…On Wed, Sep 25, 2019 at 8:28 AM Pietro Battiston ***@***.***> wrote:
If not, I suggest there to be a note to redirect potential enthusiasts to
projects alike where there is a need for contributions.
pandas <https://github.com/pandas-dev/pandas/issues> is definitely one ;-)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#79>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACMXUACS72Q7MV6SFJ6Z7CTQLNDOPANCNFSM4I2LNL2Q>
.
|
My understanding is that these proposed changes will gradually be implemented in pandas, and there won't be a pandas2. While I see that rewriting of the Block Manager is in the roadmap, I don't see there the Building “libpandas” in C++11/14 for lowest level implementation tier. I am curious to know whether there was a decision to stick with Cython. If I am wrong and C++ is being embraced, I imagine that some implementations will have to coexist: methods in |
A big part of the ideas that are listed in this repo (certainly the page on the "data structure changes") evolved into Wes starting Arrow. So for now that is where the C++ work is being done, there are no short-term plans to do that in pandas itself (but we might start using Arrow). For the BlockManager rewrite, there is currently no concrete decision whatsoever (so also not to stick with cython), except that it could be beneficial. That's an item of the roadmap that needs more to be discussed/detailed more. We should probably update the README of this repo to reflect this status better.
pyarrow is an example of that. |
@sursu indeed one of my primary motivations in developing the Apache Arrow project (which has more or less been my primary focus since sometime in 2015) is to develop next-generation data frame internals, and to do so in a way that doesn't create another large codebase owned by a small Python-only core development team. We're developing Arrow with the help of a much larger core community. pandas has millions of users so advancing the goals from the "pandas2" discussion will take years of work to make progress without disrupting existing users. There is also the very important question of who will pay for the work. |
Just a question:
I see that the latest commit in this repository has been more than 2 years ago. Is this project meant to replace pandas and if so: is it advancing?
This discusion on Reddit offers few answers.
The ideas proposed seem really appealing to me. Especially the "judicious and responsible use of modern C++".
If not, I suggest there to be a note to redirect potential enthusiasts to projects alike where there is a need for contributions.
The text was updated successfully, but these errors were encountered: