-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling of duplicate column names by join #1333
Comments
As a side note base R and Pandas may produce duplicates by adding a suffix, whereas dplyr is careful enough to avoid them. |
How annoying would it be to simply throw an error for duplicate column names to force the user to handle them manually? We could also prefix duplicate column names with the name of the table they came from, e.g. joining |
Now we have Here is a snippet that could be considered to fix the names by suffixing if
|
CC @oxinabox - this is directly relevant to what you have asked. Do you have a preference for API? |
Note that in some joins we allow more than 2 data frames. |
When working on this the following case should be tested against as it is potentially confusing I think:
Also a general question is if we rename columns with suffixes should |
I think we should go for
Any comments on this? |
that sounds right to me. Having to give a vector, of 1 suffix per joined table.This seems like it is going to get annoying in the cases where you are just joining 2 things. Suffix only:The choice of suffix only seems off to me. Further depending what your table names are prefix or suffix may be more natural. I suggest instead we should accept a (vector of) function like also means we can use
I don't know what a technical column is. |
Technical columns: currently it is only column specified by Regarding |
Yes, i think symmetry with |
Following the discussion in #1308 (comment) here is the satus what other packages do:
lsuffix
andrsuffix
argumentssuffixes
argumentdplyr
hassuffix
I guess we should add something similar. Any comments?
The text was updated successfully, but these errors were encountered: