-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"New File" appears after "New File (1)" #3
Comments
Can you be more specific? What is Window's order? I don't use Windows. |
Which version are you using? I'm using 3.0.1. >>> import natsort
>>> a = ["Folder (3)", "Folder (2)", "Folder"]
>>> natsort.natsorted(a)
['Folder', 'Folder (2)', 'Folder (3)'] I get the output you expect on my Mac. |
Okay, it doesn't work when including a path: >>> import natsort
>>> a=["C:\Folder\File", "C:\Folder (2)\File", "C:\Folder (3)\File"]
>>> natsort.natsorted(a)
['C:\\Folder (2)\\File', 'C:\\Folder (3)\\File', 'C:\\Folder\\File'] Even after unicode: >>> import natsort
>>> a=[unicode("C:\Folder\File"), unicode("C:\Folder (2)\File"), unicode("C:\Folder (3)\File")]
>>> natsort.natsorted(a)
[u'C:\\Folder (2)\\File', u'C:\\Folder (3)\\File', u'C:\\Folder\\File'] 3.0.1 on Python2.7 |
This is outside the scope of the natsort algorithm. When natsort parses a string, it creates tuples of strings and numbers, and then sorts the tuples using python's builtin mechanisms. The strings you gave would be parsed as ('C:\\Folder (', 2, ')\\File',)
('C:\\Folder (', 3, ')\\File',)
('C:\\Folder\\File',) Python sorts tuples by first element, then if there are a group that have the same first element, it does the second element, etc... If you were to sort the strings I assume that Windows treats these cases specially to sort folders in the manner you show so that it is more user friendly. There is really no way to make a general algorithm that will do this correctly because the characters immediately following "Folder" are different. In the first case where it isn't a full path, they are parsed as ('Folder (', 2, ')',)
('Folder (', 3, ')',)
('Folder',) In this case, "Folder" comes first because it and "Folder (" have the same first part of the string, but "Folder (" has extra trailing characters. To work around this, I recommend making a list parallel to the paths that contains only the "Folder" part, use >>> paths = [r"C:\Folder\File", r"C:\Folder (2)\File", r"C:\Folder (3)\File"]
>>> names = [r"Folder", r"Folder (2)", r"Folder (3)"]
>>> index = natsort.index_natsorted(names)
>>> [paths[i] for i in index]
['C:\\Folder\\File', 'C:\\Folder (2)\\File', 'C:\\Folder (3)\\File'] Or, you could use dictionary keys >>> paths = { r"Folder": r"C:\Folder\File", r"Folder (2)": r"C:\Folder (2)\File", r"Folder (3)": r"C:\Folder (3)\File", }
>>> [paths[key] for key in natsort.natsorted(paths)]
['C:\\Folder\\File', 'C:\\Folder (2)\\File', 'C:\\Folder (3)\\File'] A third workaround is to replace >>> paths = [r"C:\Folder (1)\File", r"C:\Folder (2)\File", r"C:\Folder (3)\File"]
>>> natsort.sorted(paths)
['C:\\Folder (1)\\File', 'C:\\Folder (2)\\File', 'C:\\Folder (3)\\File'] |
Did any of these suggestions help? |
I believe index_natsorted would work for a trivial case, the problem is each folder could contain a similarly named set of sub directories which would have to be index and sorted. This was a major performance hit for large directory structures, even after using the relatively fast os.walk. Have decided to just leave this as a limitation, thanks anyway. |
Sorry I couldn't help. Did you try replacing "Folder" with "Folder (1)"... I'm not sure if that would help but you could do a I suppose that in the worst case, you could take the sorted list and then move the last element to the front, since at least you know that the "Folder" always gets put last. Best of luck! |
Unfortunately there is no guarantee that the name of the folders will be "Folder", not really worth the effort to figure out which unnumbered folder name matches which list. Thanks! |
I realize it's been a while and you might have moved on, but I think I have thought of a way to make this work for you. You need to tell natsorted(paths, key=lambda x: path(x).splitall()) Or, you can adapt something from this page to do something without the I am on vacation without access to a computer, so I cannot check this out, but I imagine this should work for you. I am thinking it might be something nice to include as an example in the documentation. |
I'd like to point out that the reason this works is that I really hope this works, because if it can I think that will really be helpful to lots of people! Please let me know if you get it to work!! |
I have just verified that this works! >>> import natsort
>>> import path
>>> a = ['/p/folder/test', '/p/folder (5)/test', '/p/folder (10)/test', '/p/folder (1)/test']
>>> natsort.natsorted(a)
['/p/folder (1)/test', '/p/folder (5)/test', '/p/folder (10)/test', '/p/folder/test']
>>> natsort.natsorted(a, key=lambda x: path.path(x).splitall())
['/p/folder/test', '/p/folder (1)/test', '/p/folder (5)/test', '/p/folder (10)/test'] |
In the next release, I am planning on adding an option to natsort that will cause it to interpret input as paths, so that the user need not depend on path.py. My proposed API is something like this: >>> a = ['/p/folder/test', '/p/folder (5)/test', '/p/folder (10)/test', '/p/folder (1)/test']
>>> natsort.natsorted(a, as_path=True)
['/p/folder/test', '/p/folder (1)/test', '/p/folder (5)/test', '/p/folder (10)/test'] Any objections or opinions on this? |
This has been added as of commit d3bd9e4. Use the I hope this helps. |
@catmanjan Check out version 3.4.0, which has support for sorting this correctly. |
Hello from the future. The preferred way to handle this now is >>> from natsort import natsorted, ns
>>> a = ['/p/folder/test', '/p/folder (5)/test', '/p/folder (10)/test', '/p/folder (1)/test']
>>> natsorted(a, alg=ns.PATH)
['/p/folder/test', '/p/folder (1)/test', '/p/folder (5)/test', '/p/folder (10)/test'] |
Not sure if there is a workaround for this, but when sorting path names the order does not conform to Window's order.
The text was updated successfully, but these errors were encountered: