Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A sort that doesn't remove duplicates could be useful #1163

Open
aarroyoc opened this issue Jan 5, 2022 · 7 comments
Open

A sort that doesn't remove duplicates could be useful #1163

aarroyoc opened this issue Jan 5, 2022 · 7 comments

Comments

@aarroyoc
Copy link
Contributor

aarroyoc commented Jan 5, 2022

An equivalent predicate to sort/2 that doesn't remove duplicates.

In SICStus Prolog, it seems that samsort/2 does the trick: https://sicstus.sics.se/sicstus/docs/4.1.0/html/sicstus/lib_002dsamsort.html#lib_002dsamsort

In SWI-Prolog, msort/2 and sort/4 are different options to get this behavior.

@triska
Copy link
Contributor

triska commented Jan 5, 2022

I think samsort/2 can be defined in terms of keysort/2?

:- use_module(library(pairs)).

samsort(Ls0, Ls) :-
        pairs_keys(Pairs0, Ls0),
        keysort(Pairs0, Pairs),
        pairs_keys(Pairs, Ls).

Examples:

?- samsort("bbbbccccaa", Ls).
   Ls = "aabbbbcccc"
;  false.

?- samsort([X,Y,X,Z,Y], Ls).
   Ls = [X,X,Y,Y,Z]
;  false.

keysort/2 is a powerful building block. I first saw how versatile it is in @UWN's efficient version of list_to_set/2, which can be studied in library(lists).

The nondeterminism occurs because we do not yet have JIT indexing (#192) which could help to distinguish the clauses of pairs_keys/2 based on the second (instantiated) argument. As a workaround, we can write it like this:

:- use_module(library(pairs)).
:- use_module(library(lists)).

samsort(Ls0, Ls) :-
        same_length(Ls0, Pairs0),
        pairs_keys(Pairs0, Ls0),
        keysort(Pairs0, Pairs),
        pairs_keys(Pairs, Ls).

Yielding:

?- samsort("bbbbccccaa", Ls).
   Ls = "aabbbbcccc".

?- samsort([X,Y,X,Z,Y], Ls).
   Ls = [X,X,Y,Y,Z].

@pmoura
Copy link
Contributor

pmoura commented Jan 5, 2022

Several Prolog systems provide msort/2 as a built-in predicate. A fallback Prolog definition for it from Logtalk's library is:

msort([], []) :- !.
msort([X], [X]) :- !.
msort([X, Y| Xs], Ys) :-
	split([X, Y| Xs], X1s, X2s),
	msort(X1s, Y1s),
	msort(X2s, Y2s),
	merge(Y1s, Y2s, Ys).

merge([X| Xs], [Y| Ys], [X| Zs]) :-
	X @=< Y, !,
	merge(Xs, [Y| Ys], Zs).
merge([X| Xs], [Y| Ys], [Y| Zs]) :-
	X @> Y, !,
	merge([X | Xs], Ys, Zs).
merge([], Xs, Xs) :- !.
merge(Xs, [], Xs).

Fell free to reuse/improve on it. For tests, see https://github.com/LogtalkDotOrg/logtalk3/tree/master/tests/prolog/predicates/msort_2

@aarroyoc
Copy link
Contributor Author

aarroyoc commented Jan 5, 2022

@triska I have used keysort/2 to get this behavior in #1160 but still, I'm not sure it's good to sort using pairs if pairs are not needed at all. Seems like an excessive overhead in memory.

@UWN
Copy link

UWN commented Jan 6, 2022

This is from Quintus which was merged into SICStus 4. Prior to that SICStus did not have it.

@Qqwy
Copy link

Qqwy commented Dec 5, 2024

For whatever it's worth, I'd like to mention that I find it very surprising that the default of sort is to remove duplicates.

We're of course stuck with it since it is part of the ISO, but a built-in alternative that doesn't remove duplicates would be very welcome.

@UWN
Copy link

UWN commented Dec 5, 2024

For whatever it's worth, I'd like to mention that I find it very surprising that the default of sort is to remove duplicates.

We're of course stuck with it since it is part of the ISO, but a built-in alternative that doesn't remove duplicates would be very welcome.

sort/2 as we have it exists since at least 1982, that is the (second) DEC10 manual. And there was never any other definition. It was added to the standard in Cor.2:2012 being completely uncontroversial. Not sure where you deduce from that that "we're of course stuck". For 17 years there was critique everywhere that it was not part of the standard.

@Qqwy
Copy link

Qqwy commented Dec 5, 2024

Thank you for providing some context. And apologies for my phrasing in my previous comment. The only thing I wanted to convey was that a builtin msort or samsort or other duplicate-preserving sort would be a nice addition. But re-reading my comment it seems way more critical than I meant it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants