-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SrcLoc
sorting is non-transitive and a false total order and equality (panics in 1.81)
#1126
Comments
I've added a better backtrace from a release build with debug symbols enabled. Problem is |
It seems like this is the incorrect c2rust/c2rust-transpile/src/c_ast/mod.rs Lines 671 to 686 in 9eaf8a1
Maybe this is also why the declarations were always backwards in |
I think it will end up being |
Yeah, it seems like it's in there. @fw-immunant or @rinon, do you remember what was going on here and where a non-total ordering might be coming from? c2rust/c2rust-transpile/src/c_ast/mod.rs Lines 209 to 235 in 9eaf8a1
|
I'm not familiar with this code but the last semantically relevant change here seems to be 5d3b0aa. |
"Minimal" reproducer. Anybody know of something like creduce for JSON? use std::{cmp::Ordering, fs::File};
use serde::Deserialize;
#[derive(Copy, Debug, Clone, PartialOrd, PartialEq, Ord, Eq, Deserialize)]
pub struct SrcLoc {
pub fileid: u64,
pub line: u64,
pub column: u64,
}
#[derive(Copy, Debug, Clone, PartialOrd, PartialEq, Ord, Eq, Deserialize)]
pub struct SrcSpan {
pub fileid: u64,
pub begin_line: u64,
pub begin_column: u64,
pub end_line: u64,
pub end_column: u64,
}
impl SrcSpan {
pub fn begin(&self) -> SrcLoc {
SrcLoc {
fileid: self.fileid,
line: self.begin_line,
column: self.begin_column,
}
}
pub fn end(&self) -> SrcLoc {
SrcLoc {
fileid: self.fileid,
line: self.end_line,
column: self.end_column,
}
}
}
pub type FileId = usize;
#[derive(Deserialize)]
struct Temp {
spans: Vec<Option<SrcSpan>>,
file_map: Vec<FileId>,
include_map: Vec<Vec<SrcLoc>>,
}
use Ordering::*;
pub fn main() {
let file = File::open("sort_data.json").unwrap();
let mut temp: Temp = serde_json::from_reader(file).unwrap();
println!("{}", temp.spans.len());
sort_top_decls(&mut temp.spans[..], &temp.file_map, &temp.include_map);
}
pub fn sort_top_decls(spans: &mut [Option<SrcSpan>], file_map: &[FileId], include_map: &[Vec<SrcLoc>]) {
// Group and sort declarations by file and by position
spans.sort_unstable_by(|a, b| {
match (a, b) {
(None, None) => Equal,
(None, _) => Less,
(_, None) => Greater,
(Some(a), Some(b)) => {
compare_src_locs(file_map, include_map, &a.begin(), &b.begin())
},
}
});
}
pub fn compare_src_locs(file_map: &[FileId], include_map: &[Vec<SrcLoc>], a: &SrcLoc, b: &SrcLoc) -> Ordering {
/// Compare `self` with `other`, without regard to file id
fn cmp_pos(a: &SrcLoc, b: &SrcLoc) -> Ordering {
(a.line, a.column).cmp(&(b.line, b.column))
}
let path_a = include_map[file_map[a.fileid as usize]].clone();
let path_b = include_map[file_map[b.fileid as usize]].clone();
for (include_a, include_b) in path_a.iter().zip(path_b.iter()) {
if include_a.fileid != include_b.fileid {
return cmp_pos(include_a, include_b);
}
}
match path_a.len().cmp(&path_b.len()) {
Less => {
// compare the place b was included in a's file with a
let b = path_b.get(path_a.len()).unwrap();
cmp_pos(a, b)
}
Equal => cmp_pos(a, b),
Greater => {
// compare the place a was included in b's file with b
let a = path_a.get(path_b.len()).unwrap();
cmp_pos(a, b)
}
}
} |
Thanks for that reproduction! That's very helpful. I don't know of any analogous tool like use serde::Deserialize;
use std::cmp::Ordering;
use std::fs::File;
#[derive(Copy, Debug, Clone, PartialOrd, PartialEq, Ord, Eq, Deserialize)]
pub struct SrcLoc {
pub fileid: u64,
pub line: u64,
pub column: u64,
}
#[derive(Copy, Debug, Clone, PartialOrd, PartialEq, Ord, Eq, Deserialize)]
pub struct SrcSpan {
pub fileid: u64,
pub begin_line: u64,
pub begin_column: u64,
pub end_line: u64,
pub end_column: u64,
}
impl SrcSpan {
pub fn begin(&self) -> SrcLoc {
SrcLoc {
fileid: self.fileid,
line: self.begin_line,
column: self.begin_column,
}
}
pub fn end(&self) -> SrcLoc {
SrcLoc {
fileid: self.fileid,
line: self.end_line,
column: self.end_column,
}
}
}
pub type FileId = usize;
#[derive(Deserialize)]
struct Temp {
spans: Vec<Option<SrcSpan>>,
file_map: Vec<FileId>,
include_map: Vec<Vec<SrcLoc>>,
}
pub fn sort_top_decls(
spans: &mut [Option<SrcSpan>],
file_map: &[FileId],
include_map: &[Vec<SrcLoc>],
) {
use Ordering::*;
// Group and sort declarations by file and by position
spans.sort_unstable_by(|a, b| match (a, b) {
(None, None) => Equal,
(None, _) => Less,
(_, None) => Greater,
(Some(a), Some(b)) => compare_src_locs(file_map, include_map, &a.begin(), &b.begin()),
});
}
pub fn compare_src_locs(
file_map: &[FileId],
include_map: &[Vec<SrcLoc>],
a: &SrcLoc,
b: &SrcLoc,
) -> Ordering {
/// Compare `self` with `other`, without regard to file id
fn cmp_pos(a: &SrcLoc, b: &SrcLoc) -> Ordering {
(a.line, a.column).cmp(&(b.line, b.column))
}
let path_a = &include_map[file_map[a.fileid as usize]][..];
let path_b = &include_map[file_map[b.fileid as usize]][..];
for (include_a, include_b) in path_a.iter().zip(path_b.iter()) {
if include_a.fileid != include_b.fileid {
return cmp_pos(include_a, include_b);
}
}
use Ordering::*;
match path_a.len().cmp(&path_b.len()) {
Less => {
// compare the place b was included in a's file with a
let b = &path_b[path_a.len()];
cmp_pos(a, b)
}
Equal => cmp_pos(a, b),
Greater => {
// compare the place a was included in b's file with b
let a = &path_a[path_b.len()];
cmp_pos(a, b)
}
}
}
struct MappedSrcLoc<'a> {
src_loc: SrcLoc,
file_map: &'a [FileId],
include_map: &'a [Vec<SrcLoc>],
}
impl PartialEq for MappedSrcLoc<'_> {
fn eq(&self, other: &Self) -> bool {
self.partial_cmp(other) == Some(Ordering::Equal)
}
}
impl Eq for MappedSrcLoc<'_> {}
impl PartialOrd for MappedSrcLoc<'_> {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
Some(compare_src_locs(
self.file_map,
self.include_map,
&self.src_loc,
&other.src_loc,
))
}
}
impl Ord for MappedSrcLoc<'_> {
fn cmp(&self, other: &Self) -> Ordering {
self.partial_cmp(other).unwrap()
}
}
pub fn main() {
let file = File::open("sort_data.json").unwrap();
let temp: Temp = serde_json::from_reader(file).unwrap();
let Temp {
spans,
file_map,
include_map,
} = temp;
let file_map = &file_map;
let include_map = &include_map;
// sort_top_decls(&mut spans[..], file_map, include_map);
let spans = spans
.into_iter()
.filter_map(|span| span)
.collect::<Vec<_>>();
// spans.sort_unstable_by(|a, b| compare_src_locs(file_map, include_map, &a.begin(), &b.begin()));
let locs = spans
.into_iter()
.map(|span| span.begin())
.collect::<Vec<_>>();
// locs.sort_unstable_by(|a, b| compare_src_locs(file_map, include_map, &a, &b));
let mapped_locs = locs
.into_iter()
.map(|src_loc| MappedSrcLoc {
src_loc,
file_map,
include_map,
})
.collect::<Vec<_>>();
// mapped_locs.sort_unstable();
let n = mapped_locs.len();
for i in 0..n {
let a = &mapped_locs[i];
for j in 0..n {
let b = &mapped_locs[j];
for k in 0..n {
let c = &mapped_locs[k];
if a < b && b < c {
assert!(a < c);
}
if a > b && b > c {
assert!(a > c);
}
if a == b && b == c {
if a != c {
dbg!(a.src_loc);
dbg!(b.src_loc);
dbg!(c.src_loc);
}
assert!(a == c);
}
}
}
if i % 10 == 0 {
println!("{i}");
}
}
} > cargo run --release
...
[src/main.rs:177:25] a.src_loc = SrcLoc {
fileid: 1,
line: 31,
column: 1,
}
[src/main.rs:178:25] b.src_loc = SrcLoc {
fileid: 2,
line: 214,
column: 1,
}
[src/main.rs:179:25] c.src_loc = SrcLoc {
fileid: 1,
line: 32,
column: 1,
}
... |
SrcLoc
sorting is non-transitive and a false total order and equality (panics in 1.81)
If you also
a == b because a and b are from different files but are included at the same position. Same for b == c. If you change the start of fn cmp_pos(a: &SrcLoc, b: &SrcLoc) -> Ordering {
(a.line, a.column).cmp(&(b.line, b.column))
}
fn cmp_pos_include(a: &SrcLoc, b: &SrcLoc) -> Ordering {
(a.fileid, a.line, a.column).cmp(&(b.fileid, b.line, b.column))
}
let path_a = &include_map[file_map[a.fileid as usize]][..];
let path_b = &include_map[file_map[b.fileid as usize]][..];
for (include_a, include_b) in path_a.iter().zip(path_b.iter()) {
if include_a.fileid != include_b.fileid {
return cmp_pos_include(include_a, include_b);
}
} Then everything works. But that does go against the comment If this is the correct way to fix this, I'll create a nicer patch and open a pull request |
Yeah, I realized that, too. I'm not sure why we're doing that.
Note also that And I changed the
I'm not sure either. Let me look into it. |
I think it's for sure more correct than the current order, so you're welcome to open a PR for it. |
This implementation is simplified compared to the previous one. It is also almost twice as slow in the exhaustive test (15 vs 25 seconds) in immunant#1126 (comment) However, in real sort usage the impact should be significantly less. Fixes immunant#1126
I've created a patch in #1128 where |
This implementation is simplified compared to the previous one. It is also almost twice as slow in the exhaustive test (15 vs 25 seconds) in immunant#1126 (comment) However, in real sort usage the impact should be significantly less. Fixes immunant#1126
This implementation is simplified compared to the previous one. It is also almost twice as slow in the exhaustive test (15 vs 25 seconds) in immunant#1126 (comment) However, in real sort usage the impact should be significantly less. Fixes immunant#1126
…otal order (#1128) This implementation is simplified compared to the previous one. It is also almost twice as slow in [the exhaustive test](#1126 (comment)) (15 vs 25 seconds). However, in real sort usage the impact should be significantly less. * Fixes #1126
Seems one (or more) of the types used in the transpiler have a wrong
Ord
implementation.To reproduce:
The text was updated successfully, but these errors were encountered: