Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save files may contain invalid UTF-8 #27

Open
robojumper opened this issue Oct 20, 2020 · 0 comments
Open

Save files may contain invalid UTF-8 #27

robojumper opened this issue Oct 20, 2020 · 0 comments
Labels

Comments

@robojumper
Copy link
Owner

#26 provided a
persist.map.json that contains the null-terminated byte sequence 72 6F 6F 80 00 as an object name, presumably a room name. Room names are generated by taking the string roo and appending a single ASCII character to it. The game seems to have run out of room names by going outside of the ASCII character range with the 80. This makes the object name invalid UTF-8 and both implementations reject it.

I'm inclined to not fix this. This could possibly be be fixed in the Java impl here

static String readName(byte[] data, int start, int len) throws ParseException {
// Field names can be UTF-8
byte[] str = Arrays.copyOfRange(data, start, start + len);
String name = new String(str, StandardCharsets.UTF_8);
if (!Arrays.equals(name.getBytes(StandardCharsets.UTF_8), str) || data[start + len] != 0) {
throw new ParseException(
String.format("%d: Wrong name length: Name %s, expected %d but has null bytes in wrong place",
start, name, len),
start);
}
return name;
}

and in the Rust impl here

let name = {
let cs = CStr::from_bytes_with_nul(&field_name)?.to_str()?;
NameType::from(cs)
};

and here (need support for byte escape sequences)

pub fn escape(arg: &str) -> Cow<str> {
if arg
.chars()
.any(|c| matches!(c, '\x08' | '\x0C' | '\n' | '\r' | '\t' | '\\' | '"'))
{
let mut s = String::new();
for c in arg.chars() {
match c {
'\x08' => s.push_str("\\b"),
'\x0C' => s.push_str("\\f"),
'\n' => s.push_str("\\n"),
'\r' => s.push_str("\\r"),
'\t' => s.push_str("\\t"),
'"' => s.push_str("\\\""),
'\\' => s.push_str("\\\\"),
_ => s.push(c),
}
}
Cow::Owned(s)
} else {
Cow::Borrowed(arg)
}
}
pub fn unescape(arg: &str) -> Option<Cow<str>> {
// Bare control characters are disallowed
if arg
.chars()
.any(|c| matches!(c, '\x08' | '\x0C' | '\n' | '\r' | '\t'))
{
return None;
}
if arg.find('\\').is_some() {
let mut s = String::new();
let mut it = arg.chars();
while let Some(c) = it.next() {
match c {
'\\' => {}
c => {
s.push(c);
continue;
}
}
s.push(match it.next() {
Some('b') => '\x08',
Some('f') => '\x0C',
Some('n') => '\n',
Some('r') => '\r',
Some('t') => '\t',
Some('"') => '\"',
_ => return None,
});
}
return Some(Cow::Owned(s));
}
Some(Cow::Borrowed(arg))
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant