Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fun: What to do about duplicated fields? #11

Open
robojumper opened this issue May 30, 2019 · 2 comments
Open

Fun: What to do about duplicated fields? #11

robojumper opened this issue May 30, 2019 · 2 comments
Labels
impl-java Pertaining to the Java implementation

Comments

@robojumper
Copy link
Owner

robojumper commented May 30, 2019

// Whether this File has duplicate fields that will get lost when converting to
// string
// This doesn't seem to be causing any issues, but is important for test
// coverage because
// files with duplicate fields will re-encode to a different size.
public boolean hasDuplicateFields() {
Set<String> fields = new HashSet<>();
for (int i = 0; i < rootFields.size(); i++) {
if (!fields.add(rootFields.get(i).name)) {
return true;
}
if (hasDuplicateFields(rootFields.get(i))) {
return true;
}
}
return false;
}
private boolean hasDuplicateFields(DsonField field) {
Set<String> fields = new HashSet<>();
if (field.type == FieldType.TYPE_OBJECT) {
for (int i = 0; i < field.children.length; i++) {
if (!fields.add(field.children[i].name)) {
return true;
}
if (hasDuplicateFields(field.children[i])) {
return true;
}
}
}
return false;
}

// Files with duplicate fields will not have the same size anyway.
// Filter them out here
if (!dupeFieldFiles.contains(i)) {
assertEquals(reEncodedFiles.get(i).length, files.get(i).length,
fileList.get(i) + " encodes to different number of bytes");
}

The decoder eats duplicated fields.

This was an issue before; I simply excluded the failing file name from the byte size equality tests. In #9, a new duplicated field wound up; excluding that entire file name would regress test coverage. Thus, files now have a method that can be used to check whether discrepancies between original file size and re-encoded file size are expected. This is still not ideal.

Here's where the "Fun:" part comes in: One could find a reasonable upper bound for allowed differences, as in "we know field X is duplicated with name of length Y and data size of Z -- the file is allowed to have as many as 12+16+Y+Z+3 more bytes" (12+16 for header, 3 for alignment).

One could drive the point even further and derive a number of allowed different bytes for files without duplicated fields: Assuming that one or two bytes per Meta2 block have garbage bits and that the header has a bunch of garbage, we could have a more fine-grained test.

@robojumper robojumper added the impl-java Pertaining to the Java implementation label Mar 22, 2020
@thanhnguyen2187
Copy link

Hi there. Any update on this? I am also trying to write my own version of Darkest Dungeon Save Editor, and encountered this issue on persist.progression.json. It was something like this, where slay_a_squiffy_with_jester gets mentioned twice as a field. Is it safe if I ignore the duplicated data?

{
  "__revision_dont_touch": 1683488768,
  "base_root": {
    "version": 2,
    "dungeon": {
      ...
    },
    "completed_plot_quests_data": {
      ...
    },
    ...
    "achievements": {
    ...
    },
    "real_achievements": {
      ...
      "slay_a_squiffy_with_jester": {
        "rtti": 1935132924,
        "id": "slay_a_squiffy_with_jester",
        "completed": false,
        "awarded": false,
        "conditions": {
          "0": {
            "enemies_killed": 0
          },
          "1": {
            "enemies_killed": 0
          }
        }
      },
      ...
    },
    "infestation": {
      ...
    },
    "flashback_completion_counts": {
      ...
    }
  }
}

By the way, thanks for the awesome tool and documentation!

@robojumper
Copy link
Owner Author

I can't give you an authoritative answer on how to deal with these. I'm not aware of any issues caused by dropping these duplicates, but the scope of this project was the binary encoding of the save data, not the actual semantics of the data, so might be lots of issues I don't know about.

I'm glad you're finding this project useful though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impl-java Pertaining to the Java implementation
Projects
None yet
Development

No branches or pull requests

2 participants