Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues with uint64 enums #5265

Merged
merged 3 commits into from
May 2, 2019
Merged

Conversation

vglavnyy
Copy link
Contributor

Refactoring of EnumDef and EnumVal classes (#5161)

  • fix enum with uint64
  • add support of empty enums
  • hide the implementation of enums from code generators

The main idea: disable direct access to an enum value.

  1. The consumers of enums are code generators. They need only string presentations of enums.
  2. This PR hides implementation details of enums and delegates all operation with EnumVal to the owners (EnumDef). Added the internal builder of enums which hides the implementation of enums from the Parser.

All available tests passed.
New tests will be added to this PR before landing to cover most of the changes.

Copy link
Collaborator

@aardappel aardappel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is a ton of cleanup, making the code generators more maintainable, and new tests too, thanks!

My comments are minor, and the CI failure is a flake, so I think we can merge as is.. you think it is ready?

Offset<reflection::EnumVal> Serialize(FlatBufferBuilder *builder, const Parser &parser) const;

bool Deserialize(const Parser &parser, const reflection::EnumVal *val);

uint64_t GetAsUInt64() const { return static_cast<uint64_t>(value); }
int64_t GetAsInt64() const { return value; }
bool IsZero() const { return 0 == value; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two methods are nicely descriptive, but a bit superfluous, as using an int as a bool is fine in C++

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsZero and IsNotZero were added for visibility during refactoring.
The 'value` became private and I left these methods.

Offset<reflection::EnumVal> Serialize(FlatBufferBuilder *builder, const Parser &parser) const;

bool Deserialize(const Parser &parser, const reflection::EnumVal *val);

uint64_t GetAsUInt64() const { return static_cast<uint64_t>(value); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I appreciate that this refactor is cleaner, we also have to think about backwards compatibility, since this class is a public API. In that way, I'd prefer for value to keep existing for the signed case, and there to be a method to get the unsigned one?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is no big deal for people that access value to replace it with your new method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnumVal and EnumDef are internal public. They are not used outside in the external (generated) code.

@vglavnyy
Copy link
Contributor Author

Until Monday, I will add new tests to cover the changes. If all will pass we can try merge to master.

@aardappel
Copy link
Collaborator

Ok, let me know.

@vglavnyy
Copy link
Contributor Author

vglavnyy commented Apr 3, 2019

@aardappel
Probably, cpp-parts of the EnumDef (parser and code generator) are ready.
Can you review these commits before squashing?
After C++, all other code generators should be updated and tested (at least with monster_enum.fbs).

I propose to limit the enums with bit_flags attribute to unsigned data types.

enum E: int8 (bit_flags) { s = 7 };
table T { y : E = s }

The binary value of s is -128 = 0xFFFFFFFFFFFFFF80LL (int64_t).
ANY value of this enum is also negative (see monster_enum_generated.h).
Checking that flag Fx is the member of ANY isn't easy after cast to signed or unsigned integer.
This cast may be implicit by promotion to int type.

auto r = cast<T*>(GetRoot());
int32_t i32 = static_cast<int32_t>(r->y()); // get default
uint32_t i32 = static_cast<uint32_t>(r->y());
i32 == -128; // FFFFFF80
u32 == 4294967168; // FFFFFF80
i32 & (1<<11) == true
u32 & (1<<11) == true

This is unexpected, even with experience in C++.

@aardappel
Copy link
Collaborator

aardappel commented Apr 5, 2019

Yes, likely constraining bit_flags to unsigned shouldn't break too many people. Or we could make it a warning for now.

Copy link
Collaborator

@aardappel aardappel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow this is a crazy amount of changes, hard for me to review what impact this has. But I can also see it would have been hard to split up. I think we're just going to have to merge it and see the impact.

I was thinking of doing a 1.11 release very soon. It may be wise to merge this right after, to reduce risk. On the other hand, merging it before has the advantage that the improvements will get to more people. WDYT?

src/idl_parser.cpp Outdated Show resolved Hide resolved
@vglavnyy
Copy link
Contributor Author

vglavnyy commented Apr 6, 2019

Better will be after 1.11.

  • I will try to extract all independent and test-passed changes to standalone requests.
  • After this, will do rebase to the master.
  • Constrain bit_flags underlying type to unsigned.
  • After the time, I will be able to review my code.
  1. I caught memory leakage in the HEAD of the master trying to extract extended proto enum from this PR:
/// Enum doc comment.
enum ProtoEnum {
  option allow_alias = true;
  FOO = 1;
  /// Enum 2nd value doc comment misaligned.
  BAR = 5;
  // Aliases
  FOO_A1 = 1;
  BAR_A1 = 5;
  FOO_A2 = 1;
}
  1. Is this enum declaration valid from point of view of Protobuf?
    /// Enum doc comment.
    enum ProtoEnum {
    FOO = 1;
    /// Enum 2nd value doc comment misaligned.
    BAR = 5;
    }

Protobuf enum decl

  • There must be a zero value, so that we can use 0 as a numeric default value.
  • The zero value needs to be the first element, for compatibility with the proto2 semantics where the first enum value is always the default.

vglavnyy added a commit to vglavnyy/flatbuffers that referenced this pull request Apr 6, 2019
aardappel pushed a commit that referenced this pull request Apr 8, 2019
@aardappel
Copy link
Collaborator

What was the memory leak? What was leaking?

And yes, sounds like our test does not conform to Protobuf. Not sure if that is a problem, if FlatBuffers is more lenient than Protobuf we don't necessarily care to enforce ProtoBuf rules. In fact, most of the --proto parser is written under the assumption that the .proto is already correct according to protoc, if you feed it illegal protos it likely will give unhelpful errors or silently allow it.

@aardappel
Copy link
Collaborator

And yes, I will get started with the 1.11 release process right then.

@vglavnyy
Copy link
Contributor Author

vglavnyy commented Apr 9, 2019

What was the memory leak? What was leaking?

GCC8.2, GCC7.3, Clang 7.0, MSVC2017 - detect leakage if proto-enum has aliased values.
I have checked Travis-CI does not detect it.

#0 0x6460c2 in operator new(unsigned long) (/home/osboxes/flatbuffers/.project/flattests+0x6460c2)
#1 0x69de90 in flatbuffers::Parser::ParseEnum(bool, flatbuffers::EnumDef**) /home/osboxes/flatbuffers/src/idl_parser.cpp:1668:19
#2 0x6b17f6 in flatbuffers::Parser::ParseProtoDecl() /home/osboxes/flatbuffers/src/idl_parser.cpp:2030:5
#3 0x6c4dc7 in flatbuffers::Parser::DoParse(char const*, char const**, char const*, char const*) /home/osboxes/flatbuffers/src/idl_parser.cpp:2584:7
#4 0x6c4a54 in flatbuffers::Parser::DoParse(char const*, char const**, char const*, char const*) /home/osboxes/flatbuffers/src/idl_parser.cpp:2573:16
#5 0x6beeb7 in flatbuffers::Parser::ParseRoot(char const*, char const**, char const*) /home/osboxes/flatbuffers/src/idl_parser.cpp:2420:3
#6 0x6975a3 in flatbuffers::Parser::Parse(char const*, char const**, char const*) /home/osboxes/flatbuffers/src/idl_parser.cpp:2401:13
#7 0x923d7a in ParseProtoTest() /home/osboxes/flatbuffers/tests/test.cpp:927:3
#8 0x942739 in FlatBufferTests() /home/osboxes/flatbuffers/tests/test.cpp:2637:5
#9 0x942de3 in main /home/osboxes/flatbuffers/tests/test.cpp:2695:3

If I right, the problem is located here:

flatbuffers/src/idl_parser.cpp

Lines 2033 to 2042 in 23bb574

auto &v = enum_def->vals.vec;
std::sort(v.begin(), v.end(), compareEnumVals);
// Temp: remove any duplicates, as .fbs files can't handle them.
for (auto it = v.begin(); it != v.end();) {
if (it != v.begin() && it[0]->value == it[-1]->value)
it = v.erase(it);
else
++it;
}

There are two issues:

  • memory leak: SymbolTable::vec.erase() does not call the destructor of EnumVal
  • invalid pointers in the SymbolTable::dict

The current PR fix both issues.

@aardappel
Copy link
Collaborator

Ok, but we said we weren't going to merge this one until after 1.11, so do you want to make a small PR that just fixes the leak (& invalid pointers) ?

@vglavnyy
Copy link
Contributor Author

Ok, I'll prepare the patch.

@vglavnyy vglavnyy force-pushed the enum_u64_support branch 5 times, most recently from 530a63c to 061da18 Compare April 12, 2019 17:26
vglavnyy added a commit to vglavnyy/flatbuffers that referenced this pull request Apr 12, 2019
- update the bit_flags enum monster_test::Color to unsigned type
- declare the explicit size of generated enum names
@aardappel
Copy link
Collaborator

Ok, 1.11 landed, so this can now go in. Can you rebase?

@vglavnyy vglavnyy force-pushed the enum_u64_support branch 3 times, most recently from 0aab2d3 to 513e0bd Compare April 27, 2019 10:02
- hide the implementation of enums from code generators
- fix uint64 the issue in the cpp-generator
- fix google#5108
- new tests
- enums with bit_flags attribute should be unsigned
@vglavnyy
Copy link
Contributor Author

@aardappel
Ready for review.

FLATBUFFERS_CHECKED_ERROR AssignEnumeratorValue(const std::string &value) {
user_value = true;
auto ascending = false;
if (enum_def.IsUInt64()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of code in this file that goes along this pattern of if (enum_def.IsUInt64()) followed by two very similar bits of code. I guess you evaluated that it wasn't worth templatizing?

Also, not necessarily for this PR, but worth a thought for the future: maybe we should start splitting up idl_parser.cpp into multiple files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about IsUInt64() generalization. Probably, this is impossible by design.
The IsUInt64() can't be used for a template specialization at compile time.
The EnumDef is the switch-based strategy to get access to enum values. This strategy emulates a virtual table for methods like ToString.
I have made IsUInt64() private to isolate from a high-level code.
It is possible to apply classical inheritance if it is needed.

About splitting idp_parser.cpp I thought too.
In time, a solution will be found.

vglavnyy added 2 commits May 1, 2019 10:59
- move EnumDef::ReverseLookup implementation to idl_parser.cpp
- fix typos
@vglavnyy
Copy link
Contributor Author

vglavnyy commented May 1, 2019

Ready for landing if two latest commits are ok.

@aardappel
Copy link
Collaborator

Yes, looks good.. thanks for your hard work, again :)

@aardappel aardappel merged commit b8ef8c1 into google:master May 2, 2019
@vglavnyy vglavnyy deleted the enum_u64_support branch June 2, 2019 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants