Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-33701: [C++] Add support for LTO (link time optimization) build #33847

Merged
merged 1 commit into from
Jan 27, 2023

Conversation

kou
Copy link
Member

@kou kou commented Jan 24, 2023

Rationale for this change

Some base type classes don't have hidden method implementations. It may define these classes in each translation unit. It may cause one-definition-rule violation with -flto.

What changes are included in this PR?

Define at least one hidden method to prevent defining these base type classes in each translation unit.

Are these changes tested?

How to reproduce:

CFLAGS="-flto" CXXFLAGS="-flto" cmake ...
cmake --build ...

This reports link errors without this change.

Are there any user-facing changes?

No.

Some base type classes don't have hidden method implementations. It
may define these classes in each translation unit. It may cause
one-definition-rule violation with `-flto`.

How to reproduce:

```bash
CFLAGS="-flto" CXXFLAGS="-flot cmake ...
cmake --build
```
@kou
Copy link
Member Author

kou commented Jan 24, 2023

@github-actions crossbow submit test-r-rhub-debian-gcc-devel-lto-latest

@github-actions
Copy link

Revision: f84418b

Submitted crossbow builds: ursacomputing/crossbow @ actions-a60255052f

Task Status
test-r-rhub-debian-gcc-devel-lto-latest Azure

@kou
Copy link
Member Author

kou commented Jan 24, 2023

@paleolimbot This is another approach.

@@ -288,31 +288,46 @@ std::shared_ptr<DataType> GetPhysicalType(const std::shared_ptr<DataType>& type)
class ARROW_EXPORT FixedWidthType : public DataType {
public:
using DataType::DataType;
// This is only for preventing defining this class in each
// translation unit to avoid one-definition-rule violation.
~FixedWidthType() override;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why it causes ODR violation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I don't know much about this but...

I tried to create a small example to reproduce this case but couldn't create it... So I explain what I observed:

If we don't have this, .o of a C++ code that has std::make_shared<arrow::XXXType> such as std::make_shared<arrow::ListType> has the following nm result:

$ nm --demangle src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_numeric.cc.o | grep 'vtable for arrow::FloatingPointType'
00000000 W vtable for arrow::FloatingPointType

W is a global weak symbol.

https://linux.die.net/man/1/nm

If lowercase, the symbol is local; if uppercase, the symbol is global (external).

"W"
"w"

The symbol is a weak symbol that has not been specifically tagged as a weak object symbol.

If we have this, the symbol becomes a undefined symbol:

$ nm --demangle src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_numeric.cc.o | grep 'vtable for arrow::FloatingPointType'
         U vtable for arrow::FloatingPointType

https://linux.die.net/man/1/nm

"U"

The symbol is undefined.

If there are multiple W vtable for arrow::FloatingPointType in our .o. linking libarrow.so with -flto is failed:

[650/1111] Linking CXX shared library debug/libarrow.so.1100.0.0
FAILED: debug/libarrow.so.1100.0.0 
: && /bin/c++ -fPIC -Wno-noexcept-type -flto -fdiagnostics-color=always  -Wall -Wno-conversion -Wno-sign-conversion -Wunused-result -fno-semantic-interposition -msse4.2  -g -Werror -O0 -ggdb -shared -Wl,-soname,libarrow.so.1100 -o debug/libarrow.so.1100.0.0 ...(**/*.o)... && :
cpp/src/arrow/type.h:313: error: virtual table of type ‘struct FloatingPointType’ violates one definition rule [-Werror=odr]
  313 | class ARROW_EXPORT FloatingPointType : public NumberType {
      | 
cpp/src/arrow/type.h:313:20: note: the conflicting type defined in another translation unit
  313 | class ARROW_EXPORT FloatingPointType : public NumberType {
      |                    ^
<built-in>: note: virtual method ‘__cxa_pure_virtual’
...

It seems that W and -flto is a bad combination. If we change W to U by defining a not-inlined deconstructor, we don't get the ODR error.

Do you notice anything from this observation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why it happens. Clang doesn't complain ODR issue.
cc @pitrou, @westonpace, @lidavidm for comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why adding that destructor changes W to U. You are defining a non-inline key function. Therefore the vtable is defined in only one place.

However, I don't understand why it is an ODR error. For example, in the clang docs:

When a class has no non-pure, non-inline, virtual functions, there is no key function, and the compiler is forced to emit the vtable in every translation unit that references the class. In this case, it is emitted in a COMDAT section, which allows the linker to eliminate all duplicate copies. This is still wasteful in terms of object file size and link time, so it’s always advisable to ensure there is at least one eligible function that can serve as the key function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this is only happen with GCC (GNU ld?).
I couldn't reproduce this with CC=clang CXX=clang++ CFLAGS="-flto" CXXFLAGS="-flto" cmake .... (W symbol is still generated but there is no link error.)

Is it acceptable that we add this change only for GCC and LTO?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is okay. And I wonder how you find this solution, it's not obvious from the error log.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a try and error...

There are some abstract classes for arrow::DataType family. Most of them are in error messages but arrow::DataType isn't shown. I noticed that arrow::DataType is U and others are W in nm --demangle result. I found that arrow::DataType::~DataType() is defined in type.cc but other abstract classes don't define their destructors. So I tried to define them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a class has no non-pure, non-inline, virtual functions, there is no key function, and the compiler is forced to emit the vtable in every translation unit that references the class. In this case, it is emitted in a COMDAT section, which allows the linker to eliminate all duplicate copies. This is still wasteful in terms of object file size and link time, so it’s always advisable to ensure there is at least one eligible function that can serve as the key function.

Does it mean if a class has virtual functions, it's recommended that at least one of them should be in .cc?
I find it hard to understand a class has no non-pure, non-inline, virtual functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what it means. I think "wasteful" is too strong of a word though. I think the consequence should generally be minor enough that it isn't really a concern.

@paleolimbot
Copy link
Member

Thank you @kou!

I don't know enough about LTO or how/why this is a fix to say that this should or should not be defensively merged. Not having warnings during LTO linking makes my life easier (I don't have to figure out how to silence that nightly job or to attempt justifying anything to CRAN; however, I don't know enough about the details to make a recommendation otherwise.

@@ -288,31 +288,46 @@ std::shared_ptr<DataType> GetPhysicalType(const std::shared_ptr<DataType>& type)
class ARROW_EXPORT FixedWidthType : public DataType {
public:
using DataType::DataType;
// This is only for preventing defining this class in each
// translation unit to avoid one-definition-rule violation.
~FixedWidthType() override;
Copy link
Member

@pitrou pitrou Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably don't need to define the destructor separately?

Suggested change
~FixedWidthType() override;
~FixedWidthType() override = default;

(see https://stackoverflow.com/questions/70387962/does-declaring-a-constructor-default-in-a-header-file-break-the-odr)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried = default. If we add = default, generated symbol is still W not U. It seems that we need to define the destructor separately for U.

Copy link
Contributor

@cyb70289 cyb70289 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@ursabot
Copy link

ursabot commented Jan 27, 2023

Benchmark runs are scheduled for baseline = 108a028 and contender = 24da3be. 24da3be is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.48% ⬆️0.0%] test-mac-arm
[Failed ⬇️1.85% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.31% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 24da3be9 ec2-t3-xlarge-us-east-2
[Failed] 24da3be9 test-mac-arm
[Failed] 24da3be9 ursa-i9-9960x
[Finished] 24da3be9 ursa-thinkcentre-m75q
[Finished] 108a0286 ec2-t3-xlarge-us-east-2
[Failed] 108a0286 test-mac-arm
[Failed] 108a0286 ursa-i9-9960x
[Finished] 108a0286 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

thisisnic pushed a commit to thisisnic/arrow that referenced this pull request Jan 30, 2023
…ild (apache#33847)

### Rationale for this change

Some base type classes don't have hidden method implementations. It may define these classes in each translation unit. It may cause one-definition-rule violation with `-flto`.

### What changes are included in this PR?

Define at least one hidden method to prevent defining these base type classes in each translation unit.

### Are these changes tested?

How to reproduce:

```bash
CFLAGS="-flto" CXXFLAGS="-flto" cmake ...
cmake --build ...
```

This reports link errors without this change.

### Are there any user-facing changes?

No.
* Closes: apache#33701

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Yibo Cai <yibo.cai@arm.com>
sjperkins pushed a commit to sjperkins/arrow that referenced this pull request Feb 10, 2023
…ild (apache#33847)

### Rationale for this change

Some base type classes don't have hidden method implementations. It may define these classes in each translation unit. It may cause one-definition-rule violation with `-flto`.

### What changes are included in this PR?

Define at least one hidden method to prevent defining these base type classes in each translation unit.

### Are these changes tested?

How to reproduce:

```bash
CFLAGS="-flto" CXXFLAGS="-flto" cmake ...
cmake --build ...
```

This reports link errors without this change.

### Are there any user-facing changes?

No.
* Closes: apache#33701

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Yibo Cai <yibo.cai@arm.com>
gringasalpastor pushed a commit to gringasalpastor/arrow that referenced this pull request Feb 17, 2023
…ild (apache#33847)

### Rationale for this change

Some base type classes don't have hidden method implementations. It may define these classes in each translation unit. It may cause one-definition-rule violation with `-flto`.

### What changes are included in this PR?

Define at least one hidden method to prevent defining these base type classes in each translation unit.

### Are these changes tested?

How to reproduce:

```bash
CFLAGS="-flto" CXXFLAGS="-flto" cmake ...
cmake --build ...
```

This reports link errors without this change.

### Are there any user-facing changes?

No.
* Closes: apache#33701

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Yibo Cai <yibo.cai@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[R] Link-time optimization reports violations of one-definition rule in the R package
6 participants