Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arc binutils generates incorrect thread local storage (TLS) offsets #688

Open
keith-packard opened this issue Jul 20, 2023 · 11 comments
Open
Labels
area: Toolchain Issues related to Toolchain (Binutils+GCC+GDB+libs)

Comments

@keith-packard
Copy link
Collaborator

I'm working on standalone arc testing for picolibc. One of the tests checks to make sure TLS offsets and relocations work correctly for values that require larger alignment. I discovered that the relocations generated in the output file don't match the positions of the values within the TLS segment. With this bug present, TLS cannot be used on arc.

I've managed to reproduce this in a stand-alone test case.

Source code:

__thread int data_var = 12;
__attribute__((__aligned__(128))) __thread int data_var_128 = 128;
__thread int bss_var;
__attribute__((__aligned__(256))) __thread int bss_var_256;

int __start(void)
{
	return data_var + data_var_128 + bss_var + bss_var_256;
}

Compile command:

$ arc-zephyr-elf-gcc -ffreestanding -nostdlib foo.c -mtp-regno=26
$ arc-zephyr-elf-objdump -St a.out

objdump output:


a.out:     file format elf32-littlearc

SYMBOL TABLE:
00000100 l    d  .text	00000000 .text
00002200 l    d  .tdata	00000000 .tdata
00002300 l    d  .tbss	00000000 .tbss
00002300 l    d  .got	00000000 .got
0000230c l    d  .bss	00000000 .bss
0000230c l    d  .noinit	00000000 .noinit
00000000 l    d  .comment	00000000 .comment
00000000 l    d  .ARC.attributes	00000000 .ARC.attributes
00000000 l    df *ABS*	00000000 foo.c
00000000 l    df *ABS*	00000000 
00002300 l     O .got	00000000 _GLOBAL_OFFSET_TABLE_
00000100 g       .text	00000000 __JLI_TABLE__
0000240c g       .got	00000000 __SDATA_BEGIN__
00000000 g       .tdata	00000004 data_var
00000100 g     F .text	0000003c __start
00000200 g       .tbss	00000004 bss_var_256
00000100 g       .tbss	00000004 bss_var
0000230c g       .got	00000000 __bss_start
00000080 g       .tdata	00000004 data_var_128
0000230c g       .got	00000000 _edata
0000230c g       .bss	00000000 _end



Disassembly of section .text:

00000100 <__start>:
 100:	1cfc b6c8           	st.aw	fp,[sp,-4]
 104:	439b                	mov_s	fp,sp
 106:	2200 3f82 0000 0080 	add	r2,gp,0x80
 10e:	4344                	ld_s	r3,[r2,0]
 110:	2200 3f82 0000 0100 	add	r2,gp,0x100
 118:	4244                	ld_s	r2,[r2,0]
 11a:	635b                	add_s	r3,r3,r2
 11c:	2200 3f82 0000 0200 	add	r2,gp,0x200
 124:	4244                	ld_s	r2,[r2,0]
 126:	635b                	add_s	r3,r3,r2
 128:	2200 3f82 0000 0300 	add	r2,gp,0x300
 130:	4244                	ld_s	r2,[r2,0]
 132:	635a                	add_s	r2,r3,r2
 134:	4040                	mov_s	r0,r2
 136:	1404 341b           	ld.ab	fp,[sp,4]
 13a:	7ee0                	j_s	[blink]

You can see that all of the TLS variables are referenced with incorrect offsets:

                Actual offset    Value in code	Difference

data_var	00000000           0x80		0x80
data_var_128    00000080           0x100	0x80
bss_var         00000100           0x200	0x100
bss_var_256     00000200           0x300	0x100

If this required a computable offset to the TLS base address based on the TLS block alignment (as on ARM), that would be fixable in the linker script and/or TLS register setting code. However, the offsets of data_var and data_var_128 are 0x80 while the offsets of bss_var and bss_var_256 are 0x100.

@keith-packard keith-packard added the area: Toolchain Issues related to Toolchain (Binutils+GCC+GDB+libs) label Jul 20, 2023
@abrodkin
Copy link
Collaborator

@keith-packard so is that an issue due to misbehavior of GCC for ARC?

@keith-packard
Copy link
Collaborator Author

@keith-packard so is that an issue due to misbehavior of GCC for ARC?

No, GCC generates correct code. It's the linker which appears to write the wrong relocation value into the resulting binary.

Here's the assembly output from gcc for this test:

	.file	"foo.c"
	.cpu HS
	.arc_attribute Tag_ARC_PCS_config, 2
	.arc_attribute Tag_ARC_ABI_rf16, 0
	.arc_attribute Tag_ARC_ABI_pic, 0
	.arc_attribute Tag_ARC_ABI_tls, 1
	.arc_attribute Tag_ARC_ABI_sda, 2
	.arc_attribute Tag_ARC_ABI_exceptions, 0
	.arc_attribute Tag_ARC_CPU_variation, 2
	.section	.text
	.global	data_var
	.section	.tdata,"awT",@progbits
	.align 4
	.type	data_var, @object
	.size	data_var, 4
data_var:
	.word	12
	.global	data_var_128
	.align 128
	.type	data_var_128, @object
	.size	data_var_128, 4
data_var_128:
	.word	128
	.global	bss_var
	.section	.tbss,"awT",@nobits
	.align 4
	.type	bss_var, @object
	.size	bss_var, 4
bss_var:
	.zero	4
	.global	bss_var_256
	.align 256
	.type	bss_var_256, @object
	.size	bss_var_256, 4
bss_var_256:
	.zero	4
	.section	.text
	.align 4
	.global	__start
	.type	__start, @function
__start:
	st.a	fp,[sp,-4]	;26
	mov_s	fp,sp	;4
	add r2,gp,@data_var@tpoff
	ld_s	r3,[r2]		;15
	add r2,gp,@data_var_128@tpoff
	ld_s	r2,[r2]		;15
	add_s r3,r3,r2 ;1
	add r2,gp,@bss_var@tpoff
	ld_s	r2,[r2]		;15
	add_s r3,r3,r2 ;1
	add r2,gp,@bss_var_256@tpoff
	ld_s	r2,[r2]		;15
	add_s r2,r3,r2 ;1
	mov_s	r0,r2	;4
	ld.ab	fp,[sp,4]	;23
	j_s	[blink]
	.size	__start, .-__start
	.ident	"GCC: (Zephyr SDK 0.16.2-rc1) 12.2.0"
	.section	.note.GNU-stack,"",@progbits

You can see that GCC emits special symbol names like @data_var@tpoff, which the assembler converts into relocation entries that the linker should update with the resulting offset.

@abrodkin
Copy link
Collaborator

Ok, so then it's GNU LD which does something wrong? ;)
Anyway, my question really was if you need any help with that from ARC toolchain people?

@keith-packard
Copy link
Collaborator Author

Ok, so then it's GNU LD which does something wrong? ;) Anyway, my question really was if you need any help with that from ARC toolchain people?

Well, GNU LD works correctly on every other platform, so it's not a general issue, it's specific to the arc platform. I figured I'd start by bugging the arc toolchain people here. We'll need to track this issue in sdk-ng anyways as that will drive whether we can enable TLS in the arc targets for Zephyr. Given this bug, we should probably go disable this feature in Zephyr until it's fixed here though.

@abrodkin
Copy link
Collaborator

@claziss do you have any ideas what might be wrong here with TLS?

@claziss
Copy link

claziss commented Jul 21, 2023

Do u build with this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619640.html

@keith-packard
Copy link
Collaborator Author

We use -ftls-model=local-exec for Zephyr. I'm sorry I hadn't validated my little test case above with the same compiler flag. Re-checking this morning, I get the same values though. But, it does raise an interesting question about whether there are bits of this logic in the linker relocation code that might be causing this effect. As far as I can tell, this is purely a linker issue. Here's the output of objdump -rdt:

foo.o:     file format elf32-littlearc

SYMBOL TABLE:
00000000 l    df *ABS*	00000000 foo.c
00000000 l    d  .text	00000000 .text
00000000 l    d  .data	00000000 .data
00000000 l    d  .bss	00000000 .bss
00000000 l    d  .tdata	00000000 .tdata
00000000 l    d  .tbss	00000000 .tbss
00000000 l    d  .note.GNU-stack	00000000 .note.GNU-stack
00000000 l    d  .comment	00000000 .comment
00000000 l    d  .ARC.attributes	00000000 .ARC.attributes
00000000 g       .tdata	00000004 data_var
00000080 g       .tdata	00000004 data_var_128
00000000 g       .tbss	00000004 bss_var
00000100 g       .tbss	00000004 bss_var_256
00000000 g     F .text	0000003c __start



Disassembly of section .text:

00000000 <__start>:
   0:	1cfc b6c8           	st.aw	fp,[sp,-4]
   4:	439b                	mov_s	fp,sp
   6:	2200 3f82 0000 0000 	add	r2,gp,0
			a: R_ARC_TLS_LE_32	data_var
   e:	4344                	ld_s	r3,[r2,0]
  10:	2200 3f82 0000 0000 	add	r2,gp,0
			14: R_ARC_TLS_LE_32	data_var_128
  18:	4244                	ld_s	r2,[r2,0]
  1a:	635b                	add_s	r3,r3,r2
  1c:	2200 3f82 0000 0000 	add	r2,gp,0
			20: R_ARC_TLS_LE_32	bss_var
  24:	4244                	ld_s	r2,[r2,0]
  26:	635b                	add_s	r3,r3,r2
  28:	2200 3f82 0000 0000 	add	r2,gp,0
			2c: R_ARC_TLS_LE_32	bss_var_256
  30:	4244                	ld_s	r2,[r2,0]
  32:	635a                	add_s	r2,r3,r2
  34:	4040                	mov_s	r0,r2
  36:	1404 341b           	ld.ab	fp,[sp,4]
  3a:	7ee0                	j_s	[blink]

@claziss
Copy link

claziss commented Jul 24, 2023

Hi,
I cannot reproduce your environment, but here it is an example on how the linker fixes this TLS relocation.

R_ARC_TLS_LE_32 is defined as S + A + TLS_TBSS - TLS_REL, where

  • S is the base address of the symbol in the memory
  • A is the symbol addendum
  • TLS_TBSS is the TLS Translation Control Block size (aligned)
  • TLS_REL is the base of the TLS section

Now,
let's consider:

SYMBOL TABLE:
00000000 l    d  .text	00000000 .text
00002030 l    d  .data	00000000 .data
00002100 l    d  .tdata	00000000 .tdata
00002200 l    d  .tbss	00000000 .tbss
00002200 l    d  .got	00000000 .got
0000220c l    d  .bss	00000000 .bss
00000000 l    d  .comment	00000000 .comment
00000000 l    d  .ARC.attributes	00000000 .ARC.attributes
00000000 l    df *ABS*	00000000 t01.c
00000000 l    df *ABS*	00000000 
00002200 l     O .got	00000000 _GLOBAL_OFFSET_TABLE_
00000004 g       .tdata	00000004 data_var
00000000 g     F .text	00000030 __start
00000100 g       .tbss	00000004 bss_var_256
00000104 g       .tbss	00000004 bss_var
0000220c g       .got	00000000 __bss_start
00000000 g       .tdata	00000004 data_var_128
0000220c g       .got	00000000 _edata
0000220c g       .bss	00000000 _end

Hence, data_var = 4:

  • S = 0x00002100 (.tdata) + 4 = 0x2104
  • A = 0
  • TLS_REL = 0x00002100 (.tdata)
  • TLS_TBSS = (8 + (1 << 7) - 1) & (-( 1<<7)) = 0x80

Adding everything, we get 0x84 , which is like you get from your example where your data_var=0 (i.e., reloc resolved to 0x80).

I hope this helps you.

REF: https://github.com/foss-for-synopsys-dwc-arc-processors/arc-ABI-manual/blob/master/arcv3-elf.md

@keith-packard
Copy link
Collaborator Author

It's not terribly helpful -- it doesn't explain why the computed offsets appear to be wrong in the local exec case. The docs say you take the value in gp and add the computed offset to get the address of the TLS variable and that it should work for both .tdata and .tbss symbols. There's no value I can place in gp which generates correct addresses for all four variables, and it appears to be a problem caused by the additional alignment constraints. Without those, things "just work", but when alignment is added, the offsets do not seem correct.

@claziss
Copy link

claziss commented Jul 24, 2023

Indeed, the alignment of .tbss is set to 8. Hence, TLS_TBSS= (8+(1<<8)-1) & (-(1<<8)) = 0x100, which matches your findings. Thus, the linker expects preferential treatment for .tdata and .tbss. I need to check if this is a feature or a bug.

@claziss
Copy link

claziss commented Aug 1, 2023

@keith-packard thank you for reporting this bug. I have created a patch for it here: foss-for-synopsys-dwc-arc-processors/binutils-gdb@e7e04c7
I'll upstream the patch too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Toolchain Issues related to Toolchain (Binutils+GCC+GDB+libs)
Projects
None yet
Development

No branches or pull requests

3 participants