Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negated structure offsets #573

Open
0x6d696368 opened this issue May 6, 2019 · 23 comments
Open

Negated structure offsets #573

0x6d696368 opened this issue May 6, 2019 · 23 comments
Assignees
Labels
Type: Enhancement New feature or request

Comments

@0x6d696368
Copy link

Is your feature request related to a problem?

When code traverses a multiple linked list data structure, e.g. traversing _PEB_LDR_DATA via InInitializationOrderLinks:

typedef struct _LDR_DATA_TABLE_ENTRY
{
  /* 0x0000 */ struct _LIST_ENTRY InLoadOrderLinks;
  /* 0x0010 */ struct _LIST_ENTRY InMemoryOrderLinks;
  /* 0x0020 */ struct _LIST_ENTRY InInitializationOrderLinks;
  /* 0x0030 */ void* DllBase;
  /* 0x0038 */ void* EntryPoint;
...
} LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY; /* size: 0x00e0 */

You get pointers to the above structure with an offset of 0x10 (offset of InInitializationOrderLinks).
If you give this pointer the type struct _LDR_DATA_TABLE_ENTRY all the offsets are obviously wrong.

Describe the solution you'd like

IDA handles this by allowing the user to supply a struct offset, see: https://www.hexblog.com/?p=63
So you hit T on the usage of the struct and define the offset 0x10 and you get your correct types.

Describe alternatives you've considered

Currently I copy the old structure and create a new structure with its name prefixed by the offset _0x010 and deleting the first 2 entries in the structure.
However, this is:

  • annoying
  • doesn't work when the pointer uses struct members that are before the offset and hence got removed from the struct by this work around.

Additional context

When working with linked list data structures this features is needed often.

@0x6d696368 0x6d696368 added the Type: Enhancement New feature or request label May 6, 2019
@saruman9
Copy link
Contributor

Any progress on this?

@emteere
Copy link
Contributor

emteere commented May 24, 2019

From the blog entry, I understand the issue in general.
Just to make sure I understand the pointers you are referring to in Ghidra, are these problems occurring just in the listing with the inferred markup, or are they also occurring with memory references and the decompiler as well?

Do you have some example bytes from a function that exhibits this behavior. I have an example I've coded up to see if it matches what you are seeing.

I just want to make sure the changes in consideration will solve this issue and others related to this in other github issues.

@emteere
Copy link
Contributor

emteere commented May 24, 2019

Original:

//
// CONTAINING_RECORD macro
// Value of structure member (field), given the (type) and the List_Entry (head) in original containing type
//
#define CONTAINING_RECORD(address, type, field) (\
    (type *)((char*)(address) -(unsigned long)(&((type *)0)->field)))
    
typedef struct _LIST_ENTRY {
   struct _LIST_ENTRY *Flink;
   struct _LIST_ENTRY *Blink;
} LIST_ENTRY, *PLIST_ENTRY;

typedef struct _LDR_DATA_TABLE_ENTRY
{
   struct _LIST_ENTRY InLoadOrderLinks;
   struct _LIST_ENTRY InMemoryOrderLinks;
   struct _LIST_ENTRY InInitializationOrderLinks;
   void* DllBase;
   void* EntryPoint;
} LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY;

int countIt(PLDR_DATA_TABLE_ENTRY list) {
   int count = 0;

	PLIST_ENTRY PListHead = &(list->InMemoryOrderLinks);
	PLIST_ENTRY pEntry = PListHead->Flink;

	while(pEntry != PListHead)
	  {
	    PLDR_DATA_TABLE_ENTRY entry = CONTAINING_RECORD(pEntry,LDR_DATA_TABLE_ENTRY,InMemoryOrderLinks);;
	    printf("num1=%x %x\n", entry->DllBase, entry->EntryPoint);
	    count++;
	    pEntry = pEntry->Flink;
	  }
   return count;
}

Decompiled Output:

ulong countIt(PLDR_DATA_TABLE_ENTRY param_1)
{
  _LIST_ENTRY *p_Var1;
  uint count;
  
  p_Var1 = (param_1->InMemoryOrderLinks).Flink;
  count = 0;
  while (&param_1->InMemoryOrderLinks != p_Var1) {
    count = count + 1;
    printf("num1=%x %x\n",p_Var1[2].Flink,p_Var1[2].Blink);
    p_Var1 = p_Var1->Flink;
  }
  return (ulong)count;
}

@emteere
Copy link
Contributor

emteere commented May 24, 2019

@0x6d696368
Copy link
Author

Just to make sure I understand the pointers you are referring to in Ghidra, are these problems occurring just in the listing with the inferred markup, or are they also occurring with memory references and the decompiler as well?

This affects both the decompiler and the listing view.

Do you have some example bytes from a function that exhibits this behavior. I have an example I've coded up to see if it matches what you are seeing.

Your example is reproducing the problem perfectly.

I just want to make sure the changes in consideration will solve this issue and others related to this in other github issues.

What other related Github issues? I searched but couldn't find any relating to negative structure offsets. Maybe they run under a different term. But I would like to take a look.

@MrSapps
Copy link

MrSapps commented Jun 4, 2019

Can I just say that CONTAINING_RECORD isn't the only thing that would cause this? Sometimes the compiler generates code where a structure is accessed from another "base" address.

So for example:

struct Foo
{
int a;
int b;
int c;
};

void DoFoos(Foo* foos)
{
  for (int i=0; i<10; i++)
  {
    foo[i].a = 1;
    foo[i].b = 2;
    foo[i].c = 3;
  }
}

Would look like:

void DoFoos(Foo* foos)
{
  int* offSetToFoo = &foo->b;
  for (int i=0; i<10; i++)
  {
    offSetToFoo[-1] = 1;
    offSetToFoo[0] = 2;
    offSetToFoo[2] = 3;
    offSetToFoo += 3;
  }
}

But obviously this gets crazy with non trivial types..

@dennisbabkin
Copy link

I'm wondering if there's any progress on this?

Also @0x6d696368 did you find a workaround (except creation of a new struct by cutting out first members, which is awful inconvenient)?

@0x6d696368
Copy link
Author

Also @0x6d696368 did you find a workaround (except creation of a new struct by cutting out first members, which is awful inconvenient)?

Unfortunately, no.

@saidelike
Copy link

saidelike commented Nov 13, 2019

Even though I agree with all the above, IDA 7.2 actually improved it even more with shifted pointers. See https://www.hex-rays.com/products/ida/support/idadoc/1695.shtml and https://www.hex-rays.com/products/ida/7.2/index.shtml.

Basically, you can define

int *__shifted(mystruct,20) myptr;

where:

        struct mystruct
        {
          char buf[16];
          int dummy;
          int value;            // <- myptr points here
          double fval;
        };

I think having that in Ghidra too would solve all the issues above, and generically.

@chq-matteo
Copy link

Has there been any progress on this issue?

Can we help in any way?

@saruman9
Copy link
Contributor

saruman9 commented Feb 5, 2020

@emteere, are you working on this or is the community needing to think how to fix it?

@dennisbabkin
Copy link

@saruman9 I think it's the latter.

@Adubbz
Copy link

Adubbz commented Mar 14, 2020

An equivalent to IDA's shifted pointers in Ghidra is also something I'm finding myself increasingly in need of. Hopefully someone manages to figure something out.

@Popax21
Copy link

Popax21 commented Aug 6, 2020

For everyone needing this (including me): I'm currently working on implementing shifted pointers, expect a PR in the next few days ;)

@Popax21
Copy link

Popax21 commented Aug 7, 2020

Update: I now have a working implementation (I'm going to create a PR this afternoon), but it isn't ready to merge yet (mainly because 3% of the integration tests fail, and there are no new unit tests)

Popax21 added a commit to Popax21/ghidra that referenced this issue Aug 7, 2020
@Popax21
Copy link

Popax21 commented Aug 7, 2020

Second Update: The PR (#2189) is now out! Might take a while to merge though, because it affects so much of the codebase.

@saidelike
Copy link

What is the ghidra base commit you used in case we want to test your PR?

@Popax21
Copy link

Popax21 commented Aug 7, 2020

I rebased my commit right before pushing it, so it should be the current HEAD of the master branch

PS: There was actually a merge conflict, because in the week I was working on the patch, there was actually a change to one part of the code I was simultaneously working on

saruman9 pushed a commit to saruman9/ghidra that referenced this issue Sep 4, 2020
@IvanDSM
Copy link

IvanDSM commented Aug 22, 2021

Still a problem as of 10.1. Really need this feature :/

@EuanKirkhope
Copy link

Would be really nice to have this feature!!!

@0xBEEEF
Copy link

0xBEEEF commented Sep 19, 2021

I'll join you here: Such a feature is really needed as soon as possible. From my personal experience gathered so far, the analysis is very limited because you have to "convert" everything first to get to the originally meant field. From my point of view it is essential for the analysis of OOP with inheritance.

@dechamps
Copy link

dechamps commented Jan 30, 2022

I noticed that a similar problem occurs when dealing with some functions that were compiled as C++ virtual member functions. For example in the case of virtual Foo::Bar(), the first (auto) parameter is not a pointer to Foo, it's a pointer to the field in Foo that itself points to the relevant vftable. If you're lucky and this is the first vftable then the offset is zero, but obviously that's not always the case. The body of Foo::Bar() will then access Foo fields using adjusted offsets. As far as I can tell there is no way to tell Ghidra that a given variable is a pointer to a specific field of Foo or a specific offset in Foo. This adds to the pain of analysing code that makes use of virtual functions, on top of #516.

@mvanotti
Copy link

I think this issue affects also thread-local variables.

See the example code and how it is decompiled by ghidra:

tls.c compile with CFLAGS="-std=c11 -Wall -Wextra -O0 -ggdb -Werror" make tls

#include <threads.h>
#include <stdio.h>
#include <assert.h>

thread_local int foo = 0;

int main(void) {
	assert(scanf("%d", &foo) == 1);
	printf("foo is: %d\n", foo);
	return 0;
}

Ghidra Decompiler:
image

It would be nice to be able to define the thread-local variables + stack canary when they are used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests