Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to find 'debug_logger' function? #1

Open
mediotex opened this issue Jul 4, 2024 · 37 comments
Open

How to find 'debug_logger' function? #1

mediotex opened this issue Jul 4, 2024 · 37 comments

Comments

@mediotex
Copy link

mediotex commented Jul 4, 2024

Where should be placed the FIDB files (ecos-mips.be.32.fidb and ecos-mips.le.32.fidb) in Ghidra directory?
Also about eCOS Broadcom Function Auto-Renaming (Ghidra) - BcmDebugLogsRenameFunctions.java script: how to find 'debug_logger' function in firmware for auto-renaming?

@mediotex mediotex changed the title r2pipe dependency for ecos_bootloader_analysis.py How to find 'debug_logger' function in firmware? Jul 11, 2024
@Anonymous941
Copy link

Anonymous941 commented Jul 20, 2024

I just had the same problem, and managed to figure it out:

  1. First, click Search > Memory, and search for any named function (one that worked for me was setBridgePortEnable). Make sure format is set to "string" and memory block types is set to "all blocks"
    Screenshot
  2. Right-click on the string name, and click References > Show references to (name)
    Screenshot
  3. Now it should be clear which function it is (but obviously it won't already be named like mine is)
    Screenshot

@mediotex
Copy link
Author

mediotex commented Jul 20, 2024

I searched for setBridgePortEnable, but the search did not yield results. Same for 'debug_logger'.
In general, I have to find a specific function that a) actually exists in my eCos version and b) get it exact name, correct? Then, using that function as a basis, define other functions. I'm correct? As noted in the recos manual, the FunctionID database files work for eCOS 1.0, 2.0, and parts of 3.0. My eCos revision is 3.0.2, so possible, the FIDB files are not valid.
Is it possible to find an exact name of target function using the CLI shell? I have eCos BFC Application Layer Revision: 3.0.2

@Anonymous941
Copy link

Anonymous941 commented Jul 21, 2024

Is it possible to find an exact name of target function using the CLI shell?

Unfortunately, there isn't a direct way to do that that I know of

In general, I have to find a specific function that a) actually exists in my eCos version and b) get it exact name, correct?

That's right, but you only need to find one named function (any will do), and then using that named function you can find debug_logger. Once you find debug_logger, the script should be able to automatically find the rest.

Is this a bootloader image or an image1/image2 image? If it's a bootloader image, you can try searching for the function names listed here. If none of those have any results, you can try the instructions below

If it's an image1/image2 image, then it might not have the function named setBridgePortEnable. You can try going to Window > Defined strings and looking for anything that looks like a function name, then follow my instructions in the previous comment

If Defined strings is empty, make sure you unpacked it properly with ProgramStore (by using the -f and -f2 arguments), applied FunctionID and set up memory mappings

@mediotex
Copy link
Author

mediotex commented Jul 21, 2024

image1.out is actual image1 (not bootloader) which I dumped from RAM using bcm2dump.
Then I decompressed it to image1.out use ./ProgramStore -f image1.bin -x

In Window > Defined strings, I tried strings that looks like a function name, there are many potential candidates (GetSingletonInstance, BcmAgentCoreHelper, BcmAmdFlashDevice, BcmArpFilterSnoop, getAPExpAlgInfo), but in all cases when right-click on the string name, and click References > Show references to (name), there are only "Show Reference To Address", not to function name.
By the way, there are errors in 2 recos scripts: BcmDebugLogsRenameFunctions.java and BcmRenameLabelVTable.java: when I placed it via Ghidra Script Manager, it show me error (red color) and does not allow select it as valid scripts:

BcmDebugLogsRenameFunctions.java:34: error: class DebugLogsAnnotateFunctions is public, should be declared in a file named DebugLogsAnnotateFunctions.java
public class DebugLogsAnnotateFunctions extends GhidraScript {
       ^
BcmRenameLabelVTable.java:25: error: class RenameLabelVTable is public, should be declared in a file named RenameLabelVTable.java
public class RenameLabelVTable extends GhidraScript {
       ^
skipping /home/teknos/ghidra_scripts/BcmRenameLabelVTable.java
skipping /home/teknos/ghidra_scripts/BcmDebugLogsRenameFunctions.java

As far I understand, it's because Name of public class must match the name of .java file in which it is placed (like public class Foo{} must be placed in Foo.java file). So either I should to rename file from BcmDebugLogsRenameFunctions.java to DebugLogsAnnotateFunctions.java or rename the class from public class DebugLogsAnnotateFunctions { to public class BcmDebugLogsRenameFunctions {
I've chosed the second way. Same for another script.

Also, what's not clear with image1.out in Ghidra: part of functions not shown in Symbol Tree pane "Functions" Label: there are misssing FUN_007 label. Why it missing and where is these functions?
Qab7TgnZ

@Anonymous941
Copy link

Anonymous941 commented Jul 21, 2024

image1.out is actual image1 (not bootloader) which I dumped from RAM using bcm2dump. Then I decompressed it to image1.out use ./ProgramStore -f image1.bin -x

Is there also an image2? If so, it might (but I'm not sure) be required to get the full firmware. That's what I did for my router and was able to decompile it, try running ./ProgramStore -f image1.bin -f2 image2.bin -x, then using that file. Maybe that will have setBridgePortEnable

when right-click on the string name, and click References > Show references to (name), there are only "Show Reference To Address", not to function name.

That should probably work as well, try that. If it doesn't work, you can try searching for the function names you see now (see my first comment)

As far I understand, it's because Name of public class must match the name of .java file in which it is placed (like public class Foo{} must be placed in Foo.java file). So either I should to rename file from BcmDebugLogsRenameFunctions.java to DebugLogsAnnotateFunctions.java or rename the class from public class DebugLogsAnnotateFunctions { to public class BcmDebugLogsRenameFunctions { I've chosed the second way. Same for another script.

That's exactly what I did, and it fixed it for me too (and the script successfully extracted the function names, so there aren't any other issues). It's probably because the scripts are for an old Ghidra version

@mediotex
Copy link
Author

mediotex commented Jul 22, 2024

Yes, there is image2. Same size, but different md5. Normally, the second image in cable modem is used as a backup if something happens with image1. I figured they should be identical. But not in this case: they are differs in MD5, and image2 has SIP in its name . I decompressed it ./ProgramStore -f image1.bin -f2 image2.bin -x and will take a look tomorrow.

@Anonymous941
Copy link

Normally, the second image in cable modem is used as a backup if something happens with image1. I figured they should be identical.

I didn't know that, for me it's a different size so maybe it varies by router

@mediotex
Copy link
Author

mediotex commented Jul 22, 2024

the image decompressed by ./ProgramStore -f image1.bin -f2 image2.bin -x has the same MD5 as image1 - that is, they are the same. Decompressed image2(with SIP) is a newer version, I'll use it.
It's not fully clear, how to split RAM in two regions: .text (code) and .data (data). image2 memory layout is as follows:

.text start: 0x80004000
.text end: 0x8079e4d0
.text length: 0x79a4d0
.data start: 0x8079e4d4
.data end: 0x8093e220
.data length: 0x19fd4c
.bss_start: 0x80aad090
.bss_end: 0x80d19760
stack start: 0x80b15660
stack end: 0x80b19660

In above example to set memory mapping, the Start Address is 80004000, but my is 00000000. Should I change this and what to specify in address fields? Block Length in screenshot means memory layout parts length?
Screenshot_2024-07-22_18-32-40

@mediotex
Copy link
Author

I find two functions, BcmAmdFlashDEvice and UcdMsgEvent, but as I noted in previous posts, it only "Show Reference To Address", not to function name, and no references found to that addresses. Maybe addresses are wrong.

@Anonymous941
Copy link

Anonymous941 commented Jul 22, 2024

It's not fully clear, how to split RAM in two regions: .text (code) and .data (data). My memory layout is as follows:

Ah, I think that's been the issue all along. I found the instructions confusing too, and the screenshots on the page were extremely helpful (albeit still confusing). Here's how I did it:

  • The start address is set when you import the firmware image into Ghidra. Unless you specifically change it, it'll default to 0x00000000, which is why that's the starting address for you. You have to click the house icon and change it to the value of .text start
  • For some reason, the Ghidra's length is always 1 more then the recos script says. So set the end to .text end, and length in Ghidra should automatically be set to 1 more then .text end
  • For the new block, set its name to .data, and leave length as it is
  • Now click split again and name it .bss, set its start address to .bss_start, and its length to .bss_end - .bss_start. You have to manually do this in a hexadecimal calculator, in this case it's 0x80d19760 - 0x80aad090 = 0x26c6d0
  • Click split a final time, name it stack, and do the same thing as .bss (length is 0x80b19660 - 0x80b15660 = 0x4000)

Once all these are set, hopefully it will finally tell you where those strings are referenced

@mediotex
Copy link
Author

mediotex commented Jul 22, 2024

I'm confused in the 4 step

Now click split again and name it .bss, set its start address to .bss_start, and its length to .bss_end - .bss_start.

Should I select and split here the ram or .data section, or click Add a new block to memory?

recos instruction says "Once that’s done, we can add new regions. We can add the BSS as an overlay"?

Screen

@mediotex
Copy link
Author

Not sure if its correct:

Screen2
I don't see .text section

@Anonymous941
Copy link

Anonymous941 commented Jul 22, 2024

Should I select and split here the ram or .data section, or click Add a new block to memory?

Start by splitting ram and data, then use Add a new block to memory for .bss and stack

I don't see .text section

That only exists in the map_memory.py's output, not in Ghidra. For Ghidra, it's called ram

Not sure if its correct:

Here's what I have:

Screenshot

Here's the map_memory.py output:

.text start: 0x80004000
.text end: 0x80d5e618
.text length: 0xd5a618
.data start: 0x80d5e61c
.data end: 0x80fc64d0
.data length: 0x267eb4
.bss_start: 0x8140ece8
.bss_end: 0x81642430
stack start: 0x8151a6e8
stack end: 0x8151e6e8

Your screen looks mostly correct, but there's an off-by-one error, "length" should be what you see in map_memory.py and "end" should be one off. eg ram's length should be 0x79a4d0, not 0x79a4d1

@mediotex
Copy link
Author

mediotex commented Jul 24, 2024

Your screen looks mostly correct, but there's an off-by-one error, "length" should be what you see in map_memory.py and "end" should be one off. eg ram's length should be 0x79a4d0, not 0x79a4d1

I'm not sure if ram length should be specified 0x79a4d0: since if I set "Block Length" to 0x79a4d0, ghidra automatically recalculates the "End Address", and it will be 0x8079e4cf, not 0x8079e4d0 that was specified in map_memory.py output.

@Anonymous941
Copy link

@mediotex The screenshots in the recos article show end, not length, being one off from the map_memory.py output. Both cause Ghidra to calculate the other

@mediotex
Copy link
Author

mediotex commented Jul 26, 2024

Ok, got it. I adjusted Memory Map and re-analyzed image. I find function s_BcmAmdFlashDevice_807a316c: how to rename function and to which one?

reference

Screenshot_

What if that logging function is not available in the code? For example, I can't find in code any strings called debug_logger.
How did you rename the function and to what name in your post?

@mediotex mediotex changed the title How to find 'debug_logger' function in firmware? How to find 'debug_logger' function? Jul 28, 2024
@mediotex
Copy link
Author

mediotex commented Aug 1, 2024

There are available functions s_BcmAgentDescriptor::Bind_80868acc, s_BcmEcosSocket_808a0014 but no any debug_logger string - can you explain what specifically should be renamed? In script description noted:


// Identify calls to debug logging functions, re-construct the debug
// logging parameters and rename the calling function based on that
// string.
//
// Example: a function do: debug_logger(2, "Entering func: BcmEcosSocket::Bind"),
// we can consider the calling function is BcmEcosSocket::Bind.

It's not clear at all what should be renamed there.

@Anonymous941
Copy link

Anonymous941 commented Aug 3, 2024

Sorry for the delay, I forgot to check my GitHub notifications

What if that logging function is not available in the code? For example, I can't find in code any strings called debug_logger. How did you rename the function and to what name in your post?

There isn't actually a debug_logger string in any router; you have to manually find and rename it yourself in Ghidra (recos just calls it that). My screenshot showed it after I already renamed it, but you have to find it from context, since there aren't any strings.
You can tell which function it is because it always has the function's name passed into it, so you just had to find one function with a string (which you have) and then you should be able to figure out

After you find that one function, then these recos scripts will simply rename everything else (that it can), no more manual renaming required

Judging from your screenshot, you're 99% of the way there - just right-click on FUN_802da918 in the disassembly, choose Rename Function, and name it debug_logger yourself. Then hopefully you should be able to just run the BcmDebugLogsRenameFunctions script and it will rename all the other functions! If it fails, try renaming FUN_802714f4 to debug_logger instead of FUN_802da918

@mediotex
Copy link
Author

mediotex commented Aug 4, 2024

Oh, that changes things.
Is this what I should get? Not so many, only 36 functions.

rename-func

And for C++ vtable Auto-Renaming: should I just run BcmRenameLabelVTable.java?

@Anonymous941
Copy link

Anonymous941 commented Aug 4, 2024

Is this what I should get? Not so many, only 36 functions.

I don't think so, I got hundreds of functions. Can you show what the function names are on the symbol tree? It should be under Labels at the top. Also, can you show a screenshot of the decompile window scrolled to the right?

And for C++ vtable Auto-Renaming: should I just run BcmRenameLabelVTable.java?

Yes, you can just execute it after they're renamed (but I'm not sure that it worked correctly)

@mediotex
Copy link
Author

mediotex commented Aug 4, 2024

Not sure, how to expand functions tree to show all. Window > Functions: I have total 31813 functions, but they were already present there (just after I've analyzed image), before I ran the script. From the recos author's web pages appears that there are a few debug_loggers (up to 5?). Also, I have a lot of FID_conflict:_GLOBAL__I$52000_files labels.

Screen3

Screen4

Screen1

@abduqasem
Copy link

I used the ghidra loader for ecos, and it does all the work. The problem is, When I try to load it using angr framewrok, i could not get the same functions addresses. have anyone of you face the same issue?

@Anonymous941
Copy link

Anonymous941 commented Aug 5, 2024

From the recos author's web pages appears that there are a few debug_loggers (up to 5?)

Not sure what you mean...

Window > Functions: I have total 31813 functions, but they were already present there (just after I've analyzed image), before I ran the script

I meant under "Symbol Tree" (shown in your screenshot), there should be something you can expand called "Labels"

The function in your screenshot (FUN_8004578) should definitely be automatically renamed by the script to BcmAndFlashDevice, but for some reason it's not. debug_logger is definitely the right function, so the only reasons I can think of is either the memory map is incorrect, or there's a bug in recos. @ecos-wtf any ideas?

@mediotex
Copy link
Author

mediotex commented Aug 5, 2024

From document, chapter "Automated Function Renaming", we can see that there are a few 'debug_logger' names - debug_logger2, debug_logger5, in next pics debug_logger3, debug_logger4: not sure what this mean, but @ecos-wtf explanations are unclear in this point.

debug_logger_2

Here is a function tree expanded.
For Memory Map I used values from script, and I only added "stub entry vector" block as shown in pic.

@qkaiser
Copy link
Member

qkaiser commented Sep 24, 2024

Hi ! I was AFK for the last 3 months, I'll look into this in the coming days :)

@Anonymous941
Copy link

Hi ! I was AFK for the last 3 months, I'll look into this in the coming days :)

btw thanks for making these tools! They've been really helpful for modding my router

@qkaiser
Copy link
Member

qkaiser commented Sep 26, 2024

Where should be placed the FIDB files (ecos-mips.be.32.fidb and ecos-mips.le.32.fidb) in Ghidra directory?

You can put the FIDB files under ./Ghidra/Features/FunctionID/data/ or you can import it manually using the GUI (Tools -> Function ID -> Attach existing FIDB).

Also about eCOS Broadcom Function Auto-Renaming (Ghidra) - BcmDebugLogsRenameFunctions.java script: how to find 'debug_logger' function in firmware for auto-renaming?

All the debug_logger functions are function that were manually reversed and renamed as explained by @Anonymous941 . They're not present like this when you load the firmware. x-referencing debug strings usually does the trick to find these functions. You can choose a different name too, just edit the Java files to reflect that.

@qkaiser
Copy link
Member

qkaiser commented Sep 26, 2024

As noted in the recos manual, the FunctionID database files work for eCOS 1.0, 2.0, and parts of 3.0. My eCos revision is 3.0.2, so possible, the FIDB files are not valid.

I don't think there were lots of changes between revisions 3.0 and 3.0.2 so most standard functions will be matched by Function ID. The FIDB is built for MIPS targets so make sure that's what your firmware is running on.

@qkaiser
Copy link
Member

qkaiser commented Sep 26, 2024

From document, chapter "Automated Function Renaming", we can see that there are a few 'debug_logger' names - debug_logger2, debug_logger5, in next pics debug_logger3, debug_logger4: not sure what this mean, but @ecos-wtf explanations are unclear in this point.

These are just logging functions that were manually renamed during the reversing process. They're named differently because they have different signatures (which the Ghidra script needs to know since it's resolving the log string argument using Ghidra's API).

@qkaiser
Copy link
Member

qkaiser commented Sep 26, 2024

I used the ghidra loader for ecos, and it does all the work. The problem is, When I try to load it using angr framewrok, i could not get the same functions addresses. have anyone of you face the same issue?

Can you expand on that ? Any specific reason you want to use angr rather than Ghidra for this ?

@qkaiser
Copy link
Member

qkaiser commented Sep 26, 2024

@mediotex do you need more help with this ? Otherwise I'll close the ticket.

@mediotex
Copy link
Author

I will try to rename function tomorrow and answer.

@mediotex
Copy link
Author

I'm back to the issue. First of all, there are a vast number of functions designated as FID_conflict:_xxx
I guess this related to FIDB database and incorrect recognition of functions. Why?

So I tried to rename function FUN_802da918 BcmAmdFlashDEvice to "debug_logger", then I ran the BcmDebugLogsRenameFunctions script: it renamed 36 functions. Is this what I am supposed to get from these actions? What is the benefit and convenience of this renaming? What did it give me?

rename

@qkaiser
Copy link
Member

qkaiser commented Sep 28, 2024

I'm back to the issue. First of all, there are a vast number of functions designated as FID_conflict:_xxx
I guess this related to FIDB database and incorrect recognition of functions. Why?

Small or inlined functions can be recognized as corresponding to multiple FIDB signatures, which leads to conflicts. It's not really problematic as it should not affect main standard functions.

So I tried to rename function FUN_802da918 BcmAmdFlashDEvice to "debug_logger", then I ran the BcmDebugLogsRenameFunctions script: it renamed 36 functions. Is this what I am supposed to get from these actions?

Yes. The script renames functions calling debug loggers based on the function name located in the debug string provided to the debug logger.

You can double check that this function is indeed called 36 times by tracing cross-references to the now renamed debug_logger. There are other debug logging functions. The more you spot, the more you'll be able to rename functions and understand what's going on.

What is the benefit and convenience of this renaming? What did it give me?

All reverse engineering work, whether you're looking at raw embedded firmwares or malware starts by renaming functions, renaming variables, infering function signatures, and reconstructing structs. This is done to reduce the cognitive load on the reverser that's trying to understand code paths.

From my own experience (and @Anonymous941 apparently), both FIDB and renaming scripts allows for the recovery of approximately 5000 functions (function name and the VTable they're linked to). This leads to easier understanding and analysis of large embedded firmwares. Without this, you're basically trying to find something within 50.000 obfuscated functions.

@qkaiser
Copy link
Member

qkaiser commented Sep 28, 2024

@mediotex I think it would help if you could explain what you're trying to achieve with your firmware image and this toolkit. Do you have any experience with reverse engineering ?

@mediotex
Copy link
Author

mediotex commented Sep 28, 2024

Generally, following this logic, now I need to search for UcdMsgEvent function (as example), find the references to it and rename FUN again? Then search again for some functions, containing debug logging events and repeat. But what if the specific function I'm interested in isn't related to debug logging in any way: these renames just narrow the search by excluding logging functions?
As I think, these eCos functions vary across different BCM processors and f/w implementations.

I think it would help if you could explain what you're trying to achieve with your firmware image and this toolkit. Do you have any experience with reverse engineering ?

I'm trying to make the firmware more handy by tweaking some features. I have no much experience in reverse engineering, I'm just at the very beginning of my journey.

About assigning memory regions in the Memory Map: are there a special cases where we also need to define locations of vectors (common vector, stub entry vector, debug vector, vsr_table, virtual vector table), in addition to .text, .data, .bss and stack? Are their Start / End addresses same for Broadcom chips?

@Anonymous941
Copy link

Anonymous941 commented Sep 29, 2024

But what if the specific function I'm interested in isn't related to debug logging in any way: these renames just narrow the search by excluding logging functions?

You misunderstand: whether functions have to do with debug logging doesn't affect this. For testing, certain functions (not sure what decides which functions are included) have strings in them, which are used for crashes. So it makes it easier for Broadcom engineers to debug crashes during development, because it would say, for example, something like Crash in UcdMsgEvent rather then Crash in 0xabcdef. Because of an oversight, these strings stay in retail routers and can be used to figure out some of the original function names, which would be otherwise impossible

Generally, following this logic, now I need to search for UcdMsgEvent function (as example), find the references to it and rename FUN again?

No, rename functions only if you can manually figure out what they do. Then it makes it easier to understand other things

As I think, these eCos functions vary across different BCM processors and f/w implementations.

Yes, that's correct

I'm trying to make the firmware more handy by tweaking some features.

Do you mind making this open source if you end up doing it? I'd be interested to see (kind of) the first softmod for eCos routers, and might even port it to my DDW36C. Also here's an article that might be useful (written by @qkaiser actually), it could be used to give you remote access, and also is an example of modifying this firmware image

About assigning memory regions in the Memory Map: are there a special cases where we also need to define locations of vectors (common vector, stub entry vector, debug vector, vsr_table, virtual vector table), in addition to .text, .data, .bss and stack? Are their Start / End addresses same for Broadcom chips?

I doubt it, these are probably just for other platforms that have these things

I have no much experience in reverse engineering, I'm just at the very beginning of my journey.

I recommend trying this on a smaller scale first, try finding example binaries like in this article and seeing if you can figure out the password. You can also try compiling open source projects and seeing how things correspond to the original source code. This will hopefully give you a better idea of why function names are useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants