Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Other sizes of data (group size and Endianness) #104

Closed
ACleverDisguise opened this issue Oct 23, 2020 · 8 comments
Closed

Other sizes of data (group size and Endianness) #104

ACleverDisguise opened this issue Oct 23, 2020 · 8 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@ACleverDisguise
Copy link

ACleverDisguise commented Oct 23, 2020

I frequently have to dump data files (ADC output, for example) that don't just have byte-oriented data. It would be nice to be able to specify data width in the dump so I get the hex data grouped in the natural data size instead of having to do the little-endian two-step and mentally group indistinguishable bytes by 2 or 4 or whatever. Something like:

--word-size=1 (uint8_t, default)
--word-size=2 (uint16_t)
--word-size=4 (uint32_t)
--word-size=8 (uint64_t)
--word-size=16 (uint128_t)

That covers the common-ish types. If you want to be really brave you could do weird crap like 3-byte or 17 byte, but that is likely low return on investment.

Not all such data is little-endian, so an extra flag for those cases where word-size > 1 would be:

--little-endian (default)
--big-endian

Also, interpretation could be signed or unsigned

--signed
--unsigned (default)

Of course with this you'd drop the byte-oriented colouration (but maybe with --signed you'd highlight negative numbers in red or something).

@sharkdp
Copy link
Owner

sharkdp commented Oct 24, 2020

Thank you for the feedback.

It's not entirely clear to me what the output would look like.

Say I choose --word-size=2 (uint16_t) and the input contains 0xAB 0xCD 0x12 0x34. Would you like to see

CDAB 3412

for --little-endian and

ABCD 1234

for --big-endian?

@ACleverDisguise
Copy link
Author

That's pretty much exactly what I was picturing, yes.

@sharkdp
Copy link
Owner

sharkdp commented Oct 31, 2020

This looks similar to xxds -groupsize option if I am not mistaking:

       -g bytes | -groupsize bytes
              Separate the output of every <bytes> bytes (two hex characters or  eight
              bit-digits  each)  by  a whitespace.  Specify -g 0 to suppress grouping.
              <Bytes> defaults to 2 in normal mode, 4 in little-endian mode and  1  in
              bits mode.  Grouping does not apply to postscript or include style.

I recently came across this when reading this blog post which makes use of -g to inspect ELF64 executables.

@sharkdp sharkdp added enhancement New feature or request help wanted Extra attention is needed labels Oct 31, 2020
@ACleverDisguise
Copy link
Author

ACleverDisguise commented Nov 2, 2020

It is similar to -g and -e in xxd, yes, but I'm not a huge fan of their nomenclature and their rather bizarre default assumptions. (Like the bizarre assumption that "normal" is big-endian, which hasn't been "normal" for decades now.) I can understand, perhaps, that you might want to keep it compatible for easier transition for users, though, so I'm only going to express a mild preference for breaking free from it.

@sharkdp
Copy link
Owner

sharkdp commented Dec 5, 2022

@RinHizakura If you find the time, could you maybe summarize what is and what is not possible with your new option in #170? (released today)

@RinHizakura
Copy link
Contributor

The new option --group-bytes will provide the functionality to group multiple octets as a unit, which means that several bytes will be shown together without whitespace. It is quite similar to the option -groupsize in xxd, however, the possible group size should only be 1, 2, 4, or 8 currently.

On the other hand, this could only be shown in the big-endian format. The little-endian dump is not supported now.

@sharkdp
Copy link
Owner

sharkdp commented Dec 7, 2022

The new option --group-bytes will provide the functionality to group multiple octets as a unit, which means that several bytes will be shown together without whitespace. It is quite similar to the option -groupsize in xxd, however, the possible group size should only be 1, 2, 4, or 8 currently.

I think this limitation fine for now. 16 would probably be nice, but I understand that it probably interferes with --panels.

On the other hand, this could only be shown in the big-endian format. The little-endian dump is not supported now.

Right. I agree with @ACleverDisguise that this would be a really nice feature to have. So let's keep this ticket open for now.

@sharkdp sharkdp changed the title Other sizes of data. Other sizes of data (group size and Endianness) Dec 7, 2022
@sharkdp
Copy link
Owner

sharkdp commented Apr 25, 2023

I think the main functionality requested in this ticket is now supported with #189 by @RinHizakura now also merged.

@sharkdp sharkdp closed this as completed Apr 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants