Other sizes of data (group size and Endianness) #104

ACleverDisguise · 2020-10-23T06:18:58Z

I frequently have to dump data files (ADC output, for example) that don't just have byte-oriented data. It would be nice to be able to specify data width in the dump so I get the hex data grouped in the natural data size instead of having to do the little-endian two-step and mentally group indistinguishable bytes by 2 or 4 or whatever. Something like:

--word-size=1 (uint8_t, default)
--word-size=2 (uint16_t)
--word-size=4 (uint32_t)
--word-size=8 (uint64_t)
--word-size=16 (uint128_t)

That covers the common-ish types. If you want to be really brave you could do weird crap like 3-byte or 17 byte, but that is likely low return on investment.

Not all such data is little-endian, so an extra flag for those cases where word-size > 1 would be:

--little-endian (default)
--big-endian

Also, interpretation could be signed or unsigned

--signed
--unsigned (default)

Of course with this you'd drop the byte-oriented colouration (but maybe with --signed you'd highlight negative numbers in red or something).

sharkdp · 2020-10-24T17:24:49Z

Thank you for the feedback.

It's not entirely clear to me what the output would look like.

Say I choose --word-size=2 (uint16_t) and the input contains 0xAB 0xCD 0x12 0x34. Would you like to see

CDAB 3412

for --little-endian and

ABCD 1234

for --big-endian?

ACleverDisguise · 2020-10-24T23:37:52Z

That's pretty much exactly what I was picturing, yes.

sharkdp · 2020-10-31T09:39:06Z

This looks similar to xxds -groupsize option if I am not mistaking:

       -g bytes | -groupsize bytes
              Separate the output of every <bytes> bytes (two hex characters or  eight
              bit-digits  each)  by  a whitespace.  Specify -g 0 to suppress grouping.
              <Bytes> defaults to 2 in normal mode, 4 in little-endian mode and  1  in
              bits mode.  Grouping does not apply to postscript or include style.

I recently came across this when reading this blog post which makes use of -g to inspect ELF64 executables.

ACleverDisguise · 2020-11-02T05:30:41Z

It is similar to -g and -e in xxd, yes, but I'm not a huge fan of their nomenclature and their rather bizarre default assumptions. (Like the bizarre assumption that "normal" is big-endian, which hasn't been "normal" for decades now.) I can understand, perhaps, that you might want to keep it compatible for easier transition for users, though, so I'm only going to express a mild preference for breaking free from it.

sharkdp · 2022-12-05T21:04:04Z

@RinHizakura If you find the time, could you maybe summarize what is and what is not possible with your new option in #170? (released today)

RinHizakura · 2022-12-06T15:21:46Z

The new option --group-bytes will provide the functionality to group multiple octets as a unit, which means that several bytes will be shown together without whitespace. It is quite similar to the option -groupsize in xxd, however, the possible group size should only be 1, 2, 4, or 8 currently.

On the other hand, this could only be shown in the big-endian format. The little-endian dump is not supported now.

sharkdp · 2022-12-07T20:57:17Z

The new option --group-bytes will provide the functionality to group multiple octets as a unit, which means that several bytes will be shown together without whitespace. It is quite similar to the option -groupsize in xxd, however, the possible group size should only be 1, 2, 4, or 8 currently.

I think this limitation fine for now. 16 would probably be nice, but I understand that it probably interferes with --panels.

On the other hand, this could only be shown in the big-endian format. The little-endian dump is not supported now.

Right. I agree with @ACleverDisguise that this would be a really nice feature to have. So let's keep this ticket open for now.

sharkdp · 2023-04-25T07:21:11Z

I think the main functionality requested in this ticket is now supported with #189 by @RinHizakura now also merged.

sharkdp added enhancement New feature or request help wanted Extra attention is needed labels Oct 31, 2020

sharkdp mentioned this issue Oct 31, 2020

(Partial) compatibility with xxd? #121

Closed

2 tasks

whisperity mentioned this issue Jan 8, 2022

Decimal values in place of the hex characters #147

Closed

RinHizakura mentioned this issue Oct 4, 2022

Support byte grouping #170

Merged

sharkdp changed the title ~~Other sizes of data.~~ Other sizes of data (group size and Endianness) Dec 7, 2022

RinHizakura mentioned this issue Apr 18, 2023

Support both little and big endian dump #189

Merged

sharkdp closed this as completed Apr 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Other sizes of data (group size and Endianness) #104

Other sizes of data (group size and Endianness) #104

ACleverDisguise commented Oct 23, 2020 •

edited

Loading

sharkdp commented Oct 24, 2020 •

edited

Loading

ACleverDisguise commented Oct 24, 2020

sharkdp commented Oct 31, 2020

ACleverDisguise commented Nov 2, 2020 •

edited

Loading

sharkdp commented Dec 5, 2022

RinHizakura commented Dec 6, 2022

sharkdp commented Dec 7, 2022

sharkdp commented Apr 25, 2023

Other sizes of data (group size and Endianness) #104

Other sizes of data (group size and Endianness) #104

Comments

ACleverDisguise commented Oct 23, 2020 • edited Loading

sharkdp commented Oct 24, 2020 • edited Loading

ACleverDisguise commented Oct 24, 2020

sharkdp commented Oct 31, 2020

ACleverDisguise commented Nov 2, 2020 • edited Loading

sharkdp commented Dec 5, 2022

RinHizakura commented Dec 6, 2022

sharkdp commented Dec 7, 2022

sharkdp commented Apr 25, 2023

ACleverDisguise commented Oct 23, 2020 •

edited

Loading

sharkdp commented Oct 24, 2020 •

edited

Loading

ACleverDisguise commented Nov 2, 2020 •

edited

Loading