use length of the field value for prefixer, not length of encoded data #227

alovak · 2023-04-22T07:51:21Z

This PR fixes the issue we have with packing/unpacking field values for which 1 input byte does not match 1 output byte.

Context

In the current approach, when we pack the field value, we pass the length of the encoded data to the prefixer. Here is the code for the String field:

func (f *String) Pack() ([]byte, error) {
	data := []byte(f.value)
	// ...
	packed, err := f.spec.Enc.Encode(data)
	// ...
	packedLength, err := f.spec.Pref.EncodeLength(f.spec.Length, len(packed))

The issue here is that the prefixer encodes the length of the packed value, not the original value.

Why we haven't noticed such a bug before?

It's because this worked well for most of the cases when 1 input byte matches 1 output byte. Let's say you have a string and you encode it into ASCII, or you have a slice of bytes (Binary field) and you encode it using Binary encoding. For such cases, len(packed) is equal to len(f.value).

But for BCD, where 1 byte represents 2 digits, it does not work. That was reported by @ddvk in the issue #220.

So, when you encode "12" into BCD, you will get 1 byte 0x12 and currently, when we pack the field, the prefixer will encode the length of 1 byte, not 2 digits. When you decode it, the prefixer returns the length of 1 which is a byte, but BCD decoder expects to decode two digits ("12") and having 1 as the data length argument to Decode will drop the first digit and return only "2" instead of "12".

Here is how we Unpack value:

	// prefixer returns length of the data: dataLen
	dataLen, prefBytes, err := f.spec.Pref.DecodeLength(f.spec.Length, data)
	if err != nil {
		return 0, fmt.Errorf("failed to decode length: %w", err)
	}

	// BCD decoder knows that for two digits it should read 1 byte 
	raw, read, err := f.spec.Enc.Decode(data[prefBytes:], dataLen)
	if err != nil {
		return 0, fmt.Errorf("failed to decode content: %w", err)
	}

Changes

To fix this, we changed how fields are packed - they use the length of the field value, not the encoded value.

Breaking Change

This PR breaks some tests and most probably can break your code if you use the following together:

use String field to store hex string like: NewStringValue("210720")
use ASCIIHEXToBytes encoder to encode hex string into bytes

Such a combination in tests is frequently used for BerTLV composite fields (EMV, ICC, etc.).

Why it's broken?

First, when we pack value, ASCIIHEXToBytes converts the hex string to bytes and the length of the encoded value does not match the original value (the hex string is twice longer) changes made for packing, will use length of the original value instead of the length of encoded value how it was before.

Second, when we unpack value, BerTLV prefix returns the length of the value in bytes. Using hex string to store value of tag will not work as the hex string length does not match the length of the value in bytes.

How to fix it?

The main fix is to stop using ASCIIHexToBytes encoding with the String field.

There are two options:

Use the new Hex field added in this PR together with the Binary encoding. The Hex field allows you to work with a hex string when you set or get the value of the field, but when the field is packed or unpacked, it will use encoded bytes instead of a string.

	// ...
	Subfields: map[string]Field{
			"9A": NewHex(&Spec{
				Description: "Transaction Date",
				Enc:         encoding.Binary,
				Pref:        prefix.BerTLV,
			}),

	// ...

	// work with hex string, which will be treated as bytes during packing/unpacking
	data := &TLVTestData{
		F9A:   NewHexValue("210720"),
		F9F02: NewHexValue("000000000501"),
	}

Use Binary field together with the Binary encoding and manually set values of the Binary field.

	// define BerTLV tags as Binary
	"9F02": NewBinary(&Spec{
		Description: "Amount, Authorized (Numeric)",
		Enc:         encoding.Binary,
		Pref:        prefix.BerTLV,
	}),

	// manually convert hex string into bytes
	f9f02, err := hex.DecodeString("000000000501")
	// ...

	err = composite.SetData(&TLVTestData{
		// work with binary value here
		F9F02: NewBinaryValue(f9f02),
	})

fixes #220

codecov-commenter · 2023-04-22T07:53:05Z

Codecov Report

Patch coverage: 70.52% and project coverage change: -0.23 ⚠️

Comparison is base (0a8cb89) 73.94% compared to head (f8a46ed) 73.72%.

❗ Current head f8a46ed differs from pull request most recent head 30d24b1. Consider uploading reports for the commit 30d24b1 to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #227      +/-   ##
==========================================
- Coverage   73.94%   73.72%   -0.23%     
==========================================
  Files          41       42       +1     
  Lines        2069     2158      +89     
==========================================
+ Hits         1530     1591      +61     
- Misses        325      339      +14     
- Partials      214      228      +14

Impacted Files	Coverage Δ
field/composite.go	`82.81% <ø> (ø)`
field/spec.go	`66.66% <ø> (ø)`
field/hex.go	`68.53% <68.53%> (ø)`
field/binary.go	`60.67% <100.00%> (ø)`
field/numeric.go	`83.14% <100.00%> (ø)`
field/string.go	`81.92% <100.00%> (ø)`
field/track1.go	`69.02% <100.00%> (ø)`
field/track2.go	`68.22% <100.00%> (ø)`
field/track3.go	`61.79% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

alovak · 2023-05-09T15:43:02Z

I have to add tests for Hex field

adamdecaf · 2023-05-09T18:16:50Z

I appreciate the callout for a breaking change with solutions to upgrade. Do we know how common this break would occur? (e.g. Looking at the private 8583 specs for card brands are we aware of any.)

Also do we know of an encoding change with any of the private 8583 specs? i.e. something to callout for upgrading callers.

alovak · 2023-05-10T11:04:05Z

@adamdecaf I don't know how common it is. It depends on how the spec is described. From my experience it's rare and option 1 as a fix worked well.

adamdecaf · 2023-05-10T13:35:42Z

Sounds good. Let's merge!

FacundoMora

I have some doubts on this, because with this approach we are coupling the field type with the encoder, this causes to have considerations when defining the fields as you explain in the comments.

One idea I had was to give the encoder the responsibility of deciding the prefix value, maybe as an output value, then this value is used on the prefixer to encode it. I think this makes sense considering that the prefix value interpretation is chained to the encoder type. Let me know what you think.

field/hex.go

field/composite_test.go

field/string.go

alovak · 2023-05-11T18:30:09Z

I have some doubts on this, because with this approach we are coupling the field type with the encoder, this causes to have considerations when defining the fields as you explain in the comments.

One idea I had was to give the encoder the responsibility of deciding the prefix value, maybe as an output value, then this value is used on the prefixer to encode it. I think this makes sense considering that the prefix value interpretation is chained to the encoder type. Let me know what you think.

I understand your concerns, and it's important we discuss this thoroughly to ensure we're making the right design choices.

As per the current design, the data length is a property of the field, irrespective of how it's encoded, meaning that the physical data length is equal to the logical data length. That's to match what we see in the spec (aka application data): 2an, 2 bytes, etc. To work with it, we have String, Number, Binary, and Bitmap (which is also Binary) fields. Their physical data length is equal to logical data length - how it's used by the application.

Moving this (deciding about the logical length) from the field itself to the encoder might mean that the logical length could be calculated differently depending on the encoder, potentially leading to inconsistencies. I don't see how it can give us any benefit as now, length of the field works well except for the introduced Hex field here, which physical length does not match logical, as physically it's a hex string, but logically it's a binary data.

The suggested idea will couple Encoder to the input encoding and it means more encodings like ASCIIHexToBytes, SourceToDestination, etc.

By adding Hex field, I wanted to get a convenient instrument to us to work with binary data using hex strings.

Maybe, as an alternative to Hex, I had to add NewBinaryValueFromHex('AABBCC'). String() for Binary field already returns a hex string. Such change would make less confusion.

I believe either Hex or NewBinaryValueFromHex would provide the convenience we're aiming for, without introducing additional complexity into the Encoder. Let's discuss this further to ensure we're making the most effective choices for our design.

@FacundoMora, I want to sincerely thank you for bringing up this point of discussion. It's given us an opportunity to revisit and clarify our design.

FacundoMora · 2023-05-17T18:29:51Z

Yes but, the problem I see here is that because we are coupling the length with the field type, the length understanding keeps chained to the field type and any variation or different understanding of the length by the encoder will be problematic (or even not work as expected).

For example: I would like to implement an encoder that takes all the field chars and packs it as EBCDIC but removing the last two chars, in this case this encoder will not work because the value of the field was, for example, 10 chars long and the resulting bytes encoded will be 8.

Maybe this example is a little ridiculous, but my objective with this is to manifest the possible problem of unchaining the length understanding from an encoder. I think this could cause the prevention of implementing custom encoders or even adding more encoders to this library.

However, I take on account your concern about coupling the encoder with the input, but I can't fully understand it, can you give me an example?

Let me know your thoughts

alovak · 2023-05-18T17:25:22Z

Yes but, the problem I see here is that because we are coupling the length with the field type, the length understanding keeps chained to the field type and any variation or different understanding of the length by the encoder will be problematic (or even not work as expected).

It's not to encoder to understand the value length, it's the length of the value stored in the field.

Currently, it's the field's responsibility to know its length and value type. This is exactly what I want us to stick to. Not the encoder. In this package, fields contain alphanumeric (in most cases ASCII 7 bits chars), int or binary data (binary or different forms of TLVs). That's why we store it in String, Numeric, and Binary fields and use corresponding Golang types to store the value string, int, or []byte.

The rest is up to the engineers. They can create new types of fields, new encoders, and new prefixes and combine them in an arbitrary way to get desired output.

alovak · 2023-05-26T11:32:19Z

@FacundoMora @adamdecaf thanks for the support with this!

gijsde1ste · 2024-02-29T15:25:39Z

@alovak

I ran into the opposite of this issue I believe. The concrete use case for me is visa field 60 defined as follows.

variable length
1 byte, binary +
12 N, 4-bit BCD (unsigned packed), 7 bytes total

The spec I created for this looks like this:

60: field.NewString(&field.Spec{
  Length:      12,
  Description: "Additional POS Information",
  Enc:         encoding.BCD,
  Pref:        prefix.Binary.L,
})

Then when I assign the string value "112233445566" with a length of 12 bytes, I would expect it to be packed to (in hex with length of 7 bytes) "06112233445566" but since this change it's encoded as "0C112233445566". According to some visa test parsing tools the length really should be encoded to "06". How would you define this spec knowing I want to assign it using a 12 characters long string?

gijsde1ste · 2024-03-01T10:02:32Z

Just realized that Visa makes it impossible to correctly implement this so you can ignore the above message.

field 2 (Pan number) is defined as follows

variable length
1 byte, binary +
19 N, 4-bit BCD (unsigned packed); maximum 11 bytes

But for this field they explicitly mention the length should be the amount of numbers in the PAN and not the amount of bytes. So even though the definition is pretty much exactly the same, they want the length prefix to behave in the way it is currently working. Through above comments I found a way to encode field 60 using a hex field which handles the length prefix differently.

alovak marked this pull request as ready for review May 4, 2023 15:22

alovak force-pushed the add-hex-field branch from fa25f58 to d2d6a15 Compare May 9, 2023 13:59

alovak added 4 commits May 10, 2023 12:33

use length of the field value for prefixer, not length of encoded data

fa4832c

add test for composite with BCD fields

4d9d86c

update composite test and hex field

f39e77d

add tests for Hex field

a4bd20a

alovak force-pushed the add-hex-field branch from 7dc7a80 to a4bd20a Compare May 10, 2023 10:33

alovak mentioned this pull request May 10, 2023

fix for #220 bcd length encoding #221

Closed

FacundoMora reviewed May 10, 2023

View reviewed changes

field/hex.go Show resolved Hide resolved

field/composite_test.go Outdated Show resolved Hide resolved

field/string.go Show resolved Hide resolved

alovak added 2 commits May 26, 2023 13:23

fix typo

ddcc82d

update description of the Hex field

30d24b1

alovak merged commit fb3b4df into master May 26, 2023

alovak deleted the add-hex-field branch May 26, 2023 11:32

alovak mentioned this pull request Nov 8, 2023

String Field does not handle non-UTF8 values correctly #297

Open

This was referenced Nov 29, 2023

Add new field for HexNumerics #301

Closed

New HexNumeric field #302

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use length of the field value for prefixer, not length of encoded data #227

use length of the field value for prefixer, not length of encoded data #227

alovak commented Apr 22, 2023 •

edited

Loading

codecov-commenter commented Apr 22, 2023 •

edited

Loading

alovak commented May 9, 2023

adamdecaf commented May 9, 2023

alovak commented May 10, 2023

adamdecaf commented May 10, 2023

FacundoMora left a comment

alovak commented May 11, 2023 •

edited

Loading

FacundoMora commented May 17, 2023 •

edited

Loading

alovak commented May 18, 2023 •

edited

Loading

alovak commented May 26, 2023

gijsde1ste commented Feb 29, 2024

gijsde1ste commented Mar 1, 2024

use length of the field value for prefixer, not length of encoded data #227

use length of the field value for prefixer, not length of encoded data #227

Conversation

alovak commented Apr 22, 2023 • edited Loading

Context

Why we haven't noticed such a bug before?

Changes

Breaking Change

Why it's broken?

How to fix it?

codecov-commenter commented Apr 22, 2023 • edited Loading

Codecov Report

alovak commented May 9, 2023

adamdecaf commented May 9, 2023

alovak commented May 10, 2023

adamdecaf commented May 10, 2023

FacundoMora left a comment

Choose a reason for hiding this comment

alovak commented May 11, 2023 • edited Loading

FacundoMora commented May 17, 2023 • edited Loading

alovak commented May 18, 2023 • edited Loading

alovak commented May 26, 2023

gijsde1ste commented Feb 29, 2024

gijsde1ste commented Mar 1, 2024

alovak commented Apr 22, 2023 •

edited

Loading

codecov-commenter commented Apr 22, 2023 •

edited

Loading

alovak commented May 11, 2023 •

edited

Loading

FacundoMora commented May 17, 2023 •

edited

Loading

alovak commented May 18, 2023 •

edited

Loading