Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use length of the field value for prefixer, not length of encoded data #227

Merged
merged 6 commits into from
May 26, 2023

Conversation

alovak
Copy link
Contributor

@alovak alovak commented Apr 22, 2023

This PR fixes the issue we have with packing/unpacking field values for which 1 input byte does not match 1 output byte.

Context

In the current approach, when we pack the field value, we pass the length of the encoded data to the prefixer. Here is the code for the String field:

func (f *String) Pack() ([]byte, error) {
	data := []byte(f.value)
	// ...
	packed, err := f.spec.Enc.Encode(data)
	// ...
	packedLength, err := f.spec.Pref.EncodeLength(f.spec.Length, len(packed))

The issue here is that the prefixer encodes the length of the packed value, not the original value.

Why we haven't noticed such a bug before?

It's because this worked well for most of the cases when 1 input byte matches 1 output byte. Let's say you have a string and you encode it into ASCII, or you have a slice of bytes (Binary field) and you encode it using Binary encoding. For such cases, len(packed) is equal to len(f.value).

But for BCD, where 1 byte represents 2 digits, it does not work. That was reported by @ddvk in the issue #220.

So, when you encode "12" into BCD, you will get 1 byte 0x12 and currently, when we pack the field, the prefixer will encode the length of 1 byte, not 2 digits. When you decode it, the prefixer returns the length of 1 which is a byte, but BCD decoder expects to decode two digits ("12") and having 1 as the data length argument to Decode will drop the first digit and return only "2" instead of "12".

Here is how we Unpack value:

	// prefixer returns length of the data: dataLen
	dataLen, prefBytes, err := f.spec.Pref.DecodeLength(f.spec.Length, data)
	if err != nil {
		return 0, fmt.Errorf("failed to decode length: %w", err)
	}

	// BCD decoder knows that for two digits it should read 1 byte 
	raw, read, err := f.spec.Enc.Decode(data[prefBytes:], dataLen)
	if err != nil {
		return 0, fmt.Errorf("failed to decode content: %w", err)
	}

Changes

To fix this, we changed how fields are packed - they use the length of the field value, not the encoded value.

Breaking Change

This PR breaks some tests and most probably can break your code if you use the following together:

  • use String field to store hex string like: NewStringValue("210720")
  • use ASCIIHEXToBytes encoder to encode hex string into bytes

Such a combination in tests is frequently used for BerTLV composite fields (EMV, ICC, etc.).

Why it's broken?

First, when we pack value, ASCIIHEXToBytes converts the hex string to bytes and the length of the encoded value does not match the original value (the hex string is twice longer) changes made for packing, will use length of the original value instead of the length of encoded value how it was before.

Second, when we unpack value, BerTLV prefix returns the length of the value in bytes. Using hex string to store value of tag will not work as the hex string length does not match the length of the value in bytes.

How to fix it?

The main fix is to stop using ASCIIHexToBytes encoding with the String field.

There are two options:

  1. Use the new Hex field added in this PR together with the Binary encoding. The Hex field allows you to work with a hex string when you set or get the value of the field, but when the field is packed or unpacked, it will use encoded bytes instead of a string.
	// ...
	Subfields: map[string]Field{
			"9A": NewHex(&Spec{
				Description: "Transaction Date",
				Enc:         encoding.Binary,
				Pref:        prefix.BerTLV,
			}),

	// ...

	// work with hex string, which will be treated as bytes during packing/unpacking
	data := &TLVTestData{
		F9A:   NewHexValue("210720"),
		F9F02: NewHexValue("000000000501"),
	}
  1. Use Binary field together with the Binary encoding and manually set values of the Binary field.
	// define BerTLV tags as Binary
	"9F02": NewBinary(&Spec{
		Description: "Amount, Authorized (Numeric)",
		Enc:         encoding.Binary,
		Pref:        prefix.BerTLV,
	}),

	// manually convert hex string into bytes
	f9f02, err := hex.DecodeString("000000000501")
	// ...

	err = composite.SetData(&TLVTestData{
		// work with binary value here
		F9F02: NewBinaryValue(f9f02),
	})

fixes #220

@codecov-commenter
Copy link

codecov-commenter commented Apr 22, 2023

Codecov Report

Patch coverage: 70.52% and project coverage change: -0.23 ⚠️

Comparison is base (0a8cb89) 73.94% compared to head (f8a46ed) 73.72%.

❗ Current head f8a46ed differs from pull request most recent head 30d24b1. Consider uploading reports for the commit 30d24b1 to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #227      +/-   ##
==========================================
- Coverage   73.94%   73.72%   -0.23%     
==========================================
  Files          41       42       +1     
  Lines        2069     2158      +89     
==========================================
+ Hits         1530     1591      +61     
- Misses        325      339      +14     
- Partials      214      228      +14     
Impacted Files Coverage Δ
field/composite.go 82.81% <ø> (ø)
field/spec.go 66.66% <ø> (ø)
field/hex.go 68.53% <68.53%> (ø)
field/binary.go 60.67% <100.00%> (ø)
field/numeric.go 83.14% <100.00%> (ø)
field/string.go 81.92% <100.00%> (ø)
field/track1.go 69.02% <100.00%> (ø)
field/track2.go 68.22% <100.00%> (ø)
field/track3.go 61.79% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@alovak alovak marked this pull request as ready for review May 4, 2023 15:22
@alovak
Copy link
Contributor Author

alovak commented May 9, 2023

I have to add tests for Hex field

@adamdecaf
Copy link
Member

I appreciate the callout for a breaking change with solutions to upgrade. Do we know how common this break would occur? (e.g. Looking at the private 8583 specs for card brands are we aware of any.)

Also do we know of an encoding change with any of the private 8583 specs? i.e. something to callout for upgrading callers.

@alovak
Copy link
Contributor Author

alovak commented May 10, 2023

@adamdecaf I don't know how common it is. It depends on how the spec is described. From my experience it's rare and option 1 as a fix worked well.

@adamdecaf
Copy link
Member

Sounds good. Let's merge!

Copy link
Contributor

@FacundoMora FacundoMora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some doubts on this, because with this approach we are coupling the field type with the encoder, this causes to have considerations when defining the fields as you explain in the comments.

One idea I had was to give the encoder the responsibility of deciding the prefix value, maybe as an output value, then this value is used on the prefixer to encode it. I think this makes sense considering that the prefix value interpretation is chained to the encoder type. Let me know what you think.

field/hex.go Show resolved Hide resolved
field/composite_test.go Outdated Show resolved Hide resolved
field/string.go Show resolved Hide resolved
@alovak
Copy link
Contributor Author

alovak commented May 11, 2023

I have some doubts on this, because with this approach we are coupling the field type with the encoder, this causes to have considerations when defining the fields as you explain in the comments.

One idea I had was to give the encoder the responsibility of deciding the prefix value, maybe as an output value, then this value is used on the prefixer to encode it. I think this makes sense considering that the prefix value interpretation is chained to the encoder type. Let me know what you think.

I understand your concerns, and it's important we discuss this thoroughly to ensure we're making the right design choices.

As per the current design, the data length is a property of the field, irrespective of how it's encoded, meaning that the physical data length is equal to the logical data length. That's to match what we see in the spec (aka application data): 2an, 2 bytes, etc. To work with it, we have String, Number, Binary, and Bitmap (which is also Binary) fields. Their physical data length is equal to logical data length - how it's used by the application.

Moving this (deciding about the logical length) from the field itself to the encoder might mean that the logical length could be calculated differently depending on the encoder, potentially leading to inconsistencies. I don't see how it can give us any benefit as now, length of the field works well except for the introduced Hex field here, which physical length does not match logical, as physically it's a hex string, but logically it's a binary data.

The suggested idea will couple Encoder to the input encoding and it means more encodings like ASCIIHexToBytes, SourceToDestination, etc.

By adding Hex field, I wanted to get a convenient instrument to us to work with binary data using hex strings.

Maybe, as an alternative to Hex, I had to add NewBinaryValueFromHex('AABBCC'). String() for Binary field already returns a hex string. Such change would make less confusion.

I believe either Hex or NewBinaryValueFromHex would provide the convenience we're aiming for, without introducing additional complexity into the Encoder. Let's discuss this further to ensure we're making the most effective choices for our design.

@FacundoMora, I want to sincerely thank you for bringing up this point of discussion. It's given us an opportunity to revisit and clarify our design.

@FacundoMora
Copy link
Contributor

FacundoMora commented May 17, 2023

Yes but, the problem I see here is that because we are coupling the length with the field type, the length understanding keeps chained to the field type and any variation or different understanding of the length by the encoder will be problematic (or even not work as expected).

For example: I would like to implement an encoder that takes all the field chars and packs it as EBCDIC but removing the last two chars, in this case this encoder will not work because the value of the field was, for example, 10 chars long and the resulting bytes encoded will be 8.

Maybe this example is a little ridiculous, but my objective with this is to manifest the possible problem of unchaining the length understanding from an encoder. I think this could cause the prevention of implementing custom encoders or even adding more encoders to this library.

However, I take on account your concern about coupling the encoder with the input, but I can't fully understand it, can you give me an example?

Let me know your thoughts

@alovak
Copy link
Contributor Author

alovak commented May 18, 2023

Yes but, the problem I see here is that because we are coupling the length with the field type, the length understanding keeps chained to the field type and any variation or different understanding of the length by the encoder will be problematic (or even not work as expected).

It's not to encoder to understand the value length, it's the length of the value stored in the field.

Currently, it's the field's responsibility to know its length and value type. This is exactly what I want us to stick to. Not the encoder. In this package, fields contain alphanumeric (in most cases ASCII 7 bits chars), int or binary data (binary or different forms of TLVs). That's why we store it in String, Numeric, and Binary fields and use corresponding Golang types to store the value string, int, or []byte.

The rest is up to the engineers. They can create new types of fields, new encoders, and new prefixes and combine them in an arbitrary way to get desired output.

@alovak
Copy link
Contributor Author

alovak commented May 26, 2023

@FacundoMora @adamdecaf thanks for the support with this!

@alovak alovak merged commit fb3b4df into master May 26, 2023
@alovak alovak deleted the add-hex-field branch May 26, 2023 11:32
This was referenced Nov 29, 2023
@gijsde1ste
Copy link

@alovak

I ran into the opposite of this issue I believe. The concrete use case for me is visa field 60 defined as follows.

variable length
1 byte, binary +
12 N, 4-bit BCD (unsigned packed), 7 bytes total

The spec I created for this looks like this:

60: field.NewString(&field.Spec{
  Length:      12,
  Description: "Additional POS Information",
  Enc:         encoding.BCD,
  Pref:        prefix.Binary.L,
})

Then when I assign the string value "112233445566" with a length of 12 bytes, I would expect it to be packed to (in hex with length of 7 bytes) "06112233445566" but since this change it's encoded as "0C112233445566". According to some visa test parsing tools the length really should be encoded to "06". How would you define this spec knowing I want to assign it using a 12 characters long string?

@gijsde1ste
Copy link

Just realized that Visa makes it impossible to correctly implement this so you can ignore the above message.

field 2 (Pan number) is defined as follows

variable length
1 byte, binary +
19 N, 4-bit BCD (unsigned packed); maximum 11 bytes

But for this field they explicitly mention the length should be the amount of numbers in the PAN and not the amount of bytes. So even though the definition is pretty much exactly the same, they want the length prefix to behave in the way it is currently working. Through above comments I found a way to encode field 60 using a hex field which handles the length prefix differently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

prefix.BCD.Fixed doesn't calculate the encoding length correctly
5 participants