Save element name in extension element if it has any new lines #3946

BertMatthys · 2024-08-12T16:10:29Z

If an element name has any newlines, save it in an extension element instead of in an attribute because new lines are changed into spaces when xml attribute is parsed

Unit tests: YES
Documentation: NA

tstephen · 2024-08-19T10:06:35Z

Hi, I'm not sure this is a good idea. New lines in XML should be encoded as (line feed) or possibly &#xD (line feed, carriage return) if you prefer.

filiphr · 2024-08-19T10:15:35Z

Thanks for the feedback @tstephen. We were looking into that option actually as well. We tried using 
 when generating the attribute. However, the problem is that the default XMLStreamWriter the com.sun.xml.internal.stream.writers.XMLStreamWriterImpl when writing the value of the attribute it is going to escape the characters we pass it to. For some reason they are not escaping the \n. However, if there is & then it gets escaped. Which means that we can't encode it on our side before passing it.

When writing it the generated XML looks like:

<sequenceFlow id="bpmnSequenceFlow_6" name="abc
def
ghi" sourceRef="bpmnTask_3" targetRef="bpmnEndEvent_5">

However, when this is read the new lines are getting removed (by the XMLStreamReader.

If you have some better suggestion for this, we are more than happy to look into it

tstephen · 2024-08-19T13:14:43Z

How extraordinary! And annoying!

Nonetheless I think this is a very visible obstacle to interchange, so I hope some solution can be found.

I think the problem stems from an underspecification of XMLStreamWriter. The Java tutorial states:

Note that XMLStreamWriter implementations are not required to perform well-formedness or validity checks on input.

Ideas I can think of:

Woodstox appears it may support more complete escaping
I don't know why the cursor Stax implementation was chosen but I note that this page (at the end) seems to be pushing towards the Iterator implementation because it is 'more extensible' and 'future-proof'.
Even if we were persuaded it was a good idea to use reflection to defeat the module system and set setEscapeCharacters(false) it does not help as it applies only to writeCharacters not writeAttribute

None of these seem particularly attractive to be honest.

I am not sure what the new behaviour is but it apears that the standard name may not be written if it contains new line? (line 173 of BaseBpmnXMLConverter.java)?
Perhaps the least disruptive would be to remove this if (!BpmnXMLUtil.containsNewLine(name)) so that the standard attribute is written in addition to the new one?

filiphr · 2024-08-19T13:24:49Z

Woodstox appears it may support more complete escaping

I'm not sure that it is worth adding another dependency for the purpose of this.

I don't know why the cursor Stax implementation was chosen but I note that this page (at the end) seems to be pushing towards the Iterator implementation because it is 'more extensible' and 'future-proof'.

I'm not sure when the initial XML parsing was added, but it's been in there for a really long time. The page you linked also says

If performance is your highest priority—for example, when creating low-level libraries or infrastructure—the cursor API is more efficient.

which makes the Cursor API more attractive.

I am not sure what the new behaviour is but it apears that the standard name may not be written if it contains new line? (line 173 of BaseBpmnXMLConverter.java)?

Indeed the standard name will not output the default attribute name if it contains a new line. However, it will output a custom extension element, which we then use to load it.

Why do you think that it is a problem if the default attribute name is not written down?

tstephen · 2024-08-19T13:34:31Z

Why do you think that it is a problem if the default attribute name is not written down?

Because it means that interchange between tools will lose the name. This is a slippery slope.

filiphr · 2024-08-19T15:36:09Z

Because it means that interchange between tools will lose the name. This is a slippery slope.

We are talking about an extreme edge case, right? This is when you do an explicit new line in the name. Let's take following scenarios into consideration:

Generate XML from another tool - Here everything stays the same. Even if you deploy into Flowable and export again through our java API the name attribute will be used. We saw that if there are new lines (non encoded) in the XML the Java parsing will ignore them and transfer them into empty spaces
Use Flowable Java API to generate XML - In this scenario the name attribute will be used as long as you do not use new lines in the names. This I believe is acceptable
Use Flowable Design - This is the same as the previous scenario.

Basically, the only thing that will not work is if you are using explicitly new lines in names using Flowable Java API and / or Flowable Design.

Do you have some other scenario in mind?

tstephen · 2024-08-19T16:03:45Z

I was thinking about your first scenario....

So if I give a name containing 
 and query the Java API I'll get exactly the same name back?

filiphr · 2024-08-19T16:21:28Z

So for the first scenario nothing changes. Depending on how an XML with an attribute name with 
 is getting parsed by the XML Parser we are using. If it is properly parsed as a new line then the Java API will return a new line.

The only thing that this affects is if you are using the Flowable BPMN XML Converter to convert from the Flowable BPMN Model Java API to XML.

In any case, we are going to add a flag in the converter to use or not use the new extension element when exporting. This way nothing should change. Keep in mind that 3rd party XML -> Flowable is not affected by this change anyways.

tstephen · 2024-08-19T16:41:06Z

Thanks for bearing with me Filip, very glad to hear this.

filiphr · 2024-08-28T09:22:00Z

@tstephen not sure if you saw #3951. We applied the feedback you provided.

tstephen · 2024-08-28T09:37:16Z

Yes I did @filiphr, though I've not had a chance to try it out, it looked good. Many thanks!

BertMatthys added 2 commits August 12, 2024 18:07

Save element name in extension element if it has any new lines

b6b8f94

Improve solution

6913f45

filiphr merged commit 234f9d2 into flowable:main Aug 19, 2024
2 checks passed

BertMatthys mentioned this pull request Aug 21, 2024

Make element name extension element in case of new line optional #3951

Merged

filiphr pushed a commit that referenced this pull request Aug 28, 2024

Save element name in extension element if it has any new lines (#3946)

68468df

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Save element name in extension element if it has any new lines #3946

Save element name in extension element if it has any new lines #3946

BertMatthys commented Aug 12, 2024

tstephen commented Aug 19, 2024

filiphr commented Aug 19, 2024

tstephen commented Aug 19, 2024

filiphr commented Aug 19, 2024

tstephen commented Aug 19, 2024

filiphr commented Aug 19, 2024

tstephen commented Aug 19, 2024

filiphr commented Aug 19, 2024

tstephen commented Aug 19, 2024

filiphr commented Aug 28, 2024

tstephen commented Aug 28, 2024

Save element name in extension element if it has any new lines #3946

Save element name in extension element if it has any new lines #3946

Conversation

BertMatthys commented Aug 12, 2024

tstephen commented Aug 19, 2024

filiphr commented Aug 19, 2024

tstephen commented Aug 19, 2024

filiphr commented Aug 19, 2024

tstephen commented Aug 19, 2024

filiphr commented Aug 19, 2024

tstephen commented Aug 19, 2024

filiphr commented Aug 19, 2024

tstephen commented Aug 19, 2024

filiphr commented Aug 28, 2024

tstephen commented Aug 28, 2024