Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates in BratWriter #1169

Closed
wants to merge 4 commits into from
Closed

Updates in BratWriter #1169

wants to merge 4 commits into from

Conversation

joancf
Copy link

@joancf joancf commented Dec 15, 2017

Well.. I would like the pull request to be only 2 files, the ones of brat.. but...
I did some changes (improvements) to BratWriter.
The idea was to solve some problems I found:

  1. add a parameter to indicate where the html template is. With the current version was inside the jar. So it could not be customizable
  2. Change the way that is used to extract the information. In the previous version there was a kind of defualt option extracting everything.... So the changes include
  • List of elements to extract mandatory
  • The order of the elements to extract influences how they are exported (this can be combined with a change to the brat javascript to force that brat represents them in the same sorting order. It was difficult to understand the output when the annotations over a token are sorted according to "size". Because the same annotation layer was in different positions over different tokens.
  • Indication of which field to display for each type. It was very frustrating to have "token" for each word in order to display dependencies, now each annotations can decide which field to display (for the moment only one field is allowed and annotation can appear only once (but I think is easy to extend this))
  • Parent Types. Extend the usage of parent types. For example to allow the representation of all kinds of dependencies and avoid to have just "dependency" and solved an error with "root"
    for example here there is a the ouptut with this code:
    image

instead of I could get with the original component
image

1.10.0-SNAPSHOT
1 allow to have a custom html template (previously was part of the code)
2 Changes in parameter to define which element has to be shown (before
was the type name)
3 Changes in parent type to allow selection of all subtypes
4 The order used to define the types to be shown is used as the order to
stack them over the text (but it needs some changes in bratt js)
@ukp-svc-jenkins
Copy link

Can one of the admins verify this patch?

So the same annotation can be shown in different layers displaying
different fields.
@joancf
Copy link
Author

joancf commented Dec 15, 2017

Yes, I just updated it to allow multiple times the same annotation.
I also include the changes to brat visualizer.js, that respects the sorting.

var fragmentComparator = function(a, b) {
var tmp;
var aSpan = a.span;
var bSpan = b.span;

    if (aSpan.id < bSpan.id) {
      return -1;
    } else if (aSpan.id > bSpan.id) {
      return 1;

}

I got some problems to run the default html files, because it takes the js from brat and the browser does not allow it. So I had to do my own installation of brat, and change the html (this was the reason to set the html template as a parameter)

@reckart reckart changed the base branch from master to 1.9.x December 15, 2017 13:17
@reckart reckart changed the base branch from 1.9.x to master December 15, 2017 13:18
@reckart
Copy link
Member

reckart commented Dec 15, 2017

Thanks for the PR. Could you somehow fix it to include only the files that you actually want to contribute? A (probably) sensible process for creating a PR is like this:

  • clone the DKPro Core to your local machine
  • create a feature branch based off the official "master" and include your changes there
  • fork DKPro Core on GitHub
  • add the forked version as a second remote to your local clone
  • push the feature branch from the local clone to your fork
  • go to the github website of your fork to create a PR

After the PR is accepted, delete the feature branch. For new PRs, again create a fresh feature branch off the latest version of the official "master". Best not commit any changes to the master branch in your forked repo - it is bound to be confusing.

Copy link
Member

@reckart reckart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, but there is likely more. However, it doesn't make too much sense to comment exhaustively if the PR is re-done anyway.

{
this("T" + aId, aType, aBegin, aEnd, aText);
{
this("T" + String.format("%04d", aId), aType, aBegin, aEnd, aText);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you introducing 4-digit IDs? What would happen if IDs grow over 9999?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just to be able to sort by Id, so 10 goes not before 2. If the number is higher than 4 digits the program will not fail but sorting will not be optimal.

else if (fs instanceof AnnotationFS) {
warnings.add("Assuming annotation type ["+fs.getType().getName()+"] is span");
writeTextAnnotation(doc, (AnnotationFS) fs);
// NO writeTextAnnotation(doc, (AnnotationFS) fs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well maybe the full loop should be redone, as first it selects spans (new sorted loop) and then this second loop should be only for relations.I did not understand why after selecting/discarting the types defined by the user it was selecting everything remaining.

value=aFS.getFeatureValueAsString(aFS.getType().getFeatureByBaseName(splits[0]));
}
}
} catch (Exception E){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of exception are you expecting here? Normally, if there is a problem that cannot be handled, the exception should be allows to fail the execution.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there were some cases (the ROOT of the dpendencies ) that the getFeatureValue(currentAnnot.getType().getFeatureByBaseName(splits[f])) was getFeatureValue(null) and giving an error

@joancf joancf mentioned this pull request Dec 18, 2017
@joancf
Copy link
Author

joancf commented Dec 18, 2017

new pull request #1170

@reckart
Copy link
Member

reckart commented Dec 18, 2017

Closing as it is superseded by #1170.

@reckart reckart closed this Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants