Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Un-escape %2C to comma etc for GFF3 to EMBL #52

Merged
merged 2 commits into from
May 3, 2016

Conversation

peterjc
Copy link
Contributor

@peterjc peterjc commented May 2, 2016

This closes #50 with what is essentially a workaround, using the urllib.unquote function.

The existing code assumes that genometools does not do the unescaping, and therefore can handle multi-value attributes using this pattern:

attribute_values = attribute_value.split(',')

Therefore rather than applying the unescaping as early as possible in the GFF3 parsing, I have applied is as late as possible in the EMBL output.

See also genometools/genometools#198

peterjc added 2 commits May 2, 2016 14:59
This closes sanger-pathogens#50 with what is essentially a workaround.

The existing code assumes that genometools does not do the
unescaping, and therefore can handle multi-value attributes
using this pattern:

attribute_values = attribute_value.split(',')

Therefore rather than applying the unescaping as early as
possible in the GFF3 parsing, I have applied is as late as
possible in the EMBL output.

See also genometools/genometools#198
@andrewjpage
Copy link
Member

Thanks a million for the fix.

@andrewjpage andrewjpage merged commit 58e59dd into sanger-pathogens:master May 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should unescape %2C in GFF column 9 as comma in EMBL
2 participants