Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null pointer exception for some rare gene-keyword combinations #401

Closed
AjitPS opened this issue Aug 6, 2019 · 9 comments
Closed

Null pointer exception for some rare gene-keyword combinations #401

AjitPS opened this issue Aug 6, 2019 · 9 comments
Assignees
Labels

Comments

@AjitPS
Copy link
Collaborator

AjitPS commented Aug 6, 2019

Using test human OXL on babvs73 test VM, searches work fine except when having 2 specific genes in gene list (FOXL2 and FMR1):

API calls work, e.g.,
http://babvs73.rothamsted.ac.uk:8080/ws/humanknet/genome?keyword=ovarian&list=ESH4 .

@AjitPS AjitPS added the bug label Aug 6, 2019
@AjitPS AjitPS changed the title humanknet - certain gene throw null pointer exception test humanknet - some genes in geneList throw exceptions Aug 6, 2019
@AjitPS
Copy link
Collaborator Author

AjitPS commented Aug 6, 2019

API call for those 2 genes works but rendering MapView xml throws error; details at:
human_FOXL2_FMR1_search.log

@AjitPS
Copy link
Collaborator Author

AjitPS commented Aug 6, 2019

oddly, searching UI with keyword: ovarian and list: ENSG00000183770 , ENSG00000102081
 , i.e., exact accessions of those 2 genes works without any errors!

@josephhearnshaw
Copy link
Contributor

josephhearnshaw commented Aug 6, 2019

Searching for GCK under the terms "type 2 diabetes mellitus" OR "type 2 diabetes" OR "T2DM" will return GCK and GCKR. GCKR is the superset gene in this case (regulatory protein - matching name in string) but it returns results. GCKR contains evidence related to T2DM terms whereas FMR1NB (superset of FMR1) doesn't when searching for "premature ovarian failure".

Furthermore, FMR1 contains the Disease concept "Fragile X syndrome", which its counterpart FMR1B does not. Searching for this term with FMR1 produces the same error. The same pattern appears when you test it for other diseases too. Search for something that appears in both though, i.e. "Antineutrophile cytoplasmic antibody-associated vasculitis" and everything works fine.

@AjitPS
Copy link
Collaborator Author

AjitPS commented Aug 6, 2019

exactly, looks unrelated to the OXL and to the new thread-safe code.
Likely was always there in the code but we just never saw it as our supersets always had evidences on them. could be a future fix.

@AjitPS AjitPS changed the title test humanknet - some genes in geneList throw exceptions geneList - geneNames (with supersets having 0 evidences for keyword) throws null Aug 6, 2019
@KeywanHP
Copy link
Member

KeywanHP commented Aug 6, 2019

This issue happens if the search term is found in the Lucene index but is not part of any of the evidence concepts for gene X.
image

If the term is not found in the Lucene index at all, then KnetMiner behaves correctly and shows gene X with 0 Evidence:
image

@AjitPS AjitPS removed their assignment Aug 7, 2019
@josephhearnshaw
Copy link
Contributor

josephhearnshaw commented Aug 27, 2019

I've created a temporary work-around for this, but it involved removing the genes lacking evidence from the Gene Table & Map View. I am unable to resolve this while keeping the synchronisation blocks.

My modified block is as follows (Edited as per Marcos reply & further testing):

    ArrayList<ONDEXConcept> matchingGenes = new ArrayList<>();
    scoredCandidates.entrySet().parallelStream().filter((m) -> (genes.contains(m.getKey()))).forEach((m) -> {
        matchingGenes.add((ONDEXConcept) m.getKey());
    });`

Once obtaining the matching Genes, we'll iterate through the genes in matchedGenes when iterating through Gene ONDEXconcepts (i.e. within block for (ONDEXConcept c : matchingGenes) {...} line 1815 onwards in writeAnnotationXML) thus mitigating the 'bug' (where the userGenes do not match what's present within the scoredCandidates when using the synchronization blocks & additional changes). This was performed in writeGeneTable & writeAnnotationXML within OndexServiceProvider.java.

It'd be good if we didn't need to modify this particular behaviour; we need the synchronisation block RE: #352

@marco-brandizi
Copy link
Member

marco-brandizi commented Aug 27, 2019

I know little about this, but looking at the code fragment above, I wonder if forEachOrdered() is actually needed, rather than forEach(), and also if entrySet().parallelStream() is possible in place of .stream(). Both should speed up the population of matchingGenes.

@KeywanHP KeywanHP changed the title geneList - geneNames (with supersets having 0 evidences for keyword) throws null NullPointer Exception for some rare gene-keyword combinations Aug 28, 2019
@KeywanHP KeywanHP changed the title NullPointer Exception for some rare gene-keyword combinations Null pointer exception for some rare gene-keyword combinations Aug 28, 2019
@josephhearnshaw
Copy link
Contributor

See my above commit in my branch.

Instead am now sorting scoredCandidate genes as follows:

sortedCandidates = scoredCandidates .entrySet() .stream() .sorted(Collections.reverseOrder(Map.Entry.comparingByValue())) .collect( toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e2, LinkedHashMap::new)); // Sort the candidate values

Issue is caused with ValueComparator otherwise. Also ensuring that the appropriate values are placed into the new GeneMap in OndexLocalDataSource .

Performs the expected behaviour and sorts correctly.

marco-brandizi added a commit that referenced this issue Aug 30, 2019
@KeywanHP
Copy link
Member

Thank you @josephhearnshaw for the fix!

AjitPS added a commit that referenced this issue Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants