Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Stanford NLP #643

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from
Draft

Update Stanford NLP #643

wants to merge 11 commits into from

Conversation

kwalcock
Copy link
Member

No description provided.

kwalcock and others added 9 commits June 8, 2022 14:15
Otherwise get "[error] [launcher] error during sbt launcher: java.lang.UnsupportedOperationException: The Security Manager is deprecated and will be removed in a future release" on Java 18.
It seems to be incredibly slow
@MihaiSurdeanu
Copy link
Contributor

This might be superseded by PR #644 now, no?

@kwalcock
Copy link
Member Author

#644 was merged into this one. Weren't you saying that the version number needs a significant bump here, like to 9, and that some projects, maybe reach and possibly eidos, would need to stay at 8? This one might be 9.5.2 and the old might be 8.5.2 and for some time there might be a need to update them in parallel. If there is some change in 9.6 that should be available to old programs, it would be added to 8.6. That would probably necessitate having a processors8 branch that parallels master. If someone is fiddling with branches, maybe it's time to swap out master for main as well.

Before this is merged, I should test once more on Java 18 locally. I'd also like to check over the old issues related to the problems with the instability of the stanford output.

@MihaiSurdeanu
Copy link
Contributor

This strategy of keeping parallel minor versions is new to me. I was thinking of simply bumping this to 9.0.0. You prefer your idea because it allows us to develop in parallel?

@kwalcock
Copy link
Member Author

The merge is currently waiting for me to figure out this error observed when the change is integrated into Eidos. It's unlikely to be the only problem.

[info] TestEidosTokenizer:
[info] org.clulab.wm.eidos.system.TestEidosTokenizer *** ABORTED ***
[info]   java.lang.IllegalArgumentException: Could not load codec 'Lucene62'.  Did you forget to add lucene-backward-codecs.jar?
[info]   at org.apache.lucene.index.SegmentInfos.readCodec(SegmentInfos.java:428)
[info]   at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:360)
[info]   at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291)
[info]   at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:61)
[info]   at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:58)
[info]   at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:688)
[info]   at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:81)
[info]   at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
[info]   at org.clulab.geonorm.GeoNamesIndex.<init>(GeoNamesIndex.scala:156)
[info]   at org.clulab.wm.eidos.context.GeoNormFinder$.fromConfig(GeoNormFinder.scala:37)
[info]   ...
[info]   Cause: java.lang.IllegalArgumentException: An SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene62' does not exist.  You need to add the corresponding JAR file supporting this SPI to your classpath.  The current classpath supports the following names: [Lucene70]
[info]   at org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:116)
[info]   at org.apache.lucene.codecs.Codec.forName(Codec.java:116)
[info]   at org.apache.lucene.index.SegmentInfos.readCodec(SegmentInfos.java:424)
[info]   at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:360)
[info]   at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291)
[info]   at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:61)
[info]   at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:58)
[info]   at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:688)
[info]   at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:81)
[info]   at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
[info]   .

@kwalcock
Copy link
Member Author

The version number is not critical for me. There can be a more complicated mapping. Maybe it's more important to have a logical progression of numbers in the main branch and not skip numbers based on (perceived) needs of a side branch.

@MihaiSurdeanu
Copy link
Contributor

  • I would favor a logical, incremental progression. We can keep track of these parallel versions, for any back porting.
  • Is the Lucene bug caused by different Lucene versions? I think CoreNLP also requires Lucene? Maybe Eidos does too?

@kwalcock
Copy link
Member Author

The problem above was solved with

    "org.apache.lucene"           % "lucene-backward-codecs"   % luceneVer,

After that, the tests run. For my records, these below are then failing. I suspect that it comes from changing of tags, like:

TO -> IN
nmod -> obl
nmod_to -> obl_to
dobj -> obj

I don't know whether it's worth updating any rules or hard-coded tags. Eidos doesn't necessarily need to use this processors update.

[error] Failed tests:
[error]         org.clulab.wm.eidos.text.english.raps.TestRaps
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc5
[error]         org.clulab.wm.eidos.serialization.jsonld.TestJLDSerializer
[error]         org.clulab.wm.eidos.text.english.raps.TestRaps1
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc8
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP1
[error]         org.clulab.wm.eidos.text.english.cag.TestExtraText
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP0
[error]         org.clulab.wm.eidos.text.englishGrounding.TestSpecificGroundings
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc2
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP4
[error]         org.clulab.wm.eidos.system.TestHedging
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc3
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc6
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP3
[error]         org.clulab.wm.eidos.system.TestNegation
[error]         org.clulab.wm.eidos.rule.TestJointAdjectives
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc1
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc4
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP6
[error]         org.clulab.wm.eidos.system.TestEidosMention
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP2
[error]         org.clulab.wm.eidos.utils.TestMentionUtils
[error]         org.clulab.wm.eidos.system.TestEidosActions
[error]         org.clulab.wm.eidos.system.TestFiltering
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc7

@MihaiSurdeanu
Copy link
Contributor

Probably true. But do you have more concrete examples where some of these fail?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants