Update Stanford NLP #643

kwalcock · 2022-06-15T16:52:03Z

No description provided.

Otherwise get "[error] [launcher] error during sbt launcher: java.lang.UnsupportedOperationException: The Security Manager is deprecated and will be removed in a future release" on Java 18.

It seems to be incredibly slow

MihaiSurdeanu · 2022-06-16T08:25:59Z

This might be superseded by PR #644 now, no?

Check StanfordCoreNLP version

kwalcock · 2022-06-16T15:42:19Z

#644 was merged into this one. Weren't you saying that the version number needs a significant bump here, like to 9, and that some projects, maybe reach and possibly eidos, would need to stay at 8? This one might be 9.5.2 and the old might be 8.5.2 and for some time there might be a need to update them in parallel. If there is some change in 9.6 that should be available to old programs, it would be added to 8.6. That would probably necessitate having a processors8 branch that parallels master. If someone is fiddling with branches, maybe it's time to swap out master for main as well.

Before this is merged, I should test once more on Java 18 locally. I'd also like to check over the old issues related to the problems with the instability of the stanford output.

MihaiSurdeanu · 2022-06-16T18:24:13Z

This strategy of keeping parallel minor versions is new to me. I was thinking of simply bumping this to 9.0.0. You prefer your idea because it allows us to develop in parallel?

kwalcock · 2022-06-18T02:03:24Z

The merge is currently waiting for me to figure out this error observed when the change is integrated into Eidos. It's unlikely to be the only problem.

[info] TestEidosTokenizer:
[info] org.clulab.wm.eidos.system.TestEidosTokenizer *** ABORTED ***
[info]   java.lang.IllegalArgumentException: Could not load codec 'Lucene62'.  Did you forget to add lucene-backward-codecs.jar?
[info]   at org.apache.lucene.index.SegmentInfos.readCodec(SegmentInfos.java:428)
[info]   at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:360)
[info]   at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291)
[info]   at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:61)
[info]   at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:58)
[info]   at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:688)
[info]   at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:81)
[info]   at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
[info]   at org.clulab.geonorm.GeoNamesIndex.<init>(GeoNamesIndex.scala:156)
[info]   at org.clulab.wm.eidos.context.GeoNormFinder$.fromConfig(GeoNormFinder.scala:37)
[info]   ...
[info]   Cause: java.lang.IllegalArgumentException: An SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene62' does not exist.  You need to add the corresponding JAR file supporting this SPI to your classpath.  The current classpath supports the following names: [Lucene70]
[info]   at org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:116)
[info]   at org.apache.lucene.codecs.Codec.forName(Codec.java:116)
[info]   at org.apache.lucene.index.SegmentInfos.readCodec(SegmentInfos.java:424)
[info]   at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:360)
[info]   at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291)
[info]   at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:61)
[info]   at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:58)
[info]   at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:688)
[info]   at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:81)
[info]   at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
[info]   .

kwalcock · 2022-06-18T02:07:26Z

The version number is not critical for me. There can be a more complicated mapping. Maybe it's more important to have a logical progression of numbers in the main branch and not skip numbers based on (perceived) needs of a side branch.

MihaiSurdeanu · 2022-06-18T07:04:06Z

I would favor a logical, incremental progression. We can keep track of these parallel versions, for any back porting.
Is the Lucene bug caused by different Lucene versions? I think CoreNLP also requires Lucene? Maybe Eidos does too?

kwalcock · 2022-06-20T18:45:19Z

The problem above was solved with

    "org.apache.lucene"           % "lucene-backward-codecs"   % luceneVer,

After that, the tests run. For my records, these below are then failing. I suspect that it comes from changing of tags, like:

TO -> IN
nmod -> obl
nmod_to -> obl_to
dobj -> obj

I don't know whether it's worth updating any rules or hard-coded tags. Eidos doesn't necessarily need to use this processors update.

[error] Failed tests:
[error]         org.clulab.wm.eidos.text.english.raps.TestRaps
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc5
[error]         org.clulab.wm.eidos.serialization.jsonld.TestJLDSerializer
[error]         org.clulab.wm.eidos.text.english.raps.TestRaps1
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc8
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP1
[error]         org.clulab.wm.eidos.text.english.cag.TestExtraText
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP0
[error]         org.clulab.wm.eidos.text.englishGrounding.TestSpecificGroundings
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc2
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP4
[error]         org.clulab.wm.eidos.system.TestHedging
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc3
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc6
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP3
[error]         org.clulab.wm.eidos.system.TestNegation
[error]         org.clulab.wm.eidos.rule.TestJointAdjectives
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc1
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc4
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP6
[error]         org.clulab.wm.eidos.system.TestEidosMention
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP2
[error]         org.clulab.wm.eidos.utils.TestMentionUtils
[error]         org.clulab.wm.eidos.system.TestEidosActions
[error]         org.clulab.wm.eidos.system.TestFiltering
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc7

MihaiSurdeanu · 2022-06-21T07:48:13Z

Probably true. But do you have more concrete examples where some of these fail?

kwalcock and others added 9 commits June 8, 2022 14:15

Update sbt

682c28c

Otherwise get "[error] [launcher] error during sbt launcher: java.lang.UnsupportedOperationException: The Security Manager is deprecated and will be removed in a future release" on Java 18.

Increment to val corenlpV = "4.4.0"

5c7647d

Ignore .bsp directory

8c11025

Fix Integer constructor warning

a34a334

Use artifactory without https

7846e1a

Update Scala versions

020a3b8

Solve reflection problem one way

0f3cf7d

It seems to be incredibly slow

Work on autoClose

dc42ff6

fixed some unit tests

2cda16c

kwalcock mentioned this pull request Jun 15, 2022

It should work with new versions of corenlp #639

Open

Check StanfordCoreNLP version

8361b53

Merge pull request #644 from clulab/kwalcock/modernize

4865896

Check StanfordCoreNLP version

kwalcock mentioned this pull request Jun 23, 2022

What if processors was updated clulab/eidos#1132

Closed

kwalcock marked this pull request as draft February 15, 2023 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Stanford NLP #643

Update Stanford NLP #643

kwalcock commented Jun 15, 2022

MihaiSurdeanu commented Jun 16, 2022

kwalcock commented Jun 16, 2022

MihaiSurdeanu commented Jun 16, 2022

kwalcock commented Jun 18, 2022

kwalcock commented Jun 18, 2022

MihaiSurdeanu commented Jun 18, 2022

kwalcock commented Jun 20, 2022

MihaiSurdeanu commented Jun 21, 2022

Update Stanford NLP #643

Are you sure you want to change the base?

Update Stanford NLP #643

Conversation

kwalcock commented Jun 15, 2022

MihaiSurdeanu commented Jun 16, 2022

kwalcock commented Jun 16, 2022

MihaiSurdeanu commented Jun 16, 2022

kwalcock commented Jun 18, 2022

kwalcock commented Jun 18, 2022

MihaiSurdeanu commented Jun 18, 2022

kwalcock commented Jun 20, 2022

MihaiSurdeanu commented Jun 21, 2022