the performance of StandfordCoreNLP vs. the SpaCy #3

Shiyun-W · 2024-01-13T03:27:48Z

Hi,

Very thank you for your tool for the extraction.
I have to extract the clauses from sentences. I see you have improved the sentence-to-clauses, I tried to use the modul_stanfordSent in your package, but it failed for the example sentence: "Because Mary and Samantha arrived at the bus station before noon, I did not see them at the station.".
The error is:
`IndexError Traceback (most recent call last)
Cell In[8], line 3
1 from extractreq.modul_stanfordSent import stanford_clause
2 sent = "Because Mary and Samantha arrived at the bus station before noon, I did not see them at the station."
----> 3 stanford_clause().get_clause_list(sent)

File c:\Program Files\softwares\Anaconda3\envs\pytorch\lib\site-packages\extractreq\modul_stanfordSent.py:129, in stanford_clause.get_clause_list(self, sent)
127 del t[i]
128 for i in sub_conj_pos:
--> 129 del t[i]
130 subject_phrase = ' '.join(t.leaves())
131 for i in verb_phrases: # update the clause_list

File c:\Program Files\softwares\Anaconda3\envs\pytorch\lib\site-packages\nltk\tree\parented.py:135, in AbstractParentedTree.delitem(self, index)
133 # del ptree[(i,)]
134 elif len(index) == 1:
--> 135 del self[index[0]]
136 # del ptree[i1, i2, i3]
137 else:
138 del self[index[0]][index[1:]]

File c:\Program Files\softwares\Anaconda3\envs\pytorch\lib\site-packages\nltk\tree\parented.py:124, in AbstractParentedTree.delitem(self, index)
122 raise IndexError("index out of range")
123 # Clear the child's parent pointer.
...
--> 155 return list.getitem(self, index)
156 elif isinstance(index, (list, tuple)):
157 if len(index) == 0:

IndexError: list index out of range`

Now my problem is have you ever tested the performance for clause extraction between the standfordCoreNLP and the spacy? If I use the Spacy module, will the performance be worser than the Standford module?

I would be very appreciate if you could reply my question. This is very important for my thesis.
Thank you in advance!

Shiyun-W · 2024-01-13T04:00:25Z

Hi, I also tried another example with the spacy module: "we conclude that the regulated membrane localization of tiam1 through its nh2-terminal ph domain determines the activation of distinct rac-mediated signaling pathways.", but the result show me: ['we conclude that the regulated membrane localization of tiam1 through its nh2-terminal ph domain determines the activation of distinct rac-mediated signaling pathways',
'that the regulated membrane localization of tiam1 through its nh2-terminal ph domain determines the']
Apparently, this is not what we expected. I would like to ask how could I solve this problem?

asyrofist · 2024-01-13T18:10:34Z

Hi
actually, for several issues that we use in spacy or another Language Models..
We are using Langchain, you can checkout my newest Code Repository that talk about that in here..
https://github.com/asyrofist/LangChainProposed

In that Repository we are using Langchain Peoposed to solve that issue, because we are using Large Language Model (LLM) that fix that problem.. That's best practice for me to learn how the NLP works w/ AI generative

Shiyun-W · 2024-01-14T04:20:50Z

Thank you very much! I will try it.

But I wonder how are you define the task with the LLM, do you use it to directly extract the frame from the literature or you first split the literature into sentences, and then use the LLM to do the classification task?

asyrofist · 2024-01-14T07:29:56Z

Actually, as AI generative models..
We just use prompt that split from sentences into clause or another atomic word or subword..
you can try it by experience from several prompt models, have fun..

Shiyun-W · 2024-01-14T13:02:19Z

I understand. Thank you very much for your kindness and patience!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the performance of StandfordCoreNLP vs. the SpaCy #3

the performance of StandfordCoreNLP vs. the SpaCy #3

Shiyun-W commented Jan 13, 2024 •

edited

Loading

Shiyun-W commented Jan 13, 2024

asyrofist commented Jan 13, 2024

Shiyun-W commented Jan 14, 2024

asyrofist commented Jan 14, 2024

Shiyun-W commented Jan 14, 2024

the performance of StandfordCoreNLP vs. the SpaCy #3

the performance of StandfordCoreNLP vs. the SpaCy #3

Comments

Shiyun-W commented Jan 13, 2024 • edited Loading

Shiyun-W commented Jan 13, 2024

asyrofist commented Jan 13, 2024

Shiyun-W commented Jan 14, 2024

asyrofist commented Jan 14, 2024

Shiyun-W commented Jan 14, 2024

Shiyun-W commented Jan 13, 2024 •

edited

Loading