Issue282 handling syntax errors in projects config #300

juhoinkinen · 2019-07-17T12:58:36Z

Raise ConfigurationException instead of full traceback in following situations:

when trying to use (train/suggest) a project without a backend set in projects.cfg:
```
$ annif train nobackend -p tests/projects.cfg any_trainingdata.tsv 
Error: Project 'nobackend': backend setting is missing
```
This is in line with missing vocab error: "Error: Project 'maui-fi': vocab setting is missing".

when trying to use projects.cfg that has duplicated sections or entries:

$ annif train any_backend -p tests/projects_invalid.cfg any_trainingdata.tsv 
Error: Error: While reading from 'tests/projects_invalid.cfg' [line 10]: section 'duplicatedproject' already exists

This closes #282.

osma · 2019-08-08T10:31:23Z

Travis Python 3.5 build is failing, I think VW needs to be upgraded first. Maybe merge #296?

osma · 2019-08-08T10:31:54Z

failing build log: https://travis-ci.org/NatLibFi/Annif/jobs/559957266

codecov · 2019-08-08T10:58:01Z

Codecov Report

Merging #300 into master will increase coverage by 19.59%.
The diff coverage is 100%.

@@             Coverage Diff             @@
##           master     #300       +/-   ##
===========================================
+ Coverage    79.8%   99.39%   +19.59%     
===========================================
  Files          55       55               
  Lines        2842     2969      +127     
===========================================
+ Hits         2268     2951      +683     
+ Misses        574       18      -556

Impacted Files	Coverage Δ
annif/default_config.py	`100% <100%> (ø)`	⬆️
tests/test_project.py	`100% <100%> (+7.75%)`	⬆️
annif/project.py	`100% <100%> (+1.21%)`	⬆️
annif/corpus/subject.py	`100% <0%> (ø)`	⬆️
annif/corpus/types.py	`100% <0%> (ø)`	⬆️
annif/corpus/document.py	`100% <0%> (ø)`	⬆️
annif/corpus/convert.py	`100% <0%> (ø)`	⬆️
tests/test_corpus.py	`100% <0%> (ø)`	⬆️
annif/analyzer/__init__.py	`92.59% <0%> (+3.7%)`	⬆️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d13f369...47d4f2a. Read the comment docs.

osma · 2019-08-12T09:56:41Z

Looks good, but there is one further case that could be covered. If analyzer is left unspecified, I get this:

  File "/home/oisuomin/git/Annif/annif/project.py", line 188, in _create_vectorizer
    tokenizer=self.analyzer.tokenize_words)
AttributeError: 'NoneType' object has no attribute 'tokenize_words'

Should this case be covered too, or left for another issue/PR?

juhoinkinen · 2019-08-12T10:03:20Z

Looks good, but there is one further case that could be covered. If analyzer is left unspecified, I get this:
  File "/home/oisuomin/git/Annif/annif/project.py", line 188, in _create_vectorizer
    tokenizer=self.analyzer.tokenize_words)
AttributeError: 'NoneType' object has no attribute 'tokenize_words'
Should this case be covered too, or left for another issue/PR?

I'll check if this can be easily covered too and implement, otherwise I'll make a new issue about this.

juhoinkinen · 2019-08-15T08:01:29Z

Looks good, but there is one further case that could be covered. If analyzer is left unspecified, I get this:
  File "/home/oisuomin/git/Annif/annif/project.py", line 188, in _create_vectorizer
    tokenizer=self.analyzer.tokenize_words)
AttributeError: 'NoneType' object has no attribute 'tokenize_words'
Should this case be covered too, or left for another issue/PR?

Implemented raising ConfigurationError for this in the previous commit. For the test I changed the [noanalyzer] project to use TFIDF backend because dummy backend does not need analyzer and would not raise the ConfigurationError. That project was included for issue #148. I did not see any reason for the backend to remain dummy, but could @osma confirm?

osma

In general, changing [noanalyzer] to use tfidf should be fine. It was originally added in this commit where the intent was to verify that it's possible to have a project without an analyzer. But it might be necessary to add more project configurations e.g. "fasttext-noanalyzer", "vw_multi-noanalyzer" etc. to cover all the situations where an analyzer is needed but is missing.

annif/project.py

…but missing

juhoinkinen added 4 commits July 12, 2019 11:29

Test for ConfigurationException for missing backend entry in project

bd38123

Raise ConfigurationException for missing backend entry in project

9d5a248

Test for ConfigurationException for invalid project.cfg (duplications)

ab6003c

Raise ConfigurationException for duplications in project.cfg

863a574

juhoinkinen marked this pull request as ready for review August 8, 2019 10:23

osma mentioned this pull request Aug 8, 2019

Use vw version 8.7 #296

Merged

juhoinkinen self-assigned this Aug 12, 2019

juhoinkinen added the enhancement label Aug 12, 2019

Raise ConfigurationError for unspecifed analyzer but needed by backend

f03c320

osma requested changes Aug 20, 2019

View reviewed changes

annif/project.py Outdated Show resolved Hide resolved

annif/project.py Show resolved Hide resolved

Ignore needs_subject_vectorizer; error if analyzer field is accessed …

47d4f2a

…but missing

osma approved these changes Aug 23, 2019

View reviewed changes

osma merged commit b660eed into master Aug 23, 2019

osma deleted the issue282-handling-syntax-errors-in-projects-config branch August 23, 2019 12:17

juhoinkinen added this to the 0.42 milestone Sep 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue282 handling syntax errors in projects config #300

Issue282 handling syntax errors in projects config #300

juhoinkinen commented Jul 17, 2019 •

edited

Loading

osma commented Aug 8, 2019

osma commented Aug 8, 2019

codecov bot commented Aug 8, 2019 •

edited

Loading

osma commented Aug 12, 2019

juhoinkinen commented Aug 12, 2019

juhoinkinen commented Aug 15, 2019

osma left a comment

Issue282 handling syntax errors in projects config #300

Issue282 handling syntax errors in projects config #300

Conversation

juhoinkinen commented Jul 17, 2019 • edited Loading

osma commented Aug 8, 2019

osma commented Aug 8, 2019

codecov bot commented Aug 8, 2019 • edited Loading

Codecov Report

osma commented Aug 12, 2019

juhoinkinen commented Aug 12, 2019

juhoinkinen commented Aug 15, 2019

osma left a comment

Choose a reason for hiding this comment

juhoinkinen commented Jul 17, 2019 •

edited

Loading

codecov bot commented Aug 8, 2019 •

edited

Loading