Releases · pangaea-data-publisher/fuji

29 Apr 12:00

huberrob

v3.2.0

1af7412

v3.2.0 Latest

Latest

Changes from 3.1.0 to 3.2.0

Integration of FAIR testing for software, for more details see the following pull request:
- #478
Improved DCAT handling, now avoids overwriting existing license and access rights info; fixed incorrect handling of distribution info (bytesize type)
Re3data metadata lookup is now always performed, before it was done in case no service endpoint was given only.
Improved RDFa handling: image tags like are ignore now
Upgraded connexion to v 3; python 3.11
Improved XML handling / scheme recognition e.g. for DDI formats
Improved handling of non HTML “landing content” for DOIs see: #492
Improved handling of CC licenses, previously these were not always correctly recognized as valid

Assets 2

10 Nov 18:41

huberrob

v3.1.0

0303a8b

v3.1.0

The main change in this release is the data_harvester behavior which is now using threads to download data objects/files. This allows to include more data files for the assessment. In detail, F-UJI now is trying to analyse up to 5 files per mime-type (as listed in the metadata).
Some other changes to note:

All: Incorrect handling of some landing pages which cause the evaluator to stop has been fixed.
R1.1: Licenses packed as lists are now unpacked and correctly identified
I3: In some cases scores for I3 are improved due to the inclusion of schema.org/citation as scanned relation property
R1: Incorrect handling of file sizes given or interpreted as strings like 'None', which were accepted as valid content, caused incorrect (too high) scoring of R1, score might be lower but correct now in theses cases.
R1: Improved handling of mime types including e.g. charset info (text/plain; charset=US-ASCII) may result in higher score for R1 (FsF-R1-01MD-3)
R1: Improved parsing of content length byte units may improve the scoring.
F2: Improved handling of RDF graphs containing DC or schema.org terms to describe the content may improve findability and other scores
R1.3: F-UJI now uses threads to download more data objects (up to five files/links per claimed content type) which improves its capability to evaluate data content

Assets 2

13 Oct 12:16

huberrob

v3.0.0

2731632

v3.0.0

This new release allows configuration of metric YAML which also affects how tests are performed. More documentation about this will be published soon in the README.

Some changes of F-UJI's behaviour have to be mentioned:

The role of the YAML metric definition file is more important now. It also allows defining individual scores and maturity levels which are now longer hardcoded.
Metrics and tests which are not listed in the YAML files are not performed/assessed; this allows to switch on/off metrics and tests for community specific metrics to be defined in dedicated yaml files.
F-UJI is now able to use different metrics the REST has now an additional parameter ‘metric_version’ by which the yaml file can be defined (default metrics_v0.5.yaml)
F-UJI > 2.3.0 has more tests implemented which allow to define metrics and tests in specific yaml files which are more compatible with RDA and The Evaluator:

FsF-F1-01DD unique identifier of data
FsF-F1-02DD persistent identifier of data
FsF-F1-01M which will replace FsF-F1-01D unique identifier of metadata
FsF-F1-02M which will replace FsF-F1-02D persistent identifier of metadata
FsF-F3-02M (metadata include identifier of dataset)
FsF-F4-01M-2 which tests if OAI-PMH, SPARQl or CSW is used to offer metadata

F-UJI now is not using the first data object for F3, A1, R1 and R1.3 but the first data object which is accessible (HTTP 200)
Fixed a bug which caused wrong scores for R1 because FsF-R1-01MD-3 was sometimes ignoring matching file sizes and types.
F-UJI now also recognizes resource types for R1 if given as URI e.g. schema.org/Dataset
Fixed a bug due to which in 2.2.5 signposting links to JSON-LD files was incorrectly accepted as valid search engine support mechanism.
Fixed a bug which accepted stringified ‘None’ as entry for file type and size and cause wrong scores for R1
Improved license recognition
Improved JSON-LD handling
F-UJI is truncating very large data files prior to testing which caused R1 test FsF-R1-01MD-3 (Data content matches file type and size specified in metadata) to incorrectly compare expected file size with truncated size. Now F-UJI compares expected size with size given in HTTP header (if given) to perform this test for truncated files.
Prior to version 2.3.0 F-UJI was correctly detecting valid domain agnostic metadata standards in R1.3 (FsF-R1.3-01M-3) but did not assign any score for this. This bug was fixed for F-UJI >=2.3.0
Prior to version 3.0.0 F-UJI was accepting content negotiation in addition to html embedding and microdata as a search engine friendly way to offer metadata in FsF-F4-01M - (Metadata is offered in such a way that it can be retrieved programmatically.) Additionally F-UJI did not verify the metadata standard and content offered via RDFa/microdata. Now, F-UJi is exclusively expecting schema.org, DC or DCAT as search engine friendly metadata formats offered via html embedding and microdata/RDFa. It no longer considers empty RDFa content as it did before.

Assets 2

15 Sep 07:19

huberrob

v2.2.5

fdb78cf

v2.2.5

This release is to allow to reproduce the behavior of the F-UJI release used on f-uji.net since february 2023.

Assets 2

07 Oct 18:27

huberrob

v.2.0.2

71001bd

v2.0.2

Full Changelog: v1.4.9...v.2.0.2

This release is the first which is based on the completely restructured metadata_harvesting class. All metadata and PID collecting methods have moved there from fair_check. This allows easier testing and also using the harvester for other purposes.

Assets 2

16 May 08:18

huberrob

v1.4.9

9b252ac

v.1.4.9

Includes 1.7.9b

This will be the last version which uses metric 0.4

Improvements:

Improved signposting handling: better recognition in HTML as well as header; now focusses on metadata and identifier related links and ignores e.g. ORCID author links.
Improved JSON-LD handling, now tries to identify dataset (preferred) or creative work metadata in case several JSON-LD snippets are given (e.g. one for Webpage and another one for Dataset)
More mime types now recognized
Content negotiation now adds a preferred type, e.g. the one found in typed links
Namespace recognition now case insensitive
Improved Dublin Core parsing, now case insensitive
Improved XML mime type recognition

Assets 2

16 Mar 08:16

huberrob

v1.4.7

3e2f6d2

v1.4.7

includes 1.4.7b

Assets 2

19 Feb 12:12

huberrob

v.1.4.6

e30278c

v1.4.6

Merge pull request #252 from ignpelloz/master

Fix pyld missing (issue #251)

Assets 2

04 Jan 14:31

huberrob

v.1.4.3

2a5ac79

v.1.4.3 Pre-release

Pre-release

pyyaml see https://github.com/pangaea-data-publisher/fuji/security/de…

…pendabot/pyproject.toml/PyYAML/open

Assets 2

02 Aug 08:27

huberrob

v.1.3.5

7248d41

v.1.3.5 Pre-release

Pre-release

Merge pull request #193 from pangaea-data-publisher/robbranch2

v1.3.5

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes from 3.1.0 to 3.2.0

Releases: pangaea-data-publisher/fuji

v3.2.0

Changes from 3.1.0 to 3.2.0

v3.1.0

v3.0.0

v2.2.5

v2.0.2

v.1.4.9

v1.4.7

v1.4.6

v.1.4.3

v.1.3.5