Skip to content

Releases: CDRH/datura

ingest documentation, text spacing, and date_standardize

19 Jan 00:55
Compare
Choose a tag to compare

This release takes v0.2.0 out of beta, and makes some minor changes:

Added

  • minor test for Datura::Helpers.date_standardize
  • documentation for web scraping
  • documentation for CsvToEs (transforming CSV files and posting to elasticsearch)
  • instructions for installing Javascript Runtime files for Saxon

Changed

  • date_standardize now relies on strftime instead of manual zero padding for month, day
  • minor corrections to documentation
  • XPath: "text" is now ingested as an array and will be displayed delimitted by spaces

Migration

  • check to make sure "text" xpath is doing desired behavior

Changes to field and xpath behavior

24 Aug 16:51
3748b4a
Compare
Choose a tag to compare

This is considered a beta release and it is expected that there may be some issues which come up

Added

  • Fields (and therefore methods) for ES JSON, such as extent, alternative, spatial, etc
  • Methods to xToES format fields to accommodate default behavior
  • ES JSON uri now populated using default Orchid item path
  • Tests and fixtures for all supported formats except CustomToEs
  • get_elements returns nodeset given xpath arguments
  • spatial nested fields spatial.type and spatial.title

Changed

  • Arguments for get_text, get_list, and get_xpaths
  • XPaths for VRA and TEI to Elasticsearch
  • Default behavior for CsvToEs for some fields
  • Documentation updated
  • Changed Install instructions to include RVM and gemset naming conventions
  • API field coverage_spatial is now just spatial

Migration

  • Change coverage_spatial nested field to spatial
  • get_text, get_list, and get_xpaths require changing arguments to keyword (like xml and keep_tags)
  • Recommend checking xpaths and behavior of fields after updating to this version, as some defaults have changed
  • Possible to refactor previous FileCsv overrides to use new CsvToEs abilities, but not necessary

Improvements to CSV, WEBS transformers and adds Custom transformer

24 Apr 13:42
Compare
Choose a tag to compare

Added

  • CsvToEs class added which imitates style of other XToEs classes for easier overriding / maintenance
  • Custom formats now supported, although no functionality provided since the type of format cannot be predicted
  • Adds documentation for custom format setup

Changed

  • CSV to ES transformation no longer accepts default column names, but instead looks for columns matching ES fields to use
  • FileType elasticsearch transform now has swappable component when reading XML-type files. Webscraping script altered to manipulate HTML instead of XML object type

Removed

  • CSV to ES transformation used to automatically assume columns as ES fields, this functionality has been removed

VRA to Solr Alterations

11 Feb 14:55
5c8fc03
Compare
Choose a tag to compare

Minimal fixes and alterations to fields in VRA to Solr XSLT transformation.

PB Update

17 Sep 18:25
4cc7911
Compare
Choose a tag to compare

Changed

  • Removed match on pb/@xml:id for tei-to-html

IIIF Manifests

21 Aug 17:20
Compare
Choose a tag to compare

Added

  • IIIF output format and documentation
  • Changelog

Changed

  • nokogiri gem restricted to moving minor version instead of patch

Removed

  • pkg builds of gem
  • outdated comment line

Web scraping support, post by update time fix

15 May 16:34
Compare
Choose a tag to compare

Added webs format for minimal support of web scraping by specific apps

  • currently collections using this feature will need to write all of their own code for process
  • no defaults or recommendations about config settings implemented at this time

Fixed --update flag, which was broken

  • added "today" shortcut for those who don't wish to type in the entire date

Misc other typo fixes, etc

Pre and post file transformation hooks

01 Mar 19:51
Compare
Choose a tag to compare

Adds ability to manipulate files before and after transformation
Accommodates ruby 2.6.x

Datura Gem Launch

21 Nov 16:02
44f60a1
Compare
Choose a tag to compare

Implements previous "data" repository functionality as a ruby gem, "datura"

Original Data Repository

12 Nov 17:50
d1247d7
Compare
Choose a tag to compare

This release contains code as the data repository used to be when it was a collection of scripts. After this point, it will be a gem named datura.