Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archipelago 2021 (first quarter) Roadmap:1.0.0-RC2 #103

Open
DiegoPino opened this issue Mar 18, 2021 · 0 comments
Open

Archipelago 2021 (first quarter) Roadmap:1.0.0-RC2 #103

DiegoPino opened this issue Mar 18, 2021 · 0 comments
Labels
Cloud-ify Swap and match Concrete blocks for Cloudy skies Deployment Strategies What every vendor would love to Copy and pasta Docker Containers All about those tiny little critters documentation Improvements or additions to documentation Drupal9 Drupal9 is the new Drupal8 which was the new Drupal7 wich was the... enhancement New feature or request Future Release Duties We are all duty here, heavy duty tigresses and bears Community work and Archipelago Travel
Milestone

Comments

@DiegoPino
Copy link
Member

DiegoPino commented Mar 18, 2021

Archipelago Spring 2021 Roadmap

See also #5 and #35 and #79 and #80 for the whole historic recreation.
This is our working enumeration of concrete tasks until April of 2021 (a year with many sub-years), per Component and Service, for public evaluation (new ideas, requests, critics and comments welcome).

Calm waters around Archipelago allow us to navigate smoothly and the large number of instances already deployed, public and running and the unsuspected home-alone repositories we suspect exists makes us wonder how much more we can all discover or re-discover together

image
Illustration from the children’s book ‘Oh wie schön ist Panama’ or ‘The trip to Panama’ by Janosch

Checked tasks are ready, unchecked are in progress or planned. Priority is not given by this order.

Please feel free to comment, request more info or ask for clarification. Feature requests are also highly appreciated and taken in account (always, please!).

Strawberryfield

  • Field Property exposure to Drupal strategies
    • JSON KEY Provider (flattener)
    • JSON Flatten Keys
    • JSONPATH/JMESPATH
    • Entity Reference Casting Provider (Using UUID loading and configurable entity type) using JSON based hints to expose any semantic relationship to Search API. New to RC2 with Entity Type Selector allowing also Terms/Taxonomies, etc.
    • JSON stored Service Endpoints with extended logic (e.g HOCR) - A.k.a Strawberry Flavor Data Source.
    • Multi Map/ join: many properties to single. e.g All keys - Authorities- referring to creators, contributors etc unified as Agents keys. This leads to Fractal Ontologies and our Buckets approach.
  • File downloads and streaming
  • Ranged Request Streamer with back-to-front S3 managment and buffer/memory managment. For any exposed Binary Endpoint. Also streaming. This was some not sleeping much!

JSON representation and enrichment

  • Better File management (Better than Drupal)
    • File referencing via UUID instead of via Entity ID
    • Handle temporary files when moving from TEMP storage to PERMANENT
    • Increment file usage count on new versions
    • Decrement file usage count on version removal
    • Change file usage on Delete, EDIT on existing active content and versions
    • Add Webform based UI managment (reorder, replace, delete) for files
    • File based Post processing
      • TECHMD
      • ~~ ZIP/UNZIP ~~ MOVED to Strawberry Runners.
      • Derivative for larger MEDIA (video and Sound) MOVED to Strawberry Runners.
      • Pronom Service/Preservation
  • New JSON Service Architecture reference
  • Deposit/save on Node save whole, selfs sustainable Strawberry JSON blob in S3/Minio/FileSystem
  • Keep track of Service and action on Ingest/edit using Activity Streams
  • Add more agent information on our activity streams for provenance and tracking.AMI now also adds Set IDs
  • Add More Event Driven Subscribers. And better
  • Hook-able and override-able storage Pattern for files. @alliomeria we need to doc this for Developers.

Webforms integration

  • Webform Driven UI Ingest with custom handler and widget
  • Handler allows direct CRUD without any node attached and also prepopulation of data using an existing node UUID @alliomeria we need docs here too 🥰 🥰
  • Create a set of Demo Webforms that cover base of our GLAM source data needs
  • Full Autosaving during Creation (sessions are kept alive for a week. Users can skip Steps, jump back and forth and Validation will still happen but at the end. Log out, come back, continue.
  • Allow Webform Field Widget selection be driven by RDF type and permissions.
  • Webform Widgets can start Open/Rendered or closed via settings and have "cancel edit" hidding to avoid users leaving the edit realm.
  • New Solr Aware Entity Select Views (with code code to handle Solr to Entity) which allows
    • Complex autocomplete elements (like get me all Digital Objects of Type Book with a green Cover the user can see
  • New Fine grained Entity (node to node) reference possible through this.
  • CSV to JSON importer element
  • XML to JSON importer element
  • Strawberry transplanter. Any JSON into filled Webform Elements (display) using a twig template.
  • Special Date element ISO8601, with Ranges, Single Dates and free form representation.
  • Create new, better, LoD Webform elements
    • WIKIDATA
    • LoC (with support for any Suggest endpoint)
    • LoC with support MADS RDF Types
    • WIKIDATA Agents with LD Roles
    • WIKIDATA using custom SPARQL
    • Viaf
    • Multi Source, Multi Agent Element. Agents/Corporate can use now multiple Authority Controls.
    • Getty with exact and fuzzy search (updated to be better!
    • Nominatim Geo reconciliation. Normal and Reverse.
    • Panorama Tour Building App (like 1200 lines of code, gosh!)
    • Image and EXIF extraction on upload for UI/facing previews.
  • Create Stub (temporary) WIKIDATA entities if query shows desired WIKIDATA entity does not exist upstream.
    • "publish" to wikibase functionality
    • Replace repo wide stub uri with official one once pushed.
    • Keep track on the stub who is referencing it is (bidirectional reference?)
  • Move Strawberryfield harvest Webform handler's logic to Event Subscribers. Stronger capabilities now.
    • Deal with as:images
    • Deal with as:documents, as:video, as:sound, as:dataset elements
    • Deal with as:models
  • Allow anonymous submits to be converted into proper Nodes by Admin (Self deposit, crowd sourced metadata) WOHO! This also allows self standing endpoints and custom mappings.
  • Make Webform API Interaction work with States(JS) by removing one From wrapper.
  • Make Webform API Interaction more versatile for our use. Use as schema validator. WIP. AMI.
  • Add JS to avoid main node CRUD to submit/validate embedded Webform as widget

Media Displays

  • Display settings, new tab that shows only the active View Mode for an ADO
  • Admin/contextual block that shows how ADO to Type was chosen by the system (admin hint)
  • Add expected mime/type output to Media displays. Allows to tag media displays as JSON, XML, CSV, JSON-LD or HTML only.
    • React to mime type to allow JSON or XML output to be downloaded too.
    • Native/self rendering and Content-Type tagging with caching.
    • Automatic extraction From template of required/used variables (context). Not front facing yet but for sure useful for building a Pick-and-chose (or Data color picker) to aid in Twig Template building
  • Webforms are injected as Context. So a Webform Element Title can be used to match its value.
  • AMI set id and URLs are injected as Context during batch ingest
  • Add new Data Views Plugin integration to allow Media Displays to preprocess values on views exposed as API endpoints
  • Version Media Displays (This is config)
  • Inline Preview with ADO selection. Means users can see the data, test the data and see the output with Live Updates even without saving
  • Per Metadata Display Extra data injection via any strawberry field that is added. @alliomeria we need docs!
  • Provide example Twig templates for
    • MODS
    • DC and
    • JSON-LD
    • GEOJSON
    • IIIF Manifest 2.1
    • IIIF Manifest 3.0
    • EAD2002 (With recursive C Element generation from CSV)
    • EAD3 (With recursive C Element generation from CSV)
    • a Carrousel
  • Metadata Display Exposed endpoints (reuse as Standalone API/download/streams)
  • API builder via UI using Endpoints. Any API, OAI, IIIF, etc. Allows a VIEW to be injected to feed data. Arguments are filtered and fully customizable. WIP. Coming to RC2

Field Formatters

  • Static IIIF Images
  • Open Seadragon IIIF Images
    • W3C Web Annotations! Box and Polygon, fully IIIF compliant with CRUD endpoints. Caches until you are ready to save.
    • Add thumbnail navigation
  • IABookreader IIIF Images
  • Panorama via IIIF now with webGL max texture calculator and max Image size/memory preprocessing to avoid breaking Cantaloupe when using 400MP images.
  • Panorama Tours via other Panorama Objects and IIIF, including Hotspots of many types
  • Metadata up-casters
  • Metadata up-casters with download endpoint (Metadata Display Exposed endpoints)
  • Video (HTML5)
  • Audio (HTML5)
  • PDF (custom, derived from the base PDF.js library. Not fancy. But Mozilla asks people to NOT use their fancy one directly and we agreed.
  • Web annotations (IIIF) with JMESPATH fine grained selector of which Files to attach
  • Complex nested structures (Whole graphs)
  • 3D! (Three + JSM)
  • 3D UV Mapping using IIIF Sources and Scene/Light settings
  • 3D Point Clouds from JSON or URLS
  • Mirador 3.0 (With Resource comparison and multi sourced IIIF manifests, using full release now)
  • Mirador 3 (second JS) with HOCR/Text Highlights using https://github.com/dbmdz/mirador-textoverlay
  • Expose View Mode to JSON Type value mapping that triggers automatic View Mode Selection
  • Webrecorder.io native player (WARC replay) with WACZ capabilities version 1.3.2
  • Lazy Image Loading via CSS class. JS driven, only loads (when used) Images when visible by the user (+100 px to give them some time to load while users navigate)

API Ingest, Migration and backup

  • Strawberryfield Normalizer: expands JSON string as a JSON when exporting
  • Strawberryfield denormalizer: string-ify JSON when importing
  • Wrap JSONAPI on a set of Drush script to (Strawberry Seeds)
    • Allow Single command line invoke files and node ingest
    • Create virtual field Entity "bucket" to allow Media to be ingested into those as links and routed to internal Strawberryfield elements (utility methods for ingest)
  • AMI (Archipelago Multi Import) First iteration
    • API Source (Other repos, ContentDM, Solr)
    • Google Spreadsheets (same as IMI)
    • Complete Drush 9 integration
    • AMI Set Entities
    • AMI Sets Entity processing via Batch or Enqueuing (for Hydroponics)
    • AMI Sets Delete Ingested ADOs by this Set via batch (to clear and reingest)
    • Reusable, canned public facing AMI ingest strategies. Users can only add the source data, all the rest is pre-setup.
    • S3 Sources for AMI
    • Local file (server) Sources for AMI
    • Remote HTTP sources for AMI
    • ZIP (on the works)
    • Folder as a source (on the works)
    • Vouchers
  • Filesystem drop-and-forget ingest. You save a JSON file into S3, Archipelago creates entities and relationships.
  • Use JSON API to allow seamless moving of dependent assets between repositories and also for backups

Service Architecture (Strawberry Runners)

  • Develop webhook driven notification service for derivatives
  • Custom, user facing Plugins. Build your own derivative workflows (system calls, JSON processing, etc)
  • Document/deploy webhook triggers for minio S3 per mimetype
  • Document/deploy webhook triggers for AWS S3 (via lambda) per mimetype
  • Develop Shell processing using Custom Plugins (Processors) and user configurable for each case (rule system)
  • Allow Processor to be chained! And have multiple outputs.
  • Queue-worker processing
  • Generate JSON reference-able Services (plugins)for complex non descriptive metadata and data
    • HOCR
    • TECHMD
    • WACZ
    • Web Annotations
    • Tabular datasets
    • Transcripts (similar to Web Annotations, mostly dependant)
    • File Conversions (any that your Shell allows) with reingest
    • Smart checks on existing processed output to avoid double processing.
    • ~~ Build slim Content entity that can be used to index natively that content into Solr via search API ~~ This is now a fully capable Search API Datasource that can hold any output. one (node) to many (files) to even more sequences.
    • Allow Services to be self explaining of its capabilities. WIP how we expose this to the world. Probably GET will be allowed
    • Two Hydroponics approaches. Single Thread lineal one (default) and Multi Child, with how many children are spawn config. All using ReactPHP

SEO and API

  • Allow Media displays output to be embeded in HTML head for SEO
  • Test/Develop nested DATA VIEWS integration for OAI-ORE and OAI-PMH (See Format Strawberryfield and API builder)
  • Create (TWIG, metadata displays) and expose as endpoints full set of IIIF API JSON outputs.
  • Add helper methods and twig extensions to allow Metadata displays to access pre existing views (like object listings for a collection) to help build those lists.

ACL / Permissions

  • Integrate custom ACL with JSON Paths into per NODE ACL. Allowing this way to apply permissions to individual metadata elements/paths.
  • Same but needs better UI for referenced Services and Media
  • Allow Metadata (rule) to trigger ACL permissions. e.g if embargo_date == bla bla = remove public access
  • Allow for ACL inheritance (from parent, recursive) without hard copies.

Deployment and DevOPS

  • Sync Configurations and remove non used ones for minio branch / periodic for each Drupal release
  • Site-build and remove orphan blocks
  • Add more utility views
  • Enable JSONAPI by default on minio branch
  • Create jsonapi user with jsonapi credentials for minio branch
  • Create basic scripts to automate Docker/Bash operations
  • Update AWS deployer to match minio including docs and Cloud Services integration
  • XDEBUG integration. 2 PHP 7.4.9 Containers, Cookie based, routed by NGINX
  • Natural Language processing Service via Docker
  • Catmandu Docker container for large data mangling
  • Update all Strawberryfield modules script.
  • Drupal 8.9.13 and bumps on every module
  • Drupal 9.1.6 and bumps on every module
  • Solr 8.7 or 8.8.2, MYSQL 8.
  • D9 readyness proven and working of course 😄
  • DDEV deployment strategy
  • Archipelago Live with optimized folder structure and Production read AWS EC2 Docker deployment

Batch Operations

  • Bulk Batch Views PURE TEXT plugin to (All this via JSONPATCH so supports any operation)
    • Replace existing JSON values
  • Bulk Batch Views JSONPATH plugin to (All this via JSONPATCH so supports any operation)
    • Replace existing JSON values
    • Add to existing Values
    • Respect data type casted values, (entities, file references)
  • Bulk Batch Views MEDIA plugin to
    • Replace Media
    • Add Media
  • Bulk Batch Views ACL plugin to
    • Replace ACL and inheritance
    • Replace ACL individual Control List Elements
    • Add ACL individual Control List Elements
  • Integrate into Solr Results and Strawberryfield Taxonomy Term pages

Future roadmap

  • [x ] Solr Cloud/ Consortial ensemble
  • Native Wikibase/Wikidata publishing

Documentation:

  • Devops and new repository deployers
  • Migration to and from.
  • Backup and restoring
  • Permissions, access and ACLs.
  • Metadata Professionals, JSON schema and schema-less. AS, DR and AP internal ontologies. UPDATED
  • [ x] Metadata Professionals, Key concepts of Archipelago
  • [ x] Metadata, Ingest and edit workflows.
  • [x ] Displays, Formatters and Media Plugins (Twig)
  • Views Integration (Solr and Blocks)
  • Strawberry Field Exposed Keys and Plugins
    • Property Exposing strategies and configs
  • Media Management
  • Solr and Discovery
  • Extending and Coding
  • SEO
@DiegoPino DiegoPino added Cloud-ify Swap and match Concrete blocks for Cloudy skies Deployment Strategies What every vendor would love to Copy and pasta Docker Containers All about those tiny little critters documentation Improvements or additions to documentation Drupal9 Drupal9 is the new Drupal8 which was the new Drupal7 wich was the... enhancement New feature or request Future Release Duties We are all duty here, heavy duty tigresses and bears Community work and Archipelago Travel labels Mar 18, 2021
@DiegoPino DiegoPino modified the milestones: 1.0.0-RC2, 1.0.0 Mar 18, 2021
@DiegoPino DiegoPino pinned this issue Mar 18, 2021
@DiegoPino DiegoPino unpinned this issue Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cloud-ify Swap and match Concrete blocks for Cloudy skies Deployment Strategies What every vendor would love to Copy and pasta Docker Containers All about those tiny little critters documentation Improvements or additions to documentation Drupal9 Drupal9 is the new Drupal8 which was the new Drupal7 wich was the... enhancement New feature or request Future Release Duties We are all duty here, heavy duty tigresses and bears Community work and Archipelago Travel
Projects
None yet
Development

No branches or pull requests

1 participant