This repository has been archived by the owner on Jan 5, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
Assessment
Michael J. Giarlo edited this page Jun 14, 2018
·
3 revisions
As part of assessing Vitro, we need to:
- test the scalability of Vitro (and, by extension, VIVO) for RIALTO type data and RIALTO phase 1 amounts of data.
- ease [and cost time-wise] to work, develop, maintain Vitro
- Take any info from what we’ve learned in the scaling questions
- Highlight any open questions about Vitro [VIVO] usage
- If Vitro scales adequately, Vitro vs VIVO
- Driven by data modeling
- Driven by the ease of usage / cost of usage
Tests for assessing performance of Vitro (or other identified future options) as the canonical RIALTO data store. Options listed in priority order, starred tests being the preferences for testing. Standard loading of 4,596,065 triples (3,158,059 unique) from 73,479 n-triples files.
- inf == inferencing (within Vitro)
- index == indexing (within Vitro, default mappings to Solr)
- ⭐ == primary foci
Vitro Ingest | no inf & no index | inf & no index | no inf & index ⭐ | inf & index ⭐ |
---|---|---|---|---|
1 SPARQL Update API (Vitro) | Unable to run w/Vitro | Unable to run w/Vitro | Docs, Metrics | Dismissed |
2 Jena tdbloader (cmd line) | Dismissed | Dismissed | Dismissed | Dismissed |
3 TDB Java API (Vitro) | Dismissed | Dismissed | Dismissed | Dismissed |
4 Jena SPARQL Update (ARQ) | Dismissed | Dismissed | Dismissed | Dismissed |
5 Fuseki SPARQL Update API | n/a | Dismissed | Docs, Metrics | Dismissed |
- Sample Data Load Times
- Total time to load all sample data
- Time per RDF record / file
- Time per request (if multiple requests, i.e., HTTP)
- Scaling graph based on the above
- Sample Data Index Times (if indexing)
- Ops Metrics: (taken from https://sulstats.stanford.edu/dashboard/db/servers)
- CPU usage
- memory usage
- overall load
- swap usage
- perhaps an I/O-related metric would be good too
- RIALTO Wiki Homepage
- RIALTO Use Cases
- RIALTO Architecture
- RIALTO Data Models
- RIALTO Acceptance Criteria
- RIALTO Data Sources
- Demo Videos
- Neptune/λ Integration
- Core/Combine Integration
- SPARQL Proxy λ
- Derivatives λ
- Entity Resolver Service
- Rebuild Trigger Task
- Solr Setup
- Ingest Service
- Combine Data Sources
- Data Mappings
- Load Procedure
- Starting & Monitoring ETL
- Counting # of Publications
- Jena/TDB vs Blazegraph
- Vitro Ingest Options
- VIVO/Vitro Assessment
- VIVO Community Convo Notes
- Vitro vs Stand-Alone Datastore
- Provisioning a VM
- Deployment Process
- Toggle inferencing
- Check Inferencing is On
- Recompute inferences
- Toggle indexing
- Working with Vitro Solr
- Vitro Solr Samples
- Ingest via Fuseki SPARQL-over-HTTP
- Ingest via Jena ARQ
- Ingest via Jena tdbloader
- Ingest via Vitro SPARQL-over-HTTP
- Ingest via TDB Java API
- Vitro Logging
- Detecting TDB Changes