Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New repo robots.txt, sitemap.xml, and dataset landing json-ld reporting #21

Open
iannesbitt opened this issue Mar 16, 2023 · 3 comments
Assignees
Labels
enhancement New feature or request v0.1.2 Version 0.1.2 item
Milestone

Comments

@iannesbitt
Copy link
Contributor

@mbjones has suggested that it would be beneficial to the onboarding process to have a way to produce a report stating the existence and perhaps a simplified version of the contents of a new repository's robots.txt, sitemap.xml, and a dataset landing page to extract json-ld. This will be a quick way of establishing how ready the repo is for schema.org harvesting.

To be decided is whether this will work best as an independent script or a set of functions encompassed under mnonboard which can be accessed by an independent script.

@iannesbitt iannesbitt added the enhancement New feature or request label Mar 16, 2023
@iannesbitt iannesbitt self-assigned this Mar 16, 2023
@mbjones
Copy link
Member

mbjones commented Mar 16, 2023

Ideally what I would like is for that script to be both callable from the command line and deployable as a web service. Fir example:

$ ./so-report.py --profile dataone-full "https://arcticdata.io/catalog/view/doi%3A10.18739%2FA2SB3X09D"
# OR as a web service call:
$ curl -A "Accept: text/csv" https://api.dataone.org/so-report/dataone-full/https$3A%2F%2Farcticdata.io%2Fcatalog%2Fview%2Fdoi%3A10.18739%2FA2SB3X09D

The intent of the "profile" parameter is to select different shacl profiles (name without the .ttl extension). Obviously needs more thought, especially the second form, which would presumably also have a default text/html option for returning a report.

@iannesbitt
Copy link
Contributor Author

This is a great idea. Would require me reworking part of the onboarding script but probably worth it. I'm not as familiar with the process of turning it into a web service, but I assume it wouldn't be too difficult given a well defined working command line tool.

One question I have: is there a place where we've collected the profiles we'd be testing against here? I just have the one (soso 1.2.3) so far.

@mbjones
Copy link
Member

mbjones commented Mar 20, 2023

That soso 1.2.3 is the main one we have settled on, but we;ve discussed having others, and there are examples of others in the same directory as the soso1.2.3.

Getting your script set up as a web service should be straightforward if you have everything encapsulated in well-defined functions, and don't have any logic in main() or the command-line parsing functions for the CLI.

@iannesbitt iannesbitt added the v0.1.2 Version 0.1.2 item label Oct 4, 2023
@iannesbitt iannesbitt added this to the 0.1.2 milestone Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request v0.1.2 Version 0.1.2 item
Projects
None yet
Development

No branches or pull requests

2 participants