API wishlist

Except for uri, all of the below parameters can be specified either at the root (to query everything) or within a specific partition or collection in order to fix the scope.

Parameter	Status	Proposed by?	Description
`uri`	Live		Perform an identifier lookup, results in a 30x redirect to the item if found
`q`	Live		Locate items containing the specified text
`limit`	Live		Limit the resultset to n items
`offset`	Live		Return results starting at item #n
`class`	Live		Restrict results to those having the specified class URI
`media`	Live		Restrict results to those Creative Works and Concepts which have associated media of the specified class (`any`, `collection`, `dataset`, `video`, `image`, `interactive`, `software`, `audio`, `text`, or class URI)
`type`	Live		Restrict results to those Creative Works and Concepts which have associated media delivered as the specified MIME type (e.g., `text/html`, `audio/mp4`)
`for`	Live		Include media whose restricted-audience URI matches the given URI
`score`	Live		Set the minimum prominence score that items must have to appear in results
`mode`	Live		Set to `autocomplete` in order to perform stem matching
`lang`	Live		When performing text-based queries, specify the language of the search terms (e.g., `cy-GB`)
`about`	Proposed	MM	Restrict results to those items which have one or more of the specified concept URIs as a topic
`duration-min`, `duration-max`	Dev	Covatic	Restrict results to works with media whose duration matches the specified range (either bound is optional), in seconds
`date`	Proposed		Restrict results to (a) events occurring on the specified date; and (b) works with media which has a publication/broadcast on the specified date
`similar`	Proposed	GS	Restrict results to those items within a certain (optionally-specifiable) distance of the n-dimensional coordinates of the specified item(s)

Future work for the Datalab graph:

Strengthen the graph (make it more usable)

Put on stable platform
Proper ETLs to move the data
Get data from the authoritative sources (rather than some of the short cuts we have used to date)
Make relationships (between content) queriable
Make it easy to mass extract for analytics and machine learning (e.g. all content names and descriptions)
Better search against the content
Different access requirements for different data

Widen the graph (add more content)

Long form articles (news and sports)
Interactive
Bitesize
Taster
Recipes
Weather

Deepen the graph (add more data for the content)

Channel

Ought to be present in the data, indexing/query TBC

Screening times

Present but not meaningfully indexed (broadcast events are first-order entities)

Existing tags (as already in the system)

Straightforward

ML based descriptors (with confidence)

Named graphs with their own attributes → index confidence factors

Key people (director, actors, etc)

As with existing tags

Target audience age
Audience numbers
Production costs
Mood (using http://mood.bbcredux.com/)

List of example requests we want to be able to run against the graph:

Give me … pieces of content of a specific type with a specific length that cover these topics …

limit, type, duration-min & duration-max, about

Give me … pieces of content of a specific type with a specific length that are similar to these pieces of content…

limit, type, duration-min & duration-max, similar (see note regarding similar above)

Tell me how much content we have on …
Tell me how much content we have on … of length … that was created before …
Tell me all the names and descriptors of news articles that were created since …
Tell me the average length of content on … and how that compares based on which year it was created in
Tell me how many minutes of total content we have on …
Give me all the descriptions used for content that was created in the last … months

Provide feedback

Saved searches