Skip to content
Rob Speer edited this page Dec 7, 2015 · 50 revisions

There are three methods for accessing data through the ConceptNet 5 API: lookup, search, and association.

  • Lookup is for when you know the URI of an object in ConceptNet, and want to see a list of edges that include it.
  • Search finds a list of edges that match certain criteria.
  • Association is for finding concepts similar to a particular concept or a list of concepts.

Lookup

To look up an object by its URI, go to http://conceptnet5.media.mit.edu/data/5.4 followed by the URI. For example, the concept "toast", with URI /c/en/toast, can be found at: http://conceptnet5.media.mit.edu/data/5.4/c/en/toast

You can set the following GET arguments to modify the returned results:

  • limit = n: change the number of results from the default of 50.
  • offset = n: skip the first n results.
  • filter=core: filter the returned results to only those in the ConceptNet 5 Core (no Creative Commons ShareAlike-licensed results).

In English, the terms you supply in URIs should be normalized to their WordNet roots, and in any language, the terms should have spaces replaced by underscores. The conceptnet5.nodes.standardized_concept_uri function applies these transformations, or see the URI standardization section below.

Examples

To get the 6th through 10th highest-weight edges for "toast", you could go to: http://conceptnet5.media.mit.edu/data/5.4/c/en/toast?offset=5&limit=5

To see 50 statements submitted by user "rspeer", go to: http://conceptnet5.media.mit.edu/data/5.4/s/contributor/omcs/rspeer

URI standardization

You can use the /data/5.4/uri endpoint to find out what the ConceptNet URI is for a given text, applying steps such as reducing English words to their root form.

It requires two parameters:

  • language: the code for the language to construct the URI in, such as 'en'
  • text: the text you want to standardize and turn into a URI

It returns a dictionary containing only a "uri" member, whose value is the appropriate URI.

For convenience in constructing this API query, you can write the text using underscores in place of spaces (which will almost certainly turn back into underscores anyway).

Example

To find the somewhat unintuitive URI for the term "ground beef", go to: http://conceptnet5.media.mit.edu/data/5.4/uri?language=en&text=ground_beef

Search

This endpoint allows searching ConceptNet edges for multiple requirements.

The base URL for searching is http://conceptnet5.media.mit.edu/data/5.4/search. You add GET arguments to this to specify what to search for.

The following arguments are supported:

  • { id, uri, rel, start, end, dataset, license } = URI: giving a ConceptNet URI for any of these parameters will return edges whose corresponding fields start with the given path.
  • limit = n: change the number of results from the default of 50.
  • offset = n: skip the first n results.
  • features = str: Takes in a feature string (an assertion with one open slot), and returns edges having exactly that string as one of their features. Look at the features field of returned results for examples.
  • filter=core: filter the returned results to only those in the ConceptNet 5 Core (no Creative Commons ShareAlike-licensed results).

Examples

To find 10 things that are parts of a car, you can do this: http://conceptnet5.media.mit.edu/data/5.4/search?rel=/r/PartOf&end=/c/en/car&limit=10

Result format

The result is a JSON data structure containing:

  • numFound: an estimate of how many matches there are total.
  • edges: the list of edges. Each edge is a JSON data structure containing all the fields of a ConceptNet edge.

Association

The base URL is http://conceptnet5.media.mit.edu/data/5.4/assoc . This URL can be followed by:

  • A concept URI, in which case it will show you the most similar concepts to that concept.
  • A path of the form /list/<language>/<term list>, which finds the most similar concepts to a list of terms, as described below.

You can set the following GET arguments to modify the returned results:

  • limit = n: change the number of results from the default of 10.
  • filter = URI: return only results that start with the given URI. For example, filter=/c/en returns results in English. To return only results that include an exact URI, end the URI with /., as in /c/en/dog/.

Term lists

A term list is a comma-separated list of components. A component is a word or phrase in natural language, optionally followed by an @ sign and a weight, which changes the relative importance of that concept from its default of 1. For example, the term list "dog,dog_food@0.5" counts an association with the phrase "dog food" half as much as it counts an association with "dog".

Every term in the term list will be normalized according to the language you specify, so for example /list/en/dogs is the same as /list/en/dog.

Examples

To measure how similar cats and dogs are: http://conceptnet5.media.mit.edu/data/5.4/assoc/c/en/cat?filter=/c/en/dog/.&limit=1

To see 20 terms with the most positive affect: http://conceptnet5.media.mit.edu/data/5.4/assoc/list/en/happy,sad@-1?limit=20

To see 20 things in English with the most positive affect: http://conceptnet5.media.mit.edu/data/5.4/assoc/list/en/happy,sad@-1?limit=20&filter=/c/en

To see terms associated with breakfast foods: http://conceptnet5.media.mit.edu/data/5.4/assoc/list/en/toast,cereal,juice,egg

Version info

The ConceptNet API endpoints have remained unchanged since ConceptNet 5.1. The ConceptNet 5.0 API is no longer available, and 5.1 now redirects to 5.2.

ConceptNet 5.2 promoted assertions (made from multiple edges) as the things that the API would return, instead of the individual edges plus occasional assertions that 5.1 would give you. The assertions follow the same general format as the edges in 5.1, with extra fields such as source_uri.

ConceptNet 5.3 includes new data sources and more powerful /assoc/ lookups. It removed the full-text search API, but added the more specific /normalize (now /uri) endpoint for looking up URIs for text.

ConceptNet 5.4 includes some new data, and changed the way URIs are made. URIs now apply a simple tokenizer to the text that finds segments that are not whitespace or punctuation, and joins them with underscores.

Clone this wiki locally