Skip to content

Latest commit

 

History

History
273 lines (170 loc) · 6.18 KB

README.md

File metadata and controls

273 lines (170 loc) · 6.18 KB

Apium

Apium is an API to access all public Center for Digital Research in the Humanities resources. It is also an invasive weed in Nebraska.

Overview

The CDRH has the metadata and text of several thousand documents such as letters, posters, novels, and images in an Elasticsearch index. This API is a wrapper around that index which provides convenient ways to search and filter those items.

Below, you can find instructions about basic functionality like sorting, pagination (start / rows), and selecting the fields you want to get back.

There are a couple features that may need a little bit of an introduction.

Facets provide you a way of combining and counting the values of a field related to a query. For example, if you search for "horse" and get 100 results, a facet on the author name field might tell you that 90 of those results were by Buffalo Bill, 9 from Meriwether Lewis, and 1 from Jane Austen. You can add facets on keyword fields and date fields, but not on text fields.

Highlights are a cool way to preview the results of a text query. For example, if you searched for "horse," highlights might look like "...Oglala Sioux Nation. American Horse was the son of Sitting Bear..." and "...Stout as a horse, affectionate, haughty, electrical..." This preview helps users decide which result is most relevant to them. You can add highlighting to any text field.

item query

facets

Lists number of documents matching keyword fields

Defaults:

  • no defaults

Standard fields

facet[]=keyword_field

facet[]=category
facet[]=category&facet[]=title

Nested fields

facet[]=nested_field.keyword_field

facet[]=creator.name
facet[]=creator.name&facet[]=creator.role

Date ranges (currently supports days or years)

facet[]=date_field.range

facet[]=date.year
  #=> { 1889 : 10, 1890 : 20 }

facet[]=date
  #=> { 01-02-1889 : 2, 03-04-1889 : 8 }

Number of facets returned and sorting alphabetically (by default sorts by count)

facet_limit=number&facet_sort=term|direction

facet_limit=100
facet_sort=term|asc

facet_limit=30&facet_sort=term|desc

Sorting facets

Defaults:

  • no selection: score|desc
  • term selection, no order: term|desc

Always defaults to score descending. If you wish to sort alphabetically, add "term" and a direction. If you wish to sort score ascending, use "score" and a direction. Multiple sorts for single facets, and distinct sorts for separate facets are not supported at this time.

facet_sort=type|direction

facet_sort=term|desc
facet_sort=score|asc

field list

The fields returned by a query

Defaults:

  • returns all possible fields

Restrict the fields displayed per document in the response. Use ! to exclude a field. Wildcards in fieldnames supported.

fl=yes,!no

fl=title,!date*,date_written

filters

Filters by keyword field across the possible documents

Defaults:

  • no filters applied except _type for collection

Standard fields

f[]=field|type

f[]=category|Writings
f[]=category|Writings&f[]=format|manuscript

Nested fields

f[]=nested.keyword|type

f[]=creator.name|Cather, Willa
f[]=contributor.role|Editor

Date field

If given one date, will use it has both start and end.

Can give year range or specify date range

f[]=field|range_start|(range_end)

f[]=date|1884
  #=> 01-01-1884 to 12-31-1884
f[]=date|1884|1887
  #=> 01-01-1884 to 12-31-1887

f[]=date|1884-02-01|1887-03-01
  #=> 02-01-1884 to 03-01-1887

highlighting

Returns context of text match results

Defaults:

  • hl=true
  • hl_chars=100
  • hl_fl=text
  • hl_num=3

Disabling Highlighting

If you wish to turn highlighting off:

hl=false

Characters

This sets the number of characters that will be returned around a highlight match

hl_chars=number

hl_chars=100

Field List

Highlights will always be returned for the text field, but if you are searching multiple fields, you may wish to see highlights on those fields, also. You do not need to send text when specifying additional fields.

hl_fl=field1,field2,field3

hl_fl=annotations
hl_fl=annotations,catherwords

Number

The number of highlights returned per field. If you set hl_num=3 for text and annotations you could receive up to 6 highlights, 3 from each field.

hl_num=number

hl_num=1
hl_num=5

sorting

Specify the order of results

Defaults:

When no sort or partial sort is supplied

  • query present: sort by "relevancy" descending
  • given term is "relevancy", no order provided: sort descending
  • given term is not "relevancy", no order provided: sort ascending

You may pass multiple fields to be sorted. The first one appearing in the URL parameters will take precedence over the other(s).

sort[]=field|direction

sort[]=date|desc&sort[]=title|asc

Sorting facets

Please refer to the section on facets for information about how to sort facets, specifically.

start and rows

Manual pagination of results

Defaults:

  • start=0
  • num=50

Note: Zero indexed

start=number
num=number

start=0&num=50   # returns first 50 results
start=49&num=50  # returns second 50 results
start=9&num=10   # returns second 10 results

text search

Please refer to the Elasticsearch query string syntax for a list of all possibilities for text searching.

Basic search

q=word

q=multiple words
q=word

Multiple fields

By default, this will search the "text" field, you can specify a different one to use or multiple fields. If adding fields, you will want to make sure that your highlights include fields beyond "text"

q=field:word
q=field:word AND otherfield:other
q=field:word OR otherfield:other

Advanced search

q="phrase of words"
q=wildcard*
q=word OR other
q=word AND other
q=(word OR other) OR -nothanks