Skip to content

Searching and ordering

LeXofLeviafan edited this page Sep 29, 2024 · 1 revision

As of now, buku includes several options for filtering and sorting bookmarks in its output (whether in CLI, interactive shell, webUI or library API).

As the basis for CLI, there is a --print option (with optional range argument for specifying indices, i.e. 1-10, 50, -3; the negative number means “last N records”). In the interactive shell, p is the equivalent command (here, the index range is required).

Sampling (random)

Suppose you have a bunch of unread articles bookmarked, and want to pick something to read… Buku provides a way to pick one or multiple random bookmarks out of the current selection.

CLI

Adding the --random option to your buku invocation will limit your output to a single random entry (applicable to --print, --export, and search operations). The amount of entries can be increased by providing it as the optional argument (e.g. --random 3).

buku --print -10 --random  # a random bookmark out of the last 10

(Also, invoking --open with no argument results in a single random bookmark being opened.)

Interactive shell

In the interactive mode, buku provides a R command with optional number argument (defaulting to 1). It prints out the specified amount of bookmarks from the last search; if invoked with a negative number (or if there was no search yet), the sample is taken from all bookmarks instead.

R -3
t todo
R

(Print 3 random bookmarks; search by ‘todo’ tag; print a random bookmark from the search.)

WebUI

The Bookmarks page in Bukuserver includes a “Random” button; clicking it will display data of a random bookmark from the currently displayed list.
random button

The random bookmark dialog has a “Pick another” button, which refreshes the dialog, loading data of another random bookmark (from the same list).
pick another

(Note that the “View record” header is actually a link – clicking it will open the bookmark page which then can be edited or deleted.)

Library API

While most of the sampling is done externally in buku (i.e. fetching the records then applying random.sample() as needed), the .exportdb() method has a pick parameter which applies it internally (thus allowing to randomize invocations without resultset):

bukudb.exportdb('sample.md', pick=10)  # export 10 random bookmarks to 'sample.md'

Ordering

A sorting order can be described as a list of sorting fields, each optionally preceded by + or - (meaning natural/reversed order; when not specified, defaulting to “natural”). E.g. +title, -tags, +url means this:

  • records are ordered by title
  • when they have the same title, they're ordered by their tags, in “decreasing” order (i.e. from 'z' to 'a')
  • if both the title and the tags are identical, they're ordered by URL instead

The following names can be used to describe order: index/id, url/uri, title/metadata, description/desc, tags. (These are names used in JSON output and in the DB.)

Default/fallback ordering can always be assumed as +index.

CLI

In CLI, use the --order option to specify output order for printed bookmarks (applicable to --print, --export, and search operations). It accepts one or more argument containing a comma-separated order description (multiple arguments will be combined into one).

# first 100 bookmarks sorted by title (natural order) then URL (reversed order)
buku --print 1-100 --order title,-url

Note that while non-option CLI arguments cannot start with -, you can precede it with a comma:

# first 10 bookmarks sorted by URL (reversed order)
buku --print 1-10 --order ,-url

Interactive shell

When running buku in the interactive mode, the v command can be used to specify sort order for print and search commands (arguments should be separated by space and/or commas):

v title, -url
p 1-10

(Set ordering to +title, -url; print first 10 bookmarks.)

WebUI

The Bookmarks page in Bukuserver includes an “order” filter, which can be used for defining order of entries:
order filters

Library API

A number of methods use order parameter to describe sorting:

# fetching records with specified sorting
records = bukudb.get_rec_all(order=['+title', '-url'])

Additionally, there's a _sort() method that allows to apply ordering to already extracted records. Using it also allows to apply case-sensitive sorting (buku normally uses case-insensitive ordering).

# applying a case-sensitive sorting
records = bukudb._sort(records, ['+title', '-url'], ignore_case=False)

Searching

Buku supports following search functionality:

  • all bookmarks including/excluding specified tag(s)
  • all bookmarks matching one or more specified regular expression (in any text field)
    • search results will include all entries that match at least 1 regex (ordered by number of matches)
    • a special “markers mode” can be used to search in specific fields
  • all bookmarks including/excluding one or more specified keywords (in any text field)
    • unless “deep” mode is used, only full-word matches are accepted (i.e. the keyword “tar” will not match the word “start”)
    • searching for the single blank keyword (in “all keywords” mode) will produce all bookmarks with empty title/tag list
    • searching for the single immutable keyword (in “all keywords” mode) will produce all bookmarks marked as “immutable”
    • if “all keywords” mode is not used, search results will include all entries that match at least 1 keyword (ordered by number of matches)
    • a special “markers mode” can be used to search in specific fields

Markers

(Note: this is loosely based on search functionality of the Bukubrow web extension)

When the markers are enabled, each keyword (or regular expression) will be applied to a single field (instead of all fields), based on which special character it begins with (…the prefix itself will not be searched for, of course):

  • . for title (e.g. .Stack Overflow)
  • : for URL (e.g. :.org/, :http://)
  • > for description (e.g. >Python Package Index)
  • # for tags (e.g. #genre:,todo; in case of a keyword, it's treated as a comma-separated list)
    • #, disables deep mode for this keyword (which is enabled otherwise, ignoring the global setting)
  • * means any field (default behaviour if no prefix found)

Thus, specifying the following list of keywords:

:news.ycombinator.com #,todo,article #db: .nested query *performance

would mean searching for bookmarks with news.ycombinator.com in the URL, tags todo & article, another tag containing db:, nested query in the title (ignoring the same string in description), and performance in any field.

Additionally, in cases when the input is specified via a single string, it would be split into keywords by detecting markers preceded by whitespace (so #db:postgres will not be split in two keywords), with said whitespace being removed. Thus, the example above could be described as

:news.ycombinator.com  #,todo,article #db:  .nested query *performance

(Note that “no prefix” in this case would only be possible for the first keyword.)

CLI

The command --sreg, --sall or --sany can be used to invoke the search in regex, all-keywords or any-keywords mode respectively.

  • Keywords are supplied as their arguments.
  • Invoking buku with one or more string arguments before any parameters implies --sany
  • --exclude can be used to remove entries from search results (i.e. --exclude foo will cause any bookmarks matching foo to be skipped)
  • --stag can be used to additionally search for tags (i.e. --stag "foo + bar - baz, qux" will include tags foo & bar but exclude tags baz & qux)
    • --stag can also be used as a standalone search command
  • --deep enables “deep mode” (matching parts of words)
  • --markers enables “markers mode”

Thus, a following command can be used to search using markers and print out results in desired order:

buku --order +tags,-url --markers --sall 'global substring' \
  '.title substring'  ':url substring' :https  '> description substring' \
  '#partial,tags:' '#,exact,tags'  '*another global substring'
# in Windows, '^' can be used instead of '\' for multiline commands

Search commands can be combined with --order (as shown above), --random, and --export (for exporting search results).

Interactive shell

In the interactive mode, d and m commands are used for toggling “deep mode” and “markers mode” respectively (both are off by default).

r, S and s are used to search in regex, all-keywords or any-keywords mode respectively. Unless the “markers mode” is on, the command parameters are split by whitespace. (Additionally, the t command does tag search.) In all these cases, previously selected ordering (v) will be applied.

Thus, the CLI example above can be reproduced like this:

v +tags,-url
m
S global substring  .title substring  :url substring :https  > description substring  #partial,tags: #,exact,tags  *another global substring

(Set ordering; enable markers; search for all keywords.)

WebUI

In Bukuserver, the Bookmarks list has a “buku” filter matching the CLI filtering functionality:
buku filter

It defaults to all-keywords + markers mode; the value is used as a single keyword. Specifying multiple keywords can be done by providing multiple “buku” filters (note that all of them must be specified with the same mode). Here's equivalent of the CLI example from above:
marker filters

The Bukuserver homepage features a search form that opens Bookmarks list with the “buku” filter matching the specified keyword. (When the “With markers” checkbox is set, the search string is split into keywords as explained in the “markers mode” section.)
homepage search
Also, both in Home and in Statistic pages, the navbar contains a quick search field with the same functionality (in the default search mode).
quick search

Additionally, the Statistic page contains search links for the most common sites (“netlocs”), tags and (duplicate) titles in the DB.

Library API

The BukuDb class contains a method named .search_keywords_and_filter_by_tags(), which is used as the entry point for most search operations described in this section. It wraps the following calls:

  • .searchdb(), which implements the base search command
  • .search_by_tag(), which applies the stag filter
  • .exclude_results_from_search(), which invokes .searchdb() again and removes new results from the previously obtained list
results = bukudb.search_keywords_and_filter_by_tags(keywords,
    all_keywords=True, deep=True, markers=True, regex=False,
    stag=tags, without=except_keywords, order=ordering)
# all arguments except the 1st one are optional

Additionally, there's two helper functions used when dealing with search:

  • filter_from(xs, ys, exclude=False) returns set intersection/difference of the two provided lists
  • split_by_marker(s) splits the provided string by markers (as described in the “markers mode” section)