Skip to content

Latest commit

 

History

History
executable file
·
303 lines (208 loc) · 29.1 KB

denote-explore.org

File metadata and controls

executable file
·
303 lines (208 loc) · 29.1 KB

Denote Explore: Explore your Denote digital garden

The Denote package by Protesilaos (Prot) Stavrou provides extensive functionality for creating, retrieving, managing, and linking files in plain text, Markdown, and Org Mode. The most redeeming qualities of this package are its filename convention and modular simplicity. You can also use the package to access other file types, such as PDFs or multimedia files (which we call Denote attachments). In this way, Denote becomes a fully-featured personal knowledge management system.

The Denote-Explore package came into existence as my collection of Denote files grew. I created some auxiliary functions to manage and explore my burgeoning Denote files. This package provides four types of commands:

  1. Summary statistics: Count notes, attachments and keywords.
  2. Random walks: Generate new ideas using serendipity.
  3. Janitor: Maintain your denote collection.
  4. Visualisations: Visualise your Denote files as bar charts or a network.

Summary Statistics

The Denote-Explore package distinguishes between Denote files (notes) and attachments. Denote files are either Org Mode, Markdown, or Plain Text. All other files, such as photographs, PDFs, media files, LaTeX, and HTML, are attachments.

After a day of working hard on your digital knowledge garden, you can count the notes and attachments in your collection. Two functions provide some basic statistics of your Denote files:

  1. denote-explore-count-notes: Count the number of notes and attachments.
  2. denote-explore-count-keywords: Count the number of distinct Denote keywords.

These functions are informative, but a graph says more than a thousand numbers. The built-in chart.el package by Eric M. Ludlam is a quaint tool for creating bar charts in a plain text buffer. Two commands are available in Denote-Explore to visualise basic statistics:

  1. denote-explore-barchart-keywords: Visualise the top n keywords.
  2. denote-explore-barchart-filetypes: Visualise used file extensions.

Random Walks

Creativity springs from a medley of experiences, emotions, subconscious musings, and connecting random ideas. Introducing random elements into the creative process generates avenues of thought you might not have travelled otherwise. Random walks through your notes can be beneficial when you’re stuck in a rut or just like to walk through your files randomly.

A random walk is an arbitrary sequence of events without a defined relationship between the steps. You take a random walk by jumping to a random note, connected or unconnected to the current buffer.

The Denote-Explore package provides three commands to inject some randomness into your explorations:

  1. denote-explore-random-note: Jump to a random note or attachment.
  2. denote-explore-random-link: Jump to a random linked note (either forward or backward) or attachments (forward only).
  3. denote-explore-random-keyword: Jump to a random note or attachment with the same selected keyword(s).
  4. denote-explore-random-regex: Jump to a random note or attachment that matches a regular expression.

The default state is that these functions jump to any Denote text file (plain text, Markdown or Org-mode). The universal argument (C-u) includes attachments in the sample for a random jump.

Jumping to a randomly linked file naturally only works when the current buffer is a Denote file. A warning appears when the current buffer is an isolated note (no links or backlinks available).

When jumping to a random file with one or more matching keywords, you can choose keywords from the current buffer or override the completion options with free text. The asterisk symbol * selects all keywords in the completion list. The section process is skipped when the current buffer only has one keyword. When the current buffer is not a Denote file, you can choose any available keyword(s) in your Denote collection.

Jumping to a random note matching multiple keywords only works when the denote-sort-keywords is enabled or when the selected keywords are in the same order as the target file. You can alphabetise keywords in your Denote files with denote-explore-sort-keywords.

Jumping to a note that matches a regular expression lets you find random notes matching a search string. For example, to find a note you wrote in May 2022, use ^202205 and using 202305.*_journal jumps to a random journal entry in May 2023.

Denote-Explore Janitor

After using Denote for a while, you may need a janitor to help keep your collection organised. A janitor is a member of the maintenance and cleaning staff for buildings. Their primary responsibility is to ensure cleanliness, orderliness, and sanitation, so this role is also perfect for applying to your Denote files. The Denote-Explore package provides a series of commands to assist with cleaning, organising, and sanitising your files.

Duplicate notes

The Denote identifier is a unique string constructed of the note’s creation date and time in ISO 8601 format (e.g., 2024035T203312). Denote either uses the current date and time when generating a new note or the date and time the file was created on the file system. The Denote package prevents duplicate identifiers when creating a new note.

An attachment might have manually generated identifiers. The file’s creation date and time are irrelevant when using historical documents as attachments. For example, when adding scanned historical records, the identifier might be centuries ago, so it must be manually added.

Adding the Denote identifier manually introduces a risk of duplication. Duplicates can also arise when exporting Denote Org files, as the exported files have the same file name but a different extension.

The denote-explore-identify-duplicate-notes command lists all duplicate identifiers in a popup buffer, which includes links to the suspected duplicate notes and attachments.

Additionally, the denote-explore-identify-duplicate-notes-dired command shows them in a Dired buffer. You can directly change filenames in the Dired buffer with dired-toggle-read-only (C-x C-q) or any other preferred method.

Be careful when changing the identifier of a Denote file, as it can destroy the integrity of your links. Please ensure that the file you rename does not have any links pointing to it. You can use the denote-find-link and denote-find-backlink commands to check a file for links.

With the universal argument (C-u), this command looks for duplicated filenames without extensions instead of identifiers. Thus, this option ignores any duplicated identifiers created when exporting Denote Org mode files.

Managing Keywords

Denote keywords connect notes with similar content. Keywords should not exist in solitude because a category with only one member is not informative. Single keywords can arise because topics need to be fully developed or due to a typo. The denote-explore-single-keywords command provides a list of file tags that are only used once. The list of single keywords is presented in the minibuffer, from where you can open the relevant note or attachment.

You can also find notes or attachments without keywords with the denote-explore-zero-keywords command. This command presents all notes and attachments without keywords in the minibuffer, so you can open them and consider adding a keyword or leaving them as is.

You can remove or rename keywords with denote-explore-rename-keyword. Select one or more existing keywords from the completion list and enter the new name of the keyword(s). This function renames all chosen keywords to their latest version or removes the original keyword from all existing notes when you enter an empty string as the new keyword. This function cycles through all notes and attachments containing one or more selected keywords and asks for confirmation before making any changes. The new keyword list is stored alphabetically, and the front matter is synchronised with the file name.

Ordering keywords alphabetically makes searching for files more predictable. If you rename files manually, the keywords might sometimes be in different order. The denote-explore-sort-keywords function checks all notes. It notifies the user if there are any notes where keywords are not alphabetised. The function warns the user before renaming any files. Denote sorts keywords alphabetically for new notes when the denote-sort-keywords variable is enabled.

Synchronising Meta Data

Denote stores the metadata for each note in the filename using its ingenious format. Some of this metadata is copied to the front matter of a note, which can lead to discrepancies between the two metadata sources.

The denote-explore-sync-metadata function checks all notes and asks the user to rename any file where these two data sets are mismatched. The front matter data is the source of truth. This function also enforces the alphabetisation of keywords, which assists with finding notes.

Create Personal Knowledge Graphs

Denote implements a linking mechanism that connects notes (either Org, Markdown, or plain text files) to other notes or attachments. This mechanism allows the user to visualise all notes as a network.

Emacs is a text processor with limited graphical capabilities. However, committing your ideas to text requires a linear way of thinking since you can only process one word at a time. Visual thinking through tools such as mind maps or network diagrams is another way to approach your ideas. One of the most common methods to visualise interlinked documents is in a network or a personal knowledge graph.

Network visualisation in Denote is not just a feature but a powerful tool that visualises how notes are linked, helping you discover previously unseen connections between your thoughts and enhancing your creative process.

It’s important to note that Denote does not offer live previews of your note collection. This deliberate choice is to prevent ‘dopamine traps’ because seeing your thoughts develop in real-time is a distraction. Instead, Denote-Explore provides a focused tool for the surgical analysis of your notes, while the main user interface remains text-based.

A network diagram has nodes (vertices) and edges. Each node represents a file in your Denote system, and each edge represents a link between notes. Denote-Explore provides three types of network diagrams to explore the relationships between your thoughts. The package exports and displays these in one of three formats, with SVG files viewed in the browser as the default. The diagram below shows the basic principle of a knowledge graph. In the actual output, nodes are circles.

You create a network with the denote-explore-network command. This command will ask the user to select the type of network to create. Each network type requires additional inputs to analyse to a defined part of your Denote files. You can visualise all your notes, but if your collection because large it is not an informative exercise.

Community of Notes

A community consists of notes that share (part of) an ID, name, signature or keyword. The software asks to enter a search term or regular expression. For example, all notes with Emacs as their keyword (_emacs), or all notes with a certain signature, e.g. ==ews. A community graph displays all notes matching the search term and their connections. The example below indicates the _emacs community with the dashed line. The algorithm prunes any links to non-matching notes, which in the example below is the note with the _vim keyword.

┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─┐        
   _emacs community        
│ ┌──────┐  ┌──────┐ │  ┌────┐        
  │_emacs│  │_emacs│───►│_vim│       
│ └──┬───┘  └──────┘ │  └────┘        
     │                       
│    ▼               │        
  ┌──────┐              
│ │_emacs│           │
  └──────┘            
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─┘        

To generate a community graph, use denote-explore-network, choose ‘Community’ and enter a regular expression. The resulting diagram will pop up in your default system browser as an SVG file.

The denote-explore-network-regenerate command recreates the current graph with the same parameters, which is useful when changing the structure of your notes and you like to see the result visualised without having to repeat the parameters.

The denote-explore-network-regex-ignore variable lets you define a regular expression of notes to ignore in your visualisations. For example, if you create meta notes with long lists of dynamic links and they have the _meta keyword, then you could exclude this set of nodes by customising this variable to the relevant regular expression.

Note Neighbourhood

The neighbourhood of a note consists of all files linked to it at one or more steps deep. The algorithm selects members of the graph from linked and back-linked notes. This visualisation effectively creates the possible paths you can follow with the denote-explore-random-link function discussed in the Random Walks section above.

The illustration below shows the principle of the linking depth. Notes B and C are at linking depth 1 from A and notes D and E are at depth 2 from A.

Depth 1      2
     ┌─┐    ┌─┐ 
  ┌─►│B│◄───┤D│ 
  │  └─┘    └─┘ 
 ┌┴┐            
 │A│            
 └─┘            
  ▲  ┌─┐    ┌─┐ 
  └──┤C├───►│E│ 
     └─┘    └─┘

To generate a neighbourhood graph from the current Denote note buffer, use denote-explore-network and enter the graph’s depth. The user enters the required depth, and the software searches all notes linked to the current buffer at that depth. When building this graph from a buffer that is not a Denote note, the system also asks to select a source file (A in the diagram). The system issues a warning when you select a note without links or backlinks. You can also identify Denote files without any links with the denote-explore-isolated-notes function describe above.

The denote-explore-network-regenerate command recreates the current graph with the same parameters, which is useful when you want to change the structure of your notes after viewing the first version of the graph.

The complete set of your Denote files is most likely a disconnected Graph, meaning that there is no one path that connects all nodes. Firstly, there will be isolated notes. There will also exist isolated neighbourhoods of notes that connect to each other but not to other files.

A depth of more than three links is usually not informative because the network can become to large to read, or you hit the edges of your island of connected notes.

The denote-explore-network-regex-ignore variable lets you define a regular expression of notes to ignore in your visualisations. Lets assume you create meta notes with long lists of dynamic links and they have the _meta keyword, then you could exclude this set of nodes by setting this variable.

Keyword Network

The last available method to visualise your Denote collection is to develop a network of keywords. Two keywords are connected when used in the same note. The keywords in a note create a complete network. The union of all complete networks from all files in your Denote collection defines the keywords network. The relationship between two keywords can exist in multiple notes, so the links between keywords are weighted. The line thickness between two keywords indicates the frequency (weight) of their relationship.

While the first two graph types are directed (arrows indicate the direction of links), the keyword network is undirected as these are bidirectional associations between keywords. The diagram below shows a situation with two nodes and three possible keywords and how they combine into a keyword network.

In this example there are three notes, two with two keywords and one with three keywords. Each notes forms a small complete network that links all keywords.

┌─────┐ ┌─────┐   ┌─────┐ ┌─────┐   ┌─────┐ ┌─────┐
│_kwd1├─┤_kwd2│   │_kwd1├─┤_kwd2│   │_kwd3├─┤_kwd4│
└─────┘ └─────┘   └─┬───┘ └───┬─┘   └─────┘ └─────┘
                    │ ┌─────┐ │  
                    └─┤_kwd3├─┘  
                      └─────┘    

The union of these three networks forms the keyword network for this collection of notes. The example generates the following keyword network.

┌─────┐ ┌─────┐                                
│_kwd1├─┤_kwd2│                                
└─┬───┘ └───┬─┘                                
  │         │                                  
  │ ┌─────┐ │  ┌─────┐                         
  └─┤_kwd3├─┴──┤_kwd4│                         
    └─────┘    └─────┘                         

When generating this graph type, you will need to enter a minimum edge weight (n). The graph then will only show those keywords that are at least n times associated with each other. The default is one, which can generate a rather large graph.

The denote-explore-network-regenerate command recreates the current graph with the same parameters, which is useful when you are changing notes.

Some keywords might have to be excluded from this graph because they skew the results. For example, when using the Citar-Denote package, you might like to exclude the bib keyword from the diagram because it is only used to minimise the search space for bibliographic notes and has no further semantic value. The denote-explore-network-keywords-ignore variable lists keywords ignored in this visualisation.

Network Layout and Presentation

Emacs cannot independently generate graphics and thus relies on external software. This package can use three external mechanisms to create graphs (configurable with denote-explore-network-format), set to GraphViz SVG output by default. Other available formats are JSON and GEXF.

The Denote-Explore network algorithm consists of four steps:

  1. Use the denote-explore-network function to enter the network type and pass on to another function to enter the required parameters.
  2. The code generates a nested association list that holds all relevant metadata for the selected graph:
    • Metadata e.g.: (meta (directed . t) (type . "Neighbourhood '20210104T194405' (depth: 2)"))
    • Association list of nodes and their degrees, e.g., (((id . "20210104T194405") (name . "Platonic Solids") (keywords "geometry" "esotericism") (type . "org") (degree . 4)) ...). In the context of Denote, the degree of a network node is the unweighted sum of links and backlinks in a note.
    • Association list of edges and their weights: (((source . "20220529T190246") (target . "20201229T143000") (weight . 1)) ...). The weight of an edge indicates the number of times it occurs, which is the number of time the two files are linked or the number of times two keywords appear in the same note.
  3. The package encodes the association list to a GraphViz DOT file, JSON file, or GEXF file. The location and name of this file is configurable with the denote-explore-network-directory and denote-explore-network-filename variables.
  4. Relevant external software displays the result.

The denote-explore-network-graph-formats variable defines the file extension and the relevant functions for encoding and visualisation for each graph format.

GraphViz

GraphViz is an open-source graph visualisation software toolkit, ideal for this task. The Denote-Explore software saves the graph in the DOT language as a .gv file. The GraphViz software converts the DOT code to an SVG file.

You will need to install the GraphViz software to enable this functionality. Denote-Explore will raise an error when trying to create a GraphViz graph without the required external software available.

The configurable denote-explore-network-graphviz-header variable defines the basic settings for GraphViz graphs, such as the layout method and default node and edge settings.

The denote-explore-network-graphviz-filetype variable defines the GraphViz output format. SVG (the default) or PDF provide the best results. View the SVG file in a web browser to enable tooltips of nodes to show their name and other meta data, and to follow hyperlinks. Emacs can display SVG files, but is unable to follow links or show tootltips.

The diameter of nodes are sized relative to their degree. Thus, the most referenced note in your system will be the most visible. For nodes with a degree greater than two, the name is displayed outside the node (top left). When generating a neighbourhood, the source node is marked in a contrasting colour. In keyword graphs, the thickness of the edges indicates the number of times two keywords are associated with each other.

Hovering the mouse cursor over a node provides its name and other meta data. You can open the relevant file by clicking on the node, which works best when using Emacs as a server, if you configure your browser to open Org mode, Markdown and text files with the Emacs client. Links only work in neighbourhood and community graphs. These interactive functions are only available when viewing SVG files in a web browser.

The layout of the graph uses the Neato spring model in GraphViz.

This method is ideal for viewing small parts of your network. The network will be hard to read when the number of notes becomes too large.

D3.js

D3.js is a JavaScript library for data visualisation. This method provides an aesthetically pleasing and interactive view of your note collection. The Denote-Explore package stores the desired network as a JSON file.

The JavaScript file is generated with the R language as I have not yet mastered JavaScript to write it myself from scratch. R saves the network as an HTML file in the designated folder with the networkD3 R package. Hover over any node to reveal its name.

The colours indicate a statistical grouping based on the connections between nodes. This grouping is calculated with the Walktrap community detection algorithm, which finds communities of nodes by assessing which ones are more connected to each other than to nodes outside the community.

To enable this view, you must install the R language on your system. R will install some required libraries when you run this code for the first time. Any JavaScript developers interested in writing a better solution are cordially invited to submit improvements.

Graph Exchange XML Format

The first two formats provide some analysis of your knowledge network, but there is a lot more you can do with this type of information. While GraphViz and D3 are suitable for analysing parts of your network, this third option is ideal for storing the complete Denote network for further analysis. To do this, use the Community option and enter an empty search string to include all files.

Graph Exchange XML Format (GEXF) is a language for describing complex network structures. This option saves the network as a GEXF file without opening it in external software.

You can analyse the exported file with Gephi Lite, a free online network analysis tool. The GEXF file only contains the IDs, names and degree of the nodes, and the edges and their weights.

Analysing the Denote Network

A well-trodden trope in network analysis is that all people are linked within six degrees of separation. This may also be the case for your notes, but digging more than three layers deep is not very informative as the network can become large and difficult to review.

It might seem that adding more connections between your notes improves them, but this is not necessarily the case. The extreme case is a complete network where every file links to every other file. This situation lacks any interesting structure and wouldn’t be informative. So, be mindful of your approach to linking notes and attachments.

Your Denote network is unlikely to be a fully connected graph. In a connected graph, there is a path from any point to any other point. Within the context of Denote, this means that all files have at least one link or backlink. Your network will most likely have isolated nodes (files without any (back)links) and islands of connected notes. The previously discussed denote-explore-isolated-notes function lists all files without any links and backlinks to and from the note in the minibuffer.

The number of links and backlinks in a file (in mathematical terms, edges connected to a node) is the total degree of a node. The degree distribution of a network is the probability distribution of these degrees over the whole network. The denote-explore-degree-barchart function uses the built-in chart package to display a simple bar chart of the frequency of the total degree. This function might take a moment to run, depending on the number of notes in your system. Evaluating this function with the universal argument C-u excludes attachments from the analysis.

The importance of a note is directly related to the number of notes that link to it, the number of backlinks. The denote-explore-backlinks-barchart function visualises the number of backlinks in the top-n files in a horizontal barchart, ordered by the number of backlinks. This function asks for the number of nodes to visualise and analyses the complete network of Denote files and attachments, which can take a brief moment.

Lastly, some notes don’t have any links or backlinks. Depending on your note-taking strategy, you might want all your notes linked to another note. The denote-explore-isolated-notes function provides a list in the minibuffer of all notes without links or backlinks for you to peruse. You can select any note and add any links. Calling this function with the universal argument C-u includes attachments in the list of lonely files.

Installation and Package Configuration

This package is available through MELPA. The configuration below customises all available variables and binds the command on top of the C-c e prefix. You should modify this configuration to suit your needs, as one person’s sensible defaults are another person’s nightmare.

(use-package denote-explore
  :custom
  ;; Where to store network data and in which format
  (denote-explore-network-directory "<folder>")
  (denote-explore-network-filename "<filename?")
  (denote-explore-network-format 'graphviz)
  (denote-explore-network-graphviz-filetype 'svg)
  :bind
  (;; Statistics
   ("C-c e s n" . denote-explore-count-notes)
   ("C-c e s k" . denote-explore-count-keywords)
   ("C-c e s K" . denote-explore-keywords-barchart)
   ("C-c e s e" . denote-explore-extensions-barchart)
   ;; Random walks
   ("C-c e w n" . denote-explore-random-note)
   ("C-c e w l" . denote-explore-random-link)
   ("C-c e w k" . denote-explore-random-keyword)
   ("C-c e w r" . denote-explore-random-regex)
   ;; Denote Janitor
   ("C-c e j d" . denote-explore-identify-duplicate-notes)
   ("C-c e j D" . denote-explore-identify-duplicate-notes-dired)
   ("C-c e j z" . denote-explore-zero-keywords)
   ("C-c e j s" . denote-explore-single-keywords)
   ("C-c e j o" . denote-explore-sort-keywords)
   ("C-c e j r" . denote-explore-rename-keywords)
   ;; Visualise denote
   ("C-c e n" . denote-explore-network)
   ("C-c e r" . denote-explore-network-regenerate)
   ("C-c e d" . denote-explore-degree-barchart)
   ("C-c e b" . denote-explore-backlinks-barchart)))

You can use the most recent development version directly from GitHub (Emacs 29.1 or higher):

(unless (package-installed-p 'denote-explore)
  (package-vc-install
   '(denote-explore
     :url "https://github.com/pprevos/denote-explore/")))

Acknowledgements

This code would only have existed with the help of Protesilaos Stavrou, developer of Denote.

In addition, Jakub Szczerbowski, Samuel W. Flint, Ad (skissue), Vedang Manerikar, and Jousimies made significant contributions and suggestions.

Feel free to raise an issue here on GitHub if you have any questions or find bugs or suggestions for enhanced functionality.

License

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License or (at your option) any later version.

This program is distributed in the hope that it will be useful but WITHOUT ANY WARRANTY, INCLUDING THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

For a full copy of the GNU General Public License, see https://www.gnu.org/licenses/.