Skip to content

Latest commit

 

History

History
56 lines (32 loc) · 3.24 KB

README.md

File metadata and controls

56 lines (32 loc) · 3.24 KB

Archived

This package is archived. Please use open-language/wordnets

En-Wordnet is a node.js module which makes Princeton University's Wordnet available as a package.

About

WordNet® is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts can be navigated with the browser. WordNet is also freely and publicly available for download. WordNet's structure makes it a useful tool for computational linguistics and natural language processing.

Getting started

We use bun for this project.

curl -fsSL https://bun.sh/install | bash
bun install
bun test

Where did you find this?

The latest version of the Wordnet can be found at the website. There are links to both the 3.0 version and the 3.1 version. We are using the data and index files from the 3.1 DB version.

Is this credible?

WordNet is probably one of the most credible sources of lexical data in english on the internet right now. If you would like to try it out for yourself, please go here.

Are there things that are missing

Yes, there are other files which I have completely ignored so far. Standoff Files and Old Versions have been completely skipped. I will not be adding them till I can find a use case for it, if you have one, please share.

How do I actually use this data?

The Parser for wordnet DB files will be in a separate repository. I did this because there are other significant pieces of work which follow the specifications set up by Wordnet (like the Open MultiLingual Wordnet) and the parser would be able to utilize all of those to provide multi-lingual support.

How to I understand the data structures

The data structure is very clearly defined in these two documents

  • wndb, which talks about index._ and data._ files
  • wninput, which talks about the lexicographer file format and the word syntax

This is released under which license?

The complete license for Wordnet can be found on their website and on this repo

Credits