-
Notifications
You must be signed in to change notification settings - Fork 1
/
README
126 lines (100 loc) · 6.55 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
0. Linux Support:
Recently I started work on porting this code to GNU/Linux. I have
been successfully able to build it on Ubuntu 12.04/12.10 and the code
is available in the linux branch of this repository.
1. INTRODUCTION
Unix systems have had a culture and one of the main reasons behind
the long standing success of Unix has been to follow this culture
and philosophy over the years. Part of this culture and philosophy
is to provide documentation for each component of the Operating
System, whether it is a command line utility, a system call, a
library function, a configuration file or anything that should be
documented to make the life of the end user easier. This documentation
has been primarly shipped in the form of Manual pages
(man pages in short), which can be easily accessed using the 'man'
command. A couple of utilities are also provided to search the
documentation easily. apropos(1) can be used to search for man
pages. How apropos(1) works is very simple. The name section of
the man pages has been indexed in a file (typically named whatis.db)
and apropos(1) performs search on this file for the keywords
specified by the user.
While apropos(1) was designed keeping in mind the resources (both
hardware and software) available during the early days, but things
have changed drastically over the time. Now we have the resources
available and in the Google era it behooves us to rethink the design
and implementation of apropos(1). It is now possible to implement
apropos(1) in a better manner so as to allow more extensive and
flexible searches and that too over the complete content of the
man pages rather than limiting it to the name section. More often
than not we are not sure of the exact keywords to search for and
apropos(1) does not give us the right results (or no results at all)
in which case we turn to Google.
The idea behind this project is to mend this problem by reimplementing
apropos(1) to enable full text search capabilities and in the
process enhancing and modifying other man utilities as required.
We have decided to use the FTS engine of Sqlite [1] for this purpose.
2. REQUIREMENTS FOR BUILDING & RUNNING:
The project has been developed on NetBSD so I can only claim that it works
well on a NetBSD system (that too the current development snapshot). Though
it should be possible to build it and run on other BSD systems with little
or no changes.
Following are the requirements for building and running it on NetBSD:
2.1 -CURRENT version of NetBSD (or at least man pages from -CURRENT)
2.2 mdocml
While for other BSDs like FreeBSD or DragonFlyBSD the requirements should be
about same. Though I made a patch to man.c for adding a new option 'p' to
man(1) which would print the search path for man pages in a new line separated
format on stdout. That is required for building the Full Text Search Index
using makemandb. And of course the Makefile might require some modifications.
GNU/Linux: Please checkout the linux branch for using this code on Linux.
Currently, it is still a work in progress.
3. USING:
There are two command line utilities 'makemandb' and 'apropos'. You would
first need to build the Full Text Search (FTS) Index using makemandb(1) and then
you can use apropos(1) (the one provided by this project) to perform searches.
3.1 makemandb: Simply running makemandb will build the FTS index and tell you
the number of pages indexed. Some of the pages might not get indexed on
the way which will be indicated by error messages on the screen but
nothing to worry about that.
NOTE: The default behavior of makemandb is incremental updation. That is to
say it will try to add only those pages to the index which it did not
have previously and also it will remove those pages from the index which
are no more on the file system. Of course if there is no existing index
it will build it from scratch.
makemandb supports following options:
[-f]: The option 'f' will tell makemandb(1) to prune the existing index
(if there exists one) and rebuild the database from scratch.
[-l]: The option 'l' will tell makemand(1) to limit the indexing to only
the NAME section of the man pages. This option can be used to mimic the
behavior of the "classical apropos" although with improved search
capabilities. This option might be useful if you want to save few MB of
disk space.
[-o]: The option 'o' is for optimizing the index. makemand(1) will try
to optimize the FTS index for faster search performance and also it will
optimize the storage of the data to optimize disk space usage.
3.2 apropos: Once you have built the database you can fire apropos(1) and
pass a query to do a search. For example:
$apropos "add a new user"
apropos supports following options:
[-1234569]: You can pass section numbers as options to apropos which
will make apropos to search only within the specified set of sections.
[-p]: By default apropos(1) will display the top 10 ranked results on
stdout. So if you would like to see more results then use 'p'. It will
allow apropos(1) to display all the results and also it will pipe the
results to a pager (more(1)).
4. OTHER DELIVERABLES:
Besides the two command line tools, I have also developed a very small
library to allow and build a search application on top of the FTS index built
by makemandb. It has following public functions:
4.1 init_db(): To initialize a connection to the database.
4.2 run_query(): To run a query as entered by the user and process the rows
obtained in a callback function (apropos.c uses it).
4.3 run_query_html(): Similar to run_query() but it formats the results
obtained in the form of an HTML fragment. This can be used to build a CGI
application to do searches from a browser.
4.4 run_query_pager(): Similar to run_query_html but it formats the results
so that the matching text appears highlighted when piped to a pager.
apropos.c uses it when the -p option is specified.
4.5 close_db(): To close the database connection and release any resources.
For more detailed documentation you can read up the man pages of the individual
components.