-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathNEWS
2006 lines (1238 loc) · 61.9 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Omega 1.3.1 (2013-05-03):
This release includes all changes from 1.2.10-1.2.15 which are relevant.
documentation:
* INSTALL,configure: Provide hints as to what package to install for magic.h.
indexers:
* The HTML parser now explicitly handles <APPLET>, <OBJECT> and <TR>.
* Use a generated compact and efficient table to convert HTML tag names
to enum codes - this is both faster and smaller than the approach we were
using, with the benefit that the table is auto-generated.
* Always use our built-in conversion code for the character sets it can handle
(previously we'd use iconv if available; now we only use iconv for other
character sets). This gives us more consistent results, and in particular
means we now handle BOMs better (at least when using GNU iconv).
* A lot of data labelled as "iso-8859-1" is actually "windows-1252". The two
only differ in characters which are control characters in iso-8859-1, so
assume the latter when we see the former.
* omindex:
+ Extend --filter to handle commands which produce HTML on stdout.
+ Don't report an error if a file is deleted (or renamed) between us reading
the directory entry for it and trying to read the file itself by default.
In --verbose mode, the situation is still reported, but now with a
specific message.
+ If omindex receives any of the signals SIGHUP, SIGINT, SIGQUIT or SIGTERM,
then kill any active external filter child process, then handle the signal
as we did before. If setpgid() is available, put each external filter in
its own process group and kill the whole process group when we get a
signal.
+ Use magic_descriptor() if the version of libmagic we're building against
is new enough to have it. This eliminates an extra opening of a file
being indexed in certain cases.
+ Use rst2html to handle .rst and .rest files.
omega:
* Add new $json and $jsonarray OmegaScript commands to support producing JSON
output.
* Add $truncate command which truncates a string after a word.
* Add support for $set{weighting,tfidf} to allow the new TfIdfWeight weighting
scheme to be used.
build system:
* configure: Now looks for libmagic in MAGIC_PREFIX, to allow building with
libmagic installed in a non-standard location.
* Remove support for 'configure --enable-quiet', 'make QUIET=' and 'make
QUIET=y' - automake now supports 'configure --enable-silent-rules', 'make
V=1' and 'make V=0' which are broadly equivalent and more standard.
portability:
* tmpdir.cc: Add safeunistd.h for rmdir, required by GCC 4.7 (reported by
Gaurav Arora).
Omega 1.3.0 (2012-03-14):
general:
* Make libmagic a required dependency.
documentation:
* docs/termprefixes.html: Document how to map a user prefix to multiple term
prefixes.
* docs/overview.html: Improve documentation of htdig_noindex.
indexers:
* omindex:
+ Index title with an 'S' prefix rather than no prefix.
+ If the document with the highest existing docid before the run was updated,
we were reporting it as "added", but now we correctly report it as
"updated".
+ Catch and report std::exception explicitly, so failing to allocate memory
is no longer reported as "Unknown exception".
* scriptindex:
+ Remove special error handling case noting that index=nopos was replaced
with indexnopos - this was removed in 1.1.0 so there's been enough time to
upgrade.
omega:
+ DEFAULTOP now defaults to AND rather than OR, since that matches what pretty
much every search engine does these days. Closes ticket#512.
* Allow mapping a query string prefix to more than one term prefix (which
xapian-core has supported since 1.0.4).
* Add support for search inputs for multiple probabilistic prefixes, with
support for per-prefix stemmers.
* Drop legacy support for handling '.' separated terms in xP - that changed in
Omega 0.9.7, more than 5 years ago now.
* Remove support for OLDP CGI parameter which was superseded by xP
approximately a decade ago, and isn't even documented!
* Drop special handling for R-prefixed terms in $prettyterm - we stopped
generating these in Xapian 1.0.
templates:
* templates/query:
+ We now map unprefixed queries to include S-prefixed terms to match the
change in omindex to prefixing terms from the title with S. You may want
to make the same update to your own templates.
+ Set up prefixes for 'author:' and 'title:'.
packaging:
* xapian-omega.spec: We're ABI compatible within a release series so make
dependency on xapian-core-libs >= rather than =.
Omega 1.2.15 (2013-04-16):
omega:
* Don't pointlessly link utf8convert.o into the omega CGI.
Omega 1.2.14 (2013-03-14):
indexers:
* omindex:
+ Correct "max" -> "min" when reserving space for shared strings in .xlsx
files. This just means we now reserve a more appropriate amount of space
to start with.
+ Ignore .com files by default.
Omega 1.2.13 (2013-01-09):
indexers:
* omindex:
+ Extracting text using external filters now works for filenames containing a
newline character - previously the newline got lost during escaping for the
shell.
+ Fix segfault when -F option without a ':' is passed.
+ Skip a file if we get a read error while calculating the MD5 checksum (used
for duplicate detection) - previously we used a checksum of the file up to
that point.
+ Avoid rereading SVG and Atom files when we calculate their MD5 checksums.
+ Improvement --help output and man page, most notably:
- Say explicitly that --sample-size accepts the same formats as --max-size.
- Note default size limit on files to index is unlimited.
+ When generating a sample for a CSV file, limit the size we pre-allocate to
the CSV file size if that's smaller than the requested sample size, in case
the user sets that limit very high.
omega:
* Fix to decode %-encoded character at the end of the query string.
build system:
* INCLUDES is now deprecated in automake, so use AM_CPPFLAGS instead.
Omega 1.2.12 (2012-06-27):
No changes since 1.2.11 except to bump the version - this release was made to
fix an incorrect library version information update in xapian-core 1.2.11.
Omega 1.2.11 (2012-06-26):
indexers:
* Change HTML parser's handling of multiple <body> tags and of text outside of
<body> to match the behaviour of modern web browsers. (ticket#599)
* omindex:
+ Add command line option to control the size of the document sample stored.
Patch from Mihai Bivol.
+ Rework .xlsx parsing to substitute the shared strings into the positions
they are used in, so that the sample actually matches what appears in the
spreadsheet, and to index calculated cell contents.
+ Improve handling of headers and footers in OpenDocument documents.
+ pdftotext outputs a formfeed between each page, which messes up our "empty
body" check, so trim any trailing formfeeds before this check.
build system:
* Don't explicitly link indirect shared library dependencies on FreeBSD,
OpenBSD, and Solaris.
Omega 1.2.10 (2012-05-09):
indexers:
* Add support for CDATA to HTML/XML parser.
* omindex:
+ Add --max-size option, based on patch from ndaley in ticket#587.
+ Add support for atom feed files, patch from Mihai Bivol in ticket#595.
+ If the document with the highest existing docid before the run was updated,
we were reporting it as "added", but now we correctly report it as
"updated". (Backported from 1.3.0).
+ Catch and report std::exception explicitly, so failing to allocate memory
is no longer reported as "Unknown exception". (Backported from 1.3.0).
* scriptindex:
portability:
* Fix to build with GCC 4.7 by adding cast to rlim_t to fix error about C++11
compatibility (reported by Gaurav Arora).
Omega 1.2.9 (2012-03-08):
documentation:
* docs/overview.html:
+ Document that libmagic is used to determine the MIME type if the extension
isn't known. Partly addresses ticket#569.
+ We now limit time as well as CPU and memory for external filters.
indexers:
* Our HTML parser now ignores sections bracketed by <!--UdmComment--> and
<!--/UdmComment-->, like we already do for <!--htdig_noindex-->.
* omindex: Add more extensions to the default ignore list: bin dat db fon jar
lnk pyc pyd pyo sqlite sqlite3 sqlite-journal tmp ttf
Omega 1.2.8 (2011-12-13):
documentation:
* scriptindex.cc: Add link to http://xapian.org/docs/omega/scriptindex.html to
--help output (and so also to the man page which is generated from this).
* omegascript.html: Add note to discourage use of percentage scores.
indexers:
* omindex:
+ If we don't get any data from an external filter for 5 minutes, give up -
it has probably ended up blocked indefinitely.
+ Improve --help output (and man page which is generated from it). Closes
bug#572.
* scriptindex:
+ If no rules are found in the index script, report an error and give up -
this is inevitably the result of a mistake, and adding empty documents to
the database isn't helpful.
omega:
+ Add new $prettyurl{} command which undoes RFC3986 URL escaping which
doesn't affect semantics in practice. Partly addresses ticket#550.
+ Replace URL decoder with new implementation which handles various corner
cases better. Fixes bug#578.
+ If CGI parameter P has trailing spaces, we now remove them all rather than
leaving one.
templates:
* templates/query: HTML escape topterms.
* templates/godmode: HTML escape the contents of document values.
* templates/query: Don't show the percentage score in the default template.
testsuite:
* Add new urlenctest unit test of URL encoding and decoding.
portability:
* configure: Sync changes from xapian-core: Don't pass -Wshadow for GCC < 4.1;
don't pass -Wstrict-null-sentinel for GCC 4.0.x; only enable symbol
visibility on platforms where it is supported.
packaging:
* xapian-omega.spec: Package outlookmsg2html helper.
Omega 1.2.7 (2011-08-10):
documentation:
* docs/termprefixes.html: Document how to map a user prefix to multiple term
prefixes.
* docs/overview.html: Improve documentation of htdig_noindex.
omega:
* Improve $version output from "Xapian - xapian-omega 1.2.7" to "xapian-omega
1.2.7".
packaging:
* xapian-omega.spec: We're ABI compatible within a release series so make
dependency on xapian-core-libs >= rather than =.
Omega 1.2.6 (2011-06-12):
documentation:
* docs/omegascript.html: Correct the documentation of the colours used by
$highlight{}.
* docs/overview.html: Add using unoconv as more complex example of using
--filter (ticket#324).
templates:
* templates/query:
+ Make search query input type=search.
+ Autofocus the search query input (using HTML autofocus attribute with
Javascript fallback for older browsers). (ticket#544)
portability:
* Fix a compiler warning.
Omega 1.2.5 (2011-04-04):
documentation:
* Add index page which links to all the other documentation pages.
* INSTALL: Copy new Multi-Arch section from xapian-core/INSTALL. Replace VPATH
section with better equivalent from Xapian-core/INSTALL.
* docs/omegascript.html: Minor improvements.
indexers:
* The HTML parser no longer uses an exception to signify it has finished in
the normal case as exceptions are typically costly to handle. In tests,
this made omindex ~0.23% faster when indexing a lot of HTML files.
* omindex:
+ Add --ignore-exclusions option, which will index HTML files despite meta
robots tags, etc - omindex is often used in environments where such
exclusions aren't relevant.
+ Fix to compile with older versions of libmagic which don't have
MAGIC_MIME_TYPE (e.g. on Ubuntu hardy).
+ Tell xls2csv to separate fields with spaces rather than commas, and not to
quote them. Fixes indexing of numeric fields, and means we don't need to
use our CSV parser to get a sample.
+ Add whitespace between chunks of text extracted from Microsoft Office 2007
formats to prevent words in adjacent chunks from being run together.
+ Encode reserved characters in URLs - links to files with names containing
'#' and '?' now work.
+ Handle .xlr extension the same way as .xls (later Microsoft Works versions
apparently produce such files which are really the same format).
+ Index filename extension with new standard prefix E.
+ Just report the mimetype as unknown instead of saying "unknown Office 2007
MIME subtype".
+ Ignore *.css and *.js by default too.
+ Messages reporting skipping files are now more consistent and always report
the filename.
+ New --empty-docs option to allow documents we extract no body text from to
be indexed (existing behaviour), skipped, or reported and then indexed.
omega:
* Fix double Content-Type header in some error reporting situations (regression
introduced in 1.2.4).
* Update $url's URL encoding to follow RFC3986.
* Allow QueryParser flags to be set from OmegaScript (ticket#418). The
FLAG_SPELLING_CORRECTION flag can now be set using
$opt{flag_spelling_correction,1} - the old $opt{spelling,true} way to
enable this flag still works, but it now deprecated.
templates:
* templates/emptydocs,templates/godmode,templates/opensearch,templates/query,
templates/xml: Add missing escaping. Some of these instances may allow
cross-site scripting, so upgrading your templates is recommended, especially
if you have any sensitive cookies set on the domain Omega is running on.
* templates/xml:
+ Try $field{caption} (which is what omindex sets) before $field{title} when
getting a value for the hit tag's title attribute - this is consistent with
how the query template gets the title.
+ Add new 'type' attribute which gives $field{type}.
+ Add 'DBSize' attribute to <result> element.
+ Fix double escaping of matching terms. This is only likely to affect cases
where a matching term contains '&'.
+ Remove support for undocumented HILITECLASS CGI variable. There's no
evidence I can find using Google code search or web search that this has
been used anywhere, and it's difficult to handle escaping it properly in
the face of all the ways it could reasonably be used.
portability:
* Fix to compile on Microsoft Windows (ticket#350).
Omega 1.2.4 (2010-12-19):
documentation:
* Minor documentation improvements.
indexers:
* Some iconv implementations (such as that on Mac OS X) don't handle many of
the commonly seen mis-punctuated charset names (e.g. UTF16, UTF_16). We now
check for this if iconv fails, fix up the charset name, and retry.
* The built-in character encoding converter now handles spaces in charset
names.
* Use O_NOATIME if available and either the file is owned by the current euid,
or the current euid is 0 (i.e. we're running as root). This avoids updating
the access time of files we index which saves time. Fixes ticket#222.
* Report get_description() for Xapian exceptions, which provides additional
information above get_msg().
* Add boolean terms with add_boolean_term() so they get wdf of 0 and don't
contribute to document length.
* omindex:
+ Escape wildcard patterns being passed to unzip - in the unlikely event that
one of these matched files in or under the current directory, we might fail
to extract all the files we wanted to.
+ Add explicit support for indexing CSV files (better samples than from
using '-Mcsv:text/plain').
+ Add support for indexing .msg files from Microsoft Outlook (using the Perl
module Email::Outlook::Message. (ticket#334)
+ Improve --help for --mime-type option.
+ Optionally use libmagic to detect MIME types for files for which we have no
extension mapping, which allows us to handle files with a misleading
extension, or no extension at all. (ticket#114)
+ Add new --filter option which allows the user to specify new filters
provided they return UTF-8 text on stdout.
+ If a filter command isn't installed, previously we wouldn't try it again
for the same file extension - now we won't try it again for the same
mime-type.
+ Index the leafname of the file (without any extension) as extra keywords.
+ Extract author from HTML, OpenDocument, and PDF files. Index it with an A
prefix, and add it as a field.
+ Add support for indexing text and metadata from SVG files.
+ Extract metadata from Microsoft Office 2007 file formats.
+ Index text in headers and footers for .odt and .docx files.
+ Use the CSV parser to generate a nicer sample for files of type
application/vnd.ms-excel.
+ Add support for indexing Debian and RPM package files (ticket#493).
+ Make the memory limit for filter processes the size of physical memory,
which is a little less arbitrary than 7/8 of this value (ticket#424).
+ Under --duplicate=ignore, fix so that old documents which aren't seen get
deleted, which wasn't implemented before (to suppress this deletion, pass
-p as well).
+ Rename the short option for --version from -v to -V for consistency with
scriptindex and many other packages, and to free up -v as the short option
for --verbose. For backward compatibility, "omindex -v" is handled
specially and still reports the version.
+ Add --verbose option, and disable the less interesting output unless it is
specified.
+ Deprecate "--preserve-nonduplicates" in favour of new long option
"--no-delete" which does the same thing, but has a clearer name.
+ The deletion of documents pass at the end of indexing is now more
efficient. We track how many documents in the database we haven't seen so
we can stop once we've found them all (a particularly big improvement if
there are no documents to delete), and we now use a PostingIterator over
all documents which avoids needing to catch an exception for every gap in
the used document ids.
+ Quietly ignore files with mimetype set to "ignore". The initial list of
extensions set to ignore is: .a .dll .dylib .exe .lib .o .obj .so
+ Index file owner and read permissions, to allow finding documents with a
particular owner, and so searches can be restricted to documents a user is
able to read.
+ Add file size as a document value, so you can sort on it and filter by it.
* scriptindex:
+ Fix file descriptor leak if the LOADFILE action is used on something which
isn't a file.
omega:
* Make sure we write out HTTP headers when reporting an error early on.
* Extend $field to take an optional DOCID argument, rather than always using
the context from $hitlist.
* Add new $emptydocs command which returns a list of documents with doclength
zero.
* Add support for size: range filtering. Currently the end points of the range
have to be specified in bytes (e.g. size:102400..204800 for 100-200KB).
templates:
* templates/emptydocs: New template which lists documents with doclength zero.
build system:
* configure: Probe for any options needed to enable large file support.
Handling files >= 2GB isn't especially useful, but more importantly this is
needed to allow omindex to index files on filing systems with 64 bit inodes
on some platforms (e.g. 32-bit Linux).
* Use -no-undefined on platforms which need it to dynamically link such as
cygwin (need to do this taken from ticket#282).
portability:
* Fix to compile with Sun C++.
Omega 1.2.3 (2010-08-24):
documentation:
* docs/termprefixes.html: Update "flint and quartz" to "flint and chert" as
quartz is no longer supported. Give exact term length limit for flint and
chert.
packaging:
* xapian-omega.spec: Don't run autoreconf - it's no longer required.
Omega 1.2.2 (2010-06-27):
portability:
* Apply getopt portability fixes from xapian-core 1.2.0, fixing build failures
on Mac OS X (and probably some other platforms with non-GNU getopt
implementations). (ticket#469)
Omega 1.2.1 (2010-06-22):
This release includes all changes from 1.0.21 which are relevant.
Omega 1.2.0 (2010-04-28):
This release includes all changes from 1.0.20 which are relevant.
build system:
* configure: Tell libtool not to link in deplibs on platforms where we know
they aren't needed.
* configure: On Linux, extract the library search path from ldconfig which
gives us the default entries reliably.
Omega 1.1.5 (2010-04-15):
This release includes all changes from 1.0.19 which are relevant.
Omega 1.1.4 (2010-02-15):
This release includes all changes from 1.0.18 which are relevant.
omega:
* Use the optimised integer to string conversion routines from xapian-core.
Omega 1.1.3 (2009-11-18):
This release includes all changes from 1.0.15-1.0.17 which are relevant.
templates:
* templates/query: If JavaScript is available, convert $field{modtime} to a
string on the client-side so that the timezone is correct. If JavaScript
isn't available, fall back to the existing behaviour of using UTC.
(ticket#314)
build system:
* configure: Default to looking for xapian-config-1.1 unless XAPIAN_CONFIG is
specified.
Omega 1.1.2 (2009-07-23):
This release includes all changes from 1.0.14 which are relevant.
indexers:
* omindex:
+ Handle the "macroenabled" versions of MS Office 2007 files too
(ticket#290).
+ Extract pptx notesSlides and comments, if present. (ticket#290).
Omega 1.1.1 (2009-06-09):
This release includes all changes from 1.0.13 which are relevant.
indexers:
* omindex:
+ Check the last modification time of files before reindexing (ticket#342).
+ Add "--spelling" option to index spelling correction data.
* scriptindex:
+ Add new "spell" action for indexing spelling correction data (ticket#296).
omega:
* Add $suggestion and $opt{spelling} to provide access to spelling correction
(ticket#296).
* Add $opt{weighting} to allow the weighting scheme and parameters to be
specified (ticket#298).
* If SERVER_PROTOCOL in the environment is set to INCLUDED, then our output is
being included in another page (e.g. using SSI) so suppress the output of any
HTTP headers.
templates:
* templates/query: Offer any spelling correction QueryParser gives.
build system:
* configure: Sync warning flags used with GCC with xapian-core apart from
-Woverloaded-virtual which fires for MyHtmlParser::parse_html(). That
probably should be tidied up at some point, but not right now.
Omega 1.1.0 (2009-04-23):
indexers:
* scriptindex:
+ Make deprecated "index=nopos" an error.
omega:
* New OmegaScript command $transform{} which performs regular expression
substitutions using the PCRE library (which is now required to build Omega).
(ticket#231)
build system:
* The build system is now bootstrapped with newer versions of autoconf and
libtool which should produce smaller files and speed up configure and
make.
Omega 1.0.23 (2011-01-14):
indexers:
* omindex:
+ Escape wildcard patterns being passed to unzip - in the unlikely event that
one of these matched files in or under the current directory, we might fail
to extract all the files we wanted to when indexing document formats like
OpenDocument which use a zip file container.
+ The parser for OpenDocument metadata wasn't initialising its "state" field.
Often you'd be lucky and it would be initialised to zero, but this could
have caused misparsing of metadata in some cases.
* scriptindex: Fix file descriptor leak if the LOADFILE action is used on
something that isn't a file.
* If fstat() fails when trying to load a file, preserve the errno value from
the fstat call to report to the user.
portability:
* configure: Probe for any options needed to enable large file support.
Handling files >= 2GB isn't especially useful, but more importantly this is
needed to allow omindex to index files on filing systems with 64 bit inodes
on some platforms (e.g. 32-bit Linux).
* Add -no-undefined to AM_LDFLAGS on platforms which need it to dynamically
link such as cygwin (need to do this taken from ticket#282).
Omega 1.0.22 (2010-10-03):
portability:
* Fix to compile with Sun C++.
Omega 1.0.21 (2010-05-18):
portability:
* Fix build failure in freemem.cc on Microsoft Windows.
Omega 1.0.20 (2010-04-27):
portability:
* Fix build failure on Mac OS X and possibly some other platforms (regression
caused by fix for getopt-related warnings on Cygwin in 1.0.19).
Omega 1.0.19 (2010-04-15):
portability:
* Fix getopt-related warning on Cygwin.
Omega 1.0.18 (2010-02-14):
indexers:
* Make the default charset "utf-8" not "UTF-8" as we lower case explicitly
specified character sets to compare to see if we need to reparse. Previously
XML documents which explicitly specified their character set as UTF-8 would
cause needless restart or the parser.
* omindex:
+ Increase the wdf boost for the document title from 2 to 5, since 2 isn't
really enough.
* scriptindex:
+ Don't abort with "Unknown Exception" if indexing is disallowed or we hit
</body> for a document which had an overridden character set. Fixes
ticket#410.
Omega 1.0.17 (2009-11-18):
indexers:
* omindex:
+ On Linux, change the memory limit on external filters to use _SC_PHYS_PAGES
since _SC_AVPHYS_PAGES excludes pages used by the OS cache and so will
often report a really low value. Fixes Debian bug#548987 and ticket#358.
+ Fix likely crash when reading output from external filter program if read()
is interrupted by a signal.
+ Fix potential crash when indexing PostScript files (fixed by using delete[]
(not delete) for array allocated by new[]).
testsuite:
* utf8converttest: Charset "8859_1" isn't understood by Solaris libiconv, and
isn't a standard charset name, so just test it when using our built-in
converter and GNU libc.
portability:
* Fix build failure on Mac OS X 10.6.
* Also check for socketpair() in -lxnet if it isn't found without, which
enables resource limits on external filter programs called by omindex on
Solaris, and possibly some other platforms. Fixes ticket#412.
Omega 1.0.16 (2009-09-10):
* omega: Fix cross-site scripting vulnerability in reporting of exceptions
(CVE-2009-2947).
Omega 1.0.15 (2009-08-26):
general:
* omegascript.vim: The list of OmegaScript commands in the vim mode was rather
out of date, and a few commands were misclassified. Fix both problems and
avoid future recurrences by automatically generating those lists from the
command list in query.cc.
documentation:
* omegascript.html: Document that $date uses UTC. (ticket#314)
templates:
* query: Link to "xapian.org" rather than "www.xapian.org".
* inc/toptermsjs: Use double-quotes rather than single quotes for parameter
values on the <script> tag.
portability:
* omindex: Implement correct handling of paths when calling external filter
programs on Microsoft Windows.
Omega 1.0.14 (2009-07-21):
indexers:
* omindex: Make sure that output is flushed after every message, not just after
some of them.
portability:
* Avoid infinite loop in omindex and scriptindex when reading files under
Cygwin with automatic end of line translation enabled. This same bug can
also manifest on Unix platforms if the file is truncated by another process
while being read.
Omega 1.0.13 (2009-05-23):
indexers:
* omindex:
+ If the filter program needed for a file format isn't installed, report this
explicitly when skipping subsequent files with the extension instead of
misleadingly reporting "Unknown extension".
+ Make -s actually work as a short-form for --stemmer (as documented by
"omindex --help" and "man omindex").
+ Drop the copyright info from the output of --version as it's perennially
out of date and we don't report it for any other Xapian programs.
* scriptindex:
+ Add new "valuenumeric" action to add a document value using
Xapian::sortable_serialise() to allow numeric sorting (ticket#260).
build system:
* configure: Enable more GCC warnings - "-Wstrict-null-sentinel" for 4.0+,
"-Wlogical-op -Wmissing-declarations" for 4.3+.
Omega 1.0.12 (2009-04-19):
omega:
* $log now retries a partial write, or one interrupted by a system call.
build system:
* configure: Fix iconv parameter type probe not to implicitly cast a string
literal to char* - this a warning under GCC currently, but the user could
pass -Werror explicitly in CXXFLAGS, and this could be promoted to an error
in future GCC versions, and may already be so for some other compilers.
* Overriding CXXFLAGS at make-time (e.g. "make CXXFLAGS=-Os") no longer
overrides any flags required for building with Xapian.
* We now actually use the compiler warning flags which configure detects.
Omega 1.0.11 (2009-03-15):
documentation:
* cgiparams.html: Note the technique of using a stub database file to allow a
default of searching over multiple databases.
indexers:
* omindex:
+ Add support for indexing Microsoft Office 2007 formats and XPS files
(bug#290).
+ Fix the extraction of metadata from OpenDocument formats.
+ Fix "-l" which would previously always cause a segmentation fault if used
("--depth-limit" wasn't affected).
build system:
* configure: The output of g++ --version changed format (again) with GCC 4.3
which meant configure got "g++" for the version. Instead use the (hopefully)
more robust technique of using g++ -E to pull out __GNUC__ and
__GNUC_MINOR__.
* configure: Turn on _FORTIFY_SOURCE where available (as we do in xapian-core).
portability:
* Fix to compile when RLIMIT_AS isn't available (as on NetBSD and OpenBSD).
Instead use RLIMIT_VMEM or RLIMIT_DATA if either is available, else don't try
to limit the memory the filter process can use.
Omega 1.0.10 (2008-12-23):
build system:
* This release now uses newer versions of the autotools (autoconf 2.62 ->
2.63; automake 1.10.1 -> 1.10.2). The newer autoconf fixes a regression
in autoconf 2.62 (and so Omega 1.0.7) with detecting the endian-ness of some
platforms.
Omega 1.0.9 (2008-10-31):
documentation:
* docs/overview.html: Document HTML parsing a bit, including robots
meta and htdig_noindex.
omega:
* omega: Catch std::exception and report what its what() method returns.
* omega: Remove undocumented and non-functional support for numeric sorting
via CGI parameter SORT=#<slot> (SORT=<slot> works as before).
build system:
* configure: Sync warning flag handling changes from xapian-core to eliminate
many warnings from GCC 4.3.
Omega 1.0.8 (2008-09-04):
documentation:
* Fix a few typos and improve wording in a few places.
indexers:
* omindex:
+ If the character encoding is specified using <meta http-equiv=...> in an
HTML document then reparse the document if it isn't the encoding we're
already using so that any preceding <title> is converted correctly