This repository has been archived by the owner on May 3, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 26
/
CHANGELOG
150 lines (84 loc) · 4.48 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
master
-------------------------
1.3 (2023-05-02)
-------------------------
* Add WAND support to `by_term`.
* Allow to store WAND-specific data in skip list.
* Add ability to use an external tick in `IndexWriter::Commit`.
* Refactor index write to maintain its own index snapshot.
* Add ability to optionally cache columnstore data in memory.
* Swtich to 3-way comparator for primary sort.
* Make primary sort stable.
* Add ability to specify custom callback for debug assertions.
* Extend `term_reader` interface with the following methods:
- `term_meta term_meta(bytes_view term)`
Provides fast and efficient access to term's metadata.
- `size_t read_documents(bytes_view term, std::span<doc_id_t> docs)`
Efficiently reads first K documents associated with the given term
into the specified span.
* Avoid writing empty sort entries.
* Add `CachingFSDirectory` and `CachingMMapDirectory` allowing to reduce
a number of syscalls.
* Speedup checksum computation.
v1.2 (2022-10-06)
-------------------------
* Allow specifying progress report callback for
`index_writer::commit`/`index_writer::start`.
* Allow getting offsets from phrase and ngram similarity queries.
* Allow specifying minimum number of matches for `by_terms` filter.
* Add `MinHashAnalyzer` capable of producing tokens composing a MinHash
signature.
* Add `ByNestedFilter` capable of matching documents by nested query.
* Add ability to access previous document for columnstore2 iterators.
* Remove outdated `DECLARE_FACTORY_INLINE` macros.
* Add `MergeType::kMin` allowing to evaluate minimum sub-iterator score.
* Fix issue with loosing already flushed segment in case of failure during
document insertion into the same segment.
* Force scorers to return scores as floating point numbers.
* Deprecate and remove `iql`.
* Use abseil as a submodule.
* Added proxy_filter for caching search results.
* Add ARM support.
* Speedup BM25 scorer.
* Move to C++20.
* Add `classification_stream` capable of classifying input data based on FastText NN model.
* Add `nearest_neighbors_stream` capable of generating synonyms based on FastText NN model.
v1.1 (2022-01-05)
-------------------------
* Add support of column headers to columnstore.
* Make column name and id a part of columnstore API.
* Deprecate `column_meta`, `column_meta_writer`, `column_meta_reader`.
* Fix possible race between file creation and directory cleaner.
* Fix invalid sorting order of stored features in presence of primary sort.
* Enhance troubleshooting experience for the analyzers required locale.
* Eliminate dependency to Boost.Locale.
* Get rid of internal conversions during analysis. All text analyzers now expect UTF-8 encoded input.
* Fix threading related issues reported by TSAN.
* Add "object" locale parsing for the `collation_token_stream` to support definig locale variant
and keywords.
* Get rid of `utf8_path` in favor of `std::filesystem::path`.
* Fix sporadic "error while reading compact" failures.
* Rework Compression API to return `std::unique_ptr` instead of `std::shared_ptr`.
* Rework Analyzer API to return `std::unique_ptr` instead of `std::shared_ptr`.
* Derive `null_token_stream`, `string_token_stream`, `numeric_token_stream` and `null_token_stream`
from `analysis::analyzer`.
* Rework iterators API to reduce number of heap allocations.
* Add new analyzer `collation` capable of producing tokens honoring language
specific sorting.
* Add new feature `iresearch::Norm2` representing fixed length norm value.
* Split field features into built-in set of index features and pluggable field features.
* Add ability to specify target prefix for `by_edit_distance` query.
* Fix possible crash in `disjunction` inside `visit` of the already exhausted iterator.
* Fix block boundaries evaluation in `block_iterator` for `memory_directory` and `fs_directory`.
* Reduce number of heap allocations in `numeric_token_stream`.
* Replace RapidJSON with Velocypack for analyzers and scorers serialization and deserialization
* Add new `1_4` segment format utilizing new columnstore and term dictionary index format.
* Add new columnstore implementation based on sparse bitset format.
* Add random access functionality to `data_input` for both regular reads and
direct buffer access.
* Add `sparse_bitset_writer`/`sparse_bitset_iterator`, a fast and efficient on-disk
format for storing sparse bit sets.
* Add a set of SIMD-based utils for encoding.
v1.0 (2021-06-14)
-------------------------
Initial release of IResearch library