Skip to content

Releases: tenzir/tenzir

VAST 2021.03.25-rc2

22 Mar 08:57
9a0b874
Compare
Choose a tag to compare
VAST 2021.03.25-rc2 Pre-release
Pre-release

This is the second release candidate for the VAST release scheduled for 2021-03-25. Please take a look at our CHANGELOG for a detailed list of changes since the last release.

The following two bugs were fixed since the last release candidate:

  • Insufficient permissions for one of the paths in the schema-dirs option would lead to a crash in vast start. #1472
  • A race condition during server shutdown could lead to an invariant violation, resulting in a firing assertion. Streamlining the shutdown logic resolved the issue. #1473

VAST 2021.03.25-rc1

18 Mar 11:06
6f3565a
Compare
Choose a tag to compare
VAST 2021.03.25-rc1 Pre-release
Pre-release

This is the first release candidate for the VAST release scheduled for 2021-03-25. Please take a look at our CHANGELOG for a detailed list of changes since the last release.

VAST 2021.02.24

24 Feb 09:15
a46945a
Compare
Choose a tag to compare

We’re happy to announce the monthly release 2021.02.24 of VAST. We’ve added experimental Sigma query support, and made enhancements to the type system for an improved user experience.

Type System and Schema Enhancements

Schema parsing now uses a 2-pass loading phase so that type aliases can reference other types that are later defined in the same directory. Additionally, type definitions from already parsed schema directories can be referenced from types that are parsed later. For example, this is now possible:

type foo = bar // previously an error because "bar" is not known up to this point
type bar = string

The schema loading at VAST startup sequentially reads all schema files in the schema directories and adds the type definitions cumulatively to one global namespace. Types may now be overridden in later directories, but a type must not be defined twice in the same directory. This removes the need to override the schema files bundled with VAST entirely. Instead, we strongly recommend that instead of modifying the bundled schema files in share/vast/schema you create copies in etc/vast/schema or ~/.config/vast/schema, which will always override types from bundled schemas with the same name to avoid breaking when a VAST update contains changes to bundled schema files.

The #timestamp attribute no longer has a special meaning in VAST. From this point forward, we're providing the timestamp type alias to distinguish a timestamp from other time fields. Query expressions can replace the #timestamp extractor with a :timestamp type extractor to consider the same data, but a query for :time will not contain timestamp fields any longer. The #timestamp predicate is still operational but deprecated, and will be removed with the next release. A mitigation exists for preexisting events that still contain #timestamp attributes to ensure all relevant fields for :timestamp queries are considered for queries.

Internally, we have a type alias of this form in our base schema that ships with VAST: type timestamp = time. This is the first step towards more semantic types, where the underlying representation of a value doesn't change but where the type has a more specific meaning that is expressed by type alias name. This is enabled by the type extractor in the expression language now working for user-defined types. Similar totimestamp, the type port is defined as type port = count in the base schema to enable queries like :port == 80.

Finally, a quality of life change makes it easier to search for IP addresses belonging to a specific subnet. Given a subnet S, the parser expands this to the expression :subnet == S || :addr in S.

Fast Parsing via simdjson Now Stable

The previously experimental simdjson-based JSON import is now considered stable. We thank Nicolai Grodzitski again for his outstanding work on the feature. The option vast.import.simdjson no longer works, as the feature is now always enabled.

Initial tests and customer feedback show a great increase in performance. We are working on a blog post with a detailed performance comparison between the old and new JSON readers—stay tuned!

Logger Improvements

Thanks to Harald Achitz, VAST’s logger now uses the feature-rich spdlog library for logging. This adds two new required dependencies to VAST: spdlog >= 1.5.0 and {fmt} >= 5.2.1.

The new logging framework rotates files by default, and comes with less overhead than the framework integrated with CAF. Check out the all new documentation page for configuring logging.

Sigma Support

VAST has now experimental support for executing Sigma rules. Instead of writing a query using the VAST expression language, you can now provide a YAML rule:

vast export json < sigma/rule.yaml

Under the hood, VAST automatically detects the different expression language and converts the detection attribute of the Sigma rule into a native expression. This works surprisingly well because the expressive power of Sigma is close to VAST’s language. As of today, there are still a few differences. For example, Sigma has no range inequalities (<, <=, <=, >) and no type system to differentiate between IPs and strings, whereas VAST has no support for case-insensitive search, regular expressions, and complex aggregations. Please consult our documentation for a more detailed discussion.

By supporting Sigma natively in VAST, we commit to the open-source spirit in the security domain and embrace the endpoint detection community.

Acknowledgements

We want to thank our open-source community for numerous contributions to VAST. This month, we received external contributions from @ngrodzitski, @a4z, and @JoeLoser.

Changelog Highlights

As always, you can find the full scoop in our changelog.

⚡️ Breaking Changes

  • The previously deprecated options vast.spawn.importer.ids and vast.schema-paths no longer work. Furthermore, queries spread over multiple arguments are now disallowed instead of triggering a deprecation warning. #1374
  • The special meaning of the #timestamp attribute has been removed from the schema language. Timestamps can from now on be marked as such by using the timestamp type instead. Queries of the form #timestamp <op> value remain operational but are deprecated in favor of :timestamp. Note that this change also affects :time queries, which aren't supersets of #timestamp queries any longer. #1388
  • All options in vast.metrics.* had underscores in their names replaced with dashes to align with other options. For example, vast.metrics.file_sink is now vast.metrics.file-sink. The old options no longer work. #1368
  • User-supplied schema files are now picked up from <SYSCONFDIR>/vast/schema and <XDG_CONFIG_HOME>/vast/schema instead of <XDG_DATA_HOME>/vast/schema. #1372
  • VAST now requires {fmt} >= 5.2.1 to be installed. #1330
  • VAST switched to spdlog >= 1.5.0 for logging. For users, this means: The vast.console-format and vast.file-format now must be specified using the spdlog pattern syntax as described here. All settings under caf.logger.* are now ignored by VAST, and only the vast.* counterparts are used for logger configuration. #1223 #1328 #1334 #1390 @a4z

⚠️ Changes

  • The query normalizer interprets value predicates of type subnet more broadly: given a subnet S, the parser expands this to the expression :subnet == S || :addr in S. This change makes it easier to search for IP addresses belonging to a specific subnet. #1373
  • The options listen, read, schema, schema-file, type, and uds can from now on be supplied to the import command directly. Similarly, the options write and uds can be supplied to the export command. All options can still be used after the format subcommand, but that usage is deprecated. #1354
  • Schema parsing now uses a 2-pass loading phase so that type aliases can reference other types that are later defined in the same directory. Additionally, type definitions from already parsed schema dirs can be referenced from schema types that are parsed later. Types can also be redefined in later directories, but a type can not be defined twice in the same directory. #1331

🧬 Experimental Features

  • Sigma rules are now a valid format to represent query expression. VAST parses the detection attribute of a rule and translates it into a native query expression. To run a query using a Sigma rule, pass it on standard input, e.g., vast export json < rule.yaml. #1379

🎁 Features

  • The type extractor in the expression language now works with user defined types. For example the type port is defined as type port = count in the base schema. This type can now be queried with an expression like :port == 80. #1382
  • The new options vast.metrics.file-sink.real-time and vast.metrics.uds-sink.real-time enable real-time metrics reporting for the file sink and UDS sink respectively. #1368
  • The meta index now stores partition synopses in separate files. This will decrease restart times for systems with large databases, slow disks and aggressive readahead settings. A new config setting vast.meta-index-dir allows storing the meta index information in a separate directory. #1330 #1376
  • The JSON import now always relies upon [simdjson](https...
Read more

VAST 2021.01.28

28 Jan 09:01
9db6f36
Compare
Choose a tag to compare

We’re happy to announce the monthly release 2021.01.28. This year begins with some exciting changes. As an open-source telemetry engine, VAST now doubles down on the open platform character with a new plugin framework with multiple customization points.

VAST’s new plugin framework makes it possible to ship third-party extensions—in source or binary—along with an existing VAST deployment. We use this new functionality as well for our closed-source add-ons, e.g., live threat intel matching and NetFlow parsing. We believe having a fully open platform with extension points for custom functionality is the sweet-spot for an open-core business model.

We were also able to increase JSON parsing performance by 5x by switching to a SIMD-based implementation. Our Docker build now relies on BuildKit to support optional layers and images are based on Debian Buster. Multiple bug fixes and robustness improvements also made it in this release. Enjoy.

Plugin Framework

VAST now offers an experimental plugin framework to support efficient customization points at various places of the data processing pipeline. There exist several base classes that define an interface, such as adding a new command or spawning a new actor that processes the incoming stream of data.

The documentation page gives an overview of the plugin framework, which is still under active development.

JSON improvements

VAST now natively supports Zeek logs as line-delimited JSON objects as produced by the json-streaming-logs package via the vast import zeek-json command.

Thanks to @ngrodzitski, VAST now has experimental support for relying on simdjson for parsing JSON objects. This brings substantial gains in throughput, and shifts the bottleneck of the ingest path from parsing input to indexing at the node. To use the feature, add the --simdjson flag to the following import commands: json, suricata, and zeek-json. We will stabilize this feature in the near future and make it the default option, replacing our legacy NDJSON parser entirely.

Additionally, VAST no longer flattens imported data that contains nested records on ingestion. This is most noticeable with imports in the JSON format, but actually applies to all formats under the hood. This means that VAST now fully preserves nested JSON objects and exports them in the same structure as they were ingested. To restore the old export behavior, use vast export json --flatten.

Changelog Highlights

As always, you can find the full technical scoop in our changelog.

⚡️ Breaking Changes

  • The GitHub CI changed to Debian Buster and produces Debian artifacts instead of Ubuntu artifacts. Similarly, the Docker images we provide on dockerhub use Debian Buster as base image. To build Docker images locally, users must set DOCKER_BUILDKIT=1 in the build environment. #1294

  • The new short options -v, -vv, -vvv, -q, -qq, and -qqq map onto the existing verbosity levels. The existing short syntax, e.g., -v debug, no longer works. #1244

⚠️ Changes

  • The option vast.schema-paths is renamed to vast.schema-dirs. The old option is deprecated and will be removed in a future release. #1287

  • VAST preserves nested JSON objects in events instead of formatting them in a flattened form when exporting data with vast export json. The old behavior can be enabled with vast export json --flatten. #1257 #1289

🧬 Experimental Features

  • VAST relies on simdjson for JSON parsing. The substantial gains in throughput shift the bottleneck of the ingest path from parsing input to indexing at the node. To use the (yet experimental) feature, use vast import json|suricata|zeek-json --simdjson. #1230 #1246 #1281 #1314 #1315 @ngrodzitski

  • VAST features a new plugin framework to support efficient customization points at various places of the data processing pipeline. There exist several base classes that define an interface, e.g., for adding new commands or spawning a new actor that processes the incoming stream of data. The directory examples/plugins/example contains an example plugin. #1208 #1264 #1275 #1282 #1285 #1287 #1302 #1307 #1316

🎁 Features

  • The output of vast status contains detailed memory usage information about active and cached partitions. #1297

  • The new import zeek-json command allows for importing line-delimited Zeek JSON logs as produced by the json-streaming-logs package. Unlike stock Zeek JSON logs, where one file contains exactly one log type, the streaming format contains different log event types in a single stream and uses an additional _path field to disambiguate the log type. For stock Zeek JSON logs, use the existing import json with the -t flag to specify the log type. #1259

🐞 Bug Fixes

  • Disk monitor quota settings not ending in a 'B' are no longer silently discarded. #1278

  • Values in JSON fields that can't be converted to the type that is specified in the schema won't cause the containing event to be dropped any longer. #1250

  • Manually specified configuration files may reside in the default location directories. Configuration files can be symlinked. #1248

VAST 2020.12.16

16 Dec 20:54
1380a1b
Compare
Choose a tag to compare

We're happy to announce the monthly release 2020.12.16 of VAST. To ship a release right before Christmas, we skipped the VAST release in November in favor of a now 50% larger release, packed with a bunch of features and fixes.

Big thanks go out to Andreas Herz and Sascha Steinbiss from DCSO for providing invaluable feature feedback, performance numbers, and moving VAST closer towards general availability in Debian buster.

Learnings from Production-grade Deployments

For the months of November and December, we focused on bringing VAST into a production-ready state. The meat of the changes cover performance improvements, stability improvements, and deployment streamlining.

FlatBuffers Table Slices

Table slices, VAST's internal representation of batches of events, have received a major overhaul. We previously refactored the definition of persistent state as FlatBuffers, and in this release we continue to push the "builder pattern": this means that we create the data in a well-defined binary layout via FlatBuffers to later enjoy the benefits of direct memory-mapping at query time. VAST's store currently defines a Feather-like on-disk format, concatenating table slices into segments. At query time, we memory-map them and have random access to the data, thanks to Arrow and FlatBuffers. Additionally, this release enables versioning of table slice encodings: we can now update existing or add new encodings without introducing breaking changes.

Data Model Streamlining

The port type is no longer a first-class type. The new way to represent transport-layer ports relies on the basic type count instead. In the schema, VAST ships with a new alias type port = count to keep existing schema definitions intact. This makes the first step in an effort towards adding more semantics to types via composition and aliasing. In the medium term, we plan to roll out more domain-specific aliases to improve domain-specific reasoning.

However, this is a breaking change because the on-disk format and Arrow data representation changed. Queries with :port type extractors no longer work. Similarly, the syntax 53/udp no longer exists; use count syntax 53 instead. Since most port occurrences do not carry a known transport-layer type, and the type information exists typically in a separate field, removing port as native type streamlines the data model. You will be able to query all fields of type port in the future again once type aliases can be queried using the :T syntax.

The type registry now correctly handles changes in schemas that are not backwards compatible, i.e., renamed fields or changed types of existing fields, and warns when detecting such a change.

Import processes now always use the most recent version of a type that is available, and do no longer require the server process to restart so that new versions of types are picked up. This makes for a much smoother experience in the presence of schema evolution.

Index Stability and Performance

For this release, we were focused on ironing out issues that came to light during our tests in preparation for upcoming large-scale deployments. In these tests, we ran VAST for several days or weeks, importing tens of thousands events per second of Suricata data. As expected, we discovered several issues after pushing past the limits we can test continuously during development.

We observed excessive memory usage, growing up to hundreds of gigabytes of RAM for databases in the multi-terabyte range. Not only was providing a machine with this amount a challenge, but this also caused near hour-long restart times, since the meta index has to be rebuilt on every restart.

The overall memory usage of bloom filters in meta index synopses was reduced by introducing an additional buffering step, and rewriting our bloom filters with optimal parameters when finalizing partitions. As it turns out, most of our bloom filters were very pessimistically sized, so this reduced the startup time by and overall memory usage by up to 95%!

On the export side, the index again proved to be a source of trouble: Running too many parallel queries could crash the server process. False positives from the meta index bloom filter no longer causes index workers to stop working on queries, which caused the index to deadlock. Queries for a limited number of events did not always correctly drop further results when the query finishes early, leaving around zombie index workers that were slowing down the whole system.

In addition to fixing all of the above, we also introduced a new string synopsis in the meta index and reworked the logic to pre-select the number of partitions, making string (in)equality queries up to 30x faster.

Taxonomies Update

The past release introduced concepts. This release rounds off the taxonomy specification with models. Taxonomies offer a unified access layer to represent domain knowledge. Concepts abstract away the naming differences of different data formats, and models now make it possible to define domain-specific entities that are tuples, such as a network connection. For example, you can now query for network connections like this:

net.connection == <1.2.3.4, _, 4.3.2.1, _, _>

The model net.connection defines a 5-tuple of a given format. The query translates into a product of concepts, each of which resolve to the format-specific fields in a recursive process.

To implement this change, we need a new “meta query” capability: we added a new attribute extractor called #field that matches on the name of a field. For example, #field == "src_ip" returns all events whose layout contains a record field named src_ip. The model resolution process uses this extractor internally, but it is now available for general use as well.

To simplify taxonomy introspection of a running VAST instance, the new vast dump [concepts|models] command prints a list of registered concepts and models. See vast dump help for more information.

Changelog Highlights

As always, you can find the full technical scoop in our changelog.

⚡️ Breaking Changes

  • The on-disk format for table slices now supports versioning of table slice encodings. This breaking change makes it so that adding further encodings or adding new versions of existing encodings is possible without breaking again in the future. #1143 #1157 #1160 #1165
  • CAF-encoded table slices no longer exist. As such, the option vast.import.batch-encoding now only supports arrow and msgpack as arguments. #1142
  • The port type is no longer a first-class type. The new way to represent transport-layer ports relies on count instead. In the schema, VAST ships with a new alias type port = count to keep existing schema definitions in tact. However, this is a breaking change because the on-disk format and Arrow data representation changed. Queries with :port type extractors no longer work. Similarly, the syntax 53/udp no longer exists; use count syntax 53 instead. Since most port occurrences do not carry a known transport-layer type, and the type information exists typically in a separate field, removing port as native type streamlines the data model. #1187
  • The build configuration of VAST received a major overhaul. Inclusion of libvast in other procects via add_subdirectory(path/to/vast) is now easily possible. The names of all build options were aligned, and the new build summary shows all available options. #1175

⚠️ Changes

  • Installed schema definitions now reside in <datadir>/vast/schema/types, taxonomy definitions in <datadir>/vast/schema/taxonomy, and concept definitions in <datadir/vast/schema/concepts, as opposed to them all being in the schema directory directly. When overriding an existing installation, you may have to delete the old schema definitions by hand. #1194
  • The Suricata schemas received an overhaul: there now exist vlan and in_iface fields in all types. In addition, VAST ships with new types for ikev2, nfs, snmp, tftp, rdp, sip and dcerpc. The tls type gets support for the additional sni and session_resumed fields. #1237 #1176 #1180 #1186 @satta
  • VAST does not produce metrics by default any more. The option --disable-metrics has been renamed to --enable-metrics accordingly. #1137
  • VAST now listens on port 42000 instead of letting the operating system choose the port if the option vast.endpoint specifies an endpoint without a port. To restore the old behavior, set the port to 0 explicitly. #1170

🧬 Experimental Features

  • The expression language gained support for the #field meta extractor. It is the complement for #type and uses suffix matching for field names at the layout level. #1228
  • The query language now supports models. Models combine a list of concepts into a semantic unit that can be fulfiled by an event. If the type of an event contains a field for every concept in a mode...
Read more

VAST 2020.10.29

29 Oct 09:02
2020.10.29
b1f7367
Compare
Choose a tag to compare

We're happy to announce the monthly release 2020.10.29 of VAST.

Taxonomies

This release includes an exciting experimental feature to deliver a scalable user experience as the number of different data formats in VAST keep increasing: taxonomies. With this feature, you can now define your own unified access layer to consolidate syntactic differences of the various data formats at play. By using a taxonomy, you establish a semantic frame over the domain you analyze. Thereafter you can write queries "in your own bubble" without having to juggle the various naming schemes of each individual data source.

Consider the scenario of having two types of network security logs: Sysmon from the endpoint and Suricata on the network side. When you want to query all flow events to a particular destination in both formats, you would combine two predicates as follows:

suricata.flow.src_ip == 6.6.6.6 || sysmon.NetworkConnection.SourceIp == 6.6.6.6

But what you actually want to write is:

source_ip == 6.6.6.6

Thanks to taxonomies, this is now possible. While the expert knows the semantics of the format-specific field names, memorizing the mapping from the meaning of a field to its name is not something we want to burden the user with. It’s inefficient, error-prone, and does not scale with dozens of different data formats.

We developed two building blocks: this release introduces concepts and models will follow next. Together, they enable abstraction and composition of data semantics. Please consult our documentation of taxonomies to learn more.

Experimental Feature: Age Rotation of Old Data

As more and more data is ingested into VAST, more and more disk space will be required to store this data. As a telemetry database, typically new data is imported into VAST continuously leading to a linear increase in disk space usage over time.

To help control this tide of data, we introduced a new experimental age rotation feature to VAST: Operators can now specify a disk budget, and when the size of the database exceeds the budget old data will be deleted.

With this, operators are able to decide on the desired retention period for their data and allocate the appropriate amount of disk space once without having to permanently clean up their disks.

The main user interface for the new age rotation are the vast.start.disk-budget-high and -low config options, which can also be specified as command-line flags to vast start. These define a corridor for the amount of disk space to be used for the database directory.

Increased Source Responsiveness

VAST ingest path has to process up to hundreds of thousands log lines per second. To cope with this volume, we use both batching and backpressure. However, the sheer number of messages that VAST has to process could result in overload, which manifested simply as an unresponsive component. For example, when an input source under high load needs to reply to a vast status request, the source often fails to come back with a reply in the given timeout (10s by default). Interestingly, the other extreme could also cause a timeout: when a source did not receive enough events, it was unable to yield back to the scheduler and thereby 'got stuck" on the inbound path, unable to handle any other form of interaction.

We fixed the issue by rewriting the control logic that handles CAF actor streams. The input sources can now handle large message volumes just fine while also remaining responsive and resource-efficient when the data path idles. To control the new behavior, the new option import.read-timeout can set an input timeout for low-volume sources. Reaching the timeout forwards the current batch immediately. Previously, the option import.batch-timeout controlled this behavior, which now only controls the maximum buffer time before the source forwards batches to the server.

Changelog Highlights

As always, you can find the full technical scoop of what changed in our changelog.

🧬 Experimental Features

  • The query language now comes with support for concepts, the first part of taxonomies. Concepts is a mechanism to unify the various naming schemes of different data formats into a single, coherent nomenclature. #1102
  • A new disk monitor component can now monitor the database size and delete data that exceeds a specified threshold. Once VAST reaches the maximum amount of disk space, the disk monitor deletes the oldest data. The command-line options --disk-quota-high, --disk-quota-low, and --disk-quota-check-interval control the rotation behavior. #1103

🎁 Features

  • The new options vast.segments and vast.max-segment-size control how the archive generates segments. #1103
  • When running VAST under systemd supervision, it is now possible to use the Type=notify directive in the unit file to let VAST notify the service manager when it becomes ready. #1091
  • The new script splunk-to-vast converts a splunk CIM model file in JSON to a VAST taxonomy. For example, splunk-to-vast < Network_Traffic.json renders the concept definitions for the Network Traffic datamodel. The generated taxonomy does not include field definitions, which users should add separately according to their data formats. #1121

⚠️ Changes

  • The new option import.read-timeout allows for setting an input timeout for low volume sources. Reaching the timeout causes the current batch to be forwarded immediately. This behavior was previously controlled by import.batch-timeout, which now only controls the maximum buffer time before the source forwards batches to the server. #1096
  • VAST will now warn if a client command connects to a server that runs on a different version of the vast binary #1098
  • The default database directory moved to /var/lib/vast for Linux deployments. #1116
  • Log files are now less verbose because class and function names are not printed on every line. #1107

🐞 Bug Fixes

  • The vast status --detailed command now correctly shows the status of all sources, i.e., vast import or vast spawn source commands. #1109
  • Sources that receive no or very little input do not block vast status any longer. #1096
  • VAST no longer opens a random public port, which used to be enabled in the experimental VAST cluster mode in order to transparently establish a full mesh. #1110
  • The lookup for schema directories now happens in a fixed order. #1086
  • The lsvast tool failed to print FlatBuffers schemas correctly. The output now renders correctly. #1123

VAST 2020.09.30

30 Sep 13:17
17fc510
Compare
Choose a tag to compare

We’re happy to announce the monthly release 2020.09.30 of VAST.

YAML Config

The VAST configuration file received a makeover: it now uses YAML syntax, the ops-friendly and industry standard. We ensured that the configuration and command line behave exactly the same by aligning the CLI hierarchy with the config file structure. VAST now looks for a vast.yaml configuration file instead of vast.conf. Every installation of VAST ships with a vast.yaml.example file that illustrates the new layout and serves as a reference for documentation options.

During startup, VAST looks for configuration files in the following places, and merges their content with the more specific files taking a higher precedence:

  • <sysconfdir>/vast/vast.yaml for system-wide configuration, where <sysconfdir> is the platform-specific directory for configuration files, e.g., /etc/vast.
  • ~/.config/vast/vast.yaml for user-specific configuration. VAST respects the XDG base directory specification and its environment variables.
  • The command line option --config=path/to/vast.yaml.

The top-level configuration file section vast bundles all options affecting VAST. Similarly, the top-level section caf contains all options that affect the underlying actor system framework CAF directly, allowing for more complex and sophisticated configurations.

Adding YAML support resulted in a new depency for VAST: yaml-cpp (≥0.6.2). This robust library provides a YAML 1.2 spec-compliant parser and printer, plus it enjoys wide availability on most platforms and package managers.

Index Optimizations

The layout of the on-disk data structures used for the index has changed. VAST divides the index state into horizontal partitions (aka. shards). Instead of creating one file per record field per partition, the index now creates only a single file per partition and dynamically maps the required parts into memory. Additionally, VAST no longer relies on the binary serialization protocol of CAF. Instead, a new FlatBuffers framing with better state versioning enables a reliable upgrade path when the on-disk format changes.

Moreover, VAST used to periodically re-write the whole state of the meta index to disk into a separate file. The rationale was that the contents of the meta index are much smaller than the contents of the index. However, for large databases even the much smaller meta index can grow to a size where this can disrupt disk I/O and slow down the indexing process. To prevent that, we’ve split up the information contained in the meta index and distributed it over all partitions, so every write is now limited to the incremental state since the previous partition.

Because I/O is such a delicate topic in data-intensive applications that must keep up with high-volume data sources, we also added a new asynchronous I/O abstraction to avoid blocking threads when they don’t have to. We’ve added a new filesystem actor that centralizes I/O operations, such as reads and writes. A nice side-effect is that it makes it dead-simply to support new filesystems in the future, e.g., HDFS or S3, by merely adding a new actor implementation that adheres to the same type-safe messaging API.

Better Introspection

We re-designed the output of the vast status command in a push for a better user experience. vast status now shows information about the system, grouped by its major components. By adding more flags, the command shows more details: vast status --detailed offers slightly more context, and --debug exposes a lot of internal state that is well-suited for developers.

Smaller Things

  • The new vast get <id> [ids...] command enables direct queries to the archive.
  • The JSON export format now renders the VAST duration and port as strings instead of numbers.
  • A new utility lsvast now ships with every VAST installation. It allows for inspecting the contents of the VAST database without running VAST.

Changelog Highlights

As always, you can find the full technical scoop of what changed in our changelog.

🎁 Features

  • The output of the status command was restructured with a strong focus on usability. The new flags --detailed and --debug add additional content to the output. #995
  • VAST now merges the contents of all used configuration files instead of using only the most user-specific file. The file specified using --config takes the highest precedence, followed by the user-specific path ${XDG_CONFIG_HOME:-${HOME}/.config}/vast/vast.yaml, and the compile-time path <sysconfdir>/vast/vast.yaml #1040
  • VAST now ships with a new tool lsvast to display information about the contents of a VAST database directory. See lsvast --help for usage instructions. #863
  • VAST now supports the XDG base directory specification: The vast.yaml is now found at ${XDG_CONFIG_HOME:-${HOME}/.config}/vast/vast.yaml, and schema files at ${XDG_DATA_HOME:-${HOME}/.local/share}/vast/schema/. The user-specific configuration file takes precedence over the global configuration file in <sysconfdir>/vast/vast.yaml. #1036

🧬 Experimental Features

  • The vast get command has been added. It retrieves events from the database directly by their IDs. #938

⚠️ Changes

  • All configuration options are now grouped into vast and caf sections, depending on whether they affect VAST itself or are handed through to the underlying actor framework CAF directly. Take a look at the bundled vast.yaml.example file for an explanation of the new layout. #1073
  • Data exported in the Apache Arrow format now contains the name of the payload record type in the metadata section of the schema. #1072
  • The JSON export format now renders duration and port fields using strings as opposed to numbers. This avoids a possible loss of information and enables users to re-use the output in follow-up queries directly. #1034
  • The delay between the periodic log messages for reporting the current event rates has been increased to 10 seconds. #1035
  • The global VAST configuration now always resides in <sysconfdir>/vast/vast.yaml, and bundled schemas always in <datadir>/vast/schema/. VAST no longer supports reading a configuration file in the current working directory. #1036
  • The options that affect batches in the import command received new, more user-facing names: import.table-slice-type, import.table-slice-size, and import.read-timeout are now called import.batch-encoding, import.batch-size, and import.batch-timeout respectively. #1058
  • The persistent storage format of the index now uses FlatBuffers. #863
  • The prioprietary VAST configuration file has changed to the more ops-friendly industry standard YAML. This change introduced also a new dependency: yaml-cpp version 0.6.2 or greater. The top-level vast.yaml.example illustrates how the new YAML config looks like. Please rename existing configuration files from vast.conf to vast.yaml. VAST still reads vast.conf but will soon only look for vast.yaml or vast.yml files in available configuration file paths. #1045 #1055 #1059 #1062
  • We refactored the index architecture to improve stability and responsiveness. This includes fixes for several shutdown issues. #863

🐞 Bug Fixes

  • Stalled sources that were unable to generate new events no longer stop import processes from shutting down under rare circumstances. #1058

VAST 2020.08.28

28 Aug 09:07
b605059
Compare
Choose a tag to compare

We’re happy to announce the monthly release 2020.08.28 of our stack.

Robustness and State Recovery

We found several bugs during the shutdown process of a VAST server process, which could have caused an unresponsive process and potential loss of state. VAST now uses a multi-stage procedure to terminate itself: first attempt to shutdown all components cleanly, falling back to a hard kill afterwards, and if that fails with another timeout, the process will call abort(3).

In stress testing, we identified and fixed issues with operating VAST under high load: For large database directories, a partial read during startup corrupted the index state. We fixed both the reading behavior that led to partial reads and the possible corruption. An overflow in CAF's stream slot identifiers could deadlock the system. We deployed a workaround and have proposed a proper fix upstream.

To avoid multiple VAST processes accessing the same database directory, VAST now atomically creates a PID lock file in the database directory on startup. This ensures that at most one VAST server process can operate the persistent state.

Straightening the Data Model

The vector type has been renamed to list. In an effort to streamline the type system vocabulary, we favor list over vector because it’s closer to terminology in the ecosystem (e.g., Apache Arrow). This change requires updating existing schemas by changing vector<T> to list<T>.

Additionally, the set type has been removed. Experience with the data model showed that there is no strong use case to separate sets from lists in the VAST core. While a set data type proves useful in programming languages, VAST deals with immutable data where set constraints have been enforced upon generating the data. This change requires updating existing schemas by changing set<T> to list<T>. In the query language, the symbol for the empty map changed from {-} to {}, as it now unambiguously identifies map instances.

Changelog Highlights

As always, you can find the full technical scoop of what changed in our changelog.

🎁 Features

  • The default schema for Suricata has been updated to support the suricata.ftp and suricata.ftp_data event types. #1009
  • VAST now writes a PID lock file on startup to prevent multiple server processes from accessing the same persistent state. The pid.lock file resides in the vast.db directory. #1001

⚠️ Changes

  • The vector type has been renamed to list. In an effort to streamline the type system vocabulary, we favor list over vector because it's closer to existing terminology (e.g., Apache Arrow). This change requires updating existing schemas by changing vector<T> to list<T>. #1016
  • The set type has been removed. Experience with the data model showed that there is no strong use case to separate sets from vectors in the core. While this may be useful in programming languages, VAST deals with immutable data where set constraints have been enforced upstream. This change requires updating existing schemas by changing set<T> to vector<T>. In the query language, the new symbol for the empty map changed from {-} to {}, as it now unambiguously identifies map instances. #1010

🐞 Bug Fixes

  • VAST did not terminate when a critical component failed during startup. VAST now binds the lifetime of the node to all critical components. #1028
  • VAST would overwrite existing on-disk state data when encountering a partial read during startup. This state-corrupting behavior no longer exists. #1026
  • Incomplete reads have not been handled properly, which manifested for files larger than 2GB. On macOS, writing files larger than 2GB may have failed previously. VAST now respects OS-specific constraints on the maximum block size. #1025
  • The shutdown process of the server process could potentially hang forever. VAST now uses a 2-step procedure that first attempts to terminate all components cleanly. If that fails, it will attempt a hard kill afterwards, and if that fails after another timeout, the process will call abort(3). #1005
  • When running VAST under heavy load, CAF stream slot ids could wrap around after a few days and deadlock the system. As a workaround, we extended the slot id bit width to make the time until this happens unrealistically large. #1020
  • Some file descriptors remained open when they weren't needed any more. This descriptor leak has been fixed. #1018
  • Importing JSON no longer fails for JSON fields containing null when the corresponding VAST type in the schema is a non-trivial type like vector<string>. #1009

VAST 2020.07.28

28 Jul 11:35
464f82b
Compare
Choose a tag to compare

We’re happy to announce the monthly VAST release 2020.07.28! 🎉

Our official community chat is now tenzir.element.io. We are looking forward to engaging with our users and everybody else who is interested in our open-source projects. We chose Matrix because we want to promote an open communication platform that allows users to choose their preferred client.

FlatBuffers

We are continuing to transform the persistent state of VAST into a vendor-neutral format that supports clear versioning to simplify updates. This release adds a new dependency for this purpose: FlatBuffers. We completed the first of the three migration steps, and the team is working heavily on the two remaining steps. In this release, the archive already uses the new FlatBuffers state.

MessagePack

We have also worked on performance: our MessagePack-encoding for table slices is now open-source and the new default when Apache Arrow support is unavailable. MessagePack table slices represent events in row-major format, which is more suited for dense binary formats with little metadata, such as PCAP. Cache-friendly access patterns and a dense representation make MessagePack a good alternative to Apache Arrow for high-volume non-log data. You can enable MessagePack by setting import.table-slice-type = 'msgpack' in the configuration.

Static Binaries

To make trying out VAST easier than ever before, we now offer a statically linked binary on Linux for every commit to master, as well as for every release. Our installation instructions contain the details.

Changelog Highlights

As always, you can find the full technical scoop of what changed in our changelog. Here are the highlights:

🎁 Features

  • We open-sourced our MessagePack-based table slice implementation, which provides a compact row-oriented encoding of data. This encoding works well for binary formats (e.g., PCAP) and access patterns that involve materializing entire rows. The MessagePack table slice is the new default when Apache Arrow is unavailable. To enable parsing into MessagePack, you can pass --table-slice-type=msgpack to the import command, or set the configuration option import.table-slice-type to 'msgpack'. #975
  • Starting with this release, installing VAST on any Linux becomes significantly easier: A static binary will be provided with each release on the GitHub releases page. #966

⚠️ Changes

  • VAST now recognizes /etc/vast/schema as an additional default directory for schema files. #980
  • FlatBuffers is now a required dependency for VAST. The archive and the segment store use FlatBuffers to store and version their on-disk persistent state. #972
  • A type definition for the stats, krb5, smb, and ssh events was added to the suricata schema file. #954 #986

🐞 Bug Fixes

  • The PCAP reader now correctly shows the amount of generated events. #954

VAST 2020.06.25

25 Jun 09:24
9e78297
Compare
Choose a tag to compare

We're happy to announce the monthly release 2020.06.25 of VAST. This month we wound up with a good balance between improving robustness and adding new features. Please see the CHANGELOG for a complete list of changes.

Aging Data

The aging feature is now open-source. Aging is the periodic removal of existing data. This helps in situations when you have a disk budget or when there exist data retention policies. We marked this feature as experimental because the deletion currently affects the archive only. Even though the data is no longer materializable that way, the corresponding index entries still exist. Since an index is lossy, cleaning out the data structures in there is actually not trivial. But before marking this feature as stable, we will come up with a solution.

Faster IP Address Queries

We also added an optimization to improve the query latency for IP address point queries. When you query for a specific address, say 6.6.6.6, you now get an instant answer when there is no reference to that IP address in the database. This helps when having multiple queries of the form “did this thing hit us in the past 12 month?” If the address exists, VAST is now much smarter in selecting the relevant index partitions. Internally, we achieved this by adding a new Bloom filter synopsis to the index.

🎁 Features

  • The meta index now uses Bloom filters for equality queries involving IP addresses. This especially accelerates queries where the user wants to know whether a certain IP address exists in the entire database. #931

  • The import command gained a new --read-timeout option that forces data to be forwarded to the importer regardless of the internal batching parameters and table slices being unfinished. This allows for reducing the latency between the import command and the node. The default timeout is 10 seconds. #916

  • VAST now has options to limit the amount of results produced by an invocation of vast explore. #882

  • The import json command's type restrictions are more relaxed now, and can additionally convert from JSON strings to VAST internal data types. #891

  • VAST now supports /etc/vast/vast.conf as an additional fallback for the configuration file. The following file locations are looked at in order: Path specified on the command line via --config=path/to/vast.conf, vast.conf in current working directory, ${INSTALL_PREFIX}/etc/vast/vast.conf, and /etc/vast/vast.conf. #898

🧬 Experimental Features

  • VAST now supports aging out existing data. This feature currently only concerns data in the archive. The options system.aging-frequency and system.aging-query configure a query that runs on a regular schedule to determine which events to delete. It is also possible to trigger an aging cycle manually. #929

⚠️ Changes

  • The options system.table-slice-type and system.table-slice-size have been removed, as they duplicated import.table-slice-type and import.table-slice-size respectively. #908 #951

  • The default table slice type has been renamed to caf. It has not been the default when built with Apache Arrow support for a while now, and the new name more accurately reflects what it is doing. #948

  • The JSON export format now renders timestamps using strings instead of numbers in order to avoid possible loss of precision. #909

🐞 Bug Fixes

  • A bogus import process that assembled table slices with a greater number of events than expected by the node was able to lead to wrong query results. #908

  • A use after free bug would sometimes crash the node while it was shutting down. #896

  • The export json command now correctly unescapes its output. #910

  • VAST now correctly checks for control characters in inputs. #910