From 005ac1773abbc9516a020d9f9e879b1817d6ef73 Mon Sep 17 00:00:00 2001 From: Jordan Frazier Date: Tue, 7 Nov 2023 11:12:25 -0800 Subject: [PATCH] revert docs-src changes (old site) and chrome tracing --- .../src/chrome_tracing/trace_event.rs | 2 +- .../modules/developing/pages/queries.adoc | 26 +++++++------- .../getting-started/examples/unique-code.fenl | 2 +- .../pages/hello-world-cli.adoc | 16 ++++----- .../pages/hello-world-jupyter.adoc | 16 ++++----- .../modules/getting-started/pages/index.adoc | 34 +++++++++---------- docs-src/modules/overview/pages/faq.adoc | 24 ++++++------- .../overview/pages/what-is-kaskada.adoc | 30 ++++++++-------- .../pages/training-realtime-ml-models.adoc | 10 +++--- 9 files changed, 80 insertions(+), 80 deletions(-) diff --git a/crates/sparrow-qfr-tool/src/chrome_tracing/trace_event.rs b/crates/sparrow-qfr-tool/src/chrome_tracing/trace_event.rs index 785cb4db6..7dee23c5b 100644 --- a/crates/sparrow-qfr-tool/src/chrome_tracing/trace_event.rs +++ b/crates/sparrow-qfr-tool/src/chrome_tracing/trace_event.rs @@ -124,7 +124,7 @@ pub(crate) enum Event { #[derive(Serialize, Eq, PartialEq, Debug, Default)] #[allow(dead_code)] pub(crate) enum EventScope { - /// The event will be drawn from the top to bottom of the timestream. + /// The event will be drawn from the top to bottom of the timeline. #[serde(rename = "g")] Global, /// The event will be drawn through all threads of a given process. diff --git a/docs-src/modules/developing/pages/queries.adoc b/docs-src/modules/developing/pages/queries.adoc index 5d4201ce5..e7bdee8f4 100644 --- a/docs-src/modules/developing/pages/queries.adoc +++ b/docs-src/modules/developing/pages/queries.adoc @@ -20,19 +20,19 @@ The following is a quick overview of the query language's main features and synt === Viewing and filtering the contents of a table Kaskada queries are built by composing simple expressions. -Every expression returns a timestream. +Every expression returns a timeline. [source,Fenl] ---- Purchase | when(Purchase.amount > 10) ---- -In this example we start with the expression `Purchase` (the timestream of all purchase events) then filter it using `xref:fenl:catalog.adoc#when[when()]`. -The result is a timestream of purchase events whose amount is greater than 10. +In this example we start with the expression `Purchase` (the timeline of all purchase events) then filter it using `xref:fenl:catalog.adoc#when[when()]`. +The result is a timeline of purchase events whose amount is greater than 10. === Stateful aggregations -Aggregate events to produce a continuous timestream whose value can be observed at arbitrary points in time. +Aggregate events to produce a continuous timeline whose value can be observed at arbitrary points in time. [source,Fenl] ---- @@ -47,8 +47,8 @@ Records allow one or more values to be grouped into a single row. You can create a record using the syntax `{key: value, key2: value2}`. ==== -In this example we first filter the timestream of `Review` events to only include verified reviews, then aggregate the filtered results using the `xref:fenl:catalog.adoc#max[max()]` aggregation. -The resulting timestream describes the maximum number of stars as-of every point in time. +In this example we first filter the timeline of `Review` events to only include verified reviews, then aggregate the filtered results using the `xref:fenl:catalog.adoc#max[max()]` aggregation. +The resulting timeline describes the maximum number of stars as-of every point in time. === Automatic joins @@ -60,7 +60,7 @@ Every expression is associated with an xref:fenl:entities.adoc[entity], allowing ---- Here we've used the `xref:fenl:catalog.adoc#count[count()]` aggregation to divide the number of purchases up to each point in time by the number of pageviews up to the same point in time. -The result is a timestream describing how each user's purchase-per-pageview changes over time. +The result is a timeline describing how each user's purchase-per-pageview changes over time. Since both the `Purchase` and `Pageview` tables have the same entity, we can easily combine them. === Event-based windowing @@ -80,7 +80,7 @@ By default, aggregations are applied from the beginning of time, but here we've === Pipelined operations -Pipe syntax allows multiple operations to be chained together. Write your operations in the same order you think about them. It's timestreams all the way down, making it easy to aggregate the results of aggregations. +Pipe syntax allows multiple operations to be chained together. Write your operations in the same order you think about them. It's timelines all the way down, making it easy to aggregate the results of aggregations. [source,Fenl] ---- @@ -107,7 +107,7 @@ Pivot from events to time-series. Unlike grouped aggregates, xref:fenl:catalog.a === Continuous expressions -Observe the value of aggregations at arbitrary points in time. Timestreams are either “xref:fenl:continuity.adoc#discrete-expressions[discrete]” (instantaneous values or events) or “xref:fenl:continuity.adoc#continuous-expressions[continuous]” (values produced by a stateful aggregations). Continuous timestreams let you combine aggregates computed from different event sources. +Observe the value of aggregations at arbitrary points in time. Timelines are either “xref:fenl:continuity.adoc#discrete-expressions[discrete]” (instantaneous values or events) or “xref:fenl:continuity.adoc#continuous-expressions[continuous]” (values produced by a stateful aggregations). Continuous timelines let you combine aggregates computed from different event sources. [source,Fenl] ---- @@ -135,7 +135,7 @@ let purchases_yesterday = in { purchases_in_last_day: purchases_now - purchases_yesterday } ---- -In this example we take the timestream produced by `purchases_now` and move it forward in time by one day using the `xref:fenl:catalog.adoc#shift-by[shift_by()]` function. +In this example we take the timeline produced by `purchases_now` and move it forward in time by one day using the `xref:fenl:catalog.adoc#shift-by[shift_by()]` function. We then subtract the shifted value from the original, unshifted value === Simple, composable syntax @@ -167,11 +167,11 @@ in {hourly_big_purchases} A given query can be computed in different ways. -=== Configuring how timestreams are converted into tables +=== Configuring how timelines are converted into tables -You can either return a table describing each change in the timestream, or a table describing the "final" value of the timestream. +You can either return a table describing each change in the timeline, or a table describing the "final" value of the timeline. -Every query produces a timestream which may be returned in two different ways -- the final results (at a specific time) or all historic results. +Every query produces a timeline which may be returned in two different ways -- the final results (at a specific time) or all historic results. The "result behavior" configures which results are produced. Queries for historic results return the full history of how the values changed over time for each entity. Queries for final results return the latest result for each entity at the specified time (default is after all events have been processed). diff --git a/docs-src/modules/getting-started/examples/unique-code.fenl b/docs-src/modules/getting-started/examples/unique-code.fenl index d56e7660f..0b127ad32 100644 --- a/docs-src/modules/getting-started/examples/unique-code.fenl +++ b/docs-src/modules/getting-started/examples/unique-code.fenl @@ -9,7 +9,7 @@ let hourly_big_purchases = Purchase # Aggregate anything | when(hourly()) -# Shift timestreams relative to each other +# Shift timelines relative to each other let purchases_now = count(Purchase) let purchases_yesterday = purchases_now | shift_by(days(1)) diff --git a/docs-src/modules/getting-started/pages/hello-world-cli.adoc b/docs-src/modules/getting-started/pages/hello-world-cli.adoc index 060984964..dc85d25a7 100644 --- a/docs-src/modules/getting-started/pages/hello-world-cli.adoc +++ b/docs-src/modules/getting-started/pages/hello-world-cli.adoc @@ -90,7 +90,7 @@ You should see output similar to the following: Kaskada stores data in _tables_. Tables consist of multiple rows, and each row is a value of the same type. -When querying Kaskada, the contents of a table are interpreted as a xref:fenl:continuity.adoc[discrete timestream]: the value associated with each event corresponds to a value in the timestream. +When querying Kaskada, the contents of a table are interpreted as a xref:fenl:continuity.adoc[discrete timeline]: the value associated with each event corresponds to a value in the timeline. === Creating a Table @@ -264,7 +264,7 @@ _time,_subsort,_key_hash,_key,id,purchase_time,customer_id,vendor_id,amount,subs === Complex Examples with Fenl functions In this example, we build a pipeline of functions using the `|` character. -We begin with the timestream produced by the table `Purchase`, then filter it to the set of times where the purchase's customer is `"patrick"` using the `xref:fenl:catalog.adoc#when[when()]` function. +We begin with the timeline produced by the table `Purchase`, then filter it to the set of times where the purchase's customer is `"patrick"` using the `xref:fenl:catalog.adoc#when[when()]` function. Kaskada's query language provides a rich set of xref:fenl:catalog.adoc[operations] for reasoning about time. Here's a more sophisticated example that touches on many of the unique features of Kaskada queries: @@ -281,15 +281,15 @@ include::partial$cli-unique.adoc[] A given query can be computed in different ways. You can configure how a query is executed by providing arguments to the CLI command. -==== Changing how the result timestream is output +==== Changing how the result timeline is output -When you make a query, the resulting timestream is interpreted in one of two ways: as a history or as a snapshot. +When you make a query, the resulting timeline is interpreted in one of two ways: as a history or as a snapshot. -* A timestream *History* generates a value each time there is a change in the value for the entity, and each row is associated with a different entity and point in time. -* A timestream *Snapshot* generates a value for each entity at the same point in time; each row is associated with a different entity, but all rows are associated with the same time. +* A timeline *History* generates a value each time there is a change in the value for the entity, and each row is associated with a different entity and point in time. +* A timeline *Snapshot* generates a value for each entity at the same point in time; each row is associated with a different entity, but all rows are associated with the same time. -By default, timestreams are output as histories. -You can output a timestream as a snapshot by setting the `--result-behavior` argument to `final-results`. +By default, timelines are output as histories. +You can output a timeline as a snapshot by setting the `--result-behavior` argument to `final-results`. [source,Fenl] ---- diff --git a/docs-src/modules/getting-started/pages/hello-world-jupyter.adoc b/docs-src/modules/getting-started/pages/hello-world-jupyter.adoc index 52dd353be..9d2dd0022 100644 --- a/docs-src/modules/getting-started/pages/hello-world-jupyter.adoc +++ b/docs-src/modules/getting-started/pages/hello-world-jupyter.adoc @@ -140,7 +140,7 @@ Congratulations, you now have Kaskada locally installed and you can start loadin Kaskada stores data in _tables_. Tables consist of multiple rows, and each row is a value of the same type. -When querying Kaskada, the contents of a table are interpreted as a xref:fenl:continuity.adoc[discrete timestream]: the value associated with each event corresponds to a value in the timestream. +When querying Kaskada, the contents of a table are interpreted as a xref:fenl:continuity.adoc[discrete timeline]: the value associated with each event corresponds to a value in the timeline. === Creating a Table @@ -308,7 +308,7 @@ This makes it easier to see how a single entity changes over time. include::partial$nb-filter-patrick.adoc[] In this example, we build a pipeline of functions using the `|` character. -We begin with the timestream produced by the table `Purchase`, then filter it to the set of times where the purchase's customer is `"patrick"` using the `xref:fenl:catalog.adoc#when[when()]` function. +We begin with the timeline produced by the table `Purchase`, then filter it to the set of times where the purchase's customer is `"patrick"` using the `xref:fenl:catalog.adoc#when[when()]` function. Kaskada's query language provides a rich set of operations for reasoning about time. Here's a more sophisticated example that touches on many of the unique features of Kaskada queries: @@ -324,15 +324,15 @@ include::partial$nb-unique.adoc[] A given query can be computed in different ways. You can configure how a query is executed by providing flags to the `%%fenl` block. -==== Changing how the result timestream is output +==== Changing how the result timeline is output -When you make a query, the resulting timestream is interpreted in one of two ways: as a history or as a snapshot. +When you make a query, the resulting timeline is interpreted in one of two ways: as a history or as a snapshot. -* A timestream *History* generates a value each time the timestream changes, and each row is associated with a different entity and point in time. -* A timestream *Snapshot* generates a value for each entity at the same point in time; each row is associated with a different entity, but all rows are associated with the same time. +* A timeline *History* generates a value each time the timeline changes, and each row is associated with a different entity and point in time. +* A timeline *Snapshot* generates a value for each entity at the same point in time; each row is associated with a different entity, but all rows are associated with the same time. -By default, timestreams are output as histories. -You can output a timestream as a snapshot by setting the `--result-behavior` fenlmagic argument to `final-results`. +By default, timelines are output as histories. +You can output a timeline as a snapshot by setting the `--result-behavior` fenlmagic argument to `final-results`. include::partial$nb-filter-patrick-final.adoc[] diff --git a/docs-src/modules/getting-started/pages/index.adoc b/docs-src/modules/getting-started/pages/index.adoc index f1ed4f2e3..f10e7cd97 100644 --- a/docs-src/modules/getting-started/pages/index.adoc +++ b/docs-src/modules/getting-started/pages/index.adoc @@ -7,17 +7,17 @@ This document will show you how to quickly get started using Kaskada. [TIP] ==== Before jumping in to writing queries in Kaskada, it's a good idea to take a minute understanding "how to think with Kaskada". -Kaskada is built on timestreams - a simple, powerful abstraction that may be different than what you're accustomed to. +Kaskada is built on timelines - a simple, powerful abstraction that may be different than what you're accustomed to. If you'd prefer to jump right in, scroll down to the xref:#quick-starts[] section ==== -== Thinking with Kaskada's timestreams +== Thinking with Kaskada's timelines -Kaskada is built on the idea of a _timestream_ - the history of how a value changes over time for a specific entity or group. +Kaskada is built on the idea of a _timeline_ - the history of how a value changes over time for a specific entity or group. [stream_viz,name=basic-sum] -.Aggregating events as a timestream +.Aggregating events as a timeline .... [ { @@ -47,22 +47,22 @@ Kaskada is built on the idea of a _timestream_ - the history of how a value chan ] .... -timestreams allow you to reason about temporal context, time travel, sequencing, time-series, and more. +Timelines allow you to reason about temporal context, time travel, sequencing, time-series, and more. They allow simple, composable, declarative queries over events. -Transforming and combining timestreams allows you to intuitively express computations over events. +Transforming and combining timelines allows you to intuitively express computations over events. **** -⭢ Read more about timestreams in xref:overview:what-is-kaskada.adoc[] +⭢ Read more about timelines in xref:overview:what-is-kaskada.adoc[] **** -With Kaskada, it's timestreams all the way down - every operation's inputs and outputs are timestreams. -The timestream is a flexible abstraction that can be used in different ways depending on your needs. -In some cases, you may want to know what the timestream's value is at a specific point in time, in other cases you may want to know how the timestream's value changes over time. +With Kaskada, it's timelines all the way down - every operation's inputs and outputs are timelines. +The timeline is a flexible abstraction that can be used in different ways depending on your needs. +In some cases, you may want to know what the timeline's value is at a specific point in time, in other cases you may want to know how the timeline's value changes over time. -When making a query, you configure how to use the query's output timestream: you can use the timestream as either a _history_ or as a _snapshot_. +When making a query, you configure how to use the query's output timeline: you can use the timeline as either a _history_ or as a _snapshot_. -* A timestream *History* contains a value each time the timestream changes, and each row describes a different entity at a different point in time. -* A timestream *Snapshot* contains a value for each entity at the _same_ point in time; each row is associated with a different entity, but all rows reflect the same point in time. +* A timeline *History* contains a value each time the timeline changes, and each row describes a different entity at a different point in time. +* A timeline *Snapshot* contains a value for each entity at the _same_ point in time; each row is associated with a different entity, but all rows reflect the same point in time. This output may be written in different ways -- for example, they may be written to a Parquet file or sent as events to a stream. @@ -88,7 +88,7 @@ Kaskada can be configured to run as a remote service or as a local process. == Kaskada's data model -Timestreams begin with xref:developing:tables.adoc[tables]. +Timelines begin with xref:developing:tables.adoc[tables]. Tables are how Kaskada stores input events. Tables consist of multiple events, and each event is a value of the same xref:fenl:data-model.adoc[type]. @@ -102,10 +102,10 @@ Tables consist of multiple events, and each event is a value of the same xref:fe | 8:52 | Alice | `{amount: 4}` |=== -When querying Kaskada, the contents of a table are interpreted as a xref:fenl:continuity.adoc[discrete timestream]: the value associated with each event corresponds to a value in the timestream. +When querying Kaskada, the contents of a table are interpreted as a xref:fenl:continuity.adoc[discrete timeline]: the value associated with each event corresponds to a value in the timeline. -[stream_viz,name=purchase-timestream] -.Discrete timestream describing the `Purchase` table +[stream_viz,name=purchase-timeline] +.Discrete timeline describing the `Purchase` table .... [ { diff --git a/docs-src/modules/overview/pages/faq.adoc b/docs-src/modules/overview/pages/faq.adoc index 140eba303..e0781995f 100644 --- a/docs-src/modules/overview/pages/faq.adoc +++ b/docs-src/modules/overview/pages/faq.adoc @@ -10,12 +10,12 @@ Using SQL as the query language forces a lossy conversion from the natural repre There's a reason that https://tinkerpop.apache.org/gremlin.html[Gremlin] is popular for graphs and https://prometheus.io/docs/prometheus/latest/querying/basics/[PromQL] for timeseries. These query languages use fundamental abstractions that are aligned with the data being queried. -We believe that most appropriate abstraction for reasoning about event data is the _timestream_, and we built our query language around this idea. +We believe that most appropriate abstraction for reasoning about event data is the _timeline_, and we built our query language around this idea. -* Timestreams capture the richness of a raw event feed without leaking implementation details such as "bulk vs streaming". -* Timestreams are more general than timeseries but are compatible with timeseries operations. -* Timestreams are less general than tables because they model time explicitly. While this limits the kinds of data you can work with, it allows for much more natural expressions of sequential and temporal relationships. -* Timestreams have a familiar and useful http://worrydream.com/refs/Brooks-NoSilverBullet.pdf["geometric abstraction"] that helps you reason about time visually. +* Timelines capture the richness of a raw event feed without leaking implementation details such as "bulk vs streaming". +* Timelines are more general than timeseries but are compatible with timeseries operations. +* Timelines are less general than tables because they model time explicitly. While this limits the kinds of data you can work with, it allows for much more natural expressions of sequential and temporal relationships. +* Timelines have a familiar and useful http://worrydream.com/refs/Brooks-NoSilverBullet.pdf["geometric abstraction"] that helps you reason about time visually. == How hard is it to learn the query language? @@ -57,7 +57,7 @@ Data loading is expensive becuse it involves sorting events chronologically. Kaskada currently executes each query in a single process. Fully-distributed execution is on our roadmap, however we find that "vertical scaling" is sufficient for the vast majority of use cases. -== How do Timestreams relate to Timeseries? +== How do Timelines relate to Timeseries? Timeseries databases are a popular way to work with temporal data. A timeseries captures a series of values, each associated with a different time. @@ -67,12 +67,12 @@ Having a pre-defined series is useful for some operations, for example it is eas The downside to starting with a standard interval is that in some cases your source data doesn't conform to the timeseries format - timeseries are often generated by counting event occurrences in each time interval. Information is often lost in the transformation from instantaneous events into windowed aggregations. -Timestreams are similar to timeseries - both capture values associated with different times. -The difference is that a timestream describes an arbitrary number of values and doesn't depend on a standard interval. -In this sense, a timeseries is a special-case of a timestream. +Timelines are similar to timeseries - both capture values associated with different times. +The difference is that a timeline describes an arbitrary number of values and doesn't depend on a standard interval. +In this sense, a timeseries is a special-case of a timeline. -Kaskada provides operations for transforming a timestream into a time series. -For example, to transform an event timestream `Purchase` into a daily event-count timeseries: +Kaskada provides operations for transforming a timeline into a time series. +For example, to transform an event timeline `Purchase` into a daily event-count timeseries: [source,Fenl] ---- @@ -118,7 +118,7 @@ Kaskada is designed to allow practitioners to describe the full set of cleaning -// == How can I implement point-in-time lookups using Timestreams? +// == How can I implement point-in-time lookups using Timelines? // == What data sources can Kaskada integrate with? diff --git a/docs-src/modules/overview/pages/what-is-kaskada.adoc b/docs-src/modules/overview/pages/what-is-kaskada.adoc index d716fb2e3..000ff4f66 100644 --- a/docs-src/modules/overview/pages/what-is-kaskada.adoc +++ b/docs-src/modules/overview/pages/what-is-kaskada.adoc @@ -5,11 +5,11 @@ You need the ability to understand if what just happened is unusual, how it rela Getting to this type of contextual real-time insight has historically been difficult, as it required bringing together incompatible tools designed for either bulk or streaming applications. Recent stream-processing frameworks make it easier to work across streams and bulk data sources, but force you to pick one: either you get the power of a low-level API or the convenience of a high-level query language. -Kaskada provides a single, high-level, declarative query language. The power and convenience of Kaskada's query language come from the fact that it's built from a new abstraction: the timestream. timestreams give you the declarative transformations and aggregations of SQL without losing the ability to reason about temporal context, time travel, sequencing, timeseries, etc. Any query can be used, unchanged, in either batch or streaming mode. +Kaskada provides a single, high-level, declarative query language. The power and convenience of Kaskada's query language come from the fact that it's built from a new abstraction: the timeline. Timelines give you the declarative transformations and aggregations of SQL without losing the ability to reason about temporal context, time travel, sequencing, timeseries, etc. Any query can be used, unchanged, in either batch or streaming mode. -== What are Timestreams? +== What are Timelines? -Having the right tool makes every job easier: different data-processing jobs benefit from different ways of thinking. Tables are useful for inter-related records, graphs are useful for thinking about networks - Kaskada was designed for thinking about changes over time, and is built on the idea of a timestream. +Having the right tool makes every job easier: different data-processing jobs benefit from different ways of thinking. Tables are useful for inter-related records, graphs are useful for thinking about networks - Kaskada was designed for thinking about changes over time, and is built on the idea of a timeline. .Where I was at various times [stream_viz,name=my-location] @@ -31,9 +31,9 @@ Having the right tool makes every job easier: different data-processing jobs ben ] .... -A timestream describes how a value changes over time. In the same way that SQL queries transform tables and graph queries transform nodes and edges, Kaskada queries transforms timestreams. In comparison to a timeseries which is defined at fixed, periodic times (i.e., every minute), a timestream is defined at arbitrary times. +A timeline describes how a value changes over time. In the same way that SQL queries transform tables and graph queries transform nodes and edges, Kaskada queries transforms timelines. In comparison to a timeseries which is defined at fixed, periodic times (i.e., every minute), a timeline is defined at arbitrary times. -Timestreams simplify reasoning about time, change, and behavior. Timestreams support SQL’s aggregations but extend them with sequential operations typically provided by complex event processing (CEP) systems. +Timelines simplify reasoning about time, change, and behavior. Timelines support SQL’s aggregations but extend them with sequential operations typically provided by complex event processing (CEP) systems. == Current Challenges @@ -93,11 +93,11 @@ GROUP BY user_id If you wanted to know how that value has changed over time you'd need to re-write the query from scratch, and the result would be too long to show in this quick introduction. -Many time and sequence related questions end up being surprisingly hard to answer with SQL. This is where the notion of timestreams can make your life much easier. +Many time and sequence related questions end up being surprisingly hard to answer with SQL. This is where the notion of timelines can make your life much easier. -== The solution offered by timestreams +== The solution offered by timelines -Rather than thinking of each event as a row in a table, we can think of it as a point along a timestream. +Rather than thinking of each event as a row in a table, we can think of it as a point along a timeline. [stream_viz,name=purchase] .... @@ -118,7 +118,7 @@ Rather than thinking of each event as a row in a table, we can think of it as a ] .... -Kaskada provides many ways of transforming timestreams, for example we can compute the simple sum we saw earlier: +Kaskada provides many ways of transforming timelines, for example we can compute the simple sum we saw earlier: [source,fenl] ---- @@ -155,9 +155,9 @@ Purchase.amount | sum() ] .... -Aggregating a timestream produces a _new_ timestream - rather than computing a single answer, the timestream describes how the result of the aggregation changes over time. +Aggregating a timeline produces a _new_ timeline - rather than computing a single answer, the timeline describes how the result of the aggregation changes over time. -Since the value of a timestream is specific to a point in time, we can easily describe aggregations in a temporal context. +Since the value of a timeline is specific to a point in time, we can easily describe aggregations in a temporal context. See how easy it is to describe the earlier example of counting page views since the last purchase: [source,fenl] @@ -204,7 +204,7 @@ Pageview ] .... -This timestream describes the result of a query at every point in time, so we can easily observe its value at specific points in time without making any changes to the query: +This timeline describes the result of a query at every point in time, so we can easily observe its value at specific points in time without making any changes to the query: [source,fenl] ---- @@ -287,7 +287,7 @@ Pageview Finally, we're not limited to only thinking about a single point in time. -By shifting timestreams relative to each other we can easily describe how values change over time, for example how the previous result has changed hour-over-hour: +By shifting timelines relative to each other we can easily describe how values change over time, for example how the previous result has changed hour-over-hour: [source,fenl] ---- @@ -342,7 +342,7 @@ in daily_average - (daily_average | shift_by(hours(1))) ] .... -Writing these simple-seeming queries over timestreams with SQL queries over tables would have been _much_ harder, more verbose, and less maintainable due to the lack of alignment between the problem and the abstractions used to solve the problem. +Writing these simple-seeming queries over timelines with SQL queries over tables would have been _much_ harder, more verbose, and less maintainable due to the lack of alignment between the problem and the abstractions used to solve the problem. Aligning our mental model with the problem being solved makes reasoning about time and behavior much easier. == The shift away from technology-specific solutions @@ -355,7 +355,7 @@ SQL queries written against OLAP offline data stores often aren't supported by s While some real-time systems support "streaming SQL", streams and tables are very different things and much of the power of stream processing is lost in translation. How a computation is described shouldn't depend on where events are stored - streaming vs batch is an implementation detail. -By building Kaskada's query language on timestreams, it brings the abstractions of streaming to bulk storage, rather than the other way around. +By building Kaskada's query language on timelines, it brings the abstractions of streaming to bulk storage, rather than the other way around. Kaskada allows developers to focus on solving problems with event data by raising the abstraction level used to describe queries. diff --git a/docs-src/modules/tools-and-resources/pages/training-realtime-ml-models.adoc b/docs-src/modules/tools-and-resources/pages/training-realtime-ml-models.adoc index ab58b7531..2660cdd17 100644 --- a/docs-src/modules/tools-and-resources/pages/training-realtime-ml-models.adoc +++ b/docs-src/modules/tools-and-resources/pages/training-realtime-ml-models.adoc @@ -35,12 +35,12 @@ Visualizing events chronologically allows us to understand the context of each e * The third player pays for upgrades when they get frustrated. We'd like to capture this type of time-based insight as feature values we can use to train a model. -We can do this by drawing the result of feature computations as a timestream showing how the feature's value changes as each event is observed. -This timestream allows us to “observe” the value of the feature at any point in time, giving us a framework for training real-time ML models: +We can do this by drawing the result of feature computations as a timeline showing how the feature's value changes as each event is observed. +This timeline allows us to “observe” the value of the feature at any point in time, giving us a framework for training real-time ML models: image::framework.png[Real-time ML framework] -1. Start with raw events and compute feature timestreams +1. Start with raw events and compute feature timelines 2. Observe features at the points in time a prediction would be made to build a training example 3. Move each example forward in time until the predicted outcome can be observed 4. Compute the correct target value and append it to the example @@ -140,7 +140,7 @@ let features = { <1> | { loss_dur: 106s } |=== -Notice that the result is a timestream describing the step function of how this feature has changed over time. We can “observe” the value of this step function at any time, regardless of the times at which the original events occurred. +Notice that the result is a timeline describing the step function of how this feature has changed over time. We can “observe” the value of this step function at any time, regardless of the times at which the original events occurred. Another thing to notice is that these results are automatically grouped by user. We didn't have to explicitly group by user because tables in Kaskada specify an "entity" associated with each row. @@ -192,7 +192,7 @@ let examples = features | shift_by(hours(1)) <2> ---- <1> The examples we created previously -<2> Shift the results of the last step forward in time by one hour - visually you could imagine dragging the examples forward in the timestream by one hour +<2> Shift the results of the last step forward in time by one hour - visually you could imagine dragging the examples forward in the timeline by one hour [cols="1m,2m,4m"] |===