[Reporting] Abstract reports storage #106821

dokmic · 2021-07-27T08:49:36Z

Summary

This pull-request is encapsulating the reporting storage logic behind a Node.js stream.

Resolves #98726.

Checklist

Unit or functional tests were updated or added to match the most common scenarios

For maintainers

This was checked for breaking API changes and was labeled appropriately

elasticmachine · 2021-07-29T07:22:58Z

Pinging @elastic/kibana-app-services (Team:AppServices)

tsullivan · 2021-07-29T20:01:06Z

x-pack/plugins/reporting/server/lib/content_stream.ts

+    }
+  }
+
+  _write(chunk: Buffer | string, encoding: string, callback: Callback) {


Let's prefix unused variables with _ to stop warnings in the editor:

Suggested change

_write(chunk: Buffer | string, encoding: string, callback: Callback) {

_write(chunk: Buffer | string, _encoding: string, callback: Callback) {

tsullivan · 2021-07-29T20:03:15Z

x-pack/plugins/reporting/server/lib/content_stream.ts

+  async toString(): Promise<string> {
+    let result = '';
+
+    for await (const chunk of this) {


unnecessary await?

tsullivan · 2021-07-29T20:14:42Z

x-pack/plugins/reporting/server/lib/tasks/execute_report.ts


  constructor(
    private reporting: ReportingCore,
    private config: ReportingConfigType,
    logger: LevelLogger
  ) {
    this.logger = logger.clone(['runTask']);
+    this.getContentStream = getContentStreamFactory(reporting);


getContentStream always seems to be called only once per file, so I am not seeing a benefit from having it returned from a factory function.

We could remove the factory wrapper by having getContentStream take reporting as the first argument.

There are a lot of factory functions in Reporting, but it is a leftover of the old platform. The existing code that's like that should all be cleaned up in this PR: #106940

tsullivan

This is looking great! I'm excited to get this in.

I left a few comments, but the main concern is we need to make sure when a report document is queried from Elasticsearch, the report was created by the authenticated user.

tsullivan · 2021-07-29T20:18:08Z

x-pack/plugins/reporting/server/lib/content_stream.ts

+        constant_score: {
+          filter: {
+            bool: {
+              must: [{ term: { _id: id } }],


We need to add the filter that was in jobsQueries.getContent, which was to match the user of the document with the authenticated user.

This would preserve the requirement that users can not download reports created by other users.

I added that at first, but then I decided not to do that because I don't think it belongs there. The stream itself should be as simple as possible and responsible only for reading and writing the data. And apart from that, we already have this check in the store and perform that here before reading from the stream.

++ after I wrote my previous comment, I realized that .get is checking the username for us.

tsullivan · 2021-07-29T22:18:55Z

x-pack/plugins/reporting/server/lib/tasks/execute_report.ts

+      index: report._index!,
+      if_primary_term: report._primary_term,
+      if_seq_no: report._seq_no,
+    });


I had in mind that the abstracted file storage mechanism would be used inside of the task runner functions, (aka execute_job functions) so they can take control of streaming their output content to storage as it becomes available.

Internally, the file storage mechanism could chunk up the data into multiple documents as it is available, which lends itself towards solving #18322

It might be good to create a stream variable in _performJob, and pass it to the task runner functions on line 247. We can have the task runners return the stream back again because the _seq_no and _primary_term need to be updated. Then we could remove the parts from execute_report.ts that handle the entire output content. Something like that would be the only way to allow the csv.maxSizeBytes setting to be unlimited.

i agree we should be writing inside execute job as we are getting data, but imo its ok to do that in a follow up PR, while keeping this PR as small as possible and just abstracting our storage without doing any other changes.

I have addressed that in the latest commit. The task runner functions no longer return content but write that to the stream.

tsullivan · 2021-07-29T22:19:20Z

Let's also get @ppisljar to review

x-pack/plugins/reporting/server/routes/lib/job_response_handler.ts

ppisljar

LGTM after Tims concerns are addressed.

…rt contents

tsullivan

LGTM

This tackles the issue very well. GREAT WORK!

kibanamachine · 2021-08-04T16:07:09Z

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

💔 Build #142817 failed dfdae11a8389fff5dec2d8ed7731906aa4344c63
💚 Build #142084 succeeded 431a2e3748c9c57e1936e3a3c1e55e06d67d8d5c
💚 Build #141487 succeeded 729ba1f17eb7ae4159130925c241c3291d20750d
💔 Build #141459 failed ee1bccfa847c57e7e24717057f757df0b024a68c
💔 Build #141113 failed eecc01d1b50524adf63ba97b58ed70b5e074d733

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

* Add duplex content stream * Add content stream factory * Move report contents gathering and writing to the content stream * Update jobs executors to use content stream instead of returning report contents # Conflicts: # x-pack/plugins/reporting/server/export_types/printable_pdf/execute_job/index.test.ts

* Add duplex content stream * Add content stream factory * Move report contents gathering and writing to the content stream * Update jobs executors to use content stream instead of returning report contents

jloleysens · 2021-08-09T09:03:25Z

x-pack/plugins/reporting/server/lib/content_stream.ts

+  if_seq_no?: number;
+}
+
+export class ContentStream extends Duplex {


What is the expected behaviour of writing to a stream? Will we only ever allow writing to non-existing document IDs?

dokmic added review (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead v8.0.0 Team:AppServices release_note:skip Skip the PR/issue when compiling release notes v7.15.0 labels Jul 27, 2021

dokmic force-pushed the feature/98726 branch 4 times, most recently from ee1bccf to 729ba1f Compare July 28, 2021 20:34

dokmic marked this pull request as ready for review July 29, 2021 07:22

dokmic requested a review from tsullivan July 29, 2021 08:08

tsullivan requested review from a team July 29, 2021 16:58

tsullivan reviewed Jul 29, 2021

View reviewed changes

tsullivan requested changes Jul 29, 2021

View reviewed changes

tsullivan reviewed Jul 29, 2021

View reviewed changes

tsullivan requested a review from ppisljar July 29, 2021 22:19

tsullivan reviewed Jul 29, 2021

View reviewed changes

x-pack/plugins/reporting/server/routes/lib/job_response_handler.ts Show resolved Hide resolved

dokmic force-pushed the feature/98726 branch from 729ba1f to 431a2e3 Compare July 30, 2021 20:22

ppisljar approved these changes Aug 3, 2021

View reviewed changes

dokmic force-pushed the feature/98726 branch from 431a2e3 to dfdae11 Compare August 3, 2021 21:13

dokmic added 4 commits August 4, 2021 15:46

Add readable content stream

5ebf2ce

Add content stream factory

5764919

Move report contents gathering to the content stream

f71bf38

Add writeable content stream

46ab6dd

dokmic added 3 commits August 4, 2021 15:46

Move report contents writing to the content stream

23f4e29

Refactor content stream factory to get rid of currying

48b2937

Update jobs executors to use content stream instead of returning repo…

7e3a449

…rt contents

dokmic force-pushed the feature/98726 branch from dfdae11 to 7e3a449 Compare August 4, 2021 13:46

tsullivan approved these changes Aug 4, 2021

View reviewed changes

dokmic merged commit 1e8bc92 into elastic:master Aug 4, 2021

dokmic deleted the feature/98726 branch August 4, 2021 17:52

dokmic mentioned this pull request Aug 4, 2021

[7.x] [Reporting] Abstract reports storage (#106821) #107688

Merged

jloleysens reviewed Aug 9, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Reporting] Abstract reports storage #106821

[Reporting] Abstract reports storage #106821

dokmic commented Jul 27, 2021

elasticmachine commented Jul 29, 2021

tsullivan Jul 29, 2021

tsullivan Jul 29, 2021

tsullivan Jul 29, 2021 •

edited

Loading

tsullivan left a comment

tsullivan Jul 29, 2021

dokmic Aug 4, 2021

tsullivan Aug 4, 2021

tsullivan Jul 29, 2021 •

edited

Loading

ppisljar Aug 3, 2021

dokmic Aug 4, 2021

tsullivan commented Jul 29, 2021

ppisljar left a comment

tsullivan left a comment

kibanamachine commented Aug 4, 2021

jloleysens Aug 9, 2021

	_write(chunk: Buffer \| string, encoding: string, callback: Callback) {
	_write(chunk: Buffer \| string, _encoding: string, callback: Callback) {

[Reporting] Abstract reports storage #106821

[Reporting] Abstract reports storage #106821

Conversation

dokmic commented Jul 27, 2021

Summary

Checklist

For maintainers

elasticmachine commented Jul 29, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsullivan Jul 29, 2021 • edited Loading

Choose a reason for hiding this comment

tsullivan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsullivan Jul 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsullivan commented Jul 29, 2021

ppisljar left a comment

Choose a reason for hiding this comment

tsullivan left a comment

Choose a reason for hiding this comment

kibanamachine commented Aug 4, 2021

💚 Build Succeeded

Metrics [docs]

History

Choose a reason for hiding this comment

tsullivan Jul 29, 2021 •

edited

Loading

tsullivan Jul 29, 2021 •

edited

Loading