Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] document LIR attribution #30899

Merged
merged 13 commits into from
Jan 9, 2025

Conversation

mgree
Copy link
Contributor

@mgree mgree commented Dec 23, 2024

Documents the LIR mapping introspection source (#29848).

Preview at https://preview.materialize.com/materialize/30899/transform-data/troubleshooting/.

Motivation

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@mgree mgree added A-docs Area: documentation A-optimization Area: query optimization and transformation T-observability labels Dec 23, 2024
@mgree mgree requested a review from ala2134 December 23, 2024 20:35
@mgree mgree requested a review from a team as a code owner December 23, 2024 20:35
Copy link
Contributor

@kay-kim kay-kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this! Left some trivial suggestions (feel free to ignore).
I can pick up after vacation w.r.t. table rendering.

doc/user/content/transform-data/troubleshooting.md Outdated Show resolved Hide resolved
doc/user/content/transform-data/troubleshooting.md Outdated Show resolved Hide resolved
```sql
SELECT mo.name AS name, global_id, lir_id, parent_lir_id, REPEAT(' ', nesting * 2) || operator AS operator,
SUM(duration_ns)/1000 * '1 microsecond'::INTERVAL AS duration, SUM(count) AS count
FROM mz_introspection.mz_lir_mapping mlm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial nit (feel free to disregard). Do we want the FROM to either left-align with SELECT or right align with the 'LEFT JOIN'/'JOIN' ?

Have zero opinion as I've seen various alignments when using JOINS and I don't think we have a company style yet. But, this one seems to differ from the others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it so FROM aligns with the SELECT, like all the other queries I wrote. I'm happy to have these reformatted, no strong feelings.

doc/user/content/transform-data/troubleshooting.md Outdated Show resolved Hide resolved
doc/user/content/transform-data/troubleshooting.md Outdated Show resolved Hide resolved
doc/user/content/transform-data/troubleshooting.md Outdated Show resolved Hide resolved
Running this query on an auction generator will produce results that look something like the following (though the specific numbers will vary, of course):


| name | global_id | lir_id | parent_lir_id | operator | duration | count |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, markdown table like this won't preserve the spacing(I know, with all your nice indentation logic ... come on, markdown) :shakes-fist:
https://preview.materialize.com/materialize/30899/transform-data/troubleshooting/#attributing-computation-time

When I get back, I can move separate these into a data file and a table
where in the data file, can use ```mzsql annotation to maintain spacing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😭 Thanks!

I could also probably just force   or a unicode non-breaking space in there or something, though that's a hell of a kludge. Awkward either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a patch to use data files as well as some tweaks to improve skimmability (basically, added bullet points since it's easier t skip over the bulleted lists than paragraphs -- since people might have to read the paragraph before determining to skip or not to skip.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!!!


You can [`EXPLAIN`](/sql/explain-plan/) a query to see how it will be
run as a dataflow. In particular, `EXPLAIN PHYSICAL PLAN` will show
the concrete, fully optimized plan that Materialize will run. (That
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we typically use (...) when we're making an aside? I guess I'm curious why that wouldn't just be its own sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Common in academic writing, but that ivory-tower tone is hardly worth emulating; I dropped the parens.

and other internal views to discover which parts of your query are
computationally expensive (e.g.,
[`mz_introspection.mz_compute_operator_durations_histogram`](/sql/system-catalog/mz_introspection/#mz_compute_operator_durations_histogram), [`mz_introspection.mz_scheduling_elapsed`](/sql/system-catalog/mz_introspection/#mz_scheduling_elapsed))
or consuming excessive memory (e.g., ).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentionally blank after the (e.g. )?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope lol


### Attributing computation time

One way to understand which parts of your query are 'expensive' is to
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say 'expensive', I presume you mean computationally expensive thus resource/$$ expensive right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; revised.


[Worker skew](/transform-data/dataflow-troubleshooting/#is-work-distributed-equally-across-workers) occurs when your data do not end up getting evenly
partitioned between workers. Worker skew can only happen when your
cluster has more than one worker. (You can query
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar nit as my first comment about () instead of its own standalone sentence. 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised as well.

@mgree mgree force-pushed the docs-lir-troubleshooting branch from 169cd80 to 1925ce3 Compare January 9, 2025 16:40

```sql
SELECT mo.name AS name, mlm.global_id AS global_id, lir_id, parent_lir_id, REPEAT(' ', nesting * 2) || operator AS operator,
levels, to_cut, hint, pg_size_pretty(savings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the hint before pg_size_pretty ... so that the hint shows in the page. I could also rename columns so that they all fit in ... but, eh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, there's something to be said for not even selecting out levels or to_cut, since they feel pretty inward looking.

- The `duration` column shows that the `TopK` operator is where we spend the
bulk of the query's computation time.

- Creating an index on a view executes the underlying view query. As such, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't sure if this is why we have the 2 global_ids, but ... wanted some explanation since we just state the index has the 2 ... and people might think it's just because of our WHERE clause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the changes; I replaced 'executed' with 'started', since 'execution' feels like it implies termination.

What else do we need here to be ready to go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

Copy link
Contributor

@kay-kim kay-kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

@kay-kim kay-kim enabled auto-merge (squash) January 9, 2025 21:53
@kay-kim kay-kim merged commit 960265c into MaterializeInc:main Jan 9, 2025
11 checks passed
@mgree mgree mentioned this pull request Jan 15, 2025
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: documentation A-optimization Area: query optimization and transformation T-observability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants