Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multidimensional data pages baking logic #4478

Merged
merged 1 commit into from
Jan 29, 2025
Merged

Conversation

rakyi
Copy link
Contributor

@rakyi rakyi commented Jan 23, 2025

  • Don't delete mdims in GrapherBaker
  • Delete old mdims in MultiDimBaker
  • Improve the console output
  • Document issue with TSConfig and getting ESNext types

@rakyi rakyi requested a review from mlbrgl January 23, 2025 10:55
@@ -308,7 +309,6 @@ const bakeGrapherPageAndVariablesPngAndSVGIfChanged = async (
imageMetadataDictionary
)
)
console.log(outPath)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We output the slug in the progress bar already, this doesn't add anything useful on top of that.

@@ -432,7 +431,7 @@ export const bakeAllChangedGrapherPagesVariablesPngSvgAndDeleteRemovedGraphers =
)

const progressBar = new ProgressBar(
"bake grapher page [:bar] :current/:total :elapseds :rate/s :etas :name\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear this is an ETA in the output, which is already cluttered, and doesn't seem useful.

@@ -451,11 +450,18 @@ export const bakeAllChangedGrapherPagesVariablesPngSvgAndDeleteRemovedGraphers =
async (knex) => await bakeSingleGrapherChart(job, knex),
db.TransactionCloseMode.KeepOpen
)
progressBar.tick({ name: `slug ${job.slug}` })
progressBar.tick({ name: job.slug })
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeating slug on every line doesn't add anything useful.

@rakyi rakyi force-pushed the fix-mdim-baking-logic branch from 6e05aa2 to 8385cc3 Compare January 23, 2025 11:29
@owidbot
Copy link
Contributor

owidbot commented Jan 23, 2025

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-fix-mdim-baking-logic

SVG tester:

Number of differences (default views): 0 ✅
Number of differences (all views): 0 ✅

Edited: 2025-01-23 11:44:17 UTC
Execution time: 1.23 seconds

@rakyi rakyi force-pushed the fix-mdim-baking-logic branch from 8385cc3 to 2a6a3d2 Compare January 23, 2025 12:31
@rakyi rakyi linked an issue Jan 23, 2025 that may be closed by this pull request
@rakyi rakyi force-pushed the fix-mdim-baking-logic branch 4 times, most recently from af22b53 to 2d02649 Compare January 24, 2025 10:01
Copy link
Member

@mlbrgl mlbrgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

My main concern is about the repetition of deleteOldGrapher / deleteOldMultiDimPages:
1- code redundancy: they largely do the same thing
2- execution redundancy: the deletion of obsolete /grapher pages happens twice (once at the grapher step, and another time at the mdim step)

2 is more of an optimzation for full bakes, and doesn't probably count for much so I wouldn't necessarily complexify here. Also it's nice that each step is self-contained when we do partial bakes.

The near duplication of deleteOldGrapher in 1 is a bit more cumbersome, and it would be more straightforward to have a single function that deals with the overall deletion of /grapher pages (even if it called twice, as per 2).

It would be also a good opportunity to deal with the confusion around redirects to explorers.

On the one hand, we don't delete grapher pages that have been redirected to explorers:

const toRemove = without(oldSlugs, ...newSlugs)
// do not delete grapher slugs redirected to explorers
.filter((slug) => !isPathRedirectedToExplorer(`/grapher/${slug}`))

On the other hand, we don't bake grapher pages that have been redirected to explorers:

// Avoid baking paths that have an Explorer redirect.
// Redirects take precedence.
if (isPathRedirectedToExplorer(`/grapher/${grapher.slug}`)) {
console.log(`⏩ ${grapher.slug} redirects to explorer`)
return
}

Meanwhile, deleteOldMultiDimPages does delete grapher pages that have been redirected to explorers, and effectively overrides deleteOldGrapher's safeguard. If this safeguard is no longer relevant, I'd remove it rather than having it in one place and not the other, which is confusing.

console.log(`DELETING ${slug}`)
const path = `${bakedSiteDir}/grapher/${slug}.html`
await fs.unlink(path)
console.log(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console.log(path) necessary? We have the same console.log in deleteOldGrapher().

Also, instead of:

DELETING coronavirus-cfr
localBake/grapher/coronavirus-cfr.html

we could maybe simply output:

DELETING localBake/grapher/coronavirus-cfr.html

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it that way to be consistent with deleteOldGraphers, but that has since been changed in master, so I'm simplifying it.

}
const publishedSlugs = await getAllPublishedMultiDimDataPageSlugs(knex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could get the list of published slugs from multiDimsBySlug potentially, to avoid the extra db query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I started with that, then changed the logic a few times and then forgot that was an option. 😄

@rakyi
Copy link
Contributor Author

rakyi commented Jan 27, 2025

Good point, I moved the deletion of old graphers to a single helper function called in both places. After the removal of image exports, there is no longer a distinction between normal graphers and mdims. I kept the safeguard for redirects to explorers just in case.

@rakyi rakyi force-pushed the fix-mdim-baking-logic branch 2 times, most recently from 5d083fe to 92fdd3e Compare January 27, 2025 10:49
* Don't delete mdims in GrapherBaker
* Delete old mdims in MultiDimBaker
* Improve the console output
* Document issue with TSConfig and getting ESNext types
Copy link
Member

@mlbrgl mlbrgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for the updates!

@rakyi rakyi merged commit d3760e4 into master Jan 29, 2025
27 checks passed
@rakyi rakyi deleted the fix-mdim-baking-logic branch January 29, 2025 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Correctly delete unpublished multidimensional data pages when baking
3 participants