✨ update docs / remove references to old gdrive images flow

owid · Nov 26, 2024 · b55f2f4 · b55f2f4
1 parent 267c426
commit b55f2f4
Show file tree

Hide file tree

Showing 5 changed files with 6 additions and 66 deletions.
diff --git a/db/model/Gdoc/GdocBase.ts b/db/model/Gdoc/GdocBase.ts
@@ -663,7 +663,7 @@ export class GdocBase implements OwidGdocBaseInterface {
     }
 
     /**
-     * Load image metadata from the database. Does not check Google Drive or sync to S3
+     * Load image metadata from the database.
      */
     async loadImageMetadataFromDB(
         knex: db.KnexReadonlyTransaction,

diff --git a/ops/buildkite/deploy-content b/ops/buildkite/deploy-content
@@ -75,8 +75,6 @@ sync_to_r2_aws() {
 sync_baked_data_to_r2() {
     echo '--- Sync baked data to R2'
     # Cloudflare Pages has limit of 20000 files
-    # NOTE: There's also images/published, which are the gdocs images synced from GDrive.
-    #   There's currently a small-enough amount of them, but we need to sync them to R2 or Cloudflare Images at some point.
     # NOTE: aws is about 3x faster than rclone
     sync_to_r2_aws grapher/exports  # 9203 files
     sync_to_r2_aws exports  # 3314 files

diff --git a/packages/@ourworldindata/types/src/gdocTypes/Image.ts b/packages/@ourworldindata/types/src/gdocTypes/Image.ts
@@ -1,17 +1,5 @@
 import { DbEnrichedImage } from "../dbTypes/Images.js"
 
-// This is the JSON we get from Google's API before remapping the keys to be consistent with the rest of our interfaces
-export interface GDriveImageMetadata {
-    name: string // -> filename
-    modifiedTime: string // -> updatedAt e.g. "2023-01-11T19:45:27.000Z"
-    id: string // -> googleId e.g. "1dfArzg3JrAJupVl4YyJpb2FOnBn4irPX"
-    description?: string // -> defaultAlt
-    imageMediaMetadata?: {
-        width?: number // -> originalWidth
-        height?: number // -> originalHeight
-    }
-}
-
 // All the data we use in the client to render images
 // everything except the ID, effectively
 export type ImageMetadata = Pick<

diff --git a/packages/@ourworldindata/types/src/index.ts b/packages/@ourworldindata/types/src/index.ts
@@ -349,7 +349,7 @@ export {
     type UnformattedSpan,
 } from "./gdocTypes/Spans.js"
 
-export type { GDriveImageMetadata, ImageMetadata } from "./gdocTypes/Image.js"
+export type { ImageMetadata } from "./gdocTypes/Image.js"
 export {
     ALL_CHARTS_ID,
     LICENSE_ID,

diff --git a/site/README.md b/site/README.md
@@ -22,65 +22,19 @@ A Google Doc can be written and registered via the `/admin/gdocs` view in the ad
 
 This content is only updated in an environment's database when someone presses "publish" from the Google Doc preview (`/admin/gdocs/google_doc_id/preview`)
 
-## Images in Google Docs
+## Images
 
-To match Google Docs' "one document, many environments" paradigm, the source of images for all environments is a Shared Drive. An image is referenced, in Archie, via filename, which we use to find the entity via Google Drive's API.
-
-e.g.
+Image blocks can be added to gdocs via the follow archie syntax:
 
 ```
 {.image}
 filename: my_image.png
 {}
 ```
 
-This means that the filenames of images uploaded to the Shared Drive **must be unique**.
-
-We chose to do it this way instead of via Google Drive File ID because it's easier to read and sanity check. We also considered inline images, but Google Docs doesn't support inline SVGs and downsizes images wider than 1600px.
-
-We mirror these images to Cloudflare's R2 to allow environments to have some amount of independence from one another. For OWID developers, the env variables needed for this functionality are stored in our password manager.
-
-It is recommended to use a unique folder in R2 for each environment. By convention, `dev-$NAME` for your local development server (when you run `make create-if-missing.env.full` for the first time, this will be generated based on your unix `$USER` variable by default), and one for your staging server too (e.g. `neurath`).
-
-### Baking images
-
-During the baking process (`bakeDriveImages`) we:
-
-1. Find the filenames of all the images that are currently referenced in published Google documents (the `posts_gdocs_x_images` table stores this data)
-2. See if we've already uploaded them to R2 (by checking the `images` table, which is only updated after we've successfully mirrored the image to R2)
-3. Mirror them to R2 if not
-4. Pull all the images from R2
-5. Create optimized WEBP versions of each image at multiple resolutions
-6. Save them, and the original file, into the assets folder
-
-### Previewing images
-
-The preview flow is slightly different. If a document with an image in it is previewed, we also fetch the image from the Shared Drive and upload it to R2, but we don't do the resizing - we just display the source image via R2's CDN. This logic is all contained in the [Image component](gdocs/Image.tsx).
+where `my_image.png` is an image that has been uploaded via the `/admin/images` view in the admin client, and thus exists in Cloudflare Images.
 
-### Gotchas
-
-#### Updating images
-
-If an image has changed since we last uploaded it to R2 (e.g. a new version has been uploaded, or its description has changed) we'll re-upload the file to R2. This happens even if you're only previewing a document that references the image, regardless of whether or not you re-publish it.
-
-This means that any other documents that reference the image will use the updated version during the next bake, even if they haven't been republished. This seemed preferable to tracking version state and having to manually update every article whenever you update an image.
-
-#### Refreshing a database
-
-If you are refreshing your environment's database by importing a database dump from prod, the prod `images` table may make claims about the existence of files in your environment's S3 folder that aren't true, which will lead to 403 errors when trying to bake.
-
-In this project's root Makefile, we have a make command (`make sync-images`) that runs `rclone sync` from prod to your environment to solve this problem. Make sure your `~/.config/rclone/rclone.conf` is configured correctly and contains
-
-```
-[owid-r2]
-type = s3
-provider = Cloudflare
-env_auth = true
-access_key_id = xxx
-secret_access_key = xxx
-region = auto
-endpoint = https://078fcdfed9955087315dd86792e71a7e.r2.cloudflarestorage.com
-```
+We store information about the image's dimensions and alt text in the database, which is shared via React content to any component that needs to render them. See `Image.tsx` for the (many) implementation details.
 
 ## Data Catalog