-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MC Product Statuses: Replace WP functions with WC functions #2257
MC Product Statuses: Replace WP functions with WC functions #2257
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## feature/refactor-product-stats #2257 +/- ##
===================================================================
- Coverage 59.0% 54.0% -5.0%
===================================================================
Files 457 293 -164
Lines 17702 3715 -13987
Branches 0 532 +532
===================================================================
- Hits 10443 2006 -8437
+ Misses 7259 1258 -6001
- Partials 0 451 +451
Flags with carried forward coverage won't be shown. Click here to find out more. |
I understand this is still a WIP, but just wanted to ask the question early. One of the side effects of switching to CRUD classes is that update hooks will be called if we set/update product meta. Are we going to put something in place to short circuit the syncer hooks (eventually will be notification hooks), to prevent them from being triggered? |
Thanks @mikkamp for the heads up! I was just in the middle of implementing something to short-circuit, but then I realized that when you use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the changes here, it feels cleaner to be working with CRUD classes. I left some comments as we go through the code, I think we need to do better on how we handle the set of products we are working with to make them accessible.
I also left some comments below, but these aren't fully related to this PR so might be better handled in a followup PR.
Not directly related to this PR but I was getting the following errors in the Action Scheduler: action failed via Async Request: Call to a member function getId() on null
Cause was proxied requests returning invalid data, but my question is how is the API request supposed to handle the failure scenario? In my case it stayed with loading
set to true, but had actually completed with a failure.
Again probably not related to this PR, but I was wondering about the function update_intermediate_product_statistics
and how it handles variable products. For example if batch 1 has 1 variation (APPROVED) and batch 2 has 1 variation (DISSAPPROVED) for the same variable product. Both statuses will get a +1 count for the same variable product.
src/API/Google/MerchantReport.php
Outdated
@@ -97,7 +97,7 @@ public function get_product_view_report( $next_page_token = null ): array { | |||
continue; | |||
} | |||
|
|||
$product_view_data['statuses'][] = [ | |||
$product_view_data['statuses'][ (int) $wc_product_id ] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the typecast here necessary? As far as I can see the function get_wc_product_id
is required to return an int
. And if it is necessary then my next question would be why we aren't typecasting it for all usages of this variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's correct, there is not need for typecasting. Removed here: 7b9944a
|
||
ksort( $new_product_statuses ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can see we only update product meta with this. Why does that need to be done in a particular order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line was left from the previous code:
ksort( $new_product_statuses ); |
So I removed it here: 21e97e2
'name' => $wc_product->get_name(), | ||
'visibility' => $wc_product->get_meta( $visibility_meta_key ), | ||
'parent_id' => $wc_product->get_parent_id(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit unclear how we are expecting to use this product_data_lookup
. The name and visibility are used when we handle product issues. However since we populated the statuses in batches this information isn't going to be pre-loaded when we loop through the product issues.
Would it be more useful to save an instance of the product objects in the class:
$this->process_product_statuses( $statuses );
$this->update_products_meta_with_mc_status();
$this->update_intermediate_product_statistics();
So when we call these three functions they all work with the same product set? I can see the update meta function reloads them from cache but that's not faster then just keeping a previous copy around. Or if we don't want to save a copy in the class, can we have process_product_statuses
call the other two functions internally so we don't need to call the three in a row? That way it can also pass the set of product objects by reference so we don't need to store it in the class instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same feeling, the main reason I kept the $product_data_lookup
as it was is because I initially intended to populate the product issues using the ProductView Report query. My plan was to reuse most of the code. However, after further investigation of the product issues, I realized that the ProductView doesn't return the details from the description or the documentation. So, I don't believe I'll be able to use the same query for that purpose.
google-listings-and-ads/src/MerchantCenter/MerchantStatuses.php
Lines 485 to 487 in e9bc80d
'issue' => $item_level_issue->getDescription(), | |
'action' => $item_level_issue->getDetail(), | |
'action_url' => $item_level_issue->getDocumentation(), |
One idea to retrieve the product issues could be to store the Google IDs when we retrieve the product status. Then, the product issues section could continue to function similarly to how it's currently implemented.
Or if we don't want to save a copy in the class, can we have process_product_statuses call the other two functions internally so we don't need to call the three in a row? That way it can also pass the set of product objects by reference so we don't need to store it in the class instance.
Thanks for the suggestion! I've updated the code so that now those functions accept an array of products. While I was making these updates, I noticed that parent_id wasn't being fetched previously. So, I've adjusted the code to fetch all the parent IDs at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea to retrieve the product issues could be to store the Google IDs when we retrieve the product status. Then, the product issues section could continue to function similarly to how it's currently implemented.
Even though we can't use the ProductView Report query, we can still use the productstatuses/list endpoint, which supports pagination. That will allow us to pull the products in batches without needing to know a list of product/Google ID's up front. We can just act on whatever products the MC presents us with, and if we can map to a WC ID then we store the list of issues.
Since product issues are stored in the DB locally (until they expire), and we have the product statistics job run in batches of products. We can also consider doing something like this for each batch of UpdateMerchantProductStatuses:
- Query ProductView report to get a list of products
- Map products to WC ID's
- Add up intermediate statuses
- Fetch product statuses for the same batch of products using productstatuses/custombatch
- Store issues in local DB
We might still need something to pull in account issues / add local issues to the table. And possibly decrease batch sizes if it takes more resources to pull in statuses and issues for that batch.
I think something like that will take the most advantage of running a background process, otherwise we'd have to create two to loop through the same batch of MC products. Let me know what you think about that approach.
…ts-functions-with-wc-objects
Thanks @mikkamp for your comments and suggestions.
Sorry, I've updated the proxy to avoid those errors. e15c368 .
One option could be to store the error in a transient let's say
That's a tricky situation, but I think we can resolve it by storing the parent statuses in the intermediate transient. Then, we can compare if the new status has a higher priority than the one saved in the previous batch. WDYT? @mikkamp, when you have a chance, can you do another round? Thanks! |
Either a new transient, or storing the error in the transient we are already saving. So next time it can detect that the transient contains an error and can return a correct response. For UI I think it makes sense to present the user with the error message and a small refresh icon.
I agree, I was thinking about something similar, and just looping through the parent statuses at the end. |
@@ -617,32 +617,28 @@ protected function refresh_presync_product_issues(): void { | |||
* @see MerchantReport::get_product_view_report | |||
*/ | |||
public function process_product_statuses( array $statuses ): void { | |||
$product_repository = $this->container->get( ProductRepository::class ); | |||
$products = $product_repository->find_by_ids_as_associative_array( array_column( $statuses, 'product_id' ) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that the mapping part is moved to the ProductRepository. I was wondering why we didn't already have that and it didn't seem correctly placed handling it in the MerchantStatuses class.
src/Product/ProductRepository.php
Outdated
* | ||
* @return WC_Product[] Array of WooCommerce product objects | ||
*/ | ||
public function find_by_ids_as_associative_array( array $ids, array $args = [], int $limit = -1, int $offset = 0 ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we force this function to have a return type of array? The function already always returns an array. I just noticed you have the code $products + $parent_products
so I wanted to confirm that there is no way that those variables are not arrays.
if ( ! $product ) { | ||
do_action( | ||
'woocommerce_gla_debug_message', | ||
sprintf( 'Merchant Center product %s not found in this WooCommerce store.', $product_id ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error message seems to implicate that we are outputting the MC ID, but at this time we have already mapped it to a WC ID (potentially with a filter and custom mapping). Should we reword it to something like:
sprintf( 'Merchant Center product %s not found in this WooCommerce store.', $product_id ), | |
sprintf( 'Merchant Center product with WooCommerce ID %d is not found in this store.', $product_id ), |
'name' => $wc_product->get_name(), | ||
'visibility' => $wc_product->get_meta( $visibility_meta_key ), | ||
'parent_id' => $wc_product->get_parent_id(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea to retrieve the product issues could be to store the Google IDs when we retrieve the product status. Then, the product issues section could continue to function similarly to how it's currently implemented.
Even though we can't use the ProductView Report query, we can still use the productstatuses/list endpoint, which supports pagination. That will allow us to pull the products in batches without needing to know a list of product/Google ID's up front. We can just act on whatever products the MC presents us with, and if we can map to a WC ID then we store the list of issues.
Since product issues are stored in the DB locally (until they expire), and we have the product statistics job run in batches of products. We can also consider doing something like this for each batch of UpdateMerchantProductStatuses:
- Query ProductView report to get a list of products
- Map products to WC ID's
- Add up intermediate statuses
- Fetch product statuses for the same batch of products using productstatuses/custombatch
- Store issues in local DB
We might still need something to pull in account issues / add local issues to the table. And possibly decrease batch sizes if it takes more resources to pull in statuses and issues for that batch.
I think something like that will take the most advantage of running a background process, otherwise we'd have to create two to loop through the same batch of MC products. Let me know what you think about that approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, it's working as expected. I left a few minor comments, but it seems to have submitted them early. Since those are minor and the rest is reliant on follow up PR's I'll go ahead and approve this PR.
Changes proposed in this Pull Request:
Part of #2146 & #2250
In #2255, we've divided the background job for fetching product statuses into multiple tasks to distribute resource usage. Until now, we've been relying on
WP_Post
and bulk updates via raw queries to handle updating numerous products in a single request. While it sped things up, it could became less reliable as WC might change how product data is stored. For better future-proofing, it's more convenient to use WooCommerce objects to retrieve and update their data.This PR replaces the bulk queries and the use of get_post with
wc_get_products
. I've also updated metadata using methods available in their class. The metadata is saved using thesave_meta_data
method, so this way, we bypass calling thewoocommerce_update_product
hook and avoid triggering the syncer hooks.At first, I experimented with batches of 1,000 products. However, in some instances, it was hitting the maximum execution time of 30 seconds. So, I opted to decrease it to 500 products. I did these tests with a memory limit of 128MB.
Detailed test instructions:
mc_statuses
so you can check that themc_status
is added. You can run a query like this one:DELETE FROM wp_postmeta WHERE meta_key = '_wc_gla_mc_status';
GET /gla/mc/product-statistics/refresh
update_merchant_product_statuses
runs._wc_gla_mc_status
is set.Additional details:
ProductMetaQueryHelper::batch_update_values
andProductMetaQueryHelper::batch_insert_values
are not longer used in our codebase, so they could be deleted however they are public, so not sure if it is better just to deprecate them for now.Changelog entry