-
-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Examine on load balanced environment #379
Comments
Hi, yes this is all a known issue with using Lucene based indexes in Azure and load balancing. This is the reason why I created ExamineX so that you can have centrally managed hosted indexes instead of local Lucene indexes per node. I've talked about this in great detail in a couple of talks:
There is no silver bullet to using Lucene based indexes on Azure, especially if you are load balancing. In order to even make Lucene indexes work in Azure even without load balancing a bunch of trickier needs to happen behind the scenes (i.e. %Temp% storage is required, etc...), then when you add in Load Balancing it gets even tricker because there is no central index, there is an index per node and as you say they could get out of sync for all sorts of reasons. They only stay in sync in Umbraco based on Umbraco's cache refreshers. Now if you add slot swapping to the mix, then things probably get even more complex. What is the answer? Well, ExamineX with Azure or Elastic search is the best answer since it solves all of these issues. However, if you choose to continue to try to use Lucene based indexes in Azure + Load Balancing than there might be some options but will require custom implementations. For the most part, indexes will stay in sync with the CM but due to slot swapping, here's what happens: When you swap your staging for your live, your staging site will have a local index based on the staging information from your CM staging site since it has only been kept in sync with your staging CM. This means that the local index on this node will need to be rebuilt so that it is in sync with your live database data. Similarly, the nucache file will also only be in sync with your staging CM, not your live CM so I'm not sure how you are currently working around this? Indexes (and nucache) will be rebuilt automatically by Umbraco based on whether it is a cold boot ... This would be the ideal way to deal with this scenario. If your staging site (which is in sync with your staging DB) becomes live, then it will not be in sync and a cold boot should be executed. I'm not sure why this isn't documented anywhere on Umbraco docs site but to force a cold boot, you can clear out the Perhaps when a site it swapped, it is not restarted which would mean that a cold boot doesn't take place since there is not re-boot? That is something you would need to investigate and would also depending on how you are doing the swap. If it is done programmatically, then you could probably force delete that folder and then do the swap. |
Actually, looking at the swap docs, the source site is restarted, but a cold boot will probably not occur because it has it's last synced file. If you could programmatically swap the slots, then you could first clear that folder and initiate the swap, this should cause a cold boot during its restart while it is now pointing to your production database. Alternatively, you could use utilize custom warmup https://learn.microsoft.com/en-us/azure/app-service/deploy-staging-slots?tabs=portal#Warm-up and initiate index rebuilds. FYI, this is how Umbraco rebuilds indexes on startup so you probably don't want to conflict with its own operations https://github.com/umbraco/Umbraco-CMS/blob/contrib/src/Umbraco.Infrastructure/Examine/RebuildOnStartupHandler.cs This handler waits for one minute after the first http request is made to initiate the rebuilds (so that it doesn't interfere with site bootup/loading). If it is a cold boot, it will rebuild all of them, else only empty ones. You could potentially copy the code in this handler, remove the default one and add your own with custom logic to force cold boot re-indexing if you know the site has just been swapped. These are only ideas I'm coming up with, but essentially, this is all based on Umbraco logic, not Examine. |
Hi @Shazwazza , first of all thank you for the clear explanation! This health check is based on this piece of Umbraco code.
We're also looking to upgrade to Umbraco v13 and we're wondering if this bugfix would help us at all since they mention the following:
Thanks! |
@IOSven yes that change will help because of how the DistCache files are named along with the naming conventions for the index folders. This is probably why nucache works for you today with slot swaps but not Umbraco.
Please be aware of over index rebuilding. Rebuilding should only be done when necessary. There is a heavy database penalty for the queries it executes, plus this can cause your editors to have db lock timeouts because of how long the query takes and if someone is actively trying to edit content.
But how does this check if the index is in sync with the CM database? |
Hi @Shazwazza, We're indeed only rebuilding our indexes if we really have to when the document count is 0. We're running this examine health check on both our CD & CM environments. Would there be a better alternative workaround maybe that you could think of? Thanks! |
Its just your original question is directly relating to your indexes getting out of sync. It sounds like the health checks you have implemented don't actually check for whether they are in sync or not, and only if they are empty - Please note, Umbraco will automatically rebuild them if they are empty on startup so you shouldn't have to handle that yourself either. To determine if your indexes are in sync would require some custom logic that doesn't currently exist. There would be a few ways to try to do that but the most ideal way would be for the node to simply query the local index for a specific record with a specific value and if it didn't match it would mean it is out of sync. How you would do that or other alternatives I'll have to leave up to you. Again, the most ideal way to deal with indexes, load balancing and azure is to use a hosted search service like Azure/Elastic search and use ExamineX, then there's nothing to worry about. |
Hi @Shazwazza, We've been experimenting with examine x and we also bought a paid license. The implementation of Examine X in our project is currently on hold until further investigation. |
Thanks @IOSven for the info. Happy to assist on the ExamineX tracker regarding any performance investigations. The only performance concerns with ExamineX would simply be latency due to HTTP requests when searching or indexing but there is far less overhead on the local CPU than Examine since there is no underlying Lucene engine. Would be interesting to see where your bottlenecks are/were. |
Which Umbraco version are you using? (Please write the exact version, example: 10.1.0)
v10.8.2
Bug summary
Hi, we keep struggling with the use of Lucene indexes on our loadbalanced environment.
For each environment (test, acceptance, production) we've got 2 seperate web apps,
one for content delivery (= CD = FE Only) and one for content management (= CM = Backoffice only).
We've added the recommended configurations for maindomlock, localTempstorageLocation & luceneDirectoryFactory
as described on this page:
https://docs.umbraco.com/umbraco-cms/fundamentals/setup/server-setup/load-balancing/azure-web-apps
We've also configured an explicit schedulingPublisher & subscriber as mentioned in:
https://docs.umbraco.com/umbraco-cms/fundamentals/setup/server-setup/load-balancing/flexible-advanced
We noticed that our examine indexes on our cd web apps are not in sync with our CM web app examine indexes even though
we're only using the default internal & externalIndex that we've only extended by using the TransformingIndexValues
eventHandler.
Example:
How do we notice that our CM/CD webapp examine indexes are not in sync?
After some more investigation, we notice (or atleast think) that the problem has to do with the swapping in combination with the load balancing .
We've already created an umbraco support ticket to further escalate this issue since our client is becoming impatient as they encounter this problem everyday on their production environment.
Umbraco support gave us the following update:
Umbraco support:
My answer:
Umbraco support:
unfortunately Umbraco support cannot provide us with a specific example of this code implementation since they mention they do not have any documentation on it.
Do you have an idea if this would fix our problem and if so, how we could implement this?
Also - Currently we're not using swapping on our production environment, but this environment is also affected by the same issue.
Thanks in advance!
Sven
The text was updated successfully, but these errors were encountered: