Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean orphaned page article link records #4175

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 6 additions & 15 deletions app/models/article.rb
Original file line number Diff line number Diff line change
@@ -1,11 +1,3 @@
# create_table :articles do |t|
# # t.column :name, :string
# t.column :title, :string
# t.column :source_text, :text
# # automated stuff
# t.column :created_on, :datetime
# t.column :lock_version, :integer, :default => 0
# end
# == Schema Information
#
# Table name: articles
Expand Down Expand Up @@ -41,21 +33,20 @@ class Article < ApplicationRecord
validates :latitude, allow_blank: true, numericality: { less_than_or_equal_to: 90, greater_than_or_equal_to: -90}
validates :longitude, allow_blank: true, numericality: { less_than_or_equal_to: 180, greater_than_or_equal_to: -180}


has_and_belongs_to_many :categories, -> { distinct }
belongs_to :collection, optional: true
has_many(:target_article_links, :foreign_key => "target_article_id", :class_name => 'ArticleArticleLink')
scope :target_article_links, -> { include 'source_article' }
scope :target_article_links, -> { order "articles.title ASC" }

has_many(:source_article_links, :foreign_key => "source_article_id", :class_name => 'ArticleArticleLink')
has_many(:page_article_links)
has_many :page_article_links, dependent: :destroy
scope :page_article_links, -> { includes(:page) }
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This callback is in page.rb, but seems missing on article side.

scope :page_article_links, -> { order("pages.work_id, pages.position ASC") }

scope :pages_for_this_article, -> { order("pages.work_id, pages.position ASC").includes(:pages)}
scope :pages_for_this_article, -> { order("pages.work_id, pages.position ASC").includes(:pages) }

has_many :pages, through: :page_article_links, counter_cache: true
has_many :pages, through: :page_article_links

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting we're missing the counter_cache here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug in current rails version. This ticket provides more discussions on that

#4005

has_many :article_versions, -> { order 'version DESC' }, dependent: :destroy

Expand All @@ -78,10 +69,10 @@ def source_text
self[:source_text] || ''
end


def self.delete_orphan_articles
# don't delete orphan articles with contents
Article.delete_all("source_text IS NULL AND id NOT IN (select article_id from page_article_links)")
Article.where(provenance: nil).
where('source_text IS NULL AND id NOT IN (SELECT article_id FROM page_article_links)').destroy_all
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was not working the way we expected it to, it could account for some of what we're seeing in #4269


#######################
Expand Down Expand Up @@ -146,7 +137,7 @@ def possible_duplicates
def clear_links(type='does_not_apply')
# clear out the existing links to this page
if self.id
ArticleArticleLink.where("source_article_id = #{self.id}").delete_all
ArticleArticleLink.where("source_article_id = #{self.id}").destroy_all
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

end

Expand Down
7 changes: 3 additions & 4 deletions app/models/page.rb
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,8 @@ class Page < ApplicationRecord
acts_as_list :scope => :work
belongs_to :last_editor, :class_name => 'User', :foreign_key => 'last_editor_user_id', optional: true


has_many :page_article_links, :dependent => :destroy
has_many :articles, :through => :page_article_links
has_many :page_article_links, dependent: :destroy
has_many :articles, through: :page_article_links
has_many :page_versions, -> { order 'page_version DESC' }, :dependent => :destroy

belongs_to :current_version, :class_name => 'PageVersion', :foreign_key => 'page_version_id', optional: true
Expand Down Expand Up @@ -529,7 +528,7 @@ def clear_links(text_type)
if self.page_article_links.present?
self.clear_article_graphs
# clear out the existing links to this page
PageArticleLink.where("page_id = #{self.id} and text_type = '#{text_type}'").delete_all
PageArticleLink.where("page_id = #{self.id} and text_type = '#{text_type}'").destroy_all
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing delete_all does not trigger active_record callbacks , therefore no validations.

Moving forward, unless we specifically want to avoid callbacks, we should use destroy instead of delete in order to avoid unvalidated changes.

end
end

Expand Down
9 changes: 1 addition & 8 deletions app/models/page_article_link.rb
Original file line number Diff line number Diff line change
@@ -1,10 +1,3 @@
# # foreign keys
# t.column :page_id, :integer
# t.column :article_id, :integer
# # text
# t.column :display_text, :string
# # automated stuff
# t.column :created_on, :datetime
# == Schema Information
#
# Table name: page_article_links
Expand All @@ -23,5 +16,5 @@
#
class PageArticleLink < ApplicationRecord
belongs_to :page, optional: true
belongs_to :article, optional: true
belongs_to :article, counter_cache: :pages_count, optional: true
end
9 changes: 9 additions & 0 deletions db/migrate/20240625185153_clean_orphan_page_link_records.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
class CleanOrphanPageLinkRecords < ActiveRecord::Migration[6.1]

def up
page_article_links = PageArticleLink.where(article_id: nil).or(PageArticleLink.where(page_id: nil))

page_article_links.in_batches(of: 1000).destroy_all
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migration to clean existing bad records

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same caveat here.

end

end
Loading