Skip to content
This repository has been archived by the owner on Jul 13, 2023. It is now read-only.

Instead of copying files in adapters, create hard links where possible #2120

Merged
merged 1 commit into from
Aug 19, 2016

Conversation

cbeckr
Copy link
Contributor

@cbeckr cbeckr commented Mar 1, 2016

Paperclip duplicates the original files quite a lot as part of its validation process. (#1642, #1326).
When uploading large files (several hundred megabytes to gigabyte range), this becomes a problem: The web server will be busy creating 3 - 4 duplicates on disk, while the app (and potentially the user) are waiting for the upload operation to complete.
This pull request introduces hard links instead of FileUtil.cp where possible to keep the logic as-is but save time and disk space.

destination
end

def link_or_copy_file(src, dest)
begin

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant begin block detected.

FileUtils.mv(src, dest)
end
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra empty line detected at module body end.

@tute tute merged commit 0ce6bbc into thoughtbot:v4.3 Aug 19, 2016
@tute
Copy link
Contributor

tute commented Aug 19, 2016

@cbeckr you sent the PR to v4.3 but I'm not planning on adding new features to that version. Can you reopen your PR to master please?

In master I got the following failures with your patch:

Failures:

  1) Attachment Processing using validates_attachment processes attachments given a valid assignment
     Failure/Error: attachment.expects(:post_process_styles)

     Mocha::ExpectationError:
       not all expectations were satisfied
       unsatisfied expectations:
       - expected exactly once, not yet invoked: #<Paperclip::Attachment:0x7f8f652314e0>.post_process_styles(any_parameters)
       satisfied expectations:
       - allowed any number of times, not yet invoked: Paperclip::Deprecations.aws_sdk_version(any_parameters)
       - allowed any number of times, not yet invoked: Paperclip::Deprecations.active_model_version(any_parameters)
     # ./spec/paperclip/attachment_processing_spec.rb:57:in `block (3 levels) in <top (required)>'

  2) Missing Attachment Styles is able to calculate differences between registered styles and current styles
     Failure/Error:
       File.open(Paperclip.registered_attachments_styles_path, 'w') do |f|
         YAML.dump(current_attachments_styles, f)
       end

     Errno::ENOENT:
       No such file or directory @ rb_sysopen - /Users/tutec/Sites/thoughtbot/paperclip/paperclip/tmp/public/system/paperclip_attachments.yml
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `initialize'
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `open'
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `save_current_attachments_styles!'
     # ./spec/paperclip/paperclip_missing_attachment_styles_spec.rb:43:in `block (2 levels) in <top (required)>'

  3) Missing Attachment Styles is able to save current attachment styles for further comparison
     Failure/Error:
       File.open(Paperclip.registered_attachments_styles_path, 'w') do |f|
         YAML.dump(current_attachments_styles, f)
       end

     Errno::ENOENT:
       No such file or directory @ rb_sysopen - /Users/tutec/Sites/thoughtbot/paperclip/paperclip/tmp/public/system/paperclip_attachments.yml
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `initialize'
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `open'
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `save_current_attachments_styles!'
     # ./spec/paperclip/paperclip_missing_attachment_styles_spec.rb:29:in `block (2 levels) in <top (required)>'

  4) Missing Attachment Styles is able to read registered attachment styles from file
     Failure/Error:
       File.open(Paperclip.registered_attachments_styles_path, 'w') do |f|
         YAML.dump(current_attachments_styles, f)
       end

     Errno::ENOENT:
       No such file or directory @ rb_sysopen - /Users/tutec/Sites/thoughtbot/paperclip/paperclip/tmp/public/system/paperclip_attachments.yml
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `initialize'
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `open'
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `save_current_attachments_styles!'
     # ./spec/paperclip/paperclip_missing_attachment_styles_spec.rb:36:in `block (2 levels) in <top (required)>'

  5) Missing Attachment Styles is able to calculate differences when a new attachment is added to a model
     Failure/Error:
       File.open(Paperclip.registered_attachments_styles_path, 'w') do |f|
         YAML.dump(current_attachments_styles, f)
       end

     Errno::ENOENT:
       No such file or directory @ rb_sysopen - /Users/tutec/Sites/thoughtbot/paperclip/paperclip/tmp/public/system/paperclip_attachments.yml
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `initialize'
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `open'
     # ./lib/paperclip/missing_attachment_styles.rb:21:in `save_current_attachments_styles!'
     # ./spec/paperclip/paperclip_missing_attachment_styles_spec.rb:65:in `block (2 levels) in <top (required)>'

  6) Paperclip Many models at once does not exceed the open file limit
     Failure/Error: FileUtils.chmod( resolved_chmod, path(style_name) )

     Errno::ENOENT:
       No such file or directory @ chmod_internal - /Users/tutec/Sites/thoughtbot/paperclip/paperclip/tmp/public/system/dummies/avatars/000/000/210/original/5k.png
     # ./lib/paperclip/storage/filesystem.rb:50:in `block in flush_writes'
     # ./lib/paperclip/storage/filesystem.rb:37:in `each'
     # ./lib/paperclip/storage/filesystem.rb:37:in `flush_writes'
     # ./lib/paperclip/attachment.rb:239:in `save'
     # ./lib/paperclip/has_attached_file.rb:92:in `block in add_active_record_callbacks'
     # ./spec/paperclip/integration_spec.rb:11:in `block (4 levels) in <top (required)>'
     # ./spec/paperclip/integration_spec.rb:10:in `times'
     # ./spec/paperclip/integration_spec.rb:10:in `block (3 levels) in <top (required)>'

Finished in 32.88 seconds (files took 1.17 seconds to load)
1072 examples, 6 failures

Failed examples:

rspec ./spec/paperclip/attachment_processing_spec.rb:52 # Attachment Processing using validates_attachment processes attachments given a valid assignment
rspec ./spec/paperclip/paperclip_missing_attachment_styles_spec.rb:41 # Missing Attachment Styles is able to calculate differences between registered styles and current styles
rspec ./spec/paperclip/paperclip_missing_attachment_styles_spec.rb:27 # Missing Attachment Styles is able to save current attachment styles for further comparison
rspec ./spec/paperclip/paperclip_missing_attachment_styles_spec.rb:34 # Missing Attachment Styles is able to read registered attachment styles from file
rspec ./spec/paperclip/paperclip_missing_attachment_styles_spec.rb:63 # Missing Attachment Styles is able to calculate differences when a new attachment is added to a model
rspec ./spec/paperclip/integration_spec.rb:17 # Paperclip Many models at once does not exceed the open file limit

After we tame them, we'll merge. Thank you very much for sharing your work.

@cbeckr
Copy link
Contributor Author

cbeckr commented Aug 25, 2016

Sounds great - rebased and resubmitted as #2290. I wasn't able to reproduce the errors following the guidelines in CONTRIBUTING.md, even when testing with Appraisal. Could these have been transient?

tute pushed a commit that referenced this pull request Aug 28, 2016
#2290)

Rebased #2120 to master.
Paperclip duplicates the original files quite a lot as part of its validation process. (#1642, #1326).
When uploading large files (several hundred megabytes to gigabyte range), this becomes a problem: The web server will be busy creating 3 - 4 duplicates on disk, while the app (and potentially the user) are waiting for the upload operation to complete.
This pull request introduces hard links instead of ```FileUtil.cp``` where possible to keep the logic as-is but save time and disk space.
@jirihradil
Copy link

jirihradil commented Feb 5, 2018

There is a problem with rights on Mac (Paperclip 5.2.1):

[paperclip] Trying to link /tmp/1517831457-2f6f5305efc9a98e23eba431b5f0a573.zip to /var/folders/vy/zl7w58rd38j02td1s14tvcgh0000gn/T/a9e1d488324109cb7e946b0933ac897120180205-2161-9ey2b1.zip
Errno::EACCES: Permission denied @ rb_sysopen - /var/folders/vy/zl7w58rd38j02td1s14tvcgh0000gn/T/a9e1d488324109cb7e946b0933ac897120180205-2161-9ey2b1.zip

@jirihradil
Copy link

Just FYI - the problem is now resolved by setting read+write right on the created file. Because there is a hardlink, write right is required.

HoneyryderChuck pushed a commit to onfido/paperclip that referenced this pull request Jul 15, 2021
thoughtbot#2290)

Rebased thoughtbot#2120 to master.
Paperclip duplicates the original files quite a lot as part of its validation process. (thoughtbot#1642, thoughtbot#1326).
When uploading large files (several hundred megabytes to gigabyte range), this becomes a problem: The web server will be busy creating 3 - 4 duplicates on disk, while the app (and potentially the user) are waiting for the upload operation to complete.
This pull request introduces hard links instead of ```FileUtil.cp``` where possible to keep the logic as-is but save time and disk space.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants