Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make sure that InterProScan installation directory is writable before creating symlink to data in InterProScan_data easyconfig #16169

Closed
wants to merge 1 commit into from
Closed

Conversation

Ghepardo
Copy link
Contributor

(created using eb --new-pr)

…teable before attempting to create symbolic link in it, and to return it to read-only afterwards. Installation fails otherwise.
@boegel boegel added the bug fix label Aug 31, 2022
@boegel boegel added this to the 4.x milestone Aug 31, 2022
@@ -30,8 +30,10 @@ extract_sources = False
# NOTE The regular InterProScan data package does not contain the lookup_service data way too large!
install_cmd = 'tar xfz interproscan-data-%(version)s.tar.gz --directory %(installdir)s/ --strip-components=1 && '
install_cmd += 'rm -f interproscan-data-%(version)s.tar.gz && '
install_cmd += 'chmod u+w "$EBROOTINTERPROSCAN" && '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of this, since it effectively circumvents making install directories read-only.

From the discussion in #15717 I understand that this is basically the only option since InterProScan doesn't give a better alternative, but still...

It's pretty icky to have software A fiddle in the install directory of software B (even though A & B are clearly closely related here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel I completely agree with you there. Surely the answer to this situation is to have two alternative builds for the same InterProScan version, one with the data and one without? Then the one with the data would put the data in as part of the build. Then we could get rid of that horrible separate InterProScan_data module completely. Would that be reasonable, and would you like me to set the wheels in motion for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We briefly discussed this during the EasyBuild conf call today (@smoretti was there, he did the split between InterProScan and InterProScan_data originally).

Going forward with a "non-data" and a "data" InterProScan easyconfig seems like a good approach forward.

@smoretti Can you summarize the discussion we had, and what your take away is?

Copy link
Contributor Author

@Ghepardo Ghepardo Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smoretti I would be interested to hear your thoughts. I do apologise if I seemed disparaging about the work on this you did previously. I'd be happy to produce "no-data" and "with-data" versions of the InterProScan easyconfig. So that I can get this right, would you let me know the reasons for the following changes you made in the easyconfig?

  • Using the Binary easyblock instead of the Tarball one.
  • Going to the trouble of removing from the installation certain files you appear to regard as superflous, saving only around 10M in disk space.
  • Removing execute permissions from the JAR file.

I'm inclined to do a new build for InterProScan 5.57/InterPro 90, and to use the latest available support packages in EB that will form a consistent set - Java 11 [I believe no later version is supported by InterProScan], Perl 5.34.0, libgd 2.3.3 and Python 3.9.6. Does that seem reasonable to you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our intended users of InterProScan say that using SignalP with it is pretty essential. That is one of the applications that is not bundled with InterProScan, but it is available within EasyBuild. To use it we'd need to patch the InterProScan config file, because SignalP will be in a separate location (and will be a different version). Therefore I propose to produce a third easyconfig flavour of InterProScan 5.57-90.0, one that loads SignalP as a dependency and uses it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smoretti Can you summarize the discussion we had, and what your take away is?

The discussion was mainly to avoid an eb to change the permissions of an already installed eb.

Before doing the split between code and data, I asked the best way to do it on the EasyBuild Slack: https://easybuild.slack.com/archives/C34UA1HT7/p1652962818739379
But did not think about the permission issue.

The best is maybe - like suggested during the discussion - to have a eb file interproscan full (code + data) as before, and an interproscan code only.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smoretti I would be interested to hear your thoughts. I do apologise if I seemed disparaging about the work on this you did previously. I'd be happy to produce "no-data" and "with-data" versions of the InterProScan easyconfig. So that I can get this right, would you let me know the reasons for the following changes you made in the easyconfig?

* Using the Binary easyblock instead of the Tarball one.

I don't remember exactly why I did it. Maybe during some trials I used the full interproscan tarball and extract manually only the code from it.

* Going to the trouble of removing from the installation certain files you appear to regard as superflous, saving only around 10M in disk space.

I install interproscan for years. Usually I remove some 32bits binaries (they are also provided as 64bits), some tests and source files.

* Removing execute permissions from the JAR file.

This is cleaner

I'm inclined to do a new build for InterProScan 5.57/InterPro 90, and to use the latest available support packages in EB that will form a consistent set - Java 11 [I believe no later version is supported by InterProScan], Perl 5.34.0, libgd 2.3.3 and Python 3.9.6. Does that seem reasonable to you?

It seems reasonable to me.

I split code and data because my development machine has not a lot of disk space and the data indexing at the end of the installation fulfils my disk.

Also one of the user at my institute needs the extra lookup_service data (1.2TB compressed) not provided in the full tarball. So I always extract interproscan data by hands, and put it on a different disk.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our intended users of InterProScan say that using SignalP with it is pretty essential. That is one of the applications that is not bundled with InterProScan, but it is available within EasyBuild. To use it we'd need to patch the InterProScan config file, because SignalP will be in a separate location (and will be a different version). Therefore I propose to produce a third easyconfig flavour of InterProScan 5.57-90.0, one that loads SignalP as a dependency and uses it.

Not sure about this.

SignalP, as well as TMHMM and Phobius, are extra tools that InterProScan can use. But they are license protected, not directly downloadable.

An installation with the --robot option will fail. What do you think about that @boegel ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smoretti . It's useful and interesting to know your experiences with this package.

I'm inclined to change the default installation as little as possible, unless there's a clear benefit from changes or a compelling reason for them. This makes it more maintainable for the future. Therefore I think I'll not put those customisations into the next build. Also I'm not sure that there's any case for using the Binary easyblock here, so I'll revert to the Tarball one.

The next easyconfig version will definitely have a "no-data" flavour available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smoretti & @boegel On reflection, I think it may complicate things too much to create a third, SignalP flavour of the the easyconfig. Given that InterProScan uses several other tools, this could open the door to wanting a new InterProScan flavour for each of those - not to mention flavours for different combinations of them. That way lies madness.

I shall therefore simply amend our own InterProScan config manually after each installation, to point it to the correct SignalP.

@boegel boegel changed the title Resolve being unable to write to InterProScan application directory make sure that InterProScan installation directory is writable before creating symlink to data in InterProScan_data easyconfig Aug 31, 2022
@Ghepardo Ghepardo closed this by deleting the head repository Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants