Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working container for slurm 22.5 on Ubuntu 22.04 #233

Closed
Caian opened this issue Jan 11, 2023 · 8 comments
Closed

Working container for slurm 22.5 on Ubuntu 22.04 #233

Caian opened this issue Jan 11, 2023 · 8 comments
Labels
deprecated version The issue concerns a deprecated version of the software.

Comments

@Caian
Copy link

Caian commented Jan 11, 2023

Hello all,

Following the achievement of daverona in porting the docker container to a newer slurm/system (#222), we decided to use their repo and bump the version of Slurm to 22.5, apply the patch #219, and include a docker-compose example.

It builds libslurm and pyslurm from source using Ubuntu 22.04 with the debs generated on Ubuntu 20.

This requires building through build.sh instead of calling docker build directly.

https://github.com/hpg-cepetro/slurm-web-docker-22.5-ubuntu-22.04

The patch #219 fixes several issues, unfortunately the 2D rack view is still not working.

Best regards,

@carlos-encs
Copy link

Hello @Caian

I installed your slurm-web-docker, but when I try to open http:/cluster.local:8081/slurm/ a blank page is showed only with a blue arrow on top-left and a circle in the middle spinning.

I use FF console to debug the page and I found the following:

HTTP/1.1 404 NOT FOUND on all these entries:

http://cluster.local:8081/slurm-web-conf/config.json
http://cluster.local:8081/slurm-web-conf/clusters.config.json
http://cluster.local:8081/slurm-web-conf/2d.colors.config.json
http://cluster.local:8081/slurm-web-conf/2d.config.json

Source map error: Error: request failed with status 404
Resource URL: http://cluster.local:8081/javascript/bootstrap/js/bootstrap-tagsinput.min.js

Could you help me out to fix these issues?

Regards

@Caian
Copy link
Author

Caian commented Mar 26, 2023

Hi @carlos-encs, how did you start the container? Are you binding /etc/slurm-web to a directory in the host machine?

Before binding volumes, you should make a copy of the default configuration files that are inside the container. You could use the ones from https://github.com/edf-hpc/slurm-web/tree/master/conf, but I'm not sure if they are compatible.

@carlos-encs
Copy link

Hi @Caian

This is how I start the container
docker run -d --name slurm-web
-v ./slurm-web2/conf:/etc/slurm-web
-v /nfs/appdata/serv_slurm/munge/etc/:/etc/munge:ro
-v /nfs/appdata/serv_slurm/slurm-23.02.0/root/bin/:/etc/slurm-llnl:ro
-p 8081:80
slurm-web

munge and slurm files are in a nfs share.
munge auth is working fine.

Sorry for this noob question, but I have to ask: slurmrestd should be configured and running?
Could you send me your racks.xml and restapi.conf? I don't understand how to configure those files, properly.

Thanks

@Caian
Copy link
Author

Caian commented Apr 1, 2023

@carlos-encs I removed sensitive information from our environment.

[cors]
authorized_origins = http://localhost,http://XXXXXXXX,http://XXXXXXXX

[config]
authentication = disable
cache = enable

[roles]
guests = enabled
user = @chimistes
admin = @admin
restricted_fields_for_all = command
restricted_fields_for_user = command
restricted_fields_for_admin =

[ldap]
uri = ldap://XXXXXXXX
base_people = XXXXXXXX
base_group = XXXXXXXX
reader_dn = XXXXXXXX
reader_password = XXXXXXXX
expiration = 1296000
resolve_job_users = true
cache_job_users = true

[cache]
redis_host = redis
redis_port = 6379
jobs_expiration =
global_expiration =
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rackmap SYSTEM "/usr/share/slurm-web/restapi/schema/dtd/racks.dtd">
<rackmap>

  <nodetypes>
    <nodetype id="XXXXX" model="XXXXXXXXX" height="2" width="1" />
    <nodetype id="YYYYY" model="YYYYYYYYY" height="1" width="1" />
  </nodetypes>

  <racks posx="0" posy="0" width="2" depth="2">
    <racksrow posx="0">
      <rack id="rack1" posy="0">
        <nodes>
          <node id="AA" type="XXXXX" posx="0" posy="0" />
          <node id="BB" type="XXXXX" posx="0" posy="1" />
          <nodeset id="CC" type="YYYYY" posy="5" />
        </nodes>
      </rack>
    </racksrow>
  </racks>
</rackmap>

No, you don't need slurmrestd working.

@nekokani
Copy link

Have you solved your problem?
I met the same. :(

@BigDataHealthcare
Copy link

BigDataHealthcare commented Jan 20, 2024

I am running the container and slurmctld on same host. The host created on virtual machine. I have updated the clusters.config.json with value []. But i am getting following error.

image

I found that the http://localhost:8080/slurm-restapi throwing an"Intrernal Server Error" - status : 500. I am unable to find out the root cause of issue.
Can you please share me some information that will help me to resolve issue?

@rezib rezib added the deprecated version The issue concerns a deprecated version of the software. label May 13, 2024
@rezib rezib self-assigned this May 13, 2024
@rezib
Copy link
Contributor

rezib commented May 15, 2024

This issue concerns Slurm-web v2 which is not maintained anymore. You are highly encouraged to test the new version v3.0.0 for which the quick start guide is available online: https://docs.rackslab.io/slurm-web/install/quickstart.html

Note that Slurm-web v3.0.0 is officially supported on Slurm 24.04 LTS with deb packages. For older versions, we plan to distribute containers and this effort is tracked in #266.

Unless someone is motivated to maintain the old version of Slurm-web or you have a justified reason to keep this issue open, it will be closed in a few weeks.

@rezib rezib added the pending closure Unless justified reason is provided, it will be closed soon without solution. label May 15, 2024
@rezib rezib removed their assignment May 15, 2024
@rezib
Copy link
Contributor

rezib commented Jun 19, 2024

For the reasons explained in the previous comment, I finally close this issue.

@rezib rezib closed this as completed Jun 19, 2024
@rezib rezib removed the pending closure Unless justified reason is provided, it will be closed soon without solution. label Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecated version The issue concerns a deprecated version of the software.
Projects
None yet
Development

No branches or pull requests

5 participants