Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Platform Resource Management #54

Closed
6 of 10 tasks
drniiken opened this issue May 2, 2018 · 4 comments
Closed
6 of 10 tasks

Platform Resource Management #54

drniiken opened this issue May 2, 2018 · 4 comments
Assignees
Labels
a:director issue related with the director service a:infra+ops maintenance of infrastructure or operations (discussed in retro) a:sidecar issue related with the sidecar worker service a:webserver issue related to the webserver service PO issue Created by Product owners [PLEASE use osparc-issue repo]

Comments

@drniiken
Copy link
Member

drniiken commented May 2, 2018

GOAL
As DevOps, I want to have full control of hardware resources consumed by the platform. I also need good monitoring/logging/debugging tools to keep track of issues and problems.

Lifetime of Services

  • if a user logs out of the platform, resources should automatically be freed
  • after a long time of inactivity, resources should automatically be freed

Protocol for Problem Handling

  • what to do if a node fails
  • what to do if a service fails

Monitoring Tools

  • access to swarm logs
  • access to node logs
  • access to service logs
  • look into available tools, i.e., docker EE

Hardware Allocation

  • automation of cloud resource management
  • integration into scheduler
@oetiker
Copy link
Member

oetiker commented Sep 27, 2018

resources should probably be logged 'per user' so that we know who is causing the damage

@mguidon mguidon removed the backlog label Nov 1, 2018
@mguidon mguidon added PO issue Created by Product owners [PLEASE use osparc-issue repo] and removed Area: devops labels Jan 10, 2019
@sanderegg sanderegg added the Epic label May 16, 2019
@pcrespov pcrespov self-assigned this Jul 5, 2019
@pcrespov pcrespov added this to the Brindisi milestone Aug 26, 2019
@pcrespov pcrespov modified the milestones: Brindisi, Borogravia Sep 20, 2019
@pcrespov pcrespov added Epic and removed t:epic labels Sep 20, 2019
@esraneufeld
Copy link
Member

high priority (to be addressed for now):

  • Lifetime of Services

@sanderegg sanderegg added a:director issue related with the director service a:infra+ops maintenance of infrastructure or operations (discussed in retro) a:sidecar issue related with the sidecar worker service a:webserver issue related to the webserver service labels Oct 20, 2019
@mguidon mguidon modified the milestones: Fourecks or XXXX, Überwald Nov 15, 2019
@sanderegg
Copy link
Member

State:

  • book-keeping of services started by user/study for each opened browser tab:

    1. by monitoring websocket connection/disconnection (1 websocket per browser tab)
    2. store resources in external InMemory data store "Redis" to allow scaling of webserver in case of heavy load
  • allows auto-de-allocation of services when:

    1. browser tab is closed
    2. user logs-out
    3. user disconnects for X amount of time
    4. user is inactive for Y amount of time
    5. anonymous user closes tabs or is inactive
  • deleting of project deletes all project resources, e.g. services, internal data and database entries

@sanderegg
Copy link
Member

sanderegg commented Jan 15, 2020

State:

  • book-keeping of services started by user/study for each opened browser tab:
    • by monitoring websocket connection/disconnection (1 websocket per browser tab)
    • store resources in external InMemory data store "Redis" to allow scaling of webserver in case of heavy load
  • dynamic services are now auto-deallocated when:
    • browser tab is closed
    • user actively logs out (will also logout any other browser tab open of the same user)
    • user disconnects for more than 15 minutes (network down), during that time the services are kept up
    • user closes a project by going to the dashboard (unless another tab is still opened)
  • deleting a project deletes the project resources
  • deleting a node inside a project deletes the nodes resources
  • Refreshing the browser page while a project is open now re-opens the same project instead of bringing the user to the dashboard
  • Added a service to overview the Redis database state called redis-commander

@KZzizzle KZzizzle modified the milestones: Überwald, Dim Sum Mar 23, 2020
@KZzizzle KZzizzle removed this from the Dim Sum milestone Apr 16, 2020
@drniiken drniiken closed this as completed Oct 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:director issue related with the director service a:infra+ops maintenance of infrastructure or operations (discussed in retro) a:sidecar issue related with the sidecar worker service a:webserver issue related to the webserver service PO issue Created by Product owners [PLEASE use osparc-issue repo]
Projects
None yet
Development

No branches or pull requests

7 participants