Skip to content

KT Sessions

Shailesh Mahajan edited this page Aug 12, 2024 · 16 revisions

Knowledge Transfer meeting recordings

Repositories (Some of them)

AWS Console

Atlas Mongo DB

DEV Cluster

  • https://account.mongodb.com/account/login (Ask one of the existing team member to invite yourself)
  • Once logged-in to cloud, Create yourself a DB user with appropriate role. These credentials will be used to access actual mongo DB/collections (If don't have privilege, reach out to existing tem member)
  • For export and curator service there are a few deployment user accounts like curatorservice and exportservice. Passwords for these are in EKS Pod's as env vars.
  • Use Mongo compass or shell to connect to Mongo DB using mongodb+srv://username:password@cluster0.vwhx6.mongodb.net/ (Example for covid19 DB)
  • NOTE: Other DB's on this cluster will each need their own credential

PROD Cluster

Database and collections creation

  • Non curator portal related outbreaks like Marburg, Ebola process

    • Login to Mongo Cloud
    • Create database manually
    • To create collections , Run DB.insert() and mongo will create collection
    • No automated scripts as yet
  • Curator portal related outbreaks like Covid19

    • Login to Mongo Cloud
    • Create database manually
    • Collections are probably (?) created when BE server is reached for first request
    • Scripts in this folder are used to run migrations- data-serving/scripts/setup-db

Local Setup

Curator UI server, Curator API server, Data Server, Geocoding Service, Mongo Database

  • Follow the readme at https://github.com/globaldothealth/turnkey-curator-portal/blob/main/dev/README.md
  • Reach out to current team member to get an valid .env file
  • Docker file will setup all services - Curator UI server, Curator API server, Data Server, Mongo Database.
  • Note : If we want to retain data from our docker container MongoDB remove --force-recreate option from ./dev/run_stack.sh. Without this option every stop and start of docker compose will create fresh MongoDB (which is needed for running E2E tests)
  • To make yourself admin with all roles
    • Run ./dev/make_superuser.sh mpox <YOUR_EMAIL_ADDRESS>. Note: Our .env has MONGO_DB_NAME and DISEASE_NAME as mpox
    • Prior to this, update ./dev/make_superuser.sh and add junior curator to roles array (if you want it) Note:
    • Currently the Curator UI server, Curator API server are not used. Its still work-in-progress to help ease the case creations
    • For now Google sheets are being used, which are manually converted to xls and uploaded to S3 bucket for visualization. (This statement is true only for recent influenza cases as it was to feasible to setup the entire AWS processes)

Lens (For monitoring EKS Clusters)

  • Add following to ~/.aws/credentials file

    [GLOBAL-HEALTH]
    aws_access_key_id=<YOUR_KEY>
    aws_secret_access_key=<YOUR_SECRET>
    region = eu-central-1
    
  • Run following commands to add DEV cluster to lens

    export AWS_PROFILE=GLOBAL-HEALTH
    aws eks update-kubeconfig --name gh-dev --alias gh-dev
    
  • Run following commands to add QA cluster to lens

    export AWS_PROFILE=GLOBAL-HEALTH
    aws eks update-kubeconfig --name gh-qa --alias gh-qa
    
  • Run following commands to add PROD cluster to lens

    export AWS_PROFILE=GLOBAL-HEALTH
    aws eks update-kubeconfig --name gh-prod --alias gh-prod
    

Visualization Frontend (for lets say monkey pox map)

General Product Information

Existing documentation

Deployment Architecture

Following is the link to update the deploy architecture diagram

Turnkey Curator Portal

image

Main WordPress site and Visualizations

image

Main WordPress site

Visualization Frontend (For map display)

Briefing Report

  • For Briefing Report, each outbreak has its own frontend UI hosted on CloudFront and backed by static S3 site using S3 bucket
  • Briefing report is a entire HTML site generated by job.
  • Not every outbreak has briefing report.
  • Marburg Briefing report has folder called s3_ui

Curator UI/API/Geocoding/Data service

NOTE:

  • Each disease outbreak has its own set of Curator UI/API/Data server/Mongo DB
  • Not every outbreak has curator portal setup (Like mpox, Ebola, Influenza dosen't have it)
    • For these outbreaks, Google spreadsheet data is ingested into MongoDB and nightly process creates map data in S3 bucket
  • Curator user credentials are managed as collection in MongoDB
  • Steps for new outbreak infra creation
    • Repo - Global-Health/terraform/NEW-PATHOGEN.md
  • For outbreaks like covid19, we have following curator URL's

turnkey-curator-portal/data-serving

  • nodeJS/Mongoose data-service is used. python/pymongo reusable-data-service is NOT used.

turnkey-curator-portal/geocoding

turnkey-curator-portal/verification/curator-service/api

  • Is a nodeJS(18.18.0)/Express app

  • Runs on port 3001; Docker maps 3001:3001

  • Accessible at http://localhost:3001/api/cases?page=1&limit=10&count_limit=10000 http://localhost:3001/api/cases/1

  • Used as BE for Curator UI

  • Any API's with /geocode are proxied into above "geocoding" python app running at 8080; Docker file has LOCATION_SERVICE_URL: "http://geocoding:8080"

  • Any CRUD's are forwarded to "data-serving" layer via DATASERVER_URL: "http://data:3000"

  • turnkey-curator-portal/verification/curator-service/api/src/model - Some of these were used for automated ingestion during Covid. Ticket is created to research what of these are used.

  • AwsBatchClient - Links source created to automated ingestion process (More info 26:00 to 30:00 https://drive.google.com/file/d/1H77Eya-MWvdEzP-KgcIZmEUFL5r00hBj/view?usp=sharing)

  • To run any API

    • (1) If you don't have user on the system yet
      POST (using postman) to http://localhost:3001/auth/register (Note: no /api here)
      {
    		"allOf": {
      	"roles": [
        		"admin"
      		]
    		},
    		"name": "John Doe",
    		"email": "foo@bar.com",
    		"googleID": "string"
      }
    

    This will create a user and from API response get the "apikey" Run any api (using postman/curl) by passing header "apikey" above as X-API-Key

    • (2) If you have a user on system Login to the admin UI at Note the cookie (connect.sid) in Browser -> Application -> Storage Run any api (by setting postman cookie) that we got from above

turnkey-curator-portal/verification/curator-service/ui

  • Is a nodeJS(18.18.0)/Express app
  • Runs on port 3002; Docker maps 3002:3002
  • Pivot tables menu on Curator UI is make sure uploaded google spreadsheet data matches to what's being imported in MongoDB

turnkey-curator-portal/suggest

  • Contains list of symptoms and occupations that are used to pre-fill UI

  • For example - turnkey-curator-portal/verification/curator-service/ui/src/components/new-case-form-fields/Demographics.tsx uses optionsLocation="https://raw.githubusercontent.com/globaldothealth/list/main/suggest/occupations.txt"

    turnkey-curator-portal/verification/curator-service/ui/src/components/new-case-form-fields/Symptoms.tsx uses optionsLocation="https://raw.githubusercontent.com/globaldothealth/list/main/suggest/symptoms.txt"

  • Currently these suggestions are used from list repo, but ticket created to use them from turnkey-curator-portal/suggest

turnkey-curator-portal/api(Python/R)

  • Thin wrapper that allows clients to download curator portal data by country using API KEY
  • There's no deployment for this. Code is copied on to clients machine, dependencies are installed followed by actually running the code
  • This readme has more detailed information - turnkey-curator-portal/api/README.md

Types of Users (for curator admin portal)

  • Admin
  • Curator
  • Junior Curator

DevOps - Terraform / EKS Fargate

  • All the .env files needed for terraform and EKS Fargate are located at S3 bucket named terraform-secrets

Upgrade the cluster to the desired version via the AWS Management Console (Valentine's notes)

  • Mostly follow readme located at https://github.com/globaldothealth/eks-fargate/blob/main/README.md

  • Clone the eks-fargate repository which contains the Terraform infrastructure code for the clusters

  • Download the environment file from terraform-secrets S3 bucket and place it in the root of the repository

  • Once the upgrade is complete, update the TF_VAR_k8s_version variable to match the new version in the console.

  • Run the plan script to verify the changes:

    ./plan.sh $ENV

  • If the changes only affect the node groups, proceed to apply the changes:

    ./run.sh $ENV

  • If the changes are more extensive, verify them thoroughly before applying.

  • Check the kube-proxy image version:

    kubectl describe daemonset kube-proxy -n kube-system | grep Image

  • If the image version does not match the current Kubernetes version, refer to AWS EKS Documentation for the correct image tag: https://docs.aws.amazon.com/eks/latest/userguide/managing-kube-proxy.html

  • Upgrade the kube-proxy image:

    kubectl set image daemonset.apps/kube-proxy -n kube-system kube-proxy=<AWS_ACCOUNT>.dkr.ecr.eu-central-1.amazonaws.com/eks/kube-proxy:<TAG>

  • Verify the update:

    kubectl describe daemonset kube-proxy -n kube-system | grep Image

  • Run the script to restart deployments:

    ./restart-deployments.sh

  • Ensure all nodes are on the latest EKS version:

    kubectl get nodes

Next Topics / Discussions

  • Provide access to AWS

  • Provide access to repositories

  • Provide .env file for curator UI/API

  • Provide .env file for Visualization UI (for any outbreak)

  • Share google document with curator portal and Visualization information

  • Add us to the EKS user groups for QA and PROD clusters (just like we did for DEV cluster)

  • Main WordPress site

  • Go over AWS infrastructure, docker file and deploy of

    • Curator UI server
    • Curator API server
    • Data Server
    • Location server
    • Is this the DEV curator UI - https://dev-data.covid-19.global.health/ ? We understand each disease outbreak will have its own set of Curator UI/API/Data server/MongoDB?
    • Do we have steps documented to deploy code changes to EKS
  • turnkey-curator-portal/data-serving

    • What is difference between nodeJS/Mongoose data-service VS python/pymongo reusable-data-service? Are both used?
    • Docker file appears to use nodeJS version, But the readme suggest "Any new outbreak that is tracked by Global.health will use the reusable-data-service for CRUD operations"
  • Review Curator UI/API deploy architecture diagram

    • Where are Curator user credentials managed (like in AWS Cognito? Or DB?)
  • Review map/briefing report deploy architecture diagram

    • For mpox aggregator run.py code, where is it fetching data from google sheets?
    • For mpox , where is briefing report generated?
      • Briefing Report
        • Repo location
        • Does Briefing report has all its required content in www.monkeypox.global.health (for monkey pox) or does it also go to some other aggregate S3 bucket?
        • Deep dive in this, including scripts, s3 buckets etc.
  • Access

    • Share Dev/Prod/QA Atlas MongoDB credentials.
    • Google Cloud service account (Where Google spreadsheets are stored)
  • Go over current google sheet process - code, AWS s3 bucket, AWS cloud front, batch jobs or manual process to be run

    • local influenza
  • turnkey-curator-portal/geocoding

    • What is the relevance of geocoding/location-service/data/adm1_parsed_data.json, adm2_parsed_data.json, adm3_parsed_data.json
    • Reason for geocoding API's not needing authentication?
    • Go over Swagger API's for geocoding
  • CSV option on main site functions

    • Repo location?
    • How does CSV option function
  • How is initial MongoDB created for an outbreak? or where are the scripts that create DB

  • Outbreak visualization template

    • AWS Infrastructure
    • Docker file
    • How are all the various CloudFront's deployed in case of code updates? Manually or via scripts?
    • Go over contents of aggregate bucket
  • What is s3_ui folder in Marburg?

    • What are all those lambda's?
  • Go over entire process in case of new outbreak.

  • GitHub Wiki - Add this documentation to GitHub wiki page

  • PR review

  • turnkey-curator-portal/verification/curator-service/api/src/model

    • Are Mongoose data models not used anymore (as we have data-serving)? If so, can we delete those from codebase?
  • turnkey-curator-portal/suggest

    • what is the purpose, usage and deployment?
  • turnkey-curator-portal/api (Python/R)

    • what is the purpose, usage and deployment?
  • Explain more about following and its usage via API calls (verification/curator-service/api/src/index.ts)

    • AwsBatchClient
    • AwsLambdaClient
    • EmailClient
  • Go over screen details for each of the following.

    image

  • Create GitHub tickets for ToDo items

  • Go over other 2 remaining projects in scripts dir - usage, deployment etc.

  • DevOps

    • Why are Load Balancer target listeners EC2? instead of Fargate nodes
    • Due to predefined EC2's as t3-large, Will it limit auto scaling

To Do's