Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ATO-648 - workflowToJSON.sh - porting the code from the TPC gitlab to the O2DPG tools #1851

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

miranov25
Copy link
Contributor

Parser - workflowToJSON.sh

First of the series script to be ported from the TPC gitlab to the DPG script:

  • duPython.py duQuery.sh monalisaQuery.py monalisaQuery.sh workflowToJSON.sh

Described in detail in the
https://indico.cern.ch/event/1423563/contributions/5987247/attachments/2868815/5022209/ATO-648%20-synchronization%20AliEn%20%E2%86%92%20GSI%20and%20ALICE%20monalisa%20queries.pdf

Overview of the Shell Script

  • • Purpose: Convert workflow configuration logs into structured JSON format for enhanced data analysis and readability.
  • • Components:
    • Initialization: Sets up the environment for script execution.
    • • Parsing: Transforms verbose workflow logs into a JSON format.
    • • Comparison: Compares JSON files to identify differences.
    • • Usage: Interactive Analysis: Facilitates easier data manipulation and analysis.
    • • Integration: Can be integrated into larger data processing pipelines.

Copy link

REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available
async-2023-pbpb-apass4
async-2023-pp-apass4
async-2024-pp-apass1
async-2022-pp-apass7
async-2024-pp-cpass0
async-2024-PbPb-apass1
async-2024-ppRef-apass1

@miranov25
Copy link
Contributor Author

miranov25 commented Dec 15, 2024

Initialization:

Apptainer> source /scratch/alice/miranov/alicesw2/O2DPG/UTILS/Parsers/workflowToJSON.sh
0001:       Function Overview: init648
0002:
0003:       This function initializes the script environment and provides access to a series of utility commands designed to assist in handling and transforming workflow logs.
0004:
0005:       Available Commands:
0006:       - \[init648]: Initializes the necessary environment settings for script execution. Use this before running other related functions to ensure all configurations are correctly set.
0007:       - \[description]: Provides a comprehensive explanation of the workflow processing, detailing each step and its purpose within the system.
0008:       - \[makeParse]: Executes the log parsing process, transforming verbose workflow logs into a structured JSON format, facilitating easier data manipulation and analysis.
0009:       - \[makeDiffExample]: Demonstrates how to compare two JSON files derived from workflow logs, highlighting differences.
0010:
0011:       Usage:
0012:       To learn more about each command, type the command followed by 'help'. This will display detailed information about the command's function and usage examples.
0013:
0014:       Example:
0015:         $ init648 help   # Displays help information for the init648 command.
0016:
0017:       Note: Tests were conducted in the directory:
0018:

image

@miranov25
Copy link
Contributor Author

Make Parse of the worklow log file

  • help
  • documenatation slide
  • usage

help

Apptainer> makeParse
0001: makeParse: Parse the workflow log and create an output.json file.
0002: Usage:
0003:     makeParse <workflowconfig.log>
0004:
0005: Example usage:
0006:     #makeParse workflowconfig.log  > ~/output.json            # To parse a specific log file.
0007:     makeParse /lustre/alice/tpcdata/Run3/SCDprodTests/fullRec/PbPb_Streamers_Tune_ClusterErrors-merge-streamer/avgCharge_fullTPC_sampling_TimeBins16-Average0_rejectEdgeCl-Seed0-Track0-margin0/LHC23zzh.b5p/544116.38kHz/0110/workflowconfig.log  > workflow.json
0008:     cat workflow.json | jq '.[] | select(.command | test("^o2-dpl"))'   # Filter DPL workflows.
0009:     jq '.[] | select(.command | test("^o2-gpu"))' workflow.json  # Filter GPU related commands.
0010:

image

usage

Apptainer> makeParse workflowconfig.log  
[
  {
    "command": "o2-ctf-reader-workflow",
    "switches": {
      "session": "default_337_5438",
      "severity": "info",
      "shm-segment-id": "1664",
      "shm-segment-size": "",
      "resources-monitoring": "50",
      "resources-monitoring-dump-interval": "50",
      "early-forward-policy": "noraw",
      "fairmq-rate-logging": "0",
      "shm-mlock-segment-on-creation": "1",
      "timeframes-rate-limit": "35",
      "timeframes-rate-limit-ipcid": "1730516608",
      "ans-version": "compat",
      "delay": "1",
      "loop": "",
      "ctf-input": "list.list",
      "copy-cmd": "file://?dst\"",
      "onlyDet": "ITS,TPC,TOF,FV0,FT0,FDD,MID,MFT,MCH,TRD,EMC,PHS,CPV,HMP,ZDC,CTP",
      "emcal-decoded-subspec": "1",
      "timeframes-shm-limit": "33333333333",
      "pipeline": "",
      "remote-regex": "\"^alien:///alice/data/.+\"",
      "allow-missing-detectors": "",
      "its-digits": true,
      "mft-digits": true,
      "configKeyValues": "|"
    },
    "configKeyValues": {
      "keyval.input_dir": "/workdir",
      "keyval.output_dir": "/dev/null"
    }
  },
  {
    "command": "o2-tfidinfo-writer-workflow",
    "switches": {
      "session": "default_337_5438",
      "severity": "info",
...

@miranov25
Copy link
Contributor Author

makeDiffExample - Comparing JSON Files

Description

This function provides examples of parsing workflow logs into JSON format and then comparing these JSON files using diff.


Steps

  1. Parse Logs into JSON

    makeParse /path/to/workflowconfig.log > workflow0.json
    makeParse /path/to/other_workflowconfig.log > workflow1.json
  2. Compare JSON Files

    diff <(jq --sort-keys . workflow1.json) <(jq --sort-keys . workflow0.json)
  3. Side-by-Side Comparison with Color

    diff --side-by-side --left-column --color=always \
      <(jq --sort-keys . workflow1.json) <(jq --sort-keys . workflow0.json) | less -R
  4. Compare Only Specific Workflow Commands

    diff --side-by-side --left-column --color=always \
      <(jq '.[] | select(.command | test("^o2-gpu"))' workflow1.json | jq --sort-keys .) \
      <(jq '.[] | select(.command | test("^o2-gpu"))' workflow0.json | jq --sort-keys .) | less -R

Highlighted Example

  • Functionality: Compare workflow0.json and workflow1.json for differences.
  • Specific Command Filtering: Focuses on commands containing "^o2-gpu" for targeted comparison.
  • Enhanced Readability: Side-by-side display with syntax highlighting for easier analysis.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant