Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT

Description

Impressed by the recent successful stories of ChatGPT in many domains, we first pose the question: “If safety analysis can actually make use of LLMs?”. To answer, we conducted a case study of applying ChatGPT in the STPA for an AEB system.

Baseline Examples

We set the STPA results obtained by human safety experts as our baselines (published in Comparison of the HAZOP, FMEA, FRAM, and STPA Methods for the Hazard Analysis of Automatic Emergency Brake Systems and System-Theoretic Process Analysis (STPA) of Demand-Side Load Management in Smartgrids). We maintained the same application scenario as in original papers, which are the AEB systems and DSM systems. For subsequent comparisons, we adopted the same methodology as the original studies.

Control Loop Structure

Control loop structures of three complexity levels for the two baselines, AEB (first row) and DSM (second row) systems.

Research Questions

We first define three levels of abstraction, ranging from coarse to fine-grained, that represents how human experts may interact with ChatGPT: Workflow Level, Semantics Level and Syntax Level. After that, we frame the following research questions (RQs),

RQ1 (Collaboration Scheme): How do various collaboration schemes of integrating ChatGPT into STPA affect the effectiveness and usability of STPA? RQ2 (Semantic Complexity): To what extent do variations in semantic complexity of individual input questions to ChatGPT affect the correctness and pertinence of STPA results? RQ3 (Prompt Guideline): Does the utilisation of syntactic-level prompt guidelines affect the correctness and pertinence of STPA results?

Answer to RQ1

We consider three collaboration schemes incorporating ChatGPT into the STPA workflow in the case studies. Three ways of incorporating ChatGPT in the workflow of how human safety experts perform STPA: (a) One-off simplex collaboration (b) Recurring simplex collaboration (c) Recurring duplex collaboration. The Venn diagram of the sets of UCAs for the AEB system and the DSM system. The different colour represents the baseline (green), one-off simplex collaboration case (yellow), recurring simplex collaboration case (blue) and recurring duplex collaboration case (orange) respectively

Answer to RQ2

Box and whisker plots of samples for RQ2 (Left: Number of correct UCAs across 3 groups of samples. Right: Proportion of correct UCAs across 3 groups of samples.)

Answer to RQ3

Box and whisker plots of samples for RQ3 (Left: Number of correct UCAs across 3 groups of samples. Right: Proportion of correct UCAs across 3 groups of samples.)

Discusstion

Four-quadrant classification of risks with ways of mitigations.

Independent Review

The RQ2_RQ3_val_pdf folder contains .pdf type files that need to be independently reviewed. They contain all the results of RQ2 and RQ3. A black box in the file indicates findings which incorrect UCAs are identified in these interactions, and the 'correct' UCA is defined as an answer that is both correct and useful in this paper.

Note

Part of original ChatGPT response for each case (.mhtml) can be downloaded and opened locally using a web browser.
We utilised the plus versions of ChatGPT (GPT4) provided by the official source in this paper.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
AEB_Case_GPT3.5		AEB_Case_GPT3.5
AEB_Case_GPT4		AEB_Case_GPT4
DSM_cases		DSM_cases
IMG		IMG
RQ2_RQ3_Statistics		RQ2_RQ3_Statistics
RQ2_RQ3_val__raw_pdf		RQ2_RQ3_val__raw_pdf
RQ2_RQ3_val_pdf		RQ2_RQ3_val_pdf
RQ2_results		RQ2_results
RQ2_results_highlight		RQ2_results_highlight
RQ3_results		RQ3_results
RQ3_results_highlight		RQ3_results_highlight
README.md		README.md
RQ2_RQ3.ipynb		RQ2_RQ3.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT

Description

Baseline Examples

Control Loop Structure

Research Questions

Answer to RQ1

Answer to RQ2

Answer to RQ3

Discusstion

Independent Review

Note

About

Releases

Packages

Languages

YiQi0318/ChatGPT-STPA

Folders and files

Latest commit

History

Repository files navigation

Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT

Description

Baseline Examples

Control Loop Structure

Research Questions

Answer to RQ1

Answer to RQ2

Answer to RQ3

Discusstion

Independent Review

Note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages