The construction of graphical models is a fundamental task in probabilistic modeling and Bayesian networks. Traditionally, building these models requires expert knowledge to define variables, their relationships, and conditional dependencies. This process is often manual, time-consuming, and prone to human error. With the advent of Large Language Models (LLMs), there is an opportunity to automate this process by translating natural language descriptions directly into graphical models.
├── README.md # About the repo.
├── mkdocs.yml # The configuration file.
├── GMG_LLM_powered.pdf # Presentation of a project.
│
├── code/ # Source code files
│ ├── config.py # Configuration settings.
│ ├── init.py # Package initialization.
│ ├── NLI_base_extractor.py # Base extractor for NLI.
│ ├── NLI_node_extractor.py # Node extractor for NLI.
│ ├── NLI_edge_extractor.py # Edge extractor for NLI.
│ ├── NLI_suggest_node_distribution.py # Suggests node distribution for NLI.
│ └── graph_utils.py # Utility functions for graph processing.
|
├── tests/ # Test files
│ ├── test_base_extractor.py # Tests for base extractor functionality.
│ ├── test_extract_nodes.py # Tests for node extraction functionality.
│ ├── test_extract_edges.py # Tests for edge extraction functionality.
│ └── test_suggest_node_distr.py # Tests for node distribution suggestions.
|
├── docs/ # Documentation files
│ ├── index.md # The documentation homepage.
│ ├── installation.md # Installation instructions.
│ ├── usage_examples.md # Usage examples.
│ └── function_descriptions.md # Descriptions of functions.
│
└── utils/ # Utility scripts and dependencies
├── requirements.txt # Project dependencies.
└── badge_generator.py # Generates coverage button.
NaturalLanguageInput Class will have the following algorithms:
- extract_vertices
- by the text in English, describing the probability structure, it extracts vertices, using NER or Natural Language Understanding (with LLMs and prompts)
- it also identifies, which type of vertex is it: random variable or constant
- extract_vertex_dependencies
- by the text and 2 vertex names suggests type of dependency (no OR ordered link)
- construct_graph
- given verticies and links constructs directed graph
- graph can save any information about verticies (name/type/etc) and links
- correct_graph
- given constructed directed graph, it corrects its structure (loops/unused vertives and etc.)
- returns set of graphs, made from original with corrections
- suggest_vertex_distributions
- given the graph and text suggests type of distribution for every vertex
GraphModel Class will have the following algorithms:
- visualize
- plots graph image, with captions for verticies and specitication of distribution of every verticies
- create_graph
- given verticies and links creates graph
- change_vertex_info
- given number of vertex, changes its properties (distribution/name/etc.)
Deploying into HF/Gradio
- TODO
- LLM API: GPT-4o, o1, Claude 3.5, Gemini
- Code generation: LLaMa2, LLaMa3
- LLM tuning: Stable LM 2, Mixtral8B, Alpaca
- Graphs: NetworkX
- Visualization: Graphvis, Matplotlib, Plotly
- Deploy: HF Spaces, Gradio
Documentation available at docs
Documentation and test coverage badges can be updated automatically using github actions.
Initially both of these workflows are disabled (but can be run via "Actions" page).
To enable them automatically on push to master branch, change corresponding "yaml" files.