- A visual analytics system for the material informatics project
You can install this by installing python and installing the dependencies, or through Docker, which may be simpler.
Click on a dot in the lower left "Graph" section to start exploring.
- Have python and pip installed
- install dependency:
pip install -r requirements.txt
- Start the server:
python runServer.py
- The default dataset is 35k.json. For others, run with
python runServer.py --input FILE.json
, where you substitute FILE.json. See the "Visualization Data" section for other input files.
- The default dataset is 35k.json. For others, run with
- Then open a browser at http://localhost:5010/ (in Chrome for the fastest performance)
- Install Docker
- If OSX/Linux: run "./docker_build.sh" without the quotes. If Windows, run the last line in docker_build.sh
- Similarily, if OSX/Linux: run "./docker_run.sh" without the quotes. If Windows, run last line in docker_run.sh.
- Run "./docker_run.sh FILE.json" to chose another dataset besides the 35k dataset. See the regular Setup section for more details.
- Then open a browser at http://localhost:5010/ (in Chrome for the fastest performance)
-
35k.json is data from our pipeline applied to the 35k papers downloaded from Elsevier.
-
99.json is data from our pipeline applied to the 99 gold standard papers. The experimental sentence extraction step was substituted with the SME's annotated experimental sentence annotations.
-
gold_normalized.json is the chemical annotations from SMEs, normalized with our pipeline.
Gold data is contained in the gold_data folder. The materials_informatics_gold_es_and_chems folder contains txt files containing the experimental sections of 99 gold standard papers, and ann files containing Brat formatted annotations of chemical entities. The gold_standard.json file contains the information of the materials_informatics_gold_es_and_chems files, as well as the morphology and composition annotation of the papers. These annotations are from subject matter experts.
Experimental sentences in 99.json, gold_standard.json and .txt files in the gold_data folder, and abstracts in 35k.json, that are publicly available, are used and provided for scholarly purposes only to enable others to reproduce our work and compare performance of various natural language processing tools being developed on a common dataset. DOIs for every article are included in the data files and the copyright is owned by the respective publishers.
Release number LLNL-CODE-780105 ChemVis is distrubted under the MIT license.