Skip to content

ChemNLP: A Natural Language Processing based Library for Materials Chemistry Text Data

License

Notifications You must be signed in to change notification settings

usnistgov/chemnlp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

name alt text DOI

ChemNLP

Table of Contents

Introduction

ChemNLP is a software-package to process chemical information from the scientific literature.

ChemNLP

Installation

First create a conda environment: Install miniconda environment from https://conda.io/miniconda.html Based on your system requirements, you'll get a file something like 'Miniconda3-latest-XYZ'.

Now,

bash Miniconda3-latest-Linux-x86_64.sh (for linux)
bash Miniconda3-latest-MacOSX-x86_64.sh (for Mac)

Download 32/64 bit python 3.8 miniconda exe and install (for windows) Now, let's make a conda environment, say "chemnlp", choose other name as you like::

conda create --name chemnlp python=3.9
source activate chemnlp

Method 1 (using setup.py):

Now, let's install the package:

git clone https://github.com/usnistgov/chemnlp.git
cd chemnlp
python setup.py develop
cde data download

Method 2 (using pypi):

As an alternate method, ChemNLP can also be installed using pip command as follows:

pip install chemnlp
cde data download

Examples

Parse chemical formula

run_chemnlp.py --file_path="chemnlp/tests/XYZ"

Text classification example

python chemnlp/classification/scikit_class.py --csv_path chemnlp/sample_data/cond_mat_small.csv

Google Colab example for installation and text classification

Text generation example

Google Colab example for Text Generation with HuggingFace

Using the webapp

The webapp is available at: https://jarvis.nist.gov/jarvischemnlp

JARVIS-ChemNLP

Reference

  1. ChemNLP: A Natural Language Processing based Library for Materials Chemistry Text Data
  2. AtomGPT: Atomistic Generative Pretrained Transformer for Forward and Inverse Materials Design
  3. JARVIS-Leaderboard
  4. NIST-JARVIS Infrastructure