Skip to content

IUPAC-InChI/InChI

Repository files navigation

License: MIT Maintenance GitHub issues GitHub Discussions build build GitHub contributors GitHub release Changelog - v1.07.2 Software Article - JChemInf Good reads InChI - Web Demo

InChI - The IUPAC International Chemical Identifier

InChI is a structure-based chemical identifier, developed by IUPAC and the InChI Trust. It is a standard identifier for chemical databases that facilitates effective information management across chemistry.

InChI and InChIKey are open standards. They use unique machine readable strings to represent, store and search chemical structures. All the software and algorithms related to them are open source.

Contents of this document

What is InChI

InChI is a structure-based textual identifier, strictly unique, non-proprietary, open source, and freely accessible.

InChI identifiers describe chemical substances in terms of layers of information – the atoms and their bond connectivity, tautomeric information, isotope information, stereochemistry, and electronic charge.

inchi-structure inchi-example

With its fixed length of 27 characters the InChIKey — the hashed version of the InChI — allows for a compact representation and usage in databases or search engines.

InChI is used by most of the large chemical databases and software applications handling many millions of chemical structures.

InChI enables the linking and interlinking of chemistry and chemical structures on the web and computer platforms. By enhancing the discoverability of chemical structures, InChI advances the FAIR Guiding Principles for scientific data management and stewardship. FAIR was published in 2016 to provide guidelines to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets. InChI provides ‘Findability’ for chemical structures and extends Interoperability between platforms, both of which foster Accessibility and Reuse.

InChI Trust

The InChI Trust is a charity that supports the development and promotion of the InChI standard. It works in partnership with IUPAC to update and release new extensions to and applications of InChI. The Trust is a membership organisation, governed by its Board of Trustees which includes representation from IUPAC.

InChI-Members

The scientific design of the various tools and capabilities that comprise the InChI code are defined by the InChI Working Groups which are made up of volunteers from the InChI community with IUPAC oversight. These voluntary groups are each focused on specific areas of chemistry or tools within the InChI code. See Working Groups for details on each group and their membership.

The development of the code is coordinated by the Technical Director of the InChI Trust, together with the working groups, IUPAC and our development partners. Our development partners currently include RWTH Aachen (as part of NFDI4Chem, acknowledging funding from Volkswagen Stiftung and the Data Literacy Alliance – DALIA), and the Beilstein Institut.

Dev-Partners

How to contribute

Should you have any questions, comments, or suggestions, please feel free to post them here: GitHub discussion page.

If you encounter a bug, we kindly request you to create an issue.

You are welcome to contribute to this project. To do so, you may create a pull request.

Contents of this repository

INCHI-1-BIN

The INCHI-1-BIN subfolder contains binaries of the command line InChI executable (inchi-1) and the InChI API library (libinchi).

INCHI-1-DOC

The INCHI-1-DOC subfolder contains documentation related to the InChI Software.

INCHI-1-SRC

The INCHI-1-SRC subfolder contains the InChI source code. It also contains examples of InChI API usage, for C (inchi_main, mol2inchi, test_ixa), as well as the InChI API library source code and related projects/makefiles.

INCHI-1-SRC

The INCHI-1-TEST subfolder contains the test scripts and resources.

Images

The Images subfolder contains the images used in this readme.

Using precompiled binaries

64-bit and 32-bit precompiled binaries (executable, .dll/.so and ELF files) are located in the following folders:

Microsoft® Windows
Files (given in compressed `.zip` format) Location(s) Compiler
inchi-1.exe 64-bit: INCHI-1-BIN/windows/64bit Microsoft® Visual Studio C++ (MSVC)
32-bit: INCHI-1-BIN/windows/32bit MinGW-w64/Clang(1)
libinchi.dll
+ corresponding inchi_main.exe
64-bit: INCHI-1-BIN/windows/64bit/dll Microsoft® Visual Studio C++ (MSVC)
32-bit: INCHI-1-BIN/windows/32bit/dll MinGW-w64/Clang(1)

UNIX-based OSs (except MacOS®)
Files (given in compressed `.zip` format) Location(s) Compiler
inchi-1 (ELF file) 64-bit: INCHI-1-BIN/linux/64bit GCC
32-bit: INCHI-1-BIN/linux/32bit Clang/LLVM(2)
libinchi.so.1.07
+ corresponding inchi_main (ELF file)
64-bit: INCHI-1-BIN/linux/64bit/so GCC
32-bit: INCHI-1-BIN/linux/32bit/so Clang/LLVM(2)

(1) IMPORTANT NOTE: Since 32-bit binaries for Microsoft® Windows operating system have been compiled using MinGW-w64, it has been reported that in certain environments a dynamic link library libgcc_s_dw2-1.dll has to be included in the same folder with the executables. Therefore, libgcc_s_dw2-1.dll has been added to INCHI-1-BIN/windows/32bit and INCHI-1-BIN/windows/32bit/dll folders (we would like to thank nbehrnd for his assistance with this matter).
(2) In order to make makefile32s more consistent on all operating systems (see the note (1) above), and for easier change of the default compiler, the default compiler on 32-bit UNIX-based OSs has been set to Clang/LLVM.


Precompiled binaries for MacOS® (i.e. .app executables and .dylib libraries) will be provided very soon. Until then, please note that InChI can now be compiled from source on MacOS® using native/default Clang or GCC (if installed).

Compiling from source

Microsoft® Windows: Solution/project files for Microsoft® Visual C++ (MSVC)/Clang/LLVM and Intel® oneAPI DPC++/C++ Compiler are provided for both command line and API versions of InChI v.1.07. The solution/project files are located in the following folders:

  • INCHI-1-SRC/INCHI_EXE/inchi-1/vc14 (command line version)
  • INCHI-1-SRC/INCHI_API/demos/inchi_main/vc14 (API version consisting of libinchi.dll and its corresponding executable inchi_main.exe)
  • INCHI-1-SRC/INCHI_API/libinchi/vc14 (API version consisting only of libinchi.dll).

UNIX-based OSs/MacOS®/Microsoft® Windows: For GCC and Clang/LLVM compilers, InChI v.1.07 can be compiled from the source using Make software. makefile/makefile32 files are provided in the following folders:

  • INCHI-1-SRC/INCHI_EXE/inchi-1/gcc (command line version)
  • INCHI-1-SRC/INCHI_API/demos/inchi_main/gcc (API version consisting of libinchi.dll/libinchi.so.1.07/libinchi.1.07.dylib and its corresponding executable/ELF inchi_main.exe/inchi_main)
  • INCHI-1-SRC/INCHI_API/libinchi/gcc (API version consisting only of libinchi.dll/libinchi.so.1.07/libinchi.1.07.dylib).

New features in makefile/makefile32:

  • makefile/makefile32 files are configured to detect OSs automatically, so it is no longer needed to specify OS explicitly or run batch/bash script(s) before compiling.
  • GCC and Clang/LLVM compilers are also automatically detected by makefile/makefile32 files with:
    • GCC set as a default compiler on 64-bit platforms
    • Clang/LLVM set as a default compiler on 32-bit platforms (please refer to these notes for more details).
  • If both GCC and Clang/LLVM compilers are installed, setting a default compiler can be done simply by changing CCN parameter in makefile/makefile32 where:
    • CCN = 1 corresponds to GCC
    • CCN = 2 corresponds to Clang/LLVM.

Support for native/default MacOS® Clang compiler is now provided with 64-bit versions of makefile files (we would like to thank John Mayfield for his assistance with this matter).

If makefile/makefile32 is used for compiling libinchi on Microsoft® Windows, libinchi.dll is now generated instead of libinchi.so.1.07. Also. please make sure to read the notes regarding the required libgcc_s_dw2-1.dll for running 32-bit executables on Microsoft® Windows operating system in certain environments.

Additional notes:

Known issues

If API version (i.e. libinchi.so.1.07 and inchi_main ELF file) is compiled using Clang/LLVM on Linux OS, and libinchi.so.1.07 cannot be found by inchi_main, LD_LIBRARY_PATH should be set either temporarily or permanently before inchi_main ELF file is used. It might be worth trying to change the value of LINKER_CWD_PATH to -Wl,-R,"",-rpath,$(LIB_DIR) (i.e. replacing = with ,) in corresponding makefile/makefile32; however, please note that during our tests, this option failed to generate libinchi.so.1.07 with Clang/LLVM on Linux. More reliably, LD_LIBRARY_PATH can be set in several ways:

  • Temporarily:

    • by running a shell script ldlp_fix.sh (located in /INCHI_API/bin/Linux) with either of these two commands:

      • . ldlp_fix.sh

      • source ldlp_fix.sh;

        path to libinchi.so.1.07 can be edited in ldlp_fix.sh

    • using command line interface:

      export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/libinchi.so.1.07
      
  • Permanently:

    • by adding the following line in ~/.bashrc:

       LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/path/to/libinchi.so.1.07"
      
    • by adding the libinchi.so.1.07 path to ld.so.conf, which means adding a file /etc/ld.so.conf.d/local.conf containing just one line:

      <path_to/>libinchi.so.1.07

      and then running sudo ldconfig.

    • Open-source utility patchelf can also be of use.

If a similar issue occurs on MacOS®, one of the above solutions should be applied for setting DYLD_LIBRARY_PATH and/or DYLD_FALLBACK_LIBRARY_PATH (which behave like LD_LIBRARY_PATH).

Optional features

Experimental features under development

Some of the experimental/engineering/hidden options featured in InChI 1.07 which are known to be not fully functional are:

  • In command line version:

    • 32-bit Microsoft® Visual Studio C++ (MSVC) Win32 and Microsoft® LLVM/Clang compiler-specific issues with the following options:

      • AMI Allow multiple input files (wildcards supported)
      • AMIOutStd Write output to stdout (in AMI mode)
      • AMILogStd Write log to stderr (in AMI mode)
      • AMIPrbNone Suppress creation of problem files (in AMI mode)
  • In API/.dll/.so version:

    • KET Consider keto-enol tautomerism (experimental)
    • 15T Consider 1,5-tautomerism (experimental)
    • PT_06_00 Consider 1,3 heteroatom shift (experimental)
    • PT_13_00 Consider keten-ynol exchange (experimental)
    • PT_16_00 Consider nitroso-oxime tautomerism (experimental)
    • PT_18_00 Consider cyanic/iso-cyanic acids (experimental)
    • PT_22_00 Consider imine/amine tautomerism (experimental)
    • PT_39_00 Consider nitrone/azoxy or Behrend rearrangement (experimental)
    • Polymers105 Allow processing of polymers (experimental, legacy mode of v. 1.05)
    • NoEdits Disable polymer CRU frame shift and folding
    • NPZz Allow non-polymer-related Zz atoms (pseudo element placeholders)
    • SAtZz Allow stereo at atoms connected to Zz (default: disabled)
    • InChI2StructTest mode: Mol/SDfile -> InChI -> Structure -> (InChI+AuxInfo) -- produces Fatal Error (2)3 just like in InChI v.1.06
    • InChI2InChI Convert InChI string(s) into InChI string(s) -- produces Fatal Error(2)3 just like in InChI v.1.06

Please refrain from using the above mentioned options as they might not function properly, or will not be recognized. Regular updates with regard to their functionality will be posted on this page.