Skip to content

Latest commit

 

History

History

kg-db-neo4j

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Supported tags and respective Dockerfile links

AmbiverseNLU Knowledge Graph Neo4j Database Dockerfile

This Dockerfile is an extension of the neo4j:3.5.0 official docker image. It creates a neo4j graph database version of YAGO with a name of the database specified by an environment variable, downloads the database dump and restores the database from the dump. With this image you are ready to use the AmbiverseNLU Knowledge Graph service with neo4j graph database.

Environment Variables

This image has several environment variables that need to be setup. Besides the environment variables from the original image that can be setup and are optional, some the following environment variables are mandatory.

DUMP_NAME

This environment variable is used to define a name for the database that is created when the image is first started.

This environmental variable is also used to define the name of the database dump.
The name must be chosen from the following list of dumps:

NEO4J_dbms_active__database

This environment variable is from the original image, and sets the database defined from the DUMP_NAME as an active database.

In the next section there is a command how to run neo4j with docker run and docker-compose. The environment variables used there are recommended and working for the size of the database dump. Please note that if you change NEO4J_AUTH you have to adapt the neo4j.properties file in the AmbiverseNLU KG accordingly, or link an external file to the KG container.

Running the AmbiverseNLU container

To run the image from and connect it to directly with the AmbiverseNLU KG service you can do it in the following two ways:

docker run -d --restart=always --name kg-db-neo4j \
	-p 7474:7474 -p 7687:7687 \
	-e NEO4J_dbms_active__database=yago_aida_en20180120_cs20180120_de20180120_es20180120_ru20180120_zh20180120.db \
	-e NEO4J_dbms_memory_pagecache_size=8G \
	-e NEO4J_dbms_memory_heap_initial__size=8G \
	-e NEO4J_dbms_memory_heap_max__size=12G \
	-e NEO4J_dbms_connectors_default__listen__address=0.0.0.0 \
	-e NEO4J_dbms_security_procedures_unrestricted=apoc.* \
	-e NEO4J_AUTH=neo4j/neo4j_pass \
	-e DUMP_NAME=yago_aida_en20180120_cs20180120_de20180120_es20180120_ru20180120_zh20180120 \
	--ulimit=nofile=40000:40000 \
	-v $HOME/neo4j/data:/data \
	ambiverse/kg-db-neo4j

 

If you want to connect it to the AmbiverseNLU KG container, use the command below. This links the kg-db-neo4j container. The link name kg-db-neo4j:db is important, especially the part :db, since it is the host name defined in the neo4j.properties file in the AmbiverseNLU KG project.

docker run -d --restart=always --name ambiverse-kg \
 -p 8080:8080 \
 --link kg-db-neo4j:kg-db \
 ambiverse/ambiverse-kg

 

... or via docker-stack deploy or docker-compose

Example service-kg.yml for AmbiverseNLU KG:

version: '3.6'

services:

  kg-db:
    image: ambiverse/kg-db-neo4j
    restart: always
    deploy:
      replicas: 1
    environment:
      DUMP_NAME: yago_aida_en20180120_cs20180120_de20180120_es20180120_ru20180120_zh20180120
      NEO4J_dbms_active__database: yago_aida_en20180120_cs20180120_de20180120_es20180120_ru20180120_zh20180120.db
      NEO4J_dbms_memory_pagecache_size: 8G
      NEO4J_dbms_memory_heap_initial__size: 8G
      NEO4J_dbms_memory_heap_max__size: 12G
      NEO4J_dbms_connectors_default__listen__address: 0.0.0.0
      NEO4J_dbms_security_procedures_unrestricted: apoc.*
      NEO4J_AUTH: neo4j/neo4j_pass
    ulimits:
      nofile:
        40000
    volumes:
      - type: volume
        source: dbdata
        target: /data
      - type: tmpfs
        target: /var/tmp/data
        tmpfs:
          size: 107374182400
    healthcheck:
      test: curl -sS http://127.0.0.1:7474/browser/ || exit 1
      interval: 1m
      timeout: 60s
      retries: 15
      start_period: 60m
    networks:
      kgnet:
        aliases:
          - kg-db

  kg:
    image: ambiverse/ambiverse-kg
    restart: always
    deploy:
      replicas: 1 # Increase the number of replicas if you want to scale horizontally
      resources:
        limits:
          #cpus: "1"
          memory: 16G
      restart_policy:
        condition: on-failure
    depends_on:
      - kg-db
    ports:
      - 8080:8080
    networks:
      - kgnet
    healthcheck:
      test: curl -sS http://127.0.0.1:8080/v2/knowledgegraph/entities/Q567 || exit 1
      interval: 1m
      timeout: 60s
      retries: 10
      start_period: 10s

volumes:
  dbdata:

networks:
  kgnet:

Run docker stack deploy -c service-kg.yml ambiverse-kg (or docker-compose -f service-kg.yml up), wait for it to initialize completely.