The DeepPhe NLP extracts information from the patient cancer reports and stores the data in Neo4j graph database. The DeepPhe-Viz tool represents the extracted information in an organized workflow to end users, enabling exploration and discovery of patient data.
You must have the following tools installed:
- Nodejs 12.13.0 (includes npm 6.12.0) or the latest LTS version - which the DeepPhe-Viz tool is built upon
- Neo4j 3.5.x Server - is used to store the graph output from DeepPhe NLP
if you need to manage multiple versions of NodeJS, we have been successfully using the nvm tool to configure and manage our NodeJS environment; nvm enables a user to associate a paritcular NodeJS and NPM version with their Unix shell, allowing for each switching between NodeJS versions across different projects.
For neo4j server installation, we have tested the "Neo4j Community Edition 3.5.1" with this DeepPhe release, and you can download it from the Neo4j Releases page by choosing the correct download for your platform. Then follow their installation instructions to configure and start the server.
Next download or clone the DeepPhe-Viz
repo and enter the project directory. Installing this package and all its depedencies can be done with a simple command with no arguments:
npm install
There are two configuration files under the configs/
directory:
neo4j.json
is where you specify the neo4j database connection username and passwordserver.json
is where you can define the DeepPhe-Viz HTTP server host and port number
NOTE: the top level directory is referred to as NEO4J_HOME
, where you see the bin
and plugins
.
After building the DeepPhe system, you will have a deepphe.db
folder generated in the output folder named output_graph
. Put the generated deepphe.db
under your <NEO4J_HOME>/data/databases/
and configure the <NEO4J_HOME>/conf/neo4j.conf
to point to this database.
dbms.active_database=deepphe.db
You'll also have a file named deepphe-viz-0.3.0-plugin.zip
in the directory deepphe-viz-neo4j/target
after building the DeepPhe system. This compressed file contains a directory named plugins
. All the jar files of the plugins
directory must be copied to <NEO4J_HOME>/plugins
directory. The DeepPhe-Viz uses these libraries to interact with the customized DeepPhe system database.
A copy of the plugin file is also available on DeepPhe system Releases.
To run Neo4j as a console application, use:
./<NEO4J_HOME>/bin/neo4j console
To run Neo4j in a background process, use:
./<NEO4J_HOME>/bin/neo4j start
Once you create a new password for the 'neo4j' user upon visiting the Neo4j Browser at http://localhost:7474 the first time, you'll have full access to the Neo4j database. The same username and password will also need to be configured in the DeepPhe-Viz configuration file: configs/neo4j.json
so the DeepPhe-Viz can talk to the neo4j server.
Now you can start the DeepPhe-Viz HTTP server with
node server.js
This will start the web server on port 8383 by default. You can go to http://localhost:8383/cohortAnalysis to see the result. We'll describe the usage and workflow later.
Note: you can type lsof -i :8383
to see if port 8383 is being used. If you need to use a different port for running the DeepPhe-Viz HTTP server, specify the port number in the DeepPhe-Viz configuration file: configs/server.json
then restart the DeepPhe-Viz HTTP server.
The Viz tool consists of two major components—cohort analysis and individual patient profiles. Please note that the DeepPhe NLP may extract multiple cancers of the same patient if present. Currently the Viz tool can render multi-cancer patient in the individual patient profile to show all the cancers and tumors summaries. However, the cohort analysis page doesn't take this into consideration at this point. We are working on a comprehensive solution to the multi-cancer cases.
When we first load the DeepPhe-Viz in the web browser, you'll see a cohort analysis page. The system queries Neo4j to get all the patients of all cancer stages, and the results are represented in a series of charts. The two charts (A) and (B) on the top section can be used as filters to narrow down the target patients (C) and the resulting charts (D), (E), and (F).
A. Patient Count Per Stage
This chart shows the number of patients of each cancer stage. When users click one of the stage bars in the first chart, the viz tool will show updated charts of that stage with patients from that stage. Users can also click the top-level stage label text to show/hide all its sub-stages. The top stage stays unchanged. Note that some patients may have more than one stage, so the total number across all stages might be larger than the total number of patients in the cohort.
B. Patient Age of First Encounter Per Stage
Box-whisker plots summarizing the distribution of patient age of diagnosis across all cancer stages. The age range sliders on both sides can be used to specify the target range of first encounter age.
C. Target Patients
The target patients list is grouped by their age of first encounter and serves as the entry point to the individual patient profile. The highlighted patients are the ones being displayed in the Diagnosis chart (D).
D. Diagnosis
The diagnosis chart is a summary of all the grouped diagnosis across all the target patients based on the filters. Moving the bottom slider scrolls through the patients in the X axis.
E. Biomarkers Overview
The biomarkers overview chart is a simple distrubtion that shows the percentage of patients with biomarkers and patients without biomarkers among the target patients. Because biomarkers don't apply to patients with certain diagnosis.
F. Patients With Biomarkers
The biomarkers overview chart is a stacked bar chart that shows the percentage of patients who are positive, negative, and unknown for major biomarkers.
Clicking a target patient from the Cohort Analysis patient table, leads to display of the individual patient page.
The patient view starts with personal information on the upper left (A), followed by the cancer and tumor summary (B). The default tumor summary list view stacks all the tumors and the table view shows comparable items side by side. Similar concepts are grouped and share background colors. Concepts are ordered by importance, and each can be clicked to display their original sources in clinical notes on the right.
This timeline view (C) provides a temporal view of all of the reports for this patient. Currently we have the following report types:
- Progress Note
- Radiology Report
- Surgical Pathology Report
- Discharge Summary
- Clinical Note
On top of the timeline is an interactive episode legend. You can toggle the visibility for episode types by clicking on the episode circle, you can also zoom in/out the reports of each episode by clicking the legend text. Currently we have the following episode types:
- Pre-diagnostic
- Diagnostic
- Medical Decision-making
- Treatment
- Follow-up
- Unknown
The double-thumb slider below the timeline can also be used to zoom and scroll through the timeline in more detail.
All information shown in the cohort graphs, cancer and tumor summaries, and patient timeline are extracted from clinical notes or inferred via domain rules.
When you click one of the report dots, the report text is shown underneath the timeline (D) with all of the concepts extracted from the report. Clicking on these terms scauses the document text to scroll to the relevant span.
Note that the text in this example is obscured to protect the privacy of the patient.
All of the summary items from the full cancer and tumor summaries can also be clicked to show their source report in the timeline.
We've created a set of API endpoints using Swagger UI for advanced users to explore the potential use cases for their additional needs. The API documentation can be accessed at http://localhost:8383/documentation once you have the DeepPhe-Viz server running. This documentation allows the users to visualize and interact with the API's resources without having any of the implementation logic in place.
If you would like to poke around the Viz tool and make changes to the source code, you must don't want to restart the server with node server.js
every time after code changes. Nodemon is a utility that will monitor for any changes in your source and automatically restart your server. Perfect for development. To install,
npm install -g nodemon
Then just use nodemon
instead of node
to start the server, and now your process will automatically restart when your code changes.
nodemon server.js