ReCiter Publication Manager is a powerful web application that streamlines the process of updating and reporting on the publications of an institution's scholars. Publication Manager is the front end user interface within the ReCiter suite of applications. In addition to the requirements for the application itself, you will need to minimally install the following:
- ReCiter - a machine learning-based publication recommendation engine which provides high quality suggestions for individuals of interest
- ReCiterDB - the back end data store for Publication Manager; in addition to the schema and stored procedures, this repository contains a set of scripts that retrieve data from ReCiter and imports them into this MySQL database
- ReCiter PubMed Retrieval Tool - An application which provides an API which sits on top of PubMed's eFetch web service. Generally speaking, this API provides some basic inferences and makes the PubMed data easier to work with.
Publication Manager optionally integrates with ReCiter PubNotifier, which use AWS Lambda to automatically send emails to faculty members about publications that are newly accepted or under review. It is designed to work in conjunction with both ReCiter Publication Manager and ReCiterDB for efficient notification management.
See the Functionality section to see screencaps and animations of ReCiter Publication Manager in action.
Requirements for Publication Manager itself:
- Install Docker
- Go to the Docker website
- If you don't already have an account, you will need to create one.
- Use the Docker image to download and install Docker
- To verify Docker is installed, execute
docker ps
at the command line
- Install
Node
- Enter
brew install node
- For more, see here.
- Enter
Publication Manager is part of the ReCiter suite of applications. In addition to the above, you will need to install and set up the following:
- ReCiter - a machine learning-based publication recommendation engine
- ReCiterDB - the back end data store for Publication Manager; this repository also includes a set of scripts that retrieve data from ReCiter and imports them into this MySQL database
- ReCiter PubMed Retrieval Tool - An API on top of PubMed's eFetch web service. Generally speaking, this API makes the PubMed data easier to work with.
- Next.js - an open-source web development framework created by Vercel enabling React-based web applications with server-side rendering and generating static websites.
- ReactJS - a front-end JavaScript library for building user interfaces based on components
- NodeJS - a cross-platform, open-source server environment
- Sequelize - a modern TypeScript and Node.js object relational mapping (ORM) for MySQL and MariaDB
This guide details the steps for setting up the application locally, through AWS ECS, and as a development box for making and testing changes.
- Clone the repository: Clone the repository to your local machine using the command:
git clone https://github.com/wcmc-its/ReCiter-Publication-Manager.git
- Build the Docker image: Navigate to the cloned directory and build the Docker image: sudo docker build -t reciter-pub-manager .
Confirm the image creation by listing all Docker images:
sudo docker images
- Check and stop containers: Verify no containers are running on port 3000 and stop them if necessary:
sudo docker ps -q --filter "publish=3000"
sudo docker stop <container ID>
- Environment variables setup: Configure the environment variables using the values in the environmental variables wiki. Use "env.local" for local setups and AWS Secrets Manager for AWS ECS setups.
NEXT_PUBLIC_RECITER_API_KEY=<<value>>
RECITER_API_BASE_URL=<<value>>
etc.
- Run the Docker container: Start the Docker container, mapping the desired port (e.g., 5001 to 3000):
sudo docker run -d -p 5001:3000 --env-file env.local reciter-pub-manager:latest
To use native authentication for initial testing:
-
Create a New User: Utilize the ReCiter API to create a new user with the necessary details, which creates an entry in the
ApplicationUser
table. -
Login: Access the ReCiter Publication Manager login page and log in to create a record in the
admin_users
table. -
Update Roles: Modify roles by adding a new row in the
admin_users_roles
table with the userID androleID
as "1" for superuser privileges. -
Access the Application: Navigate to the application by logging in through the specified port on your local machine.
- View Docker logs: To check the logs for troubleshooting:
sudo docker logs <<Container ID/Container Name>>
docker logs -f -t reciter-pub-manager
- Stopping and removing instances: If needed, stop and remove Docker instances using:
docker stop reciter-pub-manager
docker rm reciter-pub-manager
To set up a development environment for making and testing changes:
-
Install Dependencies: Install application and client dependencies, including NodeJS and React Redux.
-
Run Publication Manager: Use
npm run dev
to start both the NodeJS express server and the React server concurrently. Address any port conflicts as necessary. -
Access the Application: The application should automatically open in your default browser. Log in through the
/login
page.
For real-time changes and testing:
- Modify CSS:
Example: Change the
Header.css
background-color to#ff6600
and observe instant updates in the browser.
To start the development server:
npm run dev
# or
yarn dev
Access the app at http://localhost:5001
and start editing. API routes can be explored and modified as described.
Application users are stored in the admin_users
table in ReCiterDB. There are two options for adding and updating users:
- Manual - To add or update users manually, superusers can go to the web interface and navigate to
Manage module > Manage users
. From there, they can add or update users as needed. - Programmatic - If you have configured ReCiterDB and its associated scripts, users should automatically be populated in the
person
table and added to theadmin_users
table. To configure this option, you will need to follow the setup instructions for ReCiterDB and ensure that the associated scripts are running as intended. Once this is set up, users should be added or updated in theadmin_users
table automatically.
ReCiter Publication Manager supports two options for authentication:
Option #1: Local login - This option allows users to log in to Publication Manager using a local username and password.
Option #2: SAML-based login - This option is designed to meet institutional security requirements by allowing Publication Manager to work with SAML and an institution's identity provider. During SAML authentication, the identity provider provides a payload that contains a personIdentifier
attribute, which is always populated, and sometimes contains a user.email
attribute. Publication Manager attempts to match the user.email
attribute against the email address recorded in the admin_users
table. If this fails, it attempts to match against the personIdentifier
.
To utilize the Publication Manager, users must have specific access roles assigned to them. The access roles available and their corresponding privileges are as follows:
Role | Privileges | Relevant to | How assigned |
---|---|---|---|
Curator (Individual) | Update publication lists for oneself | Faculty and individual authors | Automatically assigned |
Curator (Department) | Update publication lists for everyone in an organization unit | Departmental administrators and staff | Manually assigned by superuser |
Curator (All) | Update publication lists for everyone | Librarians | Manually assigned by superuser |
Reporter | Generate reports about everyone | Departmental administrators and staff | Manually assigned by superuser |
Superuser | Do all of the above and update roles of others | System administrators | Manually assigned by superuser |
Role assignments are stored in the admin_users_roles
table.
The publication lists for the ~6,000 or so key people of interest (e.g., full-time faculty) are curated by WCM librarians every business day. Other types of users such as NYP residents and PhD alumni have profiles that are curated less frequently.
In the event of an error or omission, faculty and departmental users may wish to update these lists through curation.
- If you are a faculty or other user with a profile:
- You will be directed to your profile upon login.
- If you do not have a profile:
- Click on the "Find people" tab.
- Enter the name or person identifier (e.g., NetID, CWID, etc.) of the person you are searching for.
- Click on the "Curate publications" button
-
Publications are divided into three categories:
- Suggested - publications for which authorship has neither been accepted nor rejected; these are generally publications suggested by the machine learning algorithm behind ReCiter
- Accepted - publications by a user for which they have affirmed authorship
- Rejected - publications by a user for which they have denied authorship
-
Reviewing accepted publications:
- The "Matching Score" gives a general idea of how likely the ReCiter algorithm believes a publication was written by our person of interest.
- If you are uncertain, expand the "Show evidence behind the suggestion." This will show you why the score is what it is. For example, sometimes a person is on a grant and that person's name has shown up as a co-author on a suggested article.
- You can also judge if the "Inferred keywords" at the top of the page are consistent with the topic of the article of interest.
- After you accept or reject a publication, you can choose to "Refresh suggestions." It may take up to a minute to use the feedback you provided to re-score all the candidate records.
Searching PubMed
- Sometimes the ReCiter engine behind Publication Manager fails to find the correct candidate publications. In that case, you can search for them using the "Add New Record: PubMed" link on the right of the page
- In this screen, you can input any search term you would in PubMed. This includes topics, author names, and a list of PMIDs (PubMed identifiers).
- The application will show a maximum of 100 results. It won't show any records that have already been accepted or rejected.
Here's how you curate the publication lists for a group of individuals:
- Enter a list of person identifiers in the "Name or NetID(s)" input box. Or, filter by Organizational Unit, Institution, and/or Person Type(s). You should be able to enter up to 200 person identifiers. Note that a search for "Medicine" won't get results for someone whose primary organizational unit is "Medicine (Cardiology)." You would have to select the parent department and all the subsuming divisions.
- Click on the "Curate publications" button.
- The user interface will display a sequence of individuals who need to have pending publications to be reviewed.
Data for reporting is refreshed on a nightly basis. Any curation work performed during the day won't appear until early the next morning.
If you are reporting on the publication output of a person, you can do it in two ways:
- One by one
- Click the "Create Reports" tab.
- In the "Author" filter, limit by name or NetID.
- Click the "Search" button.
- Check the corresponding box next to the user's name.
- For people that satisfy certain criteria.
- Click the "Create Reports" tab.
- In the "Authors" filter, limit by name, organizational units, institution, person type, and/or author position.
- Click the "Search" button.
- For a known group of NetIDs (see animation below)
- Click on the "Find People" tab.
- Paste in a list of NetIDs. (This has been tested to reliably work for up to 200 NetIDs.)
- Click the "Search" button.
- Click the "Create Reports" button above the table of results.
- You will be directed to the "Create Reports" page with all the authors with those NetIDs checked.
The Filters section of the Publication Manager allows users to narrow down the article results displayed. The following filters are available for people:
- Author - Filter by name or NetID.
- Organization - Refers to a given person's primary organizational unit. Publication Manager uses parentheses to indicate that a person is a member of both a unit and a sub-unit. For example, a search for "Medicine" will also return results for someone whose primary organizational unit is "Medicine (Cardiology)."
- Institution - Refers to a person's primary institutional affiliation (e.g.,
Cornell University
). - Person Type(s) - Refers to an individual's designation (e.g.,
academic-faculty
,student-phd
). - Author Position - Indicates whether any of the selected people were first and/or last author on a given publication. In some cases, certain authors may be co-first or co-last. These cases are not tracked automatically but can be added to an override table called
analysis_override_author_position
. Such authorships will get credit for being first or last author, both here and in the bibliometric report described below.
The following filters are available for articles:
- Date - Refers to the date an article was added to PubMed, which can be several months different from the publication date. By default, the last 30 days is selected.
- Type - Refers to the type of article, such as Case Report, Editorial, Review, etc. Articles of type "Academic Article" generally describe original research.
- Journal - Refers to the verbose journal name (e.g., "Annual Review of Cell Biology").
- Journal Rank - Refers to the Scimago journal ranking, which is not the same as the journal impact factor but correlates highly with that metric. Journal Impact Factor is not published on the website due to copyright reasons. However, it is available when a user downloads a CSV file.
- Export to CSV
- Option #1: Export authorships. (See below.) An authorship is a case where a NetID has been assigned by a human to an author on a given publication. One publication may have up to dozens of known authorships.
- Option #2: Export articles. An article-level report.
- Export to RTF
- RTF is a Word-compatible document. In cases where one or more authors has been selected, those names are bolded.
Here's how to generate a narrative bibliometric summary (see sample) of a full-time faculty at Weill Cornell Medicine. The summary includes h-index, h5-index, NIH-provided article-level bibliometrics, a ranking of an individual's most impactful publications, and a summary statement.
- Go to "Create reports"
- Search for an author (e.g., "Lyden, David)
- In the results, click on the author's name.
- A modal window appears.
- Click on the "Generate bibliometric report" button.
- This will download a bibliometric summary (see below) which you can open up in Word.
Users assigned the "curator_self" role have the ability to opt-in for email notifications regarding pending and newly accepted publications. This feature ensures that curators are promptly informed about updates that require their attention.
ReCiter intelligently suggests ORCID identifiers for individuals based on their record of accepted publications and inferred author positions. Users with the "curator_self" role have the flexibility to either accept these suggested ORCID values or manually enter an ORCID identifier of their choice. This allows for accurate association of publications to the correct individual.
The Publication Manager's configuration settings can be accessed and modified by superusers through the user interface by visiting Configuration
. All changes made in the web interface will automatically update for all users.
The available settings are:
- Labels for terms - The value to be displayed in the application for all users (e.g. NetID vs. CWID vs. UserID...)
- Help text for users - The contents of an "on hover" event that provides in-page documentation to users
- Inclusion of attributes - The decision to include attributes such as citation count in the output of an article CSV, authorship CSV, on the web page itself, or as a sortable attribute
- Order of output attributes - Allows admins to decide the order in which attributes are displayed, including in the CSV outputs, the web interface, and sort function
- Maximum records output - The maximum number of records that can be output to the CSV files
- Headshot - The full URL for a third party headshot API
- Email notifications - Ability to email out notifications to scholars when they have newly accepted or suggested publications.
- Automatic role assignment - Users who login can automatically be assigned a reporter_all or curator_all role.
Publication Manager has been funded by:
- Lyrasis through its Catalyst fund
- National Library of Medicine, National Institutes of Health under a cooperative agreement with Region 7
The ReCiter suite of applications has been funded by the following:
- The National Institutes of Health National Center for Advancing Translational Sciences through grant number UL1TR002384
Please submit any questions to Paul Albert or publications@med.cornell.edu. You may expect a response within one to two business days.
We use GitHub issues to track bugs and feature requests. If you find a bug, please feel free to open an issue.
Contributions welcome!