Skip to content

A web-based platform for annotating short-text documents to be used in applied text-mining based research.

License

Notifications You must be signed in to change notification settings

luisgasco/noytext

Repository files navigation


Noytext

A web-based platform for annotating short-text documents to be used in applied text-mining based research.

DOI

To doKey FeaturesInstallationConfigurationCreditsCiteAboutLicense

Key Features

  • Fully customizable UI - Adapt Noytext to your needs
    • Show your own project description html page
    • Introduce your team with you own html file
    • Configure your own help page
    • Personlize your navigation bar as you wish
    • Choose the overall appearance of the web-app using shinythemes
  • Connect the web-app to your own Mongo database
  • Interannotator agreement - Define the No. of times a text should be annotated by different users
  • Get info. about your annotators:
  • Emoji support in text visualization 👌
  • Cross platform

Installation

To clone this app, you'll need Git • To use this app, you'll need both R and MongoDB installed on your machine • If you are going to use it in a local environment, I recommend you to use RStudio • If you want to allow other people to use the app, you should install Shiny server in your own server

Here you have the steps to run the app in your cloud server (running Ubuntu 16.04)

1. Server
  1. The first thing you should do is add a non-root user.
    sudo adduser yourname
    sudo gpasswd -a yourname sudo
  2. Switch to "yourname"
    su - yourname
2. Install R
  1. Add R senial to our sources.list:
    sudo sh -c 'echo "deb http://cran.rstudio.com/bin/linux/ubuntu xenial/" >> /etc/apt/sources.list'
  2. Add the public keys:
    gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
    gpg -a --export E084DAB9 | sudo apt-key add -
  3. Install R
    sudo apt-get update
    sudo apt-get -y install r-base
  4. Check that R is working (use the command quit() to exit)
    R
  5. Install dependencies to install R-libraries
    sudo apt-get -y install libcurl4-gnutls-dev libxml2-dev libssl-dev libssl-dev libsasl2-dev
  6. Install devtools
    sudo su - -c "R -e \"install.packages('devtools', repos='http://cran.rstudio.com/')\""
3. Install Shiny Server
  1. Install some dependencies
    sudo apt-get -y install gdebi-core
  2. Install packages you will need
    sudo su - -c "R -e \"install.packages('shiny', repos='http://cran.rstudio.com/')\""
    sudo su - -c "R -e \"install.packages('rmarkdown', repos='http://cran.rstudio.com/')\""
    sudo su - -c "R -e \"install.packages('packrat', repos='http://cran.rstudio.com/')\""
  3. Get Shiny installation files
    wget https://download3.rstudio.org/ubuntu-14.04/x86_64/shiny-server-1.5.9.923-amd64.deb
  4. Install Shiny
    sudo gdebi shiny-server-1.5.9.923-amd64.deb
  5. Check that shiny server is working on port 3838: http://YOUR_IP:3838
4. Install Git
  sudo apt-get update
  sudo apt-get install git
5. Install MongoDB
  1. Import publick key used by the management system for MongoDB
    sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 9DA31620334BD75D9DCB49F368818C72E52529D4
  2. Create a list file for MongoDB for Ubuntu 16.04
    echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/4.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.0.list
  3. Install MongoDB
    sudo apt-get update
    sudo apt-get install mongodb-org
  4. Set MongoDB as a Ubuntu service
    sudo service mongod start
6. Install App
  1. Clone repository from Github
    git clone https://github.com/luisgasco/noytext
  2. Move noytext to srv folder (the standard path for shiny apps)
    sudo mv noytext /srv/shiny-server
  3. Go to that path
    cd /srv/shiny-server/noytext
  4. Enter to R like super user
    sudo R
  5. Enter this commands on R
    # Activate packrat (library to manage R libraries)
    packrat::on() 
    # Install libraries on the noytext private library
    packrat::restore(overwrite.dirty = TRUE)
    # Init packrat:
    packrat::init()
7. Database creation and configuration

Before using the application, you have to create a MongoDB database with two collections to import your texts there. I recommend you to use a new database with two collections, where you should import your texts. The following steps show you the process:

  1. Enter to MongoDB
    mongo
  2. Create database "db_name". This is just a name example, you can use the name you want.
    use db_name
  3. Create the "text_collection" and "user_collection" collections, which will contain your texts and annotations and your annotators data respectively.
    db.createCollection("text_collection")
    db.createCollection("user_collection")
    # exit
    quit()
  4. Import your texts. To show the process, we are going to import the file present on the "testing_data" folder of the app. This is done in Ubuntu bash. Note that you need to name your column with texts as "text", otherwise the application will crash.
    cd /srv/shiny-server/noytext/testing_data/
    mongoimport --db db_name --collection text_collection --type CSV --headerline --file "test_text.csv"

The general command to import a csv is:

mongoimport --db db_name --collection "your_text_collection_name" --type CSV --headerline --file "path_to_your_csv_file"

I repeat that is very important to have the column of the texts named as "text", otherwise the application will not work.

The following option is not completely debugged, it could fail:

You can use the file noytext_installation.sh present in /noytext/testing_data to install in your server. This sh file will automatically replicate steps from 2nd to 6th:

    wget https://raw.githubusercontent.com/luisgasco/noytext/master/testing_data/noytext_install.sh
    chmod 777 noytext_install.sh
    sudo ./noytext_install.sh

If you have problems using this file, it could be related to problems installing R. Try to manually install R using the code in the second step, and re-execute sudo ./noytext_install.sh . Be attentive to the dialogs that will appear while the app is installing.

Configuration

To configure the graphical interface of Noytext you must modify the .txt files present in the path noytext/config_files/. These files use the ":" symbol as a separator, so you cannot use that symbol in your texts. On the other hand, you should not put quotation marks in these documents because it could cause problems when reading the lines.

File name Configure...
GeneralUI_conf.txt ...the elements of the graphical interface
HelpTexts_conf.txt ...the helpers from the tab "help"
MongoDB_conf.txt ...your MongoDB connection parameters
Survey_conf.txt ...your questionaire
GeneralUI_conf.txt parameters

Noytext currently has 4 tabs (information, help, label and about). You can hide all of them, except the one used to annotate texts (label).The first element of each line represents the element you are going to modify. This file consists on 6 configuration lines:

1. Title:

It allow you to specify the name of your project/app. i.e:

Title:Annotation for soundscapes

2. Information:

It allow you to define the name of the introduction tab of your app, as well as if you want to show it and the file name of the html that you are using for this purpose. The file must be placed at the root of the project. i.e:

a. To hide this tab:

  Information:Introduction to scoundscapes:FALSE:intro.html

b. To show this tab:

  Information:Introduction to scoundscapes:TRUE:intro.html

3. Help:

It allow you to define the name of the help tab, as well as if you want to show it or no.i.e:

a. To hide this tab:

Information:Introduction to scoundscapes:FALSE

b. To show this tab:

Information:Introduction to scoundscapes:TRUE

4. Label:

Change the title of the label tab.i.e:

Label:Help us to annotate

5. About:

It allow you to define the name of the about tab of your app, as well as if you want to show it and the file name of the html that you are using for this purpose. The file must be placed at the root of the project. i.e:

a. To hide this tab:

About:Our team:FALSE:index.html

b. To show this tab:

About:Our team:TRUE:index.html

6. Shinydasboad appearance:

You can use shinythemes constans to change the color and style of the NavBar. This values can be found on https://rstudio.github.io/shinythemes/ .i.e:

Shinydasboard_appearance:sandstone
HelpTexts_conf.txt parameters

This file allow you to change the helpers file from the tab help. You can use most of the html tags in the definition (I did not check all of them). Currently, you only can define the 4 sentences that will be shown in each step of this tab.

The first element of each line represents the element you are going to modify. i.e:

HELP1:<h3>This is the first helper with a h3 tag</h3>
HELP2:<b> You can use bold tag </b>
HELP3:You can write plain text
HELP4:<h3>This is the last helper, you cannot add more</h3>
MongoDB_conf.txt parameters

Here you can define your connection URL, database port, name, and collections. If you want to register user data, besides the text collection you will need to create a collection for this purpose (it was created in the installation guide). The file configures a localhost URL and the port 27017 by default, this is the standard configuration required to access MongoDB from the app in an Ubuntu instance.

ConnectionURL:localhost
ConnectionPORT:27017
DatabaseNAME:db_name
CollectionTextNAME:text_collection
CollectionUsersNAME:user_collection

On the other hand, you can set in this file the number of times you need a text be annotated by different users:

num_annotations_text:3
Survey_conf.txt parameters
  1. This file allows you to define if you need to show a survey to annotators the first time they log in. i.e:

    a. To show survey and user login:

     SurveyNeeded:TRUE
    

    b. To hide survey and user login:

     SurveyNeeded:FALSE
    
  2. If you decide to show the survey to your users, you have to define de number of questions the survey will have:

NumberQuestions:10
  1. Define the questions you need: UNDER CONSTRUCTION

Credits

This app uses the following open source programs:

And the following R´- libraries:

Cite

To cite Noytext please use the next reference:

About

We developed this tool while Luis Gascó was a PhD student at Instrumentation and Applied Acoustics Research Group (I2A2 Group) of Universidad Politécnica de Madrid. Part of this code was developed while I was doing a research stay at Télecom Paristech.

Authors

Luis Gascó, César Asensio, Guillermo de Arcas (Universidad Politécnica de Madrid)

Chloé Clavel (Télecom ParisTech)

License

AGPL-3.0

You may also like...

  • openskyr - R library to get data from OpenSky Network API.

luisgasco.es  ·  GitHub @luisgasco  ·  Twitter @luisgasco Facebook Luis Gascó Sánchez page