This repo - https://github.com/EbookFoundation/regluit will eventually be the place for collaborative development for Unglue.it. Add issues and submit pull requests here. As of September 1, 2019, https://github.com/Gluejar/regluit is still being used for production builds.
The first version of the unglue.it codebase was a services-oriented project named "unglu".
We decided that "unglu" was too complicated, so we started over and named the new project "regluit".
regluit is a Django project that
contains four main applications: core
, frontend
, api
and payment
that can be deployed
and configured on as many ec2 instances that are needed to support traffic.
The partitioning between these modules is not as clean as would be ideal. payment
is particularly messy because we had to retool it twice because we had to switch from Paypal to Amazon Payments to Stripe.
regluit was originally developed on Django 1.3 (python 2.7) and currently runs on Django 1.8.
The recommended method for local development is to create a virtual machine with Vagrant and Virtualbox.
With this method, the only requirements on the host machine are virtualbox
and vagrant
.
Vagrant will use the ansible-local
provisioner, therefore installing python and ansible on the host machine is not necessary.
Instructions for Ubuntu 16:
- Install virtualbox:
sudo apt-get install virtualbox
- Install vagrant:
sudo apt-get install vagrant
- Clone the
EbookFoundation/regluit
repository. - Navigate to the base directory of the cloned repo (where
Vagrantfile
is located). - Run
vagrant up
to create the VM, install dependencies, and start necessary services.
- Note: This step may take up to 15 minutes to complete.
- Once the VM has been created, run
vagrant ssh
to log in to the virtual machine you just created. If provisioning was successful, you should see a success message upon login.
- If virtualenv doesn't activate upon login, you can do it manually by running
cd /opt/regluit && source venv/bin/activate
- Within the VM, run
./manage.py runserver 0.0.0.0:8000
to start the Django development server. - On your host machine, open your web browser of choice and navigate to
http://127.0.0.1:8000
Instructions for other platforms (Windows/OSX):
- Steps are essentially the same, except for the installation of Vagrant and Virtualbox. Refer to each package's documentation for specific installation instructions.
NOTE: If running Windows on your host machine, ensure you are running vagrant up
from an elevated command prompt, e.g. right click on Command Prompt -> Run As Administrator.
Here are some instructions for setting up regluit for development on an Ubuntu system. If you are on OS X see notes below to install python-setuptools in step 1:
- Ensure MySQL and Redis are installed & running on your system.
- Create a MySQL database and user for unglueit.
sudo apt-get upgrade gcc
sudo apt-get install python-setuptools git python-lxml build-essential libssl-dev libffi-dev python2.7-dev libxml2-dev libxslt-dev libmysqlclient-dev
sudo easy_install virtualenv virtualenvwrapper
git clone git@github.com:Gluejar/regluit.git
cd regluit
mkvirtualenv regluit
pip install -r requirements_versioned.pip
add2virtualenv ..
cp settings/dev.py settings/me.py
mkdir settings/keys/
cp settings/dummy/* settings/keys/
- Edit
settings/me.py
with proper mysql and redis configurations. - Edit
settings/keys/common.py
andsettings/keys/host.py
with account and key information OR if you have the ansible vault password, runansible-playbook create_keys.yml
inside the vagrant directory. echo 'export DJANGO_SETTINGS_MODULE=regluit.settings.me' >> ~/.virtualenvs/regluit/bin/postactivate
deactivate ; workon regluit
django-admin.py migrate --noinput
django-admin.py loaddata core/fixtures/initial_data.json core/fixtures/bookloader.json
populate database with test data to run properly.django-admin.py celeryd --loglevel=INFO
start the celery daemon to perform asynchronous tasks like adding related editions, and display logging information in the foreground.django-admin.py celerybeat -l INFO
to start the celerybeat daemon to handle scheduled tasks.django-admin.py runserver 0.0.0.0:8000
(you can change the port number from the default value of 8000)- make sure a redis server is running
- Point your browser to http://localhost:8000/
CSS development
- We used Less version 2.8 for CSS. http://incident57.com/less/. We use minified CSS.
- New CSS development is using SCSS. Install libsass and django-compressor.
OBSOLETE Below are the steps for getting regluit running on EC2 with Apache and mod_wsgi, and talking to an Amazon Relational Data Store instance. Instructions for setting please are slightly different.
- create an ubuntu ec2 instance (e.g, go http://alestic.com/ to find various ubuntu images)
sudo aptitude update
sudo aptitude upgrade
sudo aptitude install git-core apache libapache2-mod-wsgi mysql-client python-virtualenv python-mysqldb redis-server python-lxml postfix python-dev libmysqlclient-dev
sudo mkdir /opt/regluit
sudo chown ubuntu:ubuntu /opt/regluit
cd /opt
git config --global user.name "Raymond Yee"
git config --global user.email "rdhyee@gluejar.com"
ssh-keygen
- add
~/.ssh/id\_rsa.pub
as a deploy key on github https://github.com/Gluejar/regluit/admin/keys git clone git@github.com:Gluejar/regluit.git
cd /opt/regluit
- create an Amazon RDS instance
- connect to it, e.g.
mysql -u root -h gluejardb.cboagmr25pjs.us-east-1.rds.amazonaws.com -p
CREATE DATABASE unglueit CHARSET utf8;
GRANT ALL ON unglueit.\* TO ‘unglueit’@’ip-10-244-250-168.ec2.internal’ IDENTIFIED BY 'unglueit' REQUIRE SSL;
- update settings/prod.py with database credentials
virtualenv ENV
source ENV/bin/activate
pip install -r requirements_versioned.pip
echo "/opt/" > ENV/lib/python2.7/site-packages/regluit.pth
django-admin.py syncdb --migrate --settings regluit.settings.prod
sudo mkdir /var/www/static
sudo chown ubuntu:ubuntu /var/www/static
django-admin.py collectstatic --settings regluit.settings.prod
sudo ln -s /opt/regluit/deploy/regluit.conf /etc/apache2/sites-available/regluit
sudo a2ensite regluit
sudo a2enmod ssl rewrite
cd /home/ubuntu
- copy SSL server key to
/etc/ssl/private/server.key
- copy SSL certificate to
/etc/ssl/certs/server.crt
sudo /etc/init.d/apache2 restart
sudo adduser --no-create-home celery --disabled-password --disabled-login
(just enter return for all?)sudo cp deploy/celeryd /etc/init.d/celeryd
sudo chmod 755 /etc/init.d/celeryd
sudo cp deploy/celeryd.conf /etc/default/celeryd
sudo mkdir /var/log/celery
sudo mkdir /var/run/celery
sudo chown celery:celery /var/log/celery /var/run/celery
sudo /etc/init.d/celeryd start
sudo cp deploy/celerybeat /etc/init.d/celerybeat
sudo chmod 755 /etc/init.d/celerybeat
sudo cp deploy/celerybeat.conf /etc/default/celerybeat
sudo mkdir /var/log/celerybeat
sudo chown celery:celery /var/log/celerybeat
sudo /etc/init.d/celerybeat start
mkdir /var/www/static/media/
sudo chown ubuntu:www-data /var/www/static/media/
- Study the latest changes in the master branch, especially keep in mind how it has changed from what's in production.
- Update the production branch accordingly. If everything in
master
is ready to be moved intoproduction
, you can just mergemaster
intoproduction
. Otherwise, you can grab specific parts. (How to do so is something that should probably be described in greater detail.) - Login to unglue.it and run
/opt/regluit/deploy/update-prod
To run regluit on OS X you should have XCode installed
Install virtualenvwrapper according to the process at http://blog.praveengollakota.com/47430655:
sudo easy\_install pip
sudo pip install virtualenv
pip install virtualenvwrapper
Edit or create .bashrc in ~ to enable virtualenvwrapper commands:
-
mkdir ~/.virtualenvs
-
Edit .bashrc to include the following lines:
export WORKON_HOME=$HOME/.virtualenvs source your_path_to_virtualenvwrapper.sh_here
In the above web site, the path to virtualenvwrapper.sh was /Library/Frameworks/Python.framework/Versions/2.7/bin/virtualenvwrapper.sh In Snow Leopard, this may be /usr/local/bin/virtualenvwrapper.sh
Configure Terminal to automatically notice this at startup:
Terminal –> Preferences –> Settings –> Shell
Click "run command"; add source ~/.bashrc
If you get 'EnvironmentError: mysql_config not found' edit the line ~/.virtualenvs/regluit/build/MySQL-python/setup_posix.py
- mysql_config.path = "mysql_config" to be (using a path that exists on your system)
- mysql_config.path = "/usr/local/mysql-5.5.20-osx10.6-x86_64/bin/mysql_config"
You may need to set utf8 in /etc/my.cnf collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8
Download the selenium server: http://selenium.googlecode.com/files/selenium-server-standalone-2.5.0.jar
Start the selenium server: 'java -jar selenium-server-standalone-2.5.0.jar'
- Get the MARCXML record for the print edition from the Library of Congress.
- Find the book in catalog.loc.gov
- Click on the permalink in its record (will look something like lccn.loc.gov/2009009516)
- Download MARCXML
- At /marc/ungluify/ , enter the unglued edition in the Edition field, upload file, choose license
- The XML record will be automatically...
- converted to suitable MARCXML and .mrc records, with both direct and via-unglue.it download links
- written to S3
- added to a new instance of MARCRecord
- provided to ungluers at /marc/
- Use /admin to create a new MARC record instance
- Upload the MARC records to s3 (or wherever)
- Add the URLs of the .xml and/or .mrc record(s) to the appropriate field(s)
- Select the relevant edition
- Select an appropriate marc_format:
- use DIRECT if it links directly to the ebook file
- use UNGLUE if it links to the unglue.it download page
- if you have records with both DIRECT and UNGLUE links, you'll need two MARCRecord instances
- if you have both kinds of link, put them in separate records, as marc_format can only take one value
ungluify_record.py
should only be used to modify records of print editions of unglued ebooks. It will not produce appropriate results for CC/PD ebooks.
- Get a contract cataloger to produce quality records (.xml and .mrc formats)
- we are using ung[x] as the format for our accession numbers, where [x] is the id of the MARCRecord instance, plus leading zeroes
- Upload those records to s3 (or wherever)
- Create a MARCRecord instance in /admin
- Add the URLs of the .xml and .mrc records to the appropriate fields
- Select the relevant edition
- Select an appropriate marc_format:
- use DIRECT if it links directly to the ebook file
- use UNGLUE if it links to the unglue.it download page
- if you have records with both DIRECT and UNGLUE links, you'll need two MARCRecord instances
- if you have both kinds of link, put them in separate records, as marc_format can only take one value