This directory contains the code of Wikipedia 1.0 supporting software. More information about the Wikipedia 1.0 project can be found on the Wikipedia in English.
The wp1
subdirectory includes code for updating the enwp10
database, specifically the ratings
table (but also other
tables). The library code itself isn't directly runnable, but instead
is loaded and run in various docker images that are maintained in the
docker
directory.
requirements.txt
is a list of python dependencies in pip format that
need to be installed in a virtual env in order to run the library code.
Both the web
and workers
docker images use the same requirements,
though Flask and its
dependencies are not utilized by the worker code.
The cron
directory contains wrapper scripts for cron jobs that are
run inside the workers image.
The setup
directory contains a historical record of the database
schema used by the tool for what is refered to in code as the wp10
database. This file has been heavily edited, but should be able to be
used to re-create the enwp10
database if necessary.
wp1-frontend
contains the code for the Vue-CLI based frontend,
which is encapsulated and served from the frontend
docker image.
See that directory for instructions on how to setup a development
environment for the frontend.
conf.json
is a configuration file that is used by the wp1
library code.
docker-compose.yml
is a file read by the docker-compose
command in order to generate the
graph of required docker images that represent the production environment.
docker-compose-dev.yml
is a similar file which sets up a dev environment,
with Redis and a MariaDB server for the enwp10
database. Use it like so
docker-compose -f docker-compose-dev.yml up -d
The *.dockerfile
symlinks allow for each docker image in this repository
to be more easily built on Docker Hub. See:
- https://hub.docker.com/repository/docker/openzim/wp1bot-frontend
- https://hub.docker.com/repository/docker/openzim/wp1bot-workers
- https://hub.docker.com/repository/docker/openzim/wp1bot-web
openapi.yml
is a YAML file that describes the API of the web
image
in OpenAPI format. If you visit
the index of the API server you will
get a swagger-ui documentation frontend that utilizes this file. It
is symlinked into the wp1/web
directory.
The wp10_test.*.sql
and wiki_test.*.sql
files are rough
approximations of the schemas of the two databases that the library
interfaces with. They are used for unit testing.
This code is targeted to and tested on Python 3.9.4.
As of Python 3.3, creating a virtualenv is a single easy command. It is recommmended that you run this command in the top level directory:
python3 -m venv venv
To activate your virtualenv, run:
source venv/bin/activate
You should see your prompt change, with a (venv)
appended to the front.
To install the requirements, make sure you are in your virtualenv as explained above, then use the following command:
pip3 install -r requirements.txt
To install the requirements for the frontend server, cd into wp1-frontend
and use:
yarn install
You will also need to have Docker on your system in order to run the development server.
The tests expect a localhost MariaDB or MySQL instance on the default
port, with a user of 'root' and no password. You also need two databases:
enwp10_test
and enwikip_test
. They can use default settings and be
empty.
If you have that, and you've already installed the requirements above, you should be able to simply run the following command from this directory to run the tests:
nosetests
If you'd like to use a different MySQL user or non-default password for
the tests, you must edit _setup_wp_one_db
and _setup_wp_one_db
in
base_db_test.py
.
The script needs access to the enwiki_p replica database (referred to
in the code as wikidb
), as well as its own toolsdb application database
(referred to in the code as wp10db
). If you are a part of the toolforge
enwp10
project, you can
find the credentials for these on toolforge in the replica.my.cnf file in
the tool's home directory. They need to be formatted in a way that is
consumable by the library and pymysql. Look at credentials.py.example
and create a copy called credentials.py
with the relevant information
filled in. The production version of this code also requires English Wikipedia
API credentials for automatically editing and updating
tables like this one.
Currently, if your environment is DEVELOPMENT, jobs that utilize the API
to edit Wikipedia are disabled. There is no development wiki that gets edited
at this time.
For development, you will need to have Docker installed as explained above.
There is a Docker setup for a development database. It lives in
docker-compose-dev.yml
.
Before you run the docker-compose command below, you must copy the file
wp1/credentials.py.dev.example
to wp1/credentials.py.dev
and fill out the
section for STORAGE
, if you wish to properly materialize builder lists into
backend selections.
After that is done, use the following command to run the dev environment:
docker-compose -f docker-compose-dev.yml up -d
The dev database will need to be migrated in the following circumstances:
- In a clean checkout, the first time you run the
docker-compose
command above. - Anytime you remove/recreate the docker image
- Anytime you or a team member adds a new migration
To migrate, cd to the db/dev
directory and run the following command:
yoyo apply
The YoYo Migrations application will read the data in db/dev/yoyo.ini
and attempt
to apply any necessary migrations to your database. If there are migrations to apply,
you will be prompted to confirm. If there are none, there will be no output.
More information on YoYo Migrations is available here.
Assuming you are in your Python virtualenv (described above) you can start the API server with:
FLASK_DEBUG=1 FLASK_APP=wp1.web.app flask run
The web frontend can be started with the following command in the wp1-frontend
directory:
yarn serve
The DEVELOPMENT section of credentials.py.example is already filled out with the proper values for the servers listed in docker-compose-dev.yml. You should be able to simply copy it to credentials.py.
If you wish to connect to a wiki replica database on toolforge, you will need to fill out your credentials in WIKIDB section. This is not required for developing the frontend.
The API server has a built-in development overlay, currently used for manual
update endpoints. What this means is that the endpoints defined in
wp1.web.dev.projects
are used with priority, instead of the production endpoints,
only if the credentials.py ENV == Environment.DEVELOPMENT. This is to allow
for easier manual and CI testing of the manual update page.
If you wish to test the manual update job with a real Wikipedia replica database and RQ jobs, you will have to disable this overlay. The easiest way would be to change the following line in wp1.web.app:
if ENV == environment.Environment.DEVELOPMENT:
# In development, override some project endpoints, mostly manual
# update, to provide an easier env for developing the frontend.
print('DEVELOPMENT: overlaying dev_projects blueprint. '
'Some endpoints will be replaced with development versions')
app.register_blueprint(dev_projects, url_prefix='/v1/projects')
to something like:
if false: # false while manually testing
# In development, override some project endpoints, mostly manual
...
- Since Docker Hub no longer auto builds images, you must build the images yourself.
From the wp1 directory, run the following commands to build and push the images:
git checkout main
git pull origin main
./build_production_images.sh
- Log in to the box that contains the production docker images. It is called wp1.
cd /data/code/wp1/
sudo git pull origin main
- Pull the docker images from docker hub:
docker pull openzim/wp1bot-workers
docker pull openzim/wp1bot-web
docker pull openzim/wp1bot-frontend
- Run docker-compose to bring the production images online.
docker-compose up -d
- Run the production database migrations in the worker container:
docker exec -ti wp1bot-workers yoyo -c /usr/src/app/db/production/yoyo.ini apply
This project is configured to use git pre-commit hooks managed by the
Python program pre-commit
(website). Pre-
commit checks let us ensure that the code is properly formatted with
yapf amongst other things.
If you've installed the requirements for this repository, the pre-commit binary should be available to you. To install the hooks, use:
pre-commit install
Then, when you try to commit a change that would fail pre-commit, you get:
(venv) host:wikimedia_wp1_bot audiodude$ git commit -am 'Test commit'
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
yapf.....................................................................Failed
hookid: yapf
From there, the pre-commit hook will have modified and thus unstaged some or all of the files you were trying to commit. Look through the changes to make sure they are sane, then re-add them with git add, before trying your commit again.
GPLv2 or later, see LICENSE for more details.