Skip to content

Commit

Permalink
Some doc improvements and additional simplifications
Browse files Browse the repository at this point in the history
  • Loading branch information
miha42-github committed Apr 16, 2024
1 parent 7ceb413 commit 32d643d
Show file tree
Hide file tree
Showing 3 changed files with 17 additions and 61 deletions.
36 changes: 15 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
# Introduction to the company_dns
To enable a more automated approach to gathering information about companies `company_dns` was created. This release enables the synthesis of data from the [SEC EDGAR repository](https://www.sec.gov/edgar/searchedgar/companysearch.html) and [Wikipedia](https://wikipedia.org). A [Medium](https://medium.com) article entitled "[A case for API based open company firmographics](https://medium.com/@michaelhay_90395/a-case-for-api-based-open-company-firmographics-145e4baf121b)" is available discussing the process and motivation behind the creation of this service.

# Introducing V3.0.0
The V3.0.0 release of the `company_dns` is a significant update to the service. The primary changes are:
1. Shift from Flask to Starlette with Uvicorn
2. Automated monthly container builds, from the main branch of the repository, using GitHub Actions
3. Simplification of all aspects of the service including code structure, shift towards simpler Docker, and a more streamlined service control script
4. Vastly improved embedded help with a query console to test queries
We were motivated to make these changes to the service making it easier to improve, maintain and use.


# Installation & Setup
The follwing basic steps are provided for the purposes of getting the tool running.
## Get the code
## For developers Get the code
Assming you have setup access to GitHub, you'll need to clone the repository. Here we assume you're on a Linux box of some kind and will follow the steps below.

1. If you're performing development create a directory that will contain the code: `mkdir ~/dev`
Expand All @@ -15,9 +24,6 @@ Before you get started it is important to install all prequisites and then creat

1. Enter the directory with the service bits (assuming you're using ~/dev): `cd ~/dev/company_dns/company_dns`
2. Install all prerequsites: `pip3 install -r ./requirements.txt`
3. Change the `USER_AGENT` setting in `~/dev/company_dns/company_dns/app/pyedgar.conf` to your own user agent definition. If you don't the SEC downloads will fail.

The utility `dbcontrol.py` will download EDGAR data, process it, and then create a database for the `company_dns`. Note that you do not need to directly run this utility as the service control script will handle it for you. For more information on the database control utility please checkout the [readme](company_dns/app/README.md) for it.

## Service Control Script
A service control script, `svc_ctl.sh` is provided to wrap build, run, and log tailing functions as of V2.3.0. Compared to past versions this script significantly simplifies working with the `company_dns` removing many manual steps to getting it running. As a result there is only one step needed to get the service running `cd ~dev/company_dns;svc_ctl.sh up`. This script will:
Expand All @@ -36,15 +42,11 @@ DESCRIPTION:
Control functions to run the company_dns
COMMANDS:
help up down start stop create_db build delete_db foreground tail
help start stop build foreground tail
help - call up this help text
up - bring up the service including building and pulling the docker image
down - bring down the service and remove the docker image
start - start the service using docker-compose
stop - stop the docker service
create_db - create a new database cache for the company_dns
delete_db - delete the database cache for the company_dns
build - build the docker images for the server
foreground - run the server in the foreground to watch for output
tail - tail the logs for a server running in the background
Expand All @@ -53,7 +55,6 @@ COMMANDS:
## Verify that the service is working
Regardless of the approach you've taken to run the `company_dns` checking to see if it is operating is important. Therefore you can point a browser to the server running the service. If you're running on localhost then the following link should work [http://localhost:6868/V2.0/help](http://localhost:6868/V2.0/help) however if you're on another server then you'll need to change the server name to the one you're using. If this is successful you will be able to see the embedded help which describes the available set of endpoints, and provides and example query to the service. A screenshot of the help screen can be found below.

![Screen Shot 2022-10-16 at 8 18 57 PM](https://user-images.githubusercontent.com/10818650/196084425-6fd9d724-1f59-4eed-9548-c553168bf387.png)

## Checkout a live system
We're hosting an instance of the `company_dns` on our website for our usage and your exploration. Below are several example queries and access to embedded help to get you a better view of the system.
Expand All @@ -71,20 +72,14 @@ We try to keep high level Todos and Improvements in a list contained in a sectio

### Future work/Todos
Here are the things that are likely to be worked but without any strict deadline:
1. ~~Create a simple wrapping script to operationalize service behaviors~~ [see issue #4](https://github.com/miha42-github/company_dns/issues/4)
2. ~~Incrementally refactor the repository and the code~~
3. ~~Enable TLS on nginx or provide instructions to do so~~, [see issue #10](https://github.com/miha42-github/company_dns/issues/10)
4. Determine if feasible to talk to the companies house API for gathering data from the UK
5. Research other pools of public data which can serve to enrich
6. Evaluate if financial data can be added from EDGAR, Wikipedia and Companies House
7. ~~Clean up stale EDGAR URLs~~
8. Provide instructions/details for running on a Pi or Arm based system, see Lagniappe below
9. ~~Update README.md with the appropriate language, etc.~~, [see issue #9](https://github.com/miha42-github/company_dns/issues/9)
10. ~~Add additional URLs for news, stock, patents, etc. to the merged response~~, [see issue #11](https://github.com/miha42-github/company_dns/issues/11)
11. ~~Add ticker information from Wikipedia into the response~~, [see issue #7](https://github.com/miha42-github/company_dns/issues/7)


### The Lagniappe
If you would like to run this on a RasberryPi I'll be adding a couple of configuration files and appropriate instructions later, but until then I suggest you check out [Matt's](https://www.raspberrypi-spy.co.uk/author/matt/) guide to [getting Nginx, UWSGI and Flask running on a Pi](https://www.raspberrypi-spy.co.uk/2018/12/running-flask-under-nginx-raspberry-pi/). At some point if someone would like to create a docker image for these elements running on the Pi that would be great.
Run on a RasberryPi: To be reauthored


# License
Expand All @@ -93,8 +88,7 @@ Since this code falls under a liberal Apache-V2 license it is provided as is, wi
# Key Dependencies
- [PyEdgar](https://github.com/gaulinmp/pyedgar) - used to interface with the SEC's EDGAR repository
- [SQLite](https://www.sqlite.org/index.html) - helps all utilities and the RESTful service quickly and expressively respond to interactions with the other elements to find appropriate company data
- [Flask](https://www.palletsprojects.com/p/flask/) and associated utilities - used to realize the RESTful service
- [nginx](http://nginx.org) - enables hosting of the RESTful service
- Docker & Docker Compose - Container and server framework
- [Starlette](https://www.starlette.io) - used to create the RESTful service
- [Uvicorn](https://www.uvicorn.org) - used to run the RESTful service
- [GeoPy with ArcGIS](https://github.com/geopy/geopy) - Enables proper address formatting and reporting of lat-long pairs for companies
- [wptools](https://github.com/siznax/wptools/) - provides access to MediaWiki data for company search
2 changes: 1 addition & 1 deletion pyedgar.conf
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ INDEX_CACHE_PATH_FORMAT=full_index_{year}_Q{quarter}.gz
KEEP_ALL=True
KEEP_REGEX=
; User Agent for downloading, to keep the SEC happy
USER_AGENT=Mediumroast, Inc. hello@mediumroast.io
USER_AGENT=company_dns hello@mediumroast.io
[Index]
; Index file settings
INDEX_DELIMITER=\t
Expand Down
40 changes: 1 addition & 39 deletions svc_ctl.sh
Original file line number Diff line number Diff line change
Expand Up @@ -82,26 +82,6 @@ function bring_down_server () {
print_footer $FUNC
}

function bring_up_server () {
FUNC="Bring up service"
STEP="bring_up_server"
print_header $FUNC

print_step "Create cache db"
create_db

print_step "Build docker images"
docker-compose build

print_step "Pull docker images"
docker-compose pull

print_step "Bring up ${SERVICE}"
docker-compose up -d

print_footer $FUNC
}

function stop_server () {
FUNC="Stop ${SERVICE}"
STEP="stop_server"
Expand Down Expand Up @@ -152,10 +132,6 @@ function tail_backend () {
###################################


function create_db () {
python3 ./makedb.py
}

function print_help () {
clear
echo "NAME:"
Expand All @@ -165,14 +141,11 @@ function print_help () {
echo " Control functions to run the ${SERVICE}"
echo ""
echo "COMMANDS:"
echo " help up down start stop create_db build delete_db foreground tail"
echo " help start stop build foreground tail"
echo ""
echo " help - call up this help text"
echo " up - bring up the service including building and pulling the docker image"
echo " down - bring down the service and remove the docker image"
echo " start - start the service using docker-compose "
echo " stop - stop the docker service"
echo " create_db - create a new database cache for the ${SERVICE}"
echo " build - build the docker images for the server"
echo " foreground - run the server in the foreground to watch for output"
echo " tail - tail the logs for a server running in the background"
Expand All @@ -190,21 +163,13 @@ function print_help () {
if [ ! $1 ] || [ $1 == "help" ]; then
print_help

elif [ $1 == "up" ]; then
create_db
bring_up_server

elif [ $1 == "down" ]; then
bring_down_server

elif [ $1 == "start" ]; then
start_server

elif [ $1 == "stop" ]; then
stop_server

elif [ $1 == "build" ]; then
create_db
build_server

elif [ $1 == "foreground" ]; then
Expand All @@ -213,9 +178,6 @@ elif [ $1 == "foreground" ]; then
elif [ $1 == "tail" ]; then
tail_backend

elif [ $1 == "create_db" ]; then
create_db

fi

exit 0

0 comments on commit 32d643d

Please sign in to comment.