- Edit configuration in
config/default.json
and - custom environment variables names in
config/custom-environment-variables.json
,
- Application constants can be configured in
./constants.js
- Since the data we need to download and process is huge it's better (/ safer) to use 2 different tools instead of one single script so in case that something goes wrong during processing, we'll minimise the damage.
- Run
npm run download-data
to download all available datasets. - The datasets will be stored in the configured directory.
- Old data will be replaced.
- This operation does not affect the database.
- Run
npm run import-data
to import all data using the downloaded files from the previous step.
Before starting the application, make sure that PostgreSQL is running and you have configured everything correctly in config/default.json
- Install dependencies
npm i
- Run lint check
npm run lint
- Start app
npm start
. This will run all tools in the following sequence:
npm run download-data
=> npm run import-data
The application will print progress information and the results in the terminal.
- To verify that the data is imported, you can use the pgAdmin tool and browser the database.
- The total size of all datasets is > 1.5GB so it will take quite some time, depending on your internet connection, to finish the operation.
max_old_space_size
has been set to 4096MB to allow parse/process such huge data files without any issues. The app will clean the memory right after using the data to prevent memory/heap leaks.- The dataset for
FOREIGN ADDRESSES
doesn't have a header in the CSV file and it has slightly different format (it has an extra column). The app handles all datasets without any issue.