Version:
1.0.1
A powerful CLI scraper for AniDB, built with Node.js and TypeScript, designed to extract references and detailed information about anime. This project is designed to be scalable, configurable, and easy to integrate.
✅ Scraping data from AniDB, including:
- Anime details
- Anime references
- Assets (images, tags, etc.)
✅ Support for decorators and data validation.
✅ Structured logging with Winston.
✅ Environment configuration via.env
file.
✅ SQLite database with Sequelize and TypeScript.
✅ Development tools like ESLint, Typedoc, and TypeScript.
Make sure you have the following installed:
- Node.js:
>= 20.18.0
- Yarn:
>= 1.x
-
Clone the repository:
git clone https://github.com/5h1ngy/cli-node-scraper-anidb.git cd cli-node-scraper-anidb
-
Install dependencies:
yarn install
-
Configure environment variables: Create a
.env
file in the root directory:NODE_ENV=development STORAGE_FILE=db_dump.db PROGRESS_FILE=progress.json LOGGING_DB=true LOG_DEFAULT_LEVEL=info
You can install the scraper globally using npm
:
-
Install globally:
npm install -g .
-
Run the scraper:
anime-scraper
Run the scraper in development mode:
yarn dev
Build and run the scraper in production mode:
yarn build
yarn start
After global installation, you can use the anime-scraper
command:
anime-scraper
By default, the scraper will:
- Connect to the AniDB database.
- Retrieve anime references and details.
- Save the data in a local SQLite database.
You can customize the behavior using .env
configuration.
src/
├── config/ # Database configuration and environment variables
├── handlers/ # Error handling and helper utilities
├── models/ # Sequelize models for database entities
├── operations/ # Scraping and data collection operations
├── services/ # Services for downloading and parsing data
├── utils/ # Utility functions for formatting, extraction, and validation
├── index.ts # Main entry point
Logging is managed via Winston, with support for daily file rotation. Logs are saved in the logs/
directory.
[2024-11-19 12:34:56] info: Scraper starting...
[2024-11-19 12:34:57] verbose: Database connection established.
[2024-11-19 12:35:00] info: Processing anime ID 12345.
Documentation is automatically generated using Typedoc. You can generate it by running:
yarn build
The documentation will be generated in the docs/
directory.
The project uses SQLite via Sequelize for data management.
- Database Path: Configurable via the
STORAGE_FILE
environment variable. - Synchronization: Automatically synchronized at startup.
- Node.js:
>= 20.18.0
- Yarn:
>= 4.5.1
- SQLite: Installed on your system.
This project is licensed under the MIT License. See the LICENSE file for full details.
This project uses data from AniDB (https://anidb.net), which is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
- You may not use the data for commercial purposes.
- Any derivative works created using AniDB data must also be distributed under the same license.
For full details, visit the official license page: https://creativecommons.org/licenses/by-nc-sa/4.0/.
AniDB is the original source of the data. For more information about AniDB, visit https://anidb.net.
Project created by 5h1ngy.
This project is distributed under the MIT license. See the LICENSE file for more details.