VISEE is a system that combine both full-text search and visual search (base on image) together. Our system focus on Vietnam e-commerce product, which was collected from Tiki, Lazada, Shopee, Sendo. VISEE is completely dockerization.
- docker, docker-dompose, nvidia-docker
To run all containers and services:
./dev.sh up
Stop all services:
./dev.sh down
You can use docker-compose
command alternatively. Especially when a service is running, its code were mount directly
from host machine to docker container. So just edit your code and restart container, you will see your changes.
List of environment variables can be use to config VISEE. All variables define in .env
.
Variable | Description | Deafult value |
---|---|---|
API_KEY |
Authorization key for REST API | h$+wt&%3BtH*6rA^KfPzMKDm**GdH_wQaQebd&X9!h=nNVjrt+pn8GNB5%-_ug-U |
API_HOST |
REST API host binding (docker internal network) | 0.0.0.0 |
API_PORT |
REST API port binding (docker internal network) | 7070 |
KAFKA_HOSTS |
Kafka hosts | [visee_kafka:9092] |
KAFKA_USER |
Kafka user | None |
KAFKA_PASSWORD |
Kafka password | None |
KAFKA_NUM_PARTITION |
Kafka number of partitions | 10 |
KAFKA_LINK_TOPIC |
Kafka topic for links scraper | Link item |
KAFKA_CONSUMER_GROUP |
Kafka consumer group | default |
REDIS_HOST |
Redis host (docker internal network) | visee_redis |
REDIS_PASSWORD |
Redis password | None |
REDIS_CATEGORIES_DB |
Redis database for website categories | 0 |
REDIS_LINK2SCRAPE_DB |
Redis database for link to scraper | 1 |
REDIS_DB_IDX_FIRST |
Redis first database for DualRedisConnector |
2 |
REDIS_DB_IDX_SECOND |
Redis second database for DualRedisConnector |
3 |
MILVUS_HOST |
Milvus host (docker internal network) | visee_milvus |
MILVUS_PORT |
Milvus port | 19530 |
MILVUS_TABLE_NAME |
Milvus table name | visee |
ELASTIC_HOSTS |
Elasticearch hosts (docker internal network) | [visee_elasticsearch] |
ELASTIC_PORT |
Elasticsearch port | 9200 |
ELASTIC_USER |
Elasticsearch username | elastic |
ELASTIC_PASSWORD |
Elasticsearch password | changeme |
ELASTIC_INDEX |
Elasticsearch index | visee |
EFFNET_WEIGHT |
EfficientNet weights path (in container) | /visee/static/eff_b7.pth |
CHROME_DRIVER_PATH |
Path to chrome driver (in container) | /visee/static/chromedriver |
IMAGE_SIZE |
Image downloaded size | 1000 |
DOWNLOAD_IMAGE |
Download image or not | True |
- Crawler: Selenium, BeatifulSoup, Apache Kafka, Redis.
- Indexer: PyTorch, Apache Kafka, Redis.
- Search Engine: Elasticsearch, Milvus.
- RESTful Services: Flask, Nginx, Gunicorn.
- User Interface: NodeJS, Nginx, HTML + CSS + JS.
- Logging System: ELK+ Stack (Elasticsearch, Logtash, Kibana, Beats).
System Architecture and Technical Stack
Developers: Duy V. Huynh, Hoang N. Truong, Linh Q. Tran