王竣樺 梁致銓 曾繁斌
python -m venv venv
venv/Script/Activate
pip install -r requirements.txt
We currently utilizing 10 proxies provided by the free account of Webshare.
Warning
REMEMBER del the save page part of get_page()
or your computer will be filled with htmls
cd Patent_Search_Crawler
# crawling
python main.py
# merging files
python merger.py
You can find the complete raw data in this link: link
- We use the data
merge_data.sqlite
for training.
Tip
The merge_data.sqlite
should be put at SMM
folder.
Explanation of the preprocessed data (in /SMM/EDGPAT/data
):
Tip
The datasets below are for training, no need to preprocess.
- 2-1-level.csv: IPC Level mapping from 1 to 2.
- 3-2-level.csv: IPC Level mapping from 2 to 3.
- 4-3-level.csv: IPC Level mapping from 3 to 4.
- 5-4-level.csv: IPC Level mapping from 4 to 5.
- real-data.json: Illustrate the company's patents within the current year.
Our project focuses on developing a patent prediction model specifically for forecasting Taiwan's future patent trends. Utilizing Event-based Graph techniques, this model analyzes historical patent data to identify emerging trends and patterns.
- Data-Driven: Uses real-world patent data (Taiwan) to identify trends.
- Dynamic: Adapts to changes in technology and innovation.
- Predictive: Forecasts areas likely to see growth in patent filings.
Framework of the proposed model. We just show the calculations of the patent classification codes and one of the related companies for simplicity.
We utilized the code from EDGPAT
Warning
The Python env should be Python 3.6!
Required packages:
Just run the build_input.ipynb
We split the data into three parts: training, validation and testing by year.
Run the code:
sh EDGPAT/train.sh
Note
This code will ouptut the training result in EDGPAT/out.txt
Origin Paper Results | Our Results | ||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
Droput 0.5 | Dropout 0 | ||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|