Welcome to our github page!! We are four Students from the Leipzig University.
Within the scope of the seminar "Big Data and Language Technologies 2022" we decided to examine perceptions towards the future of AI. Therefore we analyzed statements about the future corresponding to several AI related topics. To realize this we utlized the web Archive of the Webis Group (https://webis.de/), from which we extracted AI statements using the WARC-DL pipeline (https://github.com/webis-de/WARC-DL).
In the following we describe our approach and explain how our code can be executed.
The following chart shows the workflow of our project. First AI statements are extracted from the WARC archive. Afterwards the model pipeline is excuted. Since the Topic model contains dummy topics only, the output of the Model Pipeline is utilized for the topic selection. After this step the selected topics are given to the topic assignment model and the Model Pipeline is ready to use. Subsequently we utilize the output of further executions for the analyzis and visualization.
All scripts at this step serve as the preparation for the model pipeline.
-
Navigate to the directory:
the-future-tense/stage_2_1_models/future_model/dataset
-
Extract dataset to train the future model:
./extract.py
-
Navigate to the model directory:
the-future-tense/stage_2_1_models/future_model/training/future_model_ft
-
Run the jupyter notebook script:
future_model_ft.ipynp
-
Navigate to the directory:
the-future-tense/stage_2_1_models/sentiment_model
-
Run the sentiment model test:
./test_sentiment_model.py
-
Navigate to the following directory:
the-future-tense/stage_2_1_models/topic_model
-
Run the following jupyter notebook :
topic_eval.ipynb
The Model Pipeline can now be executed in order to create the final dataset.
-
Navigate to the Model Pipeline directory:
the-future-tense/stage_2_2_model_pipeline
-
Execute the Model Pipeline:
sbatch run_main.job
The visualization for the analysis is generated at this stage.
- Navigate to the visualization directory:
the-future-tense/stage_3_visualization
-
Deposit your Openai API-Key in your
.env
as OPENAI_API_KEY -
Execute the jupyter notebook
visualize.ipynb