This repository contains supplemental material associated with my research master thesis. Specifically, the online appendix describes the methods associated with the observational study and experiment. On a conceptual level we investigate the effect of hiding like counts on Instagram on users' posting frequency (H1), variety of visual-content (H2), like behavior (H3), and self-esteem (H4).
The online appendix has been split up in two parts. The first part Data Collection & Preparation describes the seeding strategy (A), consumer account selection and screening (B), scraping process (C), computer vision API usage (D), and image similarity computations (E). The second part Data Analysis then creates a matched sample without outliers (F & G), deploys various difference in difference models (H) on this data set, and analyzes the results of the experiment (I).
We built on an observational data set scraped from Instagram (July 2020) and an experimental dataset obtained through a questionnaire on Prolific (N=600). Preprocessing of the raw data is documented in a step-by-step fashion in the appendix. Definitions and descriptions are outlined over here. All data can be accessed through a relational database service. Attached online appendix refers to multiple environment variables in order to connect to the database (see instructions below). Credentials can be acquired by contacting one of the authors.
Just like you sign in to Google Drive using your email and password credentials, we need to log in to our relational database using a database URL (INSTAGRAM_DB_URL
) and database key/password (INSTAGRAM_DB_KEY
) (available upon request). The idea is that we store these variables on our local machine without hardcoding them into our notebook. Below we briefly describe how to configure these environment variables. Alternatively, watch one of these tutorials (Mac/Linux & Windows).
Mac / Linux
- Go to the terminal and type
printenv
to list all environment variables stored on your machine. - Assuming that
INSTAGRAM_DB_URL
andINSTAGRAM_DB_KEY
are not listed there yet, we're going to define these two new variables. Open the terminal, go to your user directory (shortcut:cd ~
), and typenano .bash_profile
to open a text editor in the terminal. - Within this window you can create new variables as follows:
export [VARIABLE_NAME]="the string value you want to store";
(e.g.INSTAGRAM_DB_URL="http://anotherurl.com"
). Note that there is no space between the variable name and its value and that the string is enclosed in double quotes. Using this approach create aINSTAGRAM_DB_URL
andINSTAGRAM_DB_KEY
variable (you can list them below one another in the same file). - Exit the editor by pressing Ctrl + X, choose
Y
(to save changes), and finally pressEnter
. - You can check whether everything worked out correctly by restarting your terminal and typing
printenv
(INSTAGRAM_DB_URL
andINSTAGRAM_DB_KEY
should be listered there now!). If the new envirionment variables didn't show up, you may need to usenano .zshrc
instead ofnano .bash_profile
(see step 2).
Windows
- Open up "Control Panel" > "System and Security" > "System".
- In the left sidebar click on "Advanced system settings".
- Click on "Environment Variables" in the bottom right.
- Create a new "User Variable" (top list) and fill out the "Variable name" and "Variable value" (
INSTAGRAM_DB_URL
and[DB_URL]
, respectively). - Repeat the same for the secret
INSTAGRAM_DB_KEY
and double click "OK" twice.
- Install Anaconda (Python distribution - including Jupyter Notebook).
- Open the terminal (Mac) or Anaconda Prompt (Windows),
cd
into above main directory (Hiding-Instagram-Likes
), and typeconda create -n instagram --file requirements.txt
, followed byy
. This creates a virtual environment in which all packages are installed that are necessary to run the Jupyter notebooks. - Within the terminal type
conda activate instagram
followed byconda install -c r r-essentials
to enable R support within Jupyter notebooks (if you're asked about the Java JDK, install it from here). - Open Anaconda Navigator, switch to the newly created
instagram
virtual environment, and launch Jupyter Notebook (you may first need to click on the green "Install" button before the blue "Launch" button appears). - In the window that now opens navigate to the
Hiding-Instagram-Likes
directory and open either the Data Collection & Preparation (Python) or Data Analysis (R) notebook. Make sure to pick the right kernel for each notebook.
Note: you can freely run the notebooks from top to bottom. All lines that affect database records have been commented by default.
In the past half a year I was supervised by Hannes Datta (TiSEM) and Niels van de Ven (TiSEM), who I would like to thank for their comments on my thesis. Furthermore, I am grateful for Microsoft for providing Azure credits which we used to study image similarity.
If you have any questions or suggestions, feel free to contact me: hoi@royklaassebos.nl