table_format.txt

﻿Fields you need to scrape from Scopus in order to answer your questions about inﬂuence and networks: Scraping is important because you need author names of the people who cited him. Because every question centers on that information -- where were they, who used him as a reference.

Article title Year
Doi
Co-author
outlet /publication
Institution of person who cites (make sure it's clear how institutional afﬁliation is deﬁne)
 
Eventual dataset will consist of four tables. IDAH will help you turn your csv output from scraping Scopus into this relational format.
 
TABLE_1:
uses information from the Scopus scrape, e.g. citation for the articles that cite Isaac.
article
discipline (if you want to assign it by journal)
-----
TABLE 2:
Uses information scraped from Scopus author list & some manually-entered information.
person
type (type = student y/n)
discipline (if you want to assign it by author. assumes it stays the same their whole career)
----
TABLE_3
institution_name (make sure these are normalized, e.g. that U.C. Berkeley and University of California, Berkeley don't both appear in spreadsheet)
geolocation
----
TABLE_4
Table for article info using primary keys (IDs) from other tables and adding type, and year.
article_ID person_ID institution_ID article_type
discipline (this is the hardest way to show discipline) year