page_type | languages | products | description | urlFragment | ||
---|---|---|---|---|---|---|
sample |
|
|
PySpark examples running on Azure Databricks to analyze sample Microsoft Academic Graph Data on Azure storage. |
microsoft-academic-graph-pyspark-samples |
PySpark examples running on Azure Databricks to analyze sample Microsoft Academic Graph Data on Azure storage.
Before running these examples, you need to complete the following setups:
-
Setting up provisioning of Microsoft Academic Graph to an Azure blob storage account. See Get Microsoft Academic Graph on Azure storage.
-
Setting up Azure Databricks service. See Set up Azure Databricks.
Before you begin, you should have these items of information:
✔️ The name of your Azure Storage (AS) account containing MAG dataset from Get Microsoft Academic Graph on Azure storage.
✔️ The access key of your Azure Storage (AS) account from Get Microsoft Academic Graph on Azure storage.
✔️ The name of the container in your Azure Storage (AS) account containing MAG dataset.
✔️ The name of the output container in your Azure Storage (AS) account.
-
git clone https://github.com/Azure-Samples/microsoft-academic-graph-pyspark-samples.git
-
Follow instructions in PySpark analytics samples for Microsoft Academic Graph to run PySpark scripts in this repository.