-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Community Sprint] Documentation Tutorials 📚 #7892
Comments
Hello Sir the link that you have added in "Guide To Contribution" of" contribution guideline " in first point is not working it is redirecting to an .md file that has not been generated so can you please resolve that issue? I was unable to view the guidelines. |
Fixed :) |
Part of #7892 documentation sprint. I have added the tutorial to the documentation and started writing a bit. Will continue to fill out! 💪 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: rusty1s <matthias.fey@tu-dortmund.de>
Part of #7892 documentation sprint. I have added the tutorial to the documentation and started writing a bit. Will continue to fill out! 💪 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: rusty1s <matthias.fey@tu-dortmund.de>
#7892 Neighbor Sampling: This tutorial should go more in-depth into our [NeighborLoader](https://pytorch-geometric.readthedocs.io/en/latest/modules/loader.html#torch_geometric.loader.NeighborLoader), explain its usage and reference corresponding examples. It should outline the general computation flow of GNNs with neighborhood sampling, and things to look out for (e.g., ensuring to only make use of the first batch_size many nodes for loss/metric computation). It should also cross-link to our ["Hierarchical Neighborhood Sampling"](https://pytorch-geometric.readthedocs.io/en/latest/advanced/hgam.html) tutorial as a simple extension to improve its efficiency. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
We are kicking off another community sprint!
This community sprint resolves around improving our documentation to make PyG more easily accessible and to expose various PyG features more clearly. Each tutorial is categorized into one of three levels of expertise [EASY, MEDIUM, HARD], and should be picked depending on your expertise with PyG.
The sprint begins Thursday August 16th and will last 3 weeks. If you are interested in helping out, please also join our PyG slack channel
#documentation-sprint
for more information, guidance and help.You can assign yourself to the tutorial you are planning to work on here (choose the "documentation" tab at the bottom if you get directed to a wrong tab).
Documentation Tutorials 📚
We want to improve and enhance the "Tutorials" section in our documentation. On a high-level, we plan to add various tutorials regarding GNN design, applications and use-cases, dataset handling, sampling and multi-GPU training.
GNN Design
[MEDIUM] Best-Practices on GNN Design: This tutorial should outline common building blocks in GNN modules (e.g., GNN layers, normalization layers, skip-connections (e.g., via
JumpingKnowledge
), and explain the various options of GNN layers we have in PyG (e.g., homogeneous GNN layers, bipartite GNN layers, GNN layers that expect edge features and edge weights, GNN layers that expectedge_type
information, GNN layers designed for point clouds, etc) by cross-referencing to our GNN Cheatsheet.[EASY] Customizing Aggregations within Message Passing #7901: Most of the tutorial can be directly copied from this blog post. It should introduce our
Aggregation
package and how you can leverage it to built more powerful aggregations.Applications
[MEDIUM] Application Overview: This tutorial should introduce the various tasks you can tackle with PyG, including but not limited to node prediction, link prediction and graph classification. It should present the general idea of training pipelines and loss functions for these different tasks (e.g., global pooling in graph classification, link-level decoders in link prediction tasks), and at best should reference examples for this from our
examples/
folder.[MEDIUM] Explainability: This tutorial needs to be extended by information stemming from our blog post. In addition, it should go over benchmark datasets and explainability metrics, and reference to corresponding from our
examples/explain
folder.[EASY]
Node2Vec
/MetaPath2Vec
Tutorial: This tutorial should introduce theNode2Vec
andMetaPath2Vec
methods and their corresponding modules in PyG. It should outline the general training flow of this modules, and how to perform down-stream tasks given the embeddings generated by these modules.[HARD] Graph Transformer Tutorial: This tutorial should cover the general idea of Graph Transformers (e.g., attention, positional encodings). It should explain the underlying framework of
GPSConv
module in PyG and how to use it to train Transformer modules on graph-structured data.[EASY] Point Cloud Classification/Segmentation: This tutorial should explain how we can leverage GNNs to learn on point clouds, and introduce the various layers in PyG suitable for this task. As a reference, take a look at our Google Colab Notebook. It should also explain the training pipelines of classification and segmentation tasks and reference their corresponding examples in PyG.
Datasets
RandomNodeSplit
andRandomLinkSplit
transformations, but also cover how you can create custom splits outside of randomly generated ones.Sampling
[MEDIUM] Available Sampling Techniques in PyG: This tutorial should explain the basic concepts of mini-batch sampling for learning on large-scale graphs. It should cover the different options in PyG to do this, e.g.,
NeighborLoader
,ClusterLoader
,GraphSAINT
,ShaDowKHop
, explain their strengths and weaknesses, and which sampler/loader to pick for which task (and link to their example if available).[MEDIUM] Neighbor Sampling: This tutorial should go more in-depth into our
NeighborLoader
, explain its usage and reference corresponding examples. It should outline the general computation flow of GNNs with neighborhood sampling, and things to look out for (e.g., ensuring to only make use of the firstbatch_size
many nodes for loss/metric computation). It should also cross-link to our "Hierarchical Neighborhood Sampling" tutorial as a simple extension to improve its efficiency.[HARD] Link-level Neighbor Sampling: This tutorial should go more in-depth on how you can perform mini-batching for link prediction tasks on large-scale graphs. It should cover the basics of
LinkNeighborLoader
and how it works under the hood, explain the differences betweenedge_index
andedge_label_index
, and cover basic training pipelines. In addition, we can showcase how to leverageKNNIndex
to perform fast-querying of nearest neighbors during inference, based on the embeddings obtained from the trained GNN.Multi-GPU Training
[EASY] Multi-GPU training in Vanilla PyTorch Tutorial #7893: This tutorial should cover the basic of how we can leverage
torch.nn.DistributedDataParallel
for multi-GPU training in PyG. It should briefly go over the corresponding examples in PyG for distributed batching and distributed sampling.[MEDIUM] PyTorch Lightning: This tutorial should explain how one can leverage PyTorch Lightning within PyG for multi-GPU training. It should go over our PyTorch Lightning Wrappers in PyG to easily convert PyG datasets into a
LightningDataModule
instance, and go over and reference our PyTorch Lightning examples.[MEDIUM]
cugraph
andcugraph-ops
: (@pyg-team/nvidia-team) This tutorial should introduce and explain the usage ofCuGraphConv
modules in PyG. It would be great if more information can be shared on what makes these layers more efficient than their PyG counterpart. This tutorial should also capture how one can use them for multi-GPU training withincugraph
.[HARD]
torch_geometric.distributed
: (@pyg-team/intel-team) This tutorial should explain the usage and internals of ourtorch_geometric.distributed
package (still WIP). More information will be added once it is ready.[HARD] GraphLearn for PyTorch (GLT): This tutorial should cover how one can leverage GraphLearn for PyTorch for multi-GPU training within PyG. It should shed some lights on the internals and explain how to use it, similar to what is already present in the
README
.Guide to Contributing
*.rst
file in thedocs/source/tutorial/
folder. You can browse other files in this folder to get a sense for how tutorials are written and formatted.{tutorial_name}
". Afterwards, create an respective entry inCHANGELOG.md
to document your change/feature.The text was updated successfully, but these errors were encountered: