Welcome to CogDL’s Documentation!¶

CogDL is a graph representation learning toolkit that allows researchers and developers to easily train and compare baseline or custom models for node classification, link prediction and other tasks on graphs. It provides implementations of many popular models, including: non-GNN Baselines like Deepwalk, LINE, NetMF, GNN Baselines like GCN, GAT, GraphSAGE.
CogDL provides these features:
Task-Oriented: CogDL focuses on tasks on graphs and provides corresponding models, datasets, and leaderboards.
Easy-Running: CogDL supports running multiple experiments simultaneously on multiple models and datasets under a specific task using multiple GPUs.
Multiple Tasks: CogDL supports node classification and link prediction tasks on homogeneous/heterogeneous networks, as well as graph classification.
Extensibility: You can easily add new datasets, models and tasks and conduct experiments for them!
Supported tasks:
Node classification
Link prediction
Graph classification
Community detection (testing)
Social influence prediction (testing)
Graph reasoning (todo)
Graph pre-training (todo)
Combinatorial optimization on graphs (todo)
Install¶
PyTorch version >= 1.0.0
Python version >= 3.6
PyTorch Geometric (optional)
Please follow the instructions here to install PyTorch: https://github.com/pytorch/pytorch#installation.
Please follow the instructions here to install PyTorch Geometric: https://github.com/rusty1s/pytorch_geometric/#installation.
Install other dependencies:
>>> pip install -e .
Tutorial¶
This guide can help you start working with CogDL.
Create a model¶
Here, we will create a spectral clustering model, which is a very simple graph embedding algorithm. We name it spectral.py and put it in cogdl/models/emb directory.
First we import necessary library like numpy, scipy, networkx, sklearn, we also import API like ‘BaseModel’ and ‘register_model’ from cogl/models/ to build our new model:
import numpy as np
import networkx as nx
import scipy.sparse as sp
from sklearn import preprocessing
from .. import BaseModel, register_model
Then we use function decorator to declare new model for CogDL
@register_model('spectral')
class Spectral(BaseModel):
(...)
We have to implement method ‘build_model_from_args’ in spectral.py. If it need more parameters to train, we can use ‘add_args’ to add model-specific arguments.
@staticmethod
def add_args(parser):
"""Add model-specific arguments to the parser."""
pass
@classmethod
def build_model_from_args(cls, args):
return cls(args.hidden_size)
def __init__(self, dimension):
super(Spectral, self).__init__()
self.dimension = dimension
Each new model should provide a ‘train’ method to obtain representation.
def train(self, G):
matrix = nx.normalized_laplacian_matrix(G).todense()
matrix = np.eye(matrix.shape[0]) - np.asarray(matrix)
ut, s, _ = sp.linalg.svds(matrix, self.dimension)
emb_matrix = ut * np.sqrt(s)
emb_matrix = preprocessing.normalize(emb_matrix, "l2")
return emb_matrix
Create a dataset¶
In order to add a dataset into CogDL, you should know your dataset’s format. We have provided several graph format like edgelist, matlab_matrix and pyg. If your dataset is same as the ‘ppi’ dataset, which contains two matrices: ‘network’ and ‘group’, you can register your dataset directly use above code.
@register_dataset("ppi")
class PPIDataset(MatlabMatrix):
def __init__(self):
dataset, filename = "ppi", "Homo_sapiens"
url = "http://snap.stanford.edu/node2vec/"
path = osp.join(osp.dirname(osp.realpath(__file__)), "../..", "data", dataset)
super(PPIDataset, self).__init__(path, filename, url)
You should declare the name of the dataset, the name of file and the url, where our script can download resource.
Create a task¶
In order to evaluate some methods on several datasets, we can build a task and evaluate learned representation. The BaseTask class are:
class BaseTask(object):
@staticmethod
def add_args(parser):
"""Add task-specific arguments to the parser."""
pass
def __init__(self, args):
pass
def train(self, num_epoch):
raise NotImplementedError
we can create a subclass to implement ‘train’ method like CommunityDetection, which get representation of each node and apply clustering algorithm(K-means) to evaluate.
@register_task("community_detection")
class CommunityDetection(BaseTask):
"""Community Detection task."""
@staticmethod
def add_args(parser):
"""Add task-specific arguments to the parser."""
parser.add_argument("--hidden-size", type=int, default=128)
parser.add_argument("--num-shuffle", type=int, default=5)
def __init__(self, args):
super(CommunityDetection, self).__init__(args)
dataset = build_dataset(args)
self.data = dataset[0]
self.num_nodes, self.num_classes = self.data.y.shape
self.label = np.argmax(self.data.y, axis=1)
self.model = build_model(args)
self.hidden_size = args.hidden_size
self.num_shuffle = args.num_shuffle
def train(self):
G = nx.Graph()
G.add_edges_from(self.data.edge_index.t().tolist())
embeddings = self.model.train(G)
clusters = [30, 50, 70]
all_results = defaultdict(list)
for num_cluster in clusters:
for _ in range(self.num_shuffle):
model = KMeans(n_clusters=num_cluster).fit(embeddings)
nmi_score = normalized_mutual_info_score(self.label, model.labels_)
all_results[num_cluster].append(nmi_score)
return dict(
(
f"normalized_mutual_info_score {num_cluster}",
sum(all_results[num_cluster]) / len(all_results[num_cluster]),
)
for num_cluster in sorted(all_results.keys())
)
Combine model, dataset and task¶
After create your model, dataset and task, we could combine them together to learn representation from a model on a dataset and evaluate its performance according to a task. We use ‘build_model’, ‘build_dataset’, ‘build_task’ method to build them with cooresponding parameters.
from cogdl.tasks import build_task
from cogdl.datasets import build_dataset
from cogdl.models import build_model
from cogdl.utils import build_args_from_dict
def test_deepwalk_ppi():
default_dict = {'hidden_size': 64, 'num_shuffle': 1, 'cpu': True}
args = build_args_from_dict(default_dict)
# model, dataset and task parameters
args.model = 'spectral'
args.dataset = 'ppi'
args.task = 'community_detection'
# build model, dataset and task
dataset = build_dataset(args)
model = build_model(args)
task = build_task(args)
# train model and get evaluate results
ret = task.train()
print(ret)
Tasks¶
Node Classification¶
In this tutorial, we will introduce a important task, node classification. In this task, we train a GNN model with partial node labels and use accuracy to measure the performance.
First we define the NodeClassification class.
@register_task("node_classification")
class NodeClassification(BaseTask):
"""Node classification task."""
@staticmethod
def add_args(parser):
"""Add task-specific arguments to the parser."""
def __init__(self, args):
super(NodeClassification, self).__init__(args)
Then we can build dataset according to args.
self.device = torch.device('cpu' if args.cpu else 'cuda')
dataset = build_dataset(args)
self.data = dataset.data
self.data.apply(lambda x: x.to(self.device))
args.num_features = dataset.num_features
args.num_classes = dataset.num_classes
After that, we can build model and use Adam to optimize the model.
model = build_model(args)
self.model = model.to(self.device)
self.patience = args.patience
self.max_epoch = args.max_epoch
self.optimizer = torch.optim.Adam(
self.model.parameters(), lr=args.lr, weight_decay=args.weight_decay
)
We provide a training loop for node classification task. For each epoch, we first call _train_step to optimize our model and then call _test_step to compute the accuracy and loss.
def train(self):
epoch_iter = tqdm(range(self.max_epoch))
patience = 0
best_score = 0
best_loss = np.inf
max_score = 0
min_loss = np.inf
for epoch in epoch_iter:
self._train_step()
train_acc, _ = self._test_step(split="train")
val_acc, val_loss = self._test_step(split="val")
epoch_iter.set_description(
f"Epoch: {epoch:03d}, Train: {train_acc:.4f}, Val: {val_acc:.4f}"
)
if val_loss <= min_loss or val_acc >= max_score:
if val_loss <= best_loss: # and val_acc >= best_score:
best_loss = val_loss
best_score = val_acc
best_model = copy.deepcopy(self.model)
min_loss = np.min((min_loss, val_loss))
max_score = np.max((max_score, val_acc))
patience = 0
else:
patience += 1
if patience == self.patience:
self.model = best_model
epoch_iter.close()
break
def _train_step(self):
self.model.train()
self.optimizer.zero_grad()
self.model.loss(self.data).backward()
self.optimizer.step()
def _test_step(self, split="val"):
self.model.eval()
logits = self.model.predict(self.data)
_, mask = list(self.data(f"{split}_mask"))[0]
loss = F.nll_loss(logits[mask], self.data.y[mask])
pred = logits[mask].max(1)[1]
acc = pred.eq(self.data.y[mask]).sum().item() / mask.sum().item()
return acc, loss
Finally, we compute the accuracy scores of test set for the trained model.
test_acc, _ = self._test_step(split="test")
print(f"Test accuracy = {test_acc}")
return dict(Acc=test_acc)
The overall implementation of NodeClassification is at (https://github.com/THUDM/cogdl/blob/master/cogdl/tasks/node_classification.py).
To run NodeClassification, we can use the following command:
python scripts/train.py --task node_classification --dataset cora citeseer --model pyg_gcn pyg_gat --seed 0 1 --max-epoch 500
Then We get experimental results like this:
Variant |
Acc |
---|---|
(‘cora’, ‘pyg_gcn’) |
0.7785±0.0165 |
(‘cora’, ‘pyg_gat’) |
0.7925±0.0045 |
(‘citeseer’, ‘pyg_gcn’) |
0.6535±0.0195 |
(‘citeseer’, ‘pyg_gat’) |
0.6675±0.0025 |
Unsupervised Node Classification¶
In this tutorial, we will introduce a important task, unsupervised node classification. In this task, we usually apply L2 normalized logisitic regression to train a classifier and use F1-score to measure the performance.
First we define the UnsupervisedNodeClassification class, which has two parameters hidden-size and num-shuffle . hidden-size represents the dimension of node representation, while num-shuffle means the shuffle times in classifier.
@register_task("unsupervised_node_classification")
class UnsupervisedNodeClassification(BaseTask):
"""Node classification task."""
@staticmethod
def add_args(parser):
"""Add task-specific arguments to the parser."""
# fmt: off
parser.add_argument("--hidden-size", type=int, default=128)
parser.add_argument("--num-shuffle", type=int, default=5)
# fmt: on
def __init__(self, args):
super(UnsupervisedNodeClassification, self).__init__(args)
Then we can build dataset according to input graph’s type, and get self.label_matrix.
dataset = build_dataset(args)
self.data = dataset[0]
if issubclass(dataset.__class__.__bases__[0], InMemoryDataset):
self.num_nodes = self.data.y.shape[0]
self.num_classes = dataset.num_classes
self.label_matrix = np.zeros((self.num_nodes, self.num_classes), dtype=int)
self.label_matrix[range(self.num_nodes), self.data.y] = 1
self.data.edge_attr = self.data.edge_attr.t()
else:
self.label_matrix = self.data.y
self.num_nodes, self.num_classes = self.data.y.shape
After that, we can build model and run model.train(G) to obtain node representation.
self.model = build_model(args)
self.model_name = args.model
self.hidden_size = args.hidden_size
self.num_shuffle = args.num_shuffle
self.save_dir = args.save_dir
self.enhance = args.enhance
self.args = args
self.is_weighted = self.data.edge_attr is not None
def train(self):
G = nx.Graph()
if self.is_weighted:
edges, weight = (
self.data.edge_index.t().tolist(),
self.data.edge_attr.tolist(),
)
G.add_weighted_edges_from(
[(edges[i][0], edges[i][1], weight[0][i]) for i in range(len(edges))]
)
else:
G.add_edges_from(self.data.edge_index.t().tolist())
embeddings = self.model.train(G)
The spectral propagation in ProNE can improve the quality of representation learned from other methods, so we can use enhance_emb to enhance performance.
if self.enhance is True:
embeddings = self.enhance_emb(G, embeddings)
def enhance_emb(self, G, embs):
A = sp.csr_matrix(nx.adjacency_matrix(G))
self.args.model = 'prone'
self.args.step, self.args.theta, self.args.mu = 5, 0.5, 0.2
model = build_model(self.args)
embs = model._chebyshev_gaussian(A, embs)
return embs
When the embeddings are obtained, we can save them at self.save_dir.
# Map node2id
features_matrix = np.zeros((self.num_nodes, self.hidden_size))
for vid, node in enumerate(G.nodes()):
features_matrix[node] = embeddings[vid]
self.save_emb(features_matrix)
def save_emb(self, embs):
name = os.path.join(self.save_dir, self.model_name + '_emb.npy')
np.save(name, embs)
At last, we evaluate embedding via run num_shuffle times classification under different training ratio with features_matrix and label_matrix.
return self._evaluate(features_matrix, label_matrix, self.num_shuffle)
def _evaluate(self, features_matrix, label_matrix, num_shuffle):
# shuffle, to create train/test groups
shuffles = []
for _ in range(num_shuffle):
shuffles.append(skshuffle(features_matrix, label_matrix))
# score each train/test group
all_results = defaultdict(list)
training_percents = [0.1, 0.3, 0.5, 0.7, 0.9]
for train_percent in training_percents:
for shuf in shuffles:
In each shuffle, split data into two parts(training and testing) and use LogisticRegression to evaluate.
X, y = shuf
training_size = int(train_percent * self.num_nodes)
X_train = X[:training_size, :]
y_train = y[:training_size, :]
X_test = X[training_size:, :]
y_test = y[training_size:, :]
clf = TopKRanker(LogisticRegression())
clf.fit(X_train, y_train)
# find out how many labels should be predicted
top_k_list = list(map(int, y_test.sum(axis=1).T.tolist()[0]))
preds = clf.predict(X_test, top_k_list)
result = f1_score(y_test, preds, average="micro")
all_results[train_percent].append(result)
Node in graph may have multiple labels, so we conduct multilbel classification built from TopKRanker.
from sklearn.multiclass import OneVsRestClassifier
class TopKRanker(OneVsRestClassifier):
def predict(self, X, top_k_list):
assert X.shape[0] == len(top_k_list)
probs = np.asarray(super(TopKRanker, self).predict_proba(X))
all_labels = sp.lil_matrix(probs.shape)
for i, k in enumerate(top_k_list):
probs_ = probs[i, :]
labels = self.classes_[probs_.argsort()[-k:]].tolist()
for label in labels:
all_labels[i, label] = 1
return all_labels
Finally, we get the results of Micro-F1 score under different training ratio for different models on datasets.
return dict(
(
f"Micro-F1 {train_percent}",
sum(all_results[train_percent]) / len(all_results[train_percent]),
)
for train_percent in sorted(all_results.keys())
)
The overall implementation of UnsupervisedNodeClassification is at (https://github.com/THUDM/cogdl/blob/master/cogdl/tasks/unsupervised_node_classification.py).
To run UnsupervisedNodeClassification, we can use following instruction:
python scripts/train.py --task unsupervised_node_classification --dataset ppi wikipedia --model deepwalk prone -seed 0 1
Then We get experimental results like this:
Variant |
Micro-F1 0.1 |
Micro-F1 0.3 |
Micro-F1 0.5 |
Micro-F1 0.7 |
Micro-F1 0.9 |
---|---|---|---|---|---|
(‘ppi’, ‘deepwalk’) |
0.1547±0.0002 |
0.1846±0.0002 |
0.2033±0.0015 |
0.2161±0.0009 |
0.2243±0.0018 |
(‘ppi’, ‘prone’) |
0.1777±0.0016 |
0.2214±0.0020 |
0.2397±0.0015 |
0.2486±0.0022 |
0.2607±0.0096 |
(‘wikipedia’, ‘deepwalk’) |
0.4255±0.0027 |
0.4712±0.0005 |
0.4916±0.0011 |
0.5011±0.0017 |
0.5166±0.0043 |
(‘wikipedia’, ‘prone’) |
0.4834±0.0009 |
0.5320±0.0020 |
0.5504±0.0045 |
0.5586±0.0022 |
0.5686±0.0072 |
Supervised Graph Classification¶
In this section, we will introduce the implementation “Graph classification task”.
Task Design
Set up “SupervisedGraphClassification” class, which has two specific parameters.
degree-feature: Use one-hot node degree as node feature, for datasets such as lmdb-binary and lmdb-multi, which don’t have node features.
gamma: Multiplicative factor of learning rate decay.
lr: Learning rate.
Build dataset convert it to a list of Data defined in Cogdl. Specially, we reformat the data according to the input format of specific models. generate_data is implemented to convert dataset.
dataset = build_dataset(args)
self.data = self.generate_data(dataset, args)
def generate_data(self, dataset, args):
if "ModelNet" in str(type(dataset).__name__):
train_set, test_set = dataset.get_all()
args.num_features = 3
return {"train": train_set, "test": test_set}
else:
datalist = []
if isinstance(dataset[0], Data):
return dataset
for idata in dataset:
data = Data()
for key in idata.keys:
data[key] = idata[key]
datalist.append(data)
if args.degree_feature:
datalist = node_degree_as_feature(datalist)
args.num_features = datalist[0].num_features
return datalist
```
Then we build model and can run train to train the model.
def train(self):
for epoch in epoch_iter:
self._train_step()
val_acc, val_loss = self._test_step(split="valid")
# ...
return dict(Acc=test_acc)
def _train_step(self):
self.model.train()
loss_n = 0
for batch in self.train_loader:
batch = batch.to(self.device)
self.optimizer.zero_grad()
output, loss = self.model(batch)
loss_n += loss.item()
loss.backward()
self.optimizer.step()
def _test_step(self, split):
"""split in ['train', 'test', 'valid']"""
# ...
return acc, loss
The overall implementation of GraphClassification is at (https://github.com/THUDM/cogdl/blob/master/cogdl/tasks/graph_classification.py).
Create a model
To create a model for task graph classification, the following functions have to be implemented.
add_args(parser): add necessary hyper-parameters used in model.
@staticmethod
def add_args(parser):
parser.add_argument("--hidden-size", type=int, default=128)
parser.add_argument("--num-layers", type=int, default=2)
parser.add_argument("--lr", type=float, default=0.001)
# ...
build_model_from_args(cls, args): this function is called in ‘task’ to build model.
split_dataset(cls, dataset, args): split train/validation/test data and return correspondent dataloader according to requirement of model.
def split_dataset(cls, dataset, args):
random.shuffle(dataset)
train_size = int(len(dataset) * args.train_ratio)
test_size = int(len(dataset) * args.test_ratio)
bs = args.batch_size
train_loader = DataLoader(dataset[:train_size], batch_size=bs)
test_loader = DataLoader(dataset[-test_size:], batch_size=bs)
if args.train_ratio + args.test_ratio < 1:
valid_loader = DataLoader(dataset[train_size:-test_size], batch_size=bs)
else:
valid_loader = test_loader
return train_loader, valid_loader, test_loader
forward: forward propagation, and the return should be (predication, loss) or (prediction, None), respectively for training and test. Input parameters of forward is class Batch, which
def forward(self, batch):
h = batch.x
layer_rep = [h]
for i in range(self.num_layers-1):
h = self.gin_layers[i](h, batch.edge_index)
h = self.batch_norm[i](h)
h = F.relu(h)
layer_rep.append(h)
final_score = 0
for i in range(self.num_layers):
pooled = scatter_add(layer_rep[i], batch.batch, dim=0)
final_score += self.dropout(self.linear_prediction[i](pooled))
final_score = F.softmax(final_score, dim=-1)
if batch.y is not None:
loss = self.loss(final_score, batch.y)
return final_score, loss
return final_score, None
Run
To run GraphClassification, we can use the following command:
python scripts/train.py --task graph_classification --dataset proteins --model gin diffpool sortpool dgcnn --seed 0 1
Then We get experimental results like this:
Variants |
Acc |
---|---|
(‘proteins’, ‘gin’) |
0.7286±0.0598 |
(‘proteins’, ‘diffpool’) |
0.7530±0.0589 |
(‘proteins’, ‘sortpool’) |
0.7411±0.0269 |
(‘proteins’, ‘dgcnn’) |
0.6677±0.0355 |
(‘proteins’, ‘patchy_san’) |
0.7550±0.0812 |
Unsupervised Graph Classification¶
In this section, we will introduce the implementation “Unsupervised graph classification task”.
Task Design
Set up “UnsupervisedGraphClassification” class, which has two specific parameters.
num-shuffle : Shuffle times in classifier
degree-feature: Use one-hot node degree as node feature, for datasets such as lmdb-binary and lmdb-multi, which don’t have node features.
lr: learning
@register_task("unsupervised_graph_classification")
class UnsupervisedGraphClassification(BaseTask):
r"""Unsupervised graph classification"""
@staticmethod
def add_args(parser):
"""Add task-specific arguments to the parser."""
# fmt: off
parser.add_argument("--num-shuffle", type=int, default=10)
parser.add_argument("--degree-feature", dest="degree_feature", action="store_true")
parser.add_argument("--lr", type=float, default=0.001)
# fmt: on
def __init__(self, args):
# ...
Build dataset and convert it to a list of Data defined in Cogdl.
dataset = build_dataset(args)
self.label = np.array([data.y for data in dataset])
self.data = [
Data(x=data.x, y=data.y, edge_index=data.edge_index, edge_attr=data.edge_attr,
pos=data.pos).apply(lambda x:x.to(self.device))
for data in dataset
]
Then we build model and can run train to train the model and obtain graph representation. In this part, the training process of shallow models and deep models are implemented separately.
self.model = build_model(args)
self.model = self.model.to(self.device)
def train(self):
if self.use_nn:
# deep neural network models
epoch_iter = tqdm(range(self.epoch))
for epoch in epoch_iter:
loss_n = 0
for batch in self.data_loader:
batch = batch.to(self.device)
predict, loss = self.model(batch.x, batch.edge_index, batch.batch)
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
loss_n += loss.item()
# ...
else:
# shallow models
prediction, loss = self.model(self.data)
label = self.label
When graph representation is obtained, we evaluate the embedding with SVM via running num_shuffle times under different training ratio. You can also call save_emb to save the embedding.
return self._evaluate(prediction, label)
def _evaluate(self, embedding, labels):
# ...
for training_percent in training_percents:
for shuf in shuffles:
# ...
clf = SVC()
clf.fit(X_train, y_train)
preds = clf.predict(X_test)
# ...
```
The overall implementation of UnsupervisedGraphClassification is at (https://github.com/THUDM/cogdl/blob/master/cogdl/tasks/unsupervised_graph_classification.py).
Create a model
To create a model for task unsupervised graph classification, the following functions have to be implemented.
add_args(parser): add necessary hyper-parameters used in model.
@staticmethod
def add_args(parser):
parser.add_argument("--hidden-size", type=int, default=128)
parser.add_argument("--nn", type=bool, default=False)
parser.add_argument("--lr", type=float, default=0.001)
# ...
build_model_from_args(cls, args): this function is called in ‘task’ to build model.
forward: For shallow models, this function runs as training process of model and will be called only once; For deep neural network models, this function is actually the forward propagation process and will be called many times.
# shallow model
def forward(self, graphs):
# ...
self.model = Doc2Vec(
self.doc_collections,
...
)
vectors = np.array([self.model["g_"+str(i)] for i in range(len(graphs))])
return vectors, None
Run
To run UnsupervisedGraphClassification, we can use the following command:
python scripts/train.py --task unsupervised_graph_classification --dataset proteins --model dgk graph2vec
Then we get experimental results like this:
Variant |
Acc |
---|---|
(‘proteins’, ‘dgk’) |
0.7259±0.0118 |
(‘proteins’, ‘graph2vec’) |
0.7330±0.0043 |
(‘proteins’, ‘infograph’) |
0.7393±0.0070 |
License¶
MIT License
Copyright (c) 2020
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Citing¶
[Perozzi et al. (2014): Deepwalk: Online learning of social representations](http://arxiv.org/abs/1403.6652)
[Tang et al. (2015): Line: Large-scale informa- tion network embedding](http://arxiv.org/abs/1503.03578)
[Grover and Leskovec. (2016): node2vec: Scalable feature learning for networks](http://dl.acm.org/citation.cfm?doid=2939672.2939754)- [Cao et al. (2015):Grarep: Learning graph representations with global structural information ](http://dl.acm.org/citation.cfm?doid=2806416.2806512)
[Ou et al. (2016): Asymmetric transitivity preserving graph em- bedding](http://dl.acm.org/citation.cfm?doid=2939672.2939751)
[Qiu et al. (2017): Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec](http://arxiv.org/abs/1710.02971)
[Qiu et al. (2019): NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization](http://arxiv.org/abs/1710.02971)
[Zhang et al. (2019): Spectral Network Embedding: A Fast and Scalable Method via Sparsity](http://arxiv.org/abs/1806.02623)
[Kipf and Welling (2016): Semi-Supervised Classification with Graph Convolutional Networks](https://arxiv.org/abs/1609.02907)
[Hamilton et al. (2017): Inductive Representation Learning on Large Graphs](https://arxiv.org/abs/1706.02216)
[Veličković et al. (2017): Graph Attention Networks](https://arxiv.org/abs/1710.10903)
[Ding et al. (2018): Semi-supervised Learning on Graphs with Generative Adversarial Nets](https://arxiv.org/abs/1809.00130)
[Han et al. (2019): GroupRep: Unsupervised Structural Representation Learning for Groups in Networks](https://www.overleaf.com/read/nqxjtkmmgmff)
[Zhang et al. (2019): Revisiting Graph Convolutional Networks: Neighborhood Aggregation and Network Sampling](https://www.overleaf.com/read/xzykmvhxjmxy)
[Zhang et al. (2019): Co-training Graph Convolutional Networks with Network Redundancy](https://www.overleaf.com/read/fbhqqgzqgmyn)
[Qiu et al. (2019): NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization](http://keg.cs.tsinghua.edu.cn/jietang/publications/www19-Qiu-et-al-NetSMF-Large-Scale-Network-Embedding.pdf)
[Zhang et al. (2019): ProNE: Fast and Scalable Network Representation Learning](https://www.overleaf.com/read/dhgpkmyfdhnj)
[Cen et al. (2019): Representation Learning for Attributed Multiplex Heterogeneous Network](https://arxiv.org/abs/1905.01669)
API Reference¶
This page contains auto-generated API reference documentation 1.
options
¶
Module Contents¶
Functions¶
|
|
|
|
|
|
|
The parser doesn’t know about model-specific args, so we parse twice. |
utils
¶
Module Contents¶
Functions¶
|
|
|
|
|
|
|
|
|
Args: |
|
|
|
Args: |
|
Args: |
|
-
utils.
spmm
(indices, values, b)[source]¶ Args: indices : Tensor, shape=(2, E) values : Tensor, shape=(E,) shape : tuple(int ,int) b : Tensor, shape=(N, )
-
utils.
edge_softmax
(indices, values, shape)[source]¶ - Args:
indices: Tensor, shape=(2, E) values: Tensor, shape=(N,) shape: tuple(int, int)
- Returns:
Softmax values of edge values for nodes
layers
¶
Submodules¶
layers.gcc_module
¶
Module Contents¶
Squeeze-and-excitation networks |
|
Update the node feature hv with MLP, BN and ReLU. |
|
MLP with linear output |
|
MPNN from |
|
GIN model |
|
MPNN from |
-
class
layers.gcc_module.
SELayer
(in_channels, se_channels)[source]¶ Bases:
torch.nn.Module
Squeeze-and-excitation networks
-
class
layers.gcc_module.
ApplyNodeFunc
(mlp, use_selayer)[source]¶ Bases:
torch.nn.Module
Update the node feature hv with MLP, BN and ReLU.
-
class
layers.gcc_module.
MLP
(num_layers, input_dim, hidden_dim, output_dim, use_selayer)[source]¶ Bases:
torch.nn.Module
MLP with linear output
-
class
layers.gcc_module.
UnsupervisedGAT
(node_input_dim, node_hidden_dim, edge_input_dim, num_layers, num_heads)[source]¶ Bases:
torch.nn.Module
-
class
layers.gcc_module.
UnsupervisedMPNN
(output_dim=32, node_input_dim=32, node_hidden_dim=32, edge_input_dim=32, edge_hidden_dim=32, num_step_message_passing=6, lstm_as_gate=False)[source]¶ Bases:
torch.nn.Module
MPNN from Neural Message Passing for Quantum Chemistry
- node_input_dimint
Dimension of input node feature, default to be 15.
- edge_input_dimint
Dimension of input edge feature, default to be 15.
- output_dimint
Dimension of prediction, default to be 12.
- node_hidden_dimint
Dimension of node feature in hidden layers, default to be 64.
- edge_hidden_dimint
Dimension of edge feature in hidden layers, default to be 128.
- num_step_message_passingint
Number of message passing steps, default to be 6.
- num_step_set2setint
Number of set2set steps
- num_layer_set2setint
Number of set2set layers
-
forward
(self, g, n_feat, e_feat)[source]¶ Predict molecule labels
- gDGLGraph
Input DGLGraph for molecule(s)
- n_feattensor of dtype float32 and shape (B1, D1)
Node features. B1 for number of nodes and D1 for the node feature size.
- e_feattensor of dtype float32 and shape (B2, D2)
Edge features. B2 for number of edges and D2 for the edge feature size.
res : Predicted labels
-
class
layers.gcc_module.
UnsupervisedGIN
(num_layers, num_mlp_layers, input_dim, hidden_dim, output_dim, final_dropout, learn_eps, graph_pooling_type, neighbor_pooling_type, use_selayer)[source]¶ Bases:
torch.nn.Module
GIN model
-
class
layers.gcc_module.
GraphEncoder
(positional_embedding_size=32, max_node_freq=8, max_edge_freq=8, max_degree=128, freq_embedding_size=32, degree_embedding_size=32, output_dim=32, node_hidden_dim=32, edge_hidden_dim=32, num_layers=6, num_heads=4, num_step_set2set=6, num_layer_set2set=3, norm=False, gnn_model='mpnn', degree_input=False, lstm_as_gate=False)[source]¶ Bases:
torch.nn.Module
MPNN from Neural Message Passing for Quantum Chemistry
- node_input_dimint
Dimension of input node feature, default to be 15.
- edge_input_dimint
Dimension of input edge feature, default to be 15.
- output_dimint
Dimension of prediction, default to be 12.
- node_hidden_dimint
Dimension of node feature in hidden layers, default to be 64.
- edge_hidden_dimint
Dimension of edge feature in hidden layers, default to be 128.
- num_step_message_passingint
Number of message passing steps, default to be 6.
- num_step_set2setint
Number of set2set steps
- num_layer_set2setint
Number of set2set layers
-
forward
(self, g, return_all_outputs=False)[source]¶ Predict molecule labels
- gDGLGraph
Input DGLGraph for molecule(s)
- n_feattensor of dtype float32 and shape (B1, D1)
Node features. B1 for number of nodes and D1 for the node feature size.
- e_feattensor of dtype float32 and shape (B2, D2)
Edge features. B2 for number of edges and D2 for the edge feature size.
res : Predicted labels
layers.link_prediction_module
¶
Module Contents¶
|
|
|
Args: |
|
|
|
|
|
-
layers.link_prediction_module.
cal_mrr
(embedding, rel_embedding, edge_index, edge_type, scoring, protocol='raw', batch_size=1000, hits=[])[source]¶
-
class
layers.link_prediction_module.
ConvELayer
(dim, num_filter=20, kernel_size=7, k_w=10, dropout=0.3)[source]¶ Bases:
torch.nn.Module
-
layers.link_prediction_module.
sampling_edge_uniform
(edge_index, edge_types, edge_set, sampling_rate, num_rels, label_smoothing=0.0, num_entities=1)[source]¶ - Args:
edge_index: edge index of graph edge_types: edge_set: set of all edges of the graph, (h, t, r) sampling_rate: num_rels: label_smoothing(Optional): num_entities (Optional):
- Returns:
sampled_edges: sampled existing edges rels: types of smapled existing edges sampled_edges_all: existing edges with corrupted edges sampled_types_all: types of existing and corrupted edges labels: 0/1
layers.maggregator
¶
layers.prone_module
¶
Module Contents¶
applying sparsification to accelerate computation |
|
|
|
|
|
|
|
-
class
layers.prone_module.
PPR
(alpha=0.5, k=10)[source]¶ Bases:
object
applying sparsification to accelerate computation
-
class
layers.prone_module.
SignalRescaling
[source]¶ Bases:
object
rescale signal of each node according to the degree of the node:
sigmoid(degree)
sigmoid(1/degree)
layers.srgcn_module
¶
Module Contents¶
|
|
|
|
|
Package Contents¶
Classes¶
Squeeze-and-excitation networks |
|
-
class
layers.
MeanAggregator
(in_channels, out_channels, improved=False, cached=False, bias=True)[source]¶ Bases:
torch.nn.Module
-
static
norm
(x, edge_index)¶
-
forward
(self, x, edge_index, edge_weight=None, bias=True)¶
-
update
(self, aggr_out)¶
-
__repr__
(self)¶
-
static
data
¶
Submodules¶
data.batch
¶
Module Contents¶
A plain old python object modeling a batch of graphs as one big |
-
class
data.batch.
Batch
(batch=None, **kwargs)[source]¶ Bases:
cogdl.data.Data
A plain old python object modeling a batch of graphs as one big (dicconnected) graph. With
cogdl.data.Data
being the base class, all its methods can also be used here. In addition, single graphs can be reconstructed via the assignment vectorbatch
, which maps each node to its respective graph identifier.-
static
from_data_list
(data_list, follow_batch=[])[source]¶ Constructs a batch object from a python list holding
torch_geometric.data.Data
objects. The assignment vectorbatch
is created on the fly. Additionally, creates assignment batch vectors for each key infollow_batch
.
-
cumsum
(self, key, item)[source]¶ If
True
, the attributekey
with contentitem
should be added up cumulatively before concatenated together.Note
This method is for internal use only, and should only be overridden if the batch concatenation process is corrupted for a specific data attribute.
-
to_data_list
(self)[source]¶ Reconstructs the list of
torch_geometric.data.Data
objects from the batch object. The batch object must have been created viafrom_data_list()
in order to be able reconstruct the initial objects.
-
static
data.data
¶
Module Contents¶
A plain old python object modeling a single graph with various |
-
class
data.data.
Data
(x=None, edge_index=None, edge_attr=None, y=None, pos=None)[source]¶ Bases:
object
A plain old python object modeling a single graph with various (optional) attributes:
- Args:
- x (Tensor, optional): Node feature matrix with shape :obj:`[num_nodes,
num_node_features]`. (default:
None
)- edge_index (LongTensor, optional): Graph connectivity in COO format
with shape
[2, num_edges]
. (default:None
)- edge_attr (Tensor, optional): Edge feature matrix with shape
[num_edges, num_edge_features]
. (default:None
)- y (Tensor, optional): Graph or node targets with arbitrary shape.
(default:
None
)- pos (Tensor, optional): Node position matrix with shape
[num_nodes, num_dimensions]
. (default:None
)
The data object is not restricted to these attributes and can be extented by any other additional data.
-
__iter__
(self)[source]¶ Iterates over all present attributes in the data, yielding their attribute names and content.
-
__call__
(self, *keys)[source]¶ Iterates over all attributes
*keys
in the data, yielding their attribute names and content. If*keys
is not given this method will iterative over all present attributes.
-
cat_dim
(self, key, value)[source]¶ Returns the dimension in which the attribute
key
with contentvalue
gets concatenated when creating batches.Note
This method is for internal use only, and should only be overridden if the batch concatenation process is corrupted for a specific data attribute.
-
__inc__
(self, key, value)[source]¶ “Returns the incremental count to cumulatively increase the value of the next attribute of
key
when creating batches.Note
This method is for internal use only, and should only be overridden if the batch concatenation process is corrupted for a specific data attribute.
-
is_coalesced
(self)[source]¶ Returns
True
, if edge indices are ordered and do not contain duplicate entries.
-
apply
(self, func, *keys)[source]¶ Applies the function
func
to all attributes*keys
. If*keys
is not given,func
is applied to all present attributes.
-
contiguous
(self, *keys)[source]¶ Ensures a contiguous memory layout for all attributes
*keys
. If*keys
is not given, all present attributes are ensured to have a contiguous memory layout.
data.dataloader
¶
Module Contents¶
Data loader which merges data objects from a |
|
Data loader which merges data objects from a |
|
Data loader which merges data objects from a |
-
class
data.dataloader.
DataLoader
(dataset, batch_size=1, shuffle=True, **kwargs)[source]¶ Bases:
torch.utils.data.DataLoader
Data loader which merges data objects from a
cogdl.data.dataset
to a mini-batch.
-
class
data.dataloader.
DataListLoader
(dataset, batch_size=1, shuffle=True, **kwargs)[source]¶ Bases:
torch.utils.data.DataLoader
Data loader which merges data objects from a
cogdl.data.dataset
to a python list.Note
This data loader should be used for multi-gpu support via
cogdl.nn.DataParallel
.
-
class
data.dataloader.
DenseDataLoader
(dataset, batch_size=1, shuffle=True, **kwargs)[source]¶ Bases:
torch.utils.data.DataLoader
Data loader which merges data objects from a
cogdl.data.dataset
to a mini-batch.Note
To make use of this data loader, all graphs in the dataset needs to have the same shape for each its attributes. Therefore, this data loader should only be used when working with dense adjacency matrices.
data.dataset
¶
Module Contents¶
|
|
|
-
class
data.dataset.
Dataset
(root, transform=None, pre_transform=None, pre_filter=None)[source]¶ Bases:
torch.utils.data.Dataset
Dataset base class for creating graph datasets. See here for the accompanying tutorial.
- Args:
root (string): Root directory where the dataset should be saved. transform (callable, optional): A function/transform that takes in an
cogdl.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)- pre_transform (callable, optional): A function/transform that takes in
an
cogdl.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)- pre_filter (callable, optional): A function that takes in an
cogdl.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
property
raw_file_names
(self)[source]¶ The name of the files to find in the
self.raw_dir
folder in order to skip the download.
-
property
processed_file_names
(self)[source]¶ The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
-
property
processed_paths
(self)[source]¶ The filepaths to find in the
self.processed_dir
folder in order to skip the processing.
data.download
¶
Module Contents¶
|
Downloads the content of an URL to a specific folder. |
data.extract
¶
Module Contents¶
|
|
|
Extracts a tar archive to a specific folder. |
|
Extracts a zip archive to a specific folder. |
|
|
|
-
data.extract.
extract_tar
(path, folder, mode='r:gz', log=True)[source]¶ Extracts a tar archive to a specific folder.
Package Contents¶
Classes¶
A plain old python object modeling a single graph with various |
|
A plain old python object modeling a batch of graphs as one big |
|
Dataset base class for creating graph datasets. |
|
Data loader which merges data objects from a |
|
Data loader which merges data objects from a |
|
Data loader which merges data objects from a |
Functions¶
|
Downloads the content of an URL to a specific folder. |
|
Extracts a tar archive to a specific folder. |
|
Extracts a zip archive to a specific folder. |
|
|
|
-
class
data.
Data
(x=None, edge_index=None, edge_attr=None, y=None, pos=None)[source]¶ Bases:
object
A plain old python object modeling a single graph with various (optional) attributes:
- Args:
- x (Tensor, optional): Node feature matrix with shape :obj:`[num_nodes,
num_node_features]`. (default:
None
)- edge_index (LongTensor, optional): Graph connectivity in COO format
with shape
[2, num_edges]
. (default:None
)- edge_attr (Tensor, optional): Edge feature matrix with shape
[num_edges, num_edge_features]
. (default:None
)- y (Tensor, optional): Graph or node targets with arbitrary shape.
(default:
None
)- pos (Tensor, optional): Node position matrix with shape
[num_nodes, num_dimensions]
. (default:None
)
The data object is not restricted to these attributes and can be extented by any other additional data.
-
static
from_dict
(dictionary)¶ Creates a data object from a python dictionary.
-
__getitem__
(self, key)¶ Gets the data of the attribute
key
.
-
__setitem__
(self, key, value)¶ Sets the attribute
key
tovalue
.
-
property
keys
(self)¶ Returns all names of graph attributes.
-
__len__
(self)¶ Returns the number of all present attributes.
-
__iter__
(self)¶ Iterates over all present attributes in the data, yielding their attribute names and content.
-
__call__
(self, *keys)¶ Iterates over all attributes
*keys
in the data, yielding their attribute names and content. If*keys
is not given this method will iterative over all present attributes.
-
cat_dim
(self, key, value)¶ Returns the dimension in which the attribute
key
with contentvalue
gets concatenated when creating batches.Note
This method is for internal use only, and should only be overridden if the batch concatenation process is corrupted for a specific data attribute.
-
__inc__
(self, key, value)¶ “Returns the incremental count to cumulatively increase the value of the next attribute of
key
when creating batches.Note
This method is for internal use only, and should only be overridden if the batch concatenation process is corrupted for a specific data attribute.
-
property
num_edges
(self)¶ Returns the number of edges in the graph.
-
property
num_features
(self)¶ Returns the number of features per node in the graph.
-
property
num_nodes
(self)¶
-
apply
(self, func, *keys)¶ Applies the function
func
to all attributes*keys
. If*keys
is not given,func
is applied to all present attributes.
-
contiguous
(self, *keys)¶ Ensures a contiguous memory layout for all attributes
*keys
. If*keys
is not given, all present attributes are ensured to have a contiguous memory layout.
-
to
(self, device, *keys)¶ Performs tensor dtype and/or device conversion to all attributes
*keys
. If*keys
is not given, the conversion is applied to all present attributes.
-
cuda
(self, *keys)¶
-
clone
(self)¶
-
__repr__
(self)¶ Return repr(self).
-
class
data.
Batch
(batch=None, **kwargs)[source]¶ Bases:
cogdl.data.Data
A plain old python object modeling a batch of graphs as one big (dicconnected) graph. With
cogdl.data.Data
being the base class, all its methods can also be used here. In addition, single graphs can be reconstructed via the assignment vectorbatch
, which maps each node to its respective graph identifier.-
static
from_data_list
(data_list, follow_batch=[])¶ Constructs a batch object from a python list holding
torch_geometric.data.Data
objects. The assignment vectorbatch
is created on the fly. Additionally, creates assignment batch vectors for each key infollow_batch
.
-
cumsum
(self, key, item)¶ If
True
, the attributekey
with contentitem
should be added up cumulatively before concatenated together.Note
This method is for internal use only, and should only be overridden if the batch concatenation process is corrupted for a specific data attribute.
-
to_data_list
(self)¶ Reconstructs the list of
torch_geometric.data.Data
objects from the batch object. The batch object must have been created viafrom_data_list()
in order to be able reconstruct the initial objects.
-
property
num_graphs
(self)¶ Returns the number of graphs in the batch.
-
static
-
class
data.
Dataset
(root, transform=None, pre_transform=None, pre_filter=None)[source]¶ Bases:
torch.utils.data.Dataset
Dataset base class for creating graph datasets. See here for the accompanying tutorial.
- Args:
root (string): Root directory where the dataset should be saved. transform (callable, optional): A function/transform that takes in an
cogdl.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)- pre_transform (callable, optional): A function/transform that takes in
an
cogdl.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)- pre_filter (callable, optional): A function that takes in an
cogdl.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
property
raw_file_names
(self)¶ The name of the files to find in the
self.raw_dir
folder in order to skip the download.
-
property
processed_file_names
(self)¶ The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
-
abstract
download
(self)¶ Downloads the dataset to the
self.raw_dir
folder.
-
abstract
process
(self)¶ Processes the dataset to the
self.processed_dir
folder.
-
abstract
__len__
(self)¶ The number of examples in the dataset.
-
abstract
get
(self, idx)¶ Gets the data object at index
idx
.
-
property
num_features
(self)¶ Returns the number of features per node in the graph.
-
property
raw_paths
(self)¶ The filepaths to find in order to skip the download.
-
property
processed_paths
(self)¶ The filepaths to find in the
self.processed_dir
folder in order to skip the processing.
-
_download
(self)¶
-
_process
(self)¶
-
__getitem__
(self, idx)¶ Gets the data object at index
idx
and transforms it (in case aself.transform
is given).
-
__repr__
(self)¶
-
class
data.
DataLoader
(dataset, batch_size=1, shuffle=True, **kwargs)[source]¶ Bases:
torch.utils.data.DataLoader
Data loader which merges data objects from a
cogdl.data.dataset
to a mini-batch.
-
class
data.
DataListLoader
(dataset, batch_size=1, shuffle=True, **kwargs)[source]¶ Bases:
torch.utils.data.DataLoader
Data loader which merges data objects from a
cogdl.data.dataset
to a python list.Note
This data loader should be used for multi-gpu support via
cogdl.nn.DataParallel
.
-
class
data.
DenseDataLoader
(dataset, batch_size=1, shuffle=True, **kwargs)[source]¶ Bases:
torch.utils.data.DataLoader
Data loader which merges data objects from a
cogdl.data.dataset
to a mini-batch.Note
To make use of this data loader, all graphs in the dataset needs to have the same shape for each its attributes. Therefore, this data loader should only be used when working with dense adjacency matrices.
-
data.
download_url
(url, folder, name=None, log=True)[source]¶ Downloads the content of an URL to a specific folder.
-
data.
extract_tar
(path, folder, mode='r:gz', log=True)[source]¶ Extracts a tar archive to a specific folder.
tasks
¶
Submodules¶
tasks.graph_classification
¶
Module Contents¶
Superiviced graph classification task. |
|
Set each node feature as one-hot encoding of degree |
|
Set each node feature to the same |
-
tasks.graph_classification.
node_degree_as_feature
(data)[source]¶ Set each node feature as one-hot encoding of degree :param data: a list of class Data :return: a list of class Data
-
class
tasks.graph_classification.
GraphClassification
(args)[source]¶ Bases:
tasks.BaseTask
Superiviced graph classification task.
tasks.heterogeneous_node_classification
¶
Module Contents¶
Heterogeneous Node classification task. |
-
class
tasks.heterogeneous_node_classification.
HeterogeneousNodeClassification
(args)[source]¶ Bases:
tasks.BaseTask
Heterogeneous Node classification task.
tasks.link_prediction
¶
Module Contents¶
|
|
|
|
|
|
|
|
|
|
|
tasks.multiplex_link_prediction
¶
tasks.multiplex_node_classification
¶
tasks.node_classification
¶
Module Contents¶
Node classification task. |
-
class
tasks.node_classification.
NodeClassification
(args, dataset=None, model=None)[source]¶ Bases:
tasks.BaseTask
Node classification task.
tasks.node_classification_sampling
¶
Module Contents¶
Node classification task with sampling. |
|
-
tasks.node_classification_sampling.
get_batches
(train_nodes, train_labels, batch_size=64, shuffle=True)[source]¶
-
class
tasks.node_classification_sampling.
NodeClassificationSampling
(args)[source]¶ Bases:
tasks.BaseTask
Node classification task with sampling.
tasks.unsupervised_graph_classification
¶
Module Contents¶
Unsupervised graph classification |
-
class
tasks.unsupervised_graph_classification.
UnsupervisedGraphClassification
(args)[source]¶ Bases:
tasks.BaseTask
Unsupervised graph classification
tasks.unsupervised_node_classification
¶
Module Contents¶
Node classification task. |
|
-
class
tasks.unsupervised_node_classification.
UnsupervisedNodeClassification
(args)[source]¶ Bases:
tasks.BaseTask
Node classification task.
Package Contents¶
Functions¶
|
New task types can be added to cogdl with the |
|
-
class
tasks.
BaseTask
(args)[source]¶ Bases:
object
-
static
add_args
(parser)¶ Add task-specific arguments to the parser.
-
abstract
train
(self, num_epoch)¶
-
static
-
tasks.
register_task
(name)[source]¶ New task types can be added to cogdl with the
register_task()
function decorator.For example:
@register_task('node_classification') class NodeClassification(BaseTask): (...)
- Args:
name (str): the name of the task
datasets
¶
Submodules¶
datasets.dgl_data
¶
datasets.gatne
¶
Module Contents¶
The network datasets “Amazon”, “Twitter” and “YouTube” from the |
|
The network datasets “Amazon”, “Twitter” and “YouTube” from the |
|
The network datasets “Amazon”, “Twitter” and “YouTube” from the |
|
The network datasets “Amazon”, “Twitter” and “YouTube” from the |
|
-
class
datasets.gatne.
GatneDataset
(root, name)[source]¶ Bases:
cogdl.data.Dataset
The network datasets “Amazon”, “Twitter” and “YouTube” from the “Representation Learning for Attributed Multiplex Heterogeneous Network” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"Amazon"
,"Twitter"
,"YouTube"
).
-
class
datasets.gatne.
AmazonDataset
[source]¶ Bases:
datasets.gatne.GatneDataset
The network datasets “Amazon”, “Twitter” and “YouTube” from the “Representation Learning for Attributed Multiplex Heterogeneous Network” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"Amazon"
,"Twitter"
,"YouTube"
).
-
class
datasets.gatne.
TwitterDataset
[source]¶ Bases:
datasets.gatne.GatneDataset
The network datasets “Amazon”, “Twitter” and “YouTube” from the “Representation Learning for Attributed Multiplex Heterogeneous Network” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"Amazon"
,"Twitter"
,"YouTube"
).
-
class
datasets.gatne.
YouTubeDataset
[source]¶ Bases:
datasets.gatne.GatneDataset
The network datasets “Amazon”, “Twitter” and “YouTube” from the “Representation Learning for Attributed Multiplex Heterogeneous Network” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"Amazon"
,"Twitter"
,"YouTube"
).
datasets.gcc_data
¶
datasets.gtn_data
¶
Module Contents¶
The network datasets “ACM”, “DBLP” and “IMDB” from the |
|
The network datasets “ACM”, “DBLP” and “IMDB” from the |
|
The network datasets “ACM”, “DBLP” and “IMDB” from the |
|
The network datasets “ACM”, “DBLP” and “IMDB” from the |
|
Unpacks the given archive file to the same directory, then (by default) |
-
datasets.gtn_data.
untar
(path, fname, deleteTar=True)[source]¶ Unpacks the given archive file to the same directory, then (by default) deletes the archive file.
-
class
datasets.gtn_data.
GTNDataset
(root, name)[source]¶ Bases:
cogdl.data.Dataset
The network datasets “ACM”, “DBLP” and “IMDB” from the “Graph Transformer Networks” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"gtn-acm"
,"gtn-dblp"
,"gtn-imdb"
).
-
class
datasets.gtn_data.
ACM_GTNDataset
[source]¶ Bases:
datasets.gtn_data.GTNDataset
The network datasets “ACM”, “DBLP” and “IMDB” from the “Graph Transformer Networks” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"gtn-acm"
,"gtn-dblp"
,"gtn-imdb"
).
-
class
datasets.gtn_data.
DBLP_GTNDataset
[source]¶ Bases:
datasets.gtn_data.GTNDataset
The network datasets “ACM”, “DBLP” and “IMDB” from the “Graph Transformer Networks” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"gtn-acm"
,"gtn-dblp"
,"gtn-imdb"
).
-
class
datasets.gtn_data.
IMDB_GTNDataset
[source]¶ Bases:
datasets.gtn_data.GTNDataset
The network datasets “ACM”, “DBLP” and “IMDB” from the “Graph Transformer Networks” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"gtn-acm"
,"gtn-dblp"
,"gtn-imdb"
).
datasets.han_data
¶
Module Contents¶
The network datasets “ACM”, “DBLP” and “IMDB” from the |
|
The network datasets “ACM”, “DBLP” and “IMDB” from the |
|
The network datasets “ACM”, “DBLP” and “IMDB” from the |
|
The network datasets “ACM”, “DBLP” and “IMDB” from the |
|
Unpacks the given archive file to the same directory, then (by default) |
|
Create mask. |
-
datasets.han_data.
untar
(path, fname, deleteTar=True)[source]¶ Unpacks the given archive file to the same directory, then (by default) deletes the archive file.
-
class
datasets.han_data.
HANDataset
(root, name)[source]¶ Bases:
cogdl.data.Dataset
The network datasets “ACM”, “DBLP” and “IMDB” from the “Heterogeneous Graph Attention Network” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"han-acm"
,"han-dblp"
,"han-imdb"
).
-
class
datasets.han_data.
ACM_HANDataset
[source]¶ Bases:
datasets.han_data.HANDataset
The network datasets “ACM”, “DBLP” and “IMDB” from the “Heterogeneous Graph Attention Network” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"han-acm"
,"han-dblp"
,"han-imdb"
).
-
class
datasets.han_data.
DBLP_HANDataset
[source]¶ Bases:
datasets.han_data.HANDataset
The network datasets “ACM”, “DBLP” and “IMDB” from the “Heterogeneous Graph Attention Network” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"han-acm"
,"han-dblp"
,"han-imdb"
).
-
class
datasets.han_data.
IMDB_HANDataset
[source]¶ Bases:
datasets.han_data.HANDataset
The network datasets “ACM”, “DBLP” and “IMDB” from the “Heterogeneous Graph Attention Network” paper.
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"han-acm"
,"han-dblp"
,"han-imdb"
).
datasets.kg_data
¶
Module Contents¶
|
datasets.matlab_matrix
¶
Module Contents¶
networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/ |
|
networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/ |
|
networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/ |
|
networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/ |
|
networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/ |
-
class
datasets.matlab_matrix.
MatlabMatrix
(root, name, url)[source]¶ Bases:
cogdl.data.Dataset
networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"Blogcatalog"
).
-
class
datasets.matlab_matrix.
BlogcatalogDataset
[source]¶ Bases:
datasets.matlab_matrix.MatlabMatrix
networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"Blogcatalog"
).
-
class
datasets.matlab_matrix.
FlickrDataset
[source]¶ Bases:
datasets.matlab_matrix.MatlabMatrix
networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"Blogcatalog"
).
-
class
datasets.matlab_matrix.
WikipediaDataset
[source]¶ Bases:
datasets.matlab_matrix.MatlabMatrix
networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"Blogcatalog"
).
-
class
datasets.matlab_matrix.
PPIDataset
[source]¶ Bases:
datasets.matlab_matrix.MatlabMatrix
networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/
- Args:
root (string): Root directory where the dataset should be saved. name (string): The name of the dataset (
"Blogcatalog"
).
datasets.pyg
¶
Package Contents¶
Functions¶
|
New dataset types can be added to cogdl with the |
|
|
|
-
datasets.
register_dataset
(name)[source]¶ New dataset types can be added to cogdl with the
register_dataset()
function decorator.For example:
@register_dataset('my_dataset') class MyDataset(): (...)
- Args:
name (str): the name of the dataset
models
¶
Subpackages¶
models.emb
¶
Submodules¶
models.emb.deepwalk
¶The DeepWalk model from the `”DeepWalk: Online Learning of Social Representations” |
-
class
models.emb.deepwalk.
DeepWalk
(dimension, walk_length, walk_num, window_size, worker, iteration)[source]¶ Bases:
models.BaseModel
The DeepWalk model from the “DeepWalk: Online Learning of Social Representations” paper
- Args:
hidden_size (int) : The dimension of node representation. walk_length (int) : The walk length. walk_num (int) : The number of walks to sample for each node. window_size (int) : The actual context size which is considered in language model. worker (int) : The number of workers for word2vec. iteration (int) : The number of training iteration in word2vec.
models.emb.dgk
¶The Hin2vec model from the `”Deep Graph Kernels” |
-
class
models.emb.dgk.
DeepGraphKernel
(hidden_dim, min_count, window_size, sampling_rate, rounds, epoch, alpha, n_workers=4)[source]¶ Bases:
models.BaseModel
The Hin2vec model from the “Deep Graph Kernels” paper.
- Args:
hidden_size (int) : The dimension of node representation. min_count (int) : Parameter in word2vec. window (int) : The actual context size which is considered in language model. sampling_rate (float) : Parameter in word2vec. iteration (int) : The number of iteration in WL method. epoch (int) : The number of training iteration. alpha (float) : The learning rate of word2vec.
models.emb.dngr
¶The DNGR model from the `”Deep Neural Networks for Learning Graph Representations” |
-
class
models.emb.dngr.
DNGR_layer
(num_node, hidden_size1, hidden_size2)[source]¶ Bases:
torch.nn.Module
-
class
models.emb.dngr.
DNGR
(hidden_size1, hidden_size2, noise, alpha, step, max_epoch, lr, cpu)[source]¶ Bases:
models.BaseModel
The DNGR model from the “Deep Neural Networks for Learning Graph Representations” paper
- Args:
hidden_size1 (int) : The size of the first hidden layer. hidden_size2 (int) : The size of the second hidden layer. noise (float) : Denoise rate of DAE. alpha (float) : Parameter in DNGR. step (int) : The max step in random surfing. max_epoch (int) : The max epoches in training step. lr (float) : Learning rate in DNGR.
models.emb.gatne
¶The GATNE model from the `”Representation Learning for Attributed Multiplex Heterogeneous Network” |
|
|
|
|
|
|
|
|
|
|
-
class
models.emb.gatne.
GATNE
(dimension, walk_length, walk_num, window_size, worker, epoch, batch_size, edge_dim, att_dim, negative_samples, neighbor_samples, schema)[source]¶ Bases:
models.BaseModel
The GATNE model from the “Representation Learning for Attributed Multiplex Heterogeneous Network” paper
- Args:
walk_length (int) : The walk length. walk_num (int) : The number of walks to sample for each node. window_size (int) : The actual context size which is considered in language model. worker (int) : The number of workers for word2vec. epoch (int) : The number of training epochs. batch_size (int) : The size of each training batch. edge_dim (int) : Number of edge embedding dimensions. att_dim (int) : Number of attention dimensions. negative_samples (int) : Negative samples for optimization. neighbor_samples (int) : Neighbor samples for aggregation schema (str) : The metapath schema used in model. Metapaths are splited with “,”, while each node type are connected with “-” in each metapath. For example:”0-1-0,0-1-2-1-0”
-
class
models.emb.gatne.
GATNEModel
(num_nodes, embedding_size, embedding_u_size, edge_type_count, dim_a)[source]¶ Bases:
torch.nn.Module
-
class
models.emb.gatne.
NSLoss
(num_nodes, num_sampled, embedding_size)[source]¶ Bases:
torch.nn.Module
models.emb.graph2vec
¶The Graph2Vec model from the `”graph2vec: Learning Distributed Representations of Graphs” |
-
class
models.emb.graph2vec.
Graph2Vec
(dimension, min_count, window_size, dm, sampling_rate, rounds, epoch, lr, worker=4)[source]¶ Bases:
models.BaseModel
The Graph2Vec model from the “graph2vec: Learning Distributed Representations of Graphs” paper
- Args:
hidden_size (int) : The dimension of node representation. min_count (int) : Parameter in doc2vec. window_size (int) : The actual context size which is considered in language model. sampling_rate (float) : Parameter in doc2vec. dm (int) : Parameter in doc2vec. iteration (int) : The number of iteration in WL method. epoch (int) : The max epoches in training step. lr (float) : Learning rate in doc2vec.
models.emb.grarep
¶The GraRep model from the `”Grarep: Learning graph representations with global structural information” |
-
class
models.emb.grarep.
GraRep
(dimension, step)[source]¶ Bases:
models.BaseModel
The GraRep model from the “Grarep: Learning graph representations with global structural information” paper.
- Args:
hidden_size (int) : The dimension of node representation. step (int) : The maximum order of transitition probability.
models.emb.hin2vec
¶The Hin2vec model from the `”HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning” |
-
class
models.emb.hin2vec.
Hin2vec_layer
(num_node, num_relation, hidden_size, cpu)[source]¶ Bases:
torch.nn.Module
-
class
models.emb.hin2vec.
Hin2vec
(hidden_dim, walk_length, walk_num, batch_size, hop, negative, epoches, lr, cpu=True)[source]¶ Bases:
models.BaseModel
The Hin2vec model from the “HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning” paper.
- Args:
hidden_size (int) : The dimension of node representation. walk_length (int) : The walk length. walk_num (int) : The number of walks to sample for each node. batch_size (int) : The batch size of training in Hin2vec. hop (int) : The number of hop to construct training samples in Hin2vec. negative (int) : The number of nagative samples for each meta2path pair. epoches (int) : The number of training iteration. lr (float) : The initial learning rate of SGD. cpu (bool) : Use CPU or GPU to train hin2vec.
models.emb.hope
¶The HOPE model from the `”Grarep: Asymmetric transitivity preserving graph embedding” |
-
class
models.emb.hope.
HOPE
(dimension, beta)[source]¶ Bases:
models.BaseModel
The HOPE model from the “Grarep: Asymmetric transitivity preserving graph embedding” paper.
- Args:
hidden_size (int) : The dimension of node representation. beta (float) : Parameter in katz decomposition.
models.emb.line
¶The LINE model from the `”Line: Large-scale information network embedding” |
-
class
models.emb.line.
LINE
(dimension, walk_length, walk_num, negative, batch_size, alpha, order)[source]¶ Bases:
models.BaseModel
The LINE model from the “Line: Large-scale information network embedding” paper.
- Args:
hidden_size (int) : The dimension of node representation. walk_length (int) : The walk length. walk_num (int) : The number of walks to sample for each node. negative (int) : The number of nagative samples for each edge. batch_size (int) : The batch size of training in LINE. alpha (float) : The initial learning rate of SGD. order (int) : 1 represents perserving 1-st order proximity, 2 represents 2-nd, while 3 means both of them (each of them having dimension/2 node representation).
models.emb.metapath2vec
¶The Metapath2vec model from the `”metapath2vec: Scalable Representation |
-
class
models.emb.metapath2vec.
Metapath2vec
(dimension, walk_length, walk_num, window_size, worker, iteration, schema)[source]¶ Bases:
models.BaseModel
The Metapath2vec model from the “metapath2vec: Scalable Representation Learning for Heterogeneous Networks” paper
- Args:
hidden_size (int) : The dimension of node representation. walk_length (int) : The walk length. walk_num (int) : The number of walks to sample for each node. window_size (int) : The actual context size which is considered in language model. worker (int) : The number of workers for word2vec. iteration (int) : The number of training iteration in word2vec. schema (str) : The metapath schema used in model. Metapaths are splited with “,”, while each node type are connected with “-” in each metapath. For example:”0-1-0,0-2-0,1-0-2-0-1”.
models.emb.netmf
¶The NetMF model from the `”Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec” |
-
class
models.emb.netmf.
NetMF
(dimension, window_size, rank, negative, is_large=False)[source]¶ Bases:
models.BaseModel
The NetMF model from the “Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec” paper.
- Args:
hidden_size (int) : The dimension of node representation. window_size (int) : The actual context size which is considered in language model. rank (int) : The rank in approximate normalized laplacian. negative (int) : The number of nagative samples in negative sampling. is-large (bool) : When window size is large, use approximated deepwalk matrix to decompose.
models.emb.netsmf
¶The NetSMF model from the `”NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization” |
-
class
models.emb.netsmf.
NetSMF
(dimension, window_size, negative, num_round, worker)[source]¶ Bases:
models.BaseModel
The NetSMF model from the “NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization” paper.
- Args:
hidden_size (int) : The dimension of node representation. window_size (int) : The actual context size which is considered in language model. negative (int) : The number of nagative samples in negative sampling. num_round (int) : The number of round in NetSMF. worker (int) : The number of workers for NetSMF.
models.emb.node2vec
¶The node2vec model from the `”node2vec: Scalable feature learning for networks” |
-
class
models.emb.node2vec.
Node2vec
(dimension, walk_length, walk_num, window_size, worker, iteration, p, q)[source]¶ Bases:
models.BaseModel
The node2vec model from the “node2vec: Scalable feature learning for networks” paper
- Args:
hidden_size (int) : The dimension of node representation. walk_length (int) : The walk length. walk_num (int) : The number of walks to sample for each node. window_size (int) : The actual context size which is considered in language model. worker (int) : The number of workers for word2vec. iteration (int) : The number of training iteration in word2vec. p (float) : Parameter in node2vec. q (float) : Parameter in node2vec.
models.emb.prone
¶The ProNE model from the `”ProNE: Fast and Scalable Network Representation Learning” |
-
class
models.emb.prone.
ProNE
(dimension, step, mu, theta)[source]¶ Bases:
models.BaseModel
The ProNE model from the “ProNE: Fast and Scalable Network Representation Learning” paper.
- Args:
hidden_size (int) : The dimension of node representation. step (int) : The number of items in the chebyshev expansion. mu (float) : Parameter in ProNE. theta (float) : Parameter in ProNE.
models.emb.pte
¶The PTE model from the `”PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks” |
-
class
models.emb.pte.
PTE
(dimension, walk_length, walk_num, negative, batch_size, alpha)[source]¶ Bases:
models.BaseModel
The PTE model from the “PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks” paper.
- Args:
hidden_size (int) : The dimension of node representation. walk_length (int) : The walk length. walk_num (int) : The number of walks to sample for each node. negative (int) : The number of nagative samples for each edge. batch_size (int) : The batch size of training in PTE. alpha (float) : The initial learning rate of SGD.
models.emb.sdne
¶The SDNE model from the `”Structural Deep Network Embedding” |
-
class
models.emb.sdne.
SDNE_layer
(num_node, hidden_size1, hidden_size2, droput, alpha, beta, nu1, nu2)[source]¶ Bases:
torch.nn.Module
-
class
models.emb.sdne.
SDNE
(hidden_size1, hidden_size2, droput, alpha, beta, nu1, nu2, max_epoch, lr, cpu)[source]¶ Bases:
models.BaseModel
The SDNE model from the “Structural Deep Network Embedding” paper
- Args:
hidden_size1 (int) : The size of the first hidden layer. hidden_size2 (int) : The size of the second hidden layer. droput (float) : Droput rate. alpha (float) : Trade-off parameter between 1-st and 2-nd order objective function in SDNE. beta (float) : Parameter of 2-nd order objective function in SDNE. nu1 (float) : Parameter of l1 normlization in SDNE. nu2 (float) : Parameter of l2 normlization in SDNE. max_epoch (int) : The max epoches in training step. lr (float) : Learning rate in SDNE. cpu (bool) : Use CPU or GPU to train hin2vec.
models.emb.spectral
¶The Spectral clustering model from the `”Leveraging social media networks for classification” |
-
class
models.emb.spectral.
Spectral
(dimension)[source]¶ Bases:
models.BaseModel
The Spectral clustering model from the “Leveraging social media networks for classification” paper
- Args:
hidden_size (int) : The dimension of node representation.
models.nn
¶
Submodules¶
models.nn.asgcn
¶Simple GCN layer, similar to https://arxiv.org/abs/1609.02907 |
|
-
class
models.nn.asgcn.
GraphConvolution
(in_features, out_features, bias=True)[source]¶ Bases:
torch.nn.Module
Simple GCN layer, similar to https://arxiv.org/abs/1609.02907
-
class
models.nn.asgcn.
ASGCN
(num_features, num_classes, hidden_size, num_layers, dropout, sample_size)[source]¶ Bases:
models.BaseModel
models.nn.compgcn
¶
|
Borrowed from https://github.com/malllabiisc/CompGCN |
|
Borrowed from https://github.com/malllabiisc/CompGCN |
|
Borrowed from https://github.com/malllabiisc/CompGCN |
-
models.nn.compgcn.
com_mult
(a, b)[source]¶ Borrowed from https://github.com/malllabiisc/CompGCN
-
models.nn.compgcn.
conj
(a)[source]¶ Borrowed from https://github.com/malllabiisc/CompGCN
-
models.nn.compgcn.
ccorr
(a, b)[source]¶ Borrowed from https://github.com/malllabiisc/CompGCN
-
class
models.nn.compgcn.
BasesRelEmbLayer
(num_bases, num_rels, in_feats)[source]¶ Bases:
torch.nn.Module
-
class
models.nn.compgcn.
CompGCNLayer
(in_feats, out_feats, num_rels, opn='mult', num_bases=None, activation=lambda x: ..., dropout=0.0, bias=True)[source]¶ Bases:
torch.nn.Module
-
class
models.nn.compgcn.
CompGCN
(num_entities, num_rels, num_bases, in_feats, hidden_size, out_feats, layers, dropout, activation)[source]¶ Bases:
torch.nn.Module
-
class
models.nn.compgcn.
LinkPredictCompGCN
(num_entities, num_rels, hidden_size, num_bases=0, layers=1, sampling_rate=0.01, score_func='conve', penalty=0.001, dropout=0.0, lbl_smooth=0.1)[source]¶ Bases:
cogdl.layers.link_prediction_module.GNNLinkPredict
,models.BaseModel
models.nn.dgi
¶
|
Row-normalize feature matrix and convert to tuple representation |
|
Symmetrically normalize adjacency matrix. |
|
Convert a scipy sparse matrix to a torch sparse tensor. |
-
models.nn.dgi.
preprocess_features
(features)[source]¶ Row-normalize feature matrix and convert to tuple representation
models.nn.dgl_gcc
¶
|
|
|
one epoch training for moco |
|
|
|
|
|
-
models.nn.dgl_gcc.
_rwr_trace_to_dgl_graph
(g, seed, trace, positional_embedding_size, entire_graph=False)[source]¶
-
class
models.nn.dgl_gcc.
NodeClassificationDataset
(data, rw_hops=64, subgraph_size=64, restart_prob=0.8, positional_embedding_size=32, step_dist=[1.0, 0.0, 0.0])[source]¶ Bases:
object
models.nn.disengcn
¶Implementation of “Disentangled Graph Convolutional Networks” <http://proceedings.mlr.press/v97/ma19a.html>. |
|
-
class
models.nn.disengcn.
DisenGCNLayer
(in_feats, out_feats, K, iterations, tau=1.0, activation='leaky_relu')[source]¶ Bases:
torch.nn.Module
Implementation of “Disentangled Graph Convolutional Networks” <http://proceedings.mlr.press/v97/ma19a.html>.
-
class
models.nn.disengcn.
DisenGCN
(in_feats, hidden_size, num_classes, K, iterations, tau, dropout, activation)[source]¶ Bases:
models.BaseModel
models.nn.fastgcn
¶Simple GCN layer, similar to https://arxiv.org/abs/1609.02907 |
|
-
class
models.nn.fastgcn.
GraphConvolution
(in_features, out_features, bias=True)[source]¶ Bases:
torch.nn.Module
Simple GCN layer, similar to https://arxiv.org/abs/1609.02907
-
class
models.nn.fastgcn.
FastGCN
(num_features, num_classes, hidden_size, num_layers, dropout, sample_size)[source]¶ Bases:
models.BaseModel
models.nn.gat
¶Simple GAT layer, similar to https://arxiv.org/abs/1710.10903 |
|
Special function for only sparse region backpropataion layer. |
|
Sparse version GAT layer, similar to https://arxiv.org/abs/1710.10903 |
|
The GAT model from the `”Graph Attention Networks” |
-
class
models.nn.gat.
GraphAttentionLayer
(in_features, out_features, dropout, alpha, concat=True)[source]¶ Bases:
torch.nn.Module
Simple GAT layer, similar to https://arxiv.org/abs/1710.10903
-
class
models.nn.gat.
SpecialSpmmFunction
[source]¶ Bases:
torch.autograd.Function
Special function for only sparse region backpropataion layer.
-
class
models.nn.gat.
SpGraphAttentionLayer
(in_features, out_features, dropout, alpha, concat=True)[source]¶ Bases:
torch.nn.Module
Sparse version GAT layer, similar to https://arxiv.org/abs/1710.10903
-
class
models.nn.gat.
PetarVGAT
(nfeat, nhid, nclass, dropout, alpha, nheads)[source]¶ Bases:
models.BaseModel
-
class
models.nn.gat.
PetarVSpGAT
(nfeat, nhid, nclass, dropout, alpha, nheads)[source]¶ Bases:
models.nn.gat.PetarVGAT
The GAT model from the “Graph Attention Networks” paper
- Args:
num_features (int) : Number of input features. num_classes (int) : Number of classes. hidden_size (int) : The dimension of node representation. dropout (float) : Dropout rate for model training. alpha (float) : Coefficient of leaky_relu. nheads (int) : Number of attention heads.
models.nn.gcn
¶Simple GCN layer, similar to https://arxiv.org/abs/1609.02907 |
|
The GCN model from the `”Semi-Supervised Classification with Graph Convolutional Networks” |
-
class
models.nn.gcn.
GraphConvolution
(in_features, out_features, bias=True)[source]¶ Bases:
torch.nn.Module
Simple GCN layer, similar to https://arxiv.org/abs/1609.02907
-
class
models.nn.gcn.
TKipfGCN
(nfeat, nhid, nclass, dropout)[source]¶ Bases:
models.BaseModel
The GCN model from the “Semi-Supervised Classification with Graph Convolutional Networks” paper
- Args:
num_features (int) : Number of input features. num_classes (int) : Number of classes. hidden_size (int) : The dimension of node representation. dropout (float) : Dropout rate for model training.
models.nn.gcnmix
¶
|
|
|
|
|
|
|
-
models.nn.gcnmix.
get_current_consistency_weight
(final_consistency_weight, rampup_starts, rampup_ends, epoch)[source]¶
-
class
models.nn.gcnmix.
BaseGNNMix
(in_feat, hidden_size, num_classes, k, temperature, alpha, dropout)[source]¶ Bases:
models.BaseModel
-
class
models.nn.gcnmix.
GCNMix
(in_feat, hidden_size, num_classes, k, temperature, alpha, rampup_starts, rampup_ends, final_consistency_weight, ema_decay, dropout)[source]¶ Bases:
models.BaseModel
models.nn.grand
¶-
class
models.nn.grand.
MLPLayer
(in_features, out_features, bias=True)[source]¶ Bases:
torch.nn.Module
-
class
models.nn.grand.
Grand
(nfeat, nhid, nclass, input_droprate, hidden_droprate, use_bn, dropnode_rate, tem, lam, order, sample, alpha)[source]¶ Bases:
models.BaseModel
models.nn.graphsage
¶
|
-
class
models.nn.graphsage.
Graphsage
(num_features, num_classes, hidden_size, num_layers, sample_size, dropout)[source]¶ Bases:
models.BaseModel
models.nn.mixhop
¶-
class
models.nn.mixhop.
MixHop
(num_features, num_classes, hidden_size, num_layers, dropout)[source]¶ Bases:
models.BaseModel
models.nn.mlp
¶-
class
models.nn.mlp.
MLP
(num_features, num_classes, hidden_size, num_layers, dropout)[source]¶ Bases:
models.BaseModel
models.nn.mvgrl
¶
|
Row-normalize feature matrix and convert to tuple representation |
|
Symmetrically normalize adjacency matrix. |
|
Convert a scipy sparse matrix to a torch sparse tensor. |
|
-
models.nn.mvgrl.
preprocess_features
(features)[source]¶ Row-normalize feature matrix and convert to tuple representation
models.nn.patchy_san
¶
|
assemble neighbors for node with BFS strategy |
|
|
|
1-dimension Wl method used for node normalization for all the subgraphs |
|
construct features for cnn |
|
construct features |
-
class
models.nn.patchy_san.
PatchySAN
(batch_size, num_features, num_classes, num_sample, stride, num_neighbor, iteration)[source]¶ Bases:
models.BaseModel
The Patchy-SAN model from the “Learning Convolutional Neural Networks for Graphs” paper.
- Args:
batch_size (int) : The batch size of training. sample (int) : Number of chosen vertexes. stride (int) : Node selection stride. neighbor (int) : The number of neighbor for each node. iteration (int) : The number of training iteration.
-
models.nn.patchy_san.
assemble_neighbor
(G, node, num_neighbor, sorted_nodes)[source]¶ assemble neighbors for node with BFS strategy
-
models.nn.patchy_san.
one_dim_wl
(graph_list, init_labels, iteration=5)[source]¶ 1-dimension Wl method used for node normalization for all the subgraphs
models.nn.pyg_cheb
¶-
class
models.nn.pyg_cheb.
Chebyshev
(num_features, num_classes, hidden_size, num_layers, dropout, filter_size)[source]¶ Bases:
models.BaseModel
models.nn.pyg_dgcnn
¶EdgeConv and DynamicGraph in paper `”Dynamic Graph CNN for Learning on |
-
class
models.nn.pyg_dgcnn.
DGCNN
(in_feats, hidden_dim, out_feats, k=20, dropout=0.5)[source]¶ Bases:
models.BaseModel
EdgeConv and DynamicGraph in paper “Dynamic Graph CNN for Learning on Point Clouds” <https://arxiv.org/pdf/1801.07829.pdf>__ .
- in_featsint
Size of each input sample.
- out_featsint
Size of each output sample.
- hidden_dimint
Dimension of hidden layer embedding.
- kint
Number of neareast neighbors.
models.nn.pyg_diffpool
¶GraphSAGE from “Inductive Representation Learning on Large Graphs”. |
|
GraphSAGE with mini-batch |
|
DIFFPOOL from paper `”Hierarchical Graph Representation Learning |
|
DIFFPOOL layer with batch forward |
|
DIFFPOOL from paper `Hierarchical Graph Representation Learning |
|
-
class
models.nn.pyg_diffpool.
GraphSAGE
(in_feats, hidden_dim, out_feats, num_layers, dropout=0.5, normalize=False, concat=False, use_bn=False)[source]¶ Bases:
torch.nn.Module
GraphSAGE from “Inductive Representation Learning on Large Graphs”.
- ..math::
h^{i+1}_{mathcal{N}(v)}=AGGREGATE_{k}(h_{u}^{k}) h^{k+1}_{v} = sigma(mathbf{W}^{k}·CONCAT(h_{v}^{k}, h_{mathcal{N}(v)}))
- Args:
in_feats (int) : Size of each input sample. hidden_dim (int) : Size of hidden layer dimension. out_feats (int) : Size of each output sample. num_layers (int) : Number of GraphSAGE Layers. dropout (float, optional) : Size of dropout, default:
0.5
. normalize (bool, optional) : Normalze features after each layer if True, default:True
.
-
class
models.nn.pyg_diffpool.
BatchedGraphSAGE
(in_feats, out_feats, use_bn=True, self_loop=True)[source]¶ Bases:
torch.nn.Module
GraphSAGE with mini-batch
- Args:
in_feats (int) : Size of each input sample. out_feats (int) : Size of each output sample. use_bn (bool) : Apply batch normalization if True, default:
True
. self_loop (bool) : Add self loop if True, default:True
.
-
class
models.nn.pyg_diffpool.
BatchedDiffPoolLayer
(in_feats, out_feats, assign_dim, batch_size, dropout=0.5, link_pred_loss=True, entropy_loss=True)[source]¶ Bases:
torch.nn.Module
DIFFPOOL from paper “Hierarchical Graph Representation Learning with Differentiable Pooling”.
\[X^{(l+1)} = S^{l)}^T Z^{(l)} A^{(l+1)} = S^{(l)}^T A^{(l)} S^{(l)} Z^{(l)} = GNN_{l, embed}(A^{(l)}, X^{(l)}) S^{(l)} = softmax(GNN_{l,pool}(A^{(l)}, X^{(l)}))\]- in_featsint
Size of each input sample.
- out_featsint
Size of each output sample.
- assign_dimint
Size of next adjacency matrix.
- batch_sizeint
Size of each mini-batch.
- dropoutfloat, optional
Size of dropout, default:
0.5
.- link_pred_lossbool, optional
Use link prediction loss if True, default:
True
.
-
class
models.nn.pyg_diffpool.
BatchedDiffPool
(in_feats, next_size, emb_size, use_bn=True, self_loop=True, use_link_loss=False, use_entropy=True)[source]¶ Bases:
torch.nn.Module
DIFFPOOL layer with batch forward
- in_featsint
Size of each input sample.
- next_sizeint
Size of next adjacency matrix.
- emb_sizeint
Dimension of next node feature matrix.
- use_bnbool, optional
Apply batch normalization if True, default:
True
.- self_loopbool, optional
Add self loop if True, default:
True
.- use_link_lossbool, optional
Use link prediction loss if True, default:
True
.- use_entropybool, optioinal
Use entropy prediction loss if True, default:
True
.
-
class
models.nn.pyg_diffpool.
DiffPool
(in_feats, hidden_dim, embed_dim, num_classes, num_layers, num_pool_layers, assign_dim, pooling_ratio, batch_size, dropout=0.5, no_link_pred=True, concat=False, use_bn=False)[source]¶ Bases:
models.BaseModel
DIFFPOOL from paper Hierarchical Graph Representation Learning with Differentiable Pooling.
- in_featsint
Size of each input sample.
- hidden_dimint
Size of hidden layer dimension of GNN.
- embed_dimint
Size of embeded node feature, output size of GNN.
- num_classesint
Number of target classes.
- num_layersint
Number of GNN layers.
- num_pool_layersint
Number of pooling.
- assign_dimint
Embedding size after the first pooling.
- pooling_ratiofloat
Size of each poolling ratio.
- batch_sizeint
Size of each mini-batch.
- dropoutfloat, optional
Size of dropout, default: 0.5.
- no_link_predbool, optional
If True, use link prediction loss, default: True.
models.nn.pyg_drgat
¶-
class
models.nn.pyg_drgat.
DrGAT
(num_features, num_classes, hidden_size, num_heads, dropout)[source]¶ Bases:
models.BaseModel
models.nn.pyg_drgcn
¶-
class
models.nn.pyg_drgcn.
DrGCN
(num_features, num_classes, hidden_size, num_layers, dropout)[source]¶ Bases:
models.BaseModel
models.nn.pyg_gat
¶-
class
models.nn.pyg_gat.
GAT
(num_features, num_classes, hidden_size, num_heads, dropout)[source]¶ Bases:
models.BaseModel
models.nn.pyg_gcn
¶-
class
models.nn.pyg_gcn.
GCN
(num_features, num_classes, hidden_size, num_layers, dropout)[source]¶ Bases:
models.BaseModel
models.nn.pyg_gin
¶Graph Isomorphism Network layer from paper `”How Powerful are Graph |
|
Multilayer perception with batch normalization |
|
Graph Isomorphism Network from paper `”How Powerful are Graph |
-
class
models.nn.pyg_gin.
GINLayer
(apply_func=None, eps=0, train_eps=True)[source]¶ Bases:
torch.nn.Module
Graph Isomorphism Network layer from paper “How Powerful are Graph Neural Networks?”.
\[h_i^{(l+1)} = f_\Theta \left((1 + \epsilon) h_i^{l} + \mathrm{sum}\left(\left\{h_j^{l}, j\in\mathcal{N}(i) \right\}\right)\right)\]- apply_funccallable layer function)
layer or function applied to update node feature
- epsfloat32, optional
Initial epsilon value.
- train_epsbool, optional
If True, epsilon will be a learnable parameter.
-
class
models.nn.pyg_gin.
GINMLP
(in_feats, out_feats, hidden_dim, num_layers, use_bn=True, activation=None)[source]¶ Bases:
torch.nn.Module
Multilayer perception with batch normalization
\[x^{(i+1)} = \sigma(W^{i}x^{(i)})\]- in_featsint
Size of each input sample.
- out_featsint
Size of each output sample.
- hidden_dimint
Size of hidden layer dimension.
- use_bnbool, optional
Apply batch normalization if True, default: `True).
-
class
models.nn.pyg_gin.
GIN
(num_layers, in_feats, out_feats, hidden_dim, num_mlp_layers, eps=0, pooling='sum', train_eps=False, dropout=0.5)[source]¶ Bases:
models.BaseModel
Graph Isomorphism Network from paper “How Powerful are Graph Neural Networks?”.
- Args:
- num_layersint
Number of GIN layers
- in_featsint
Size of each input sample
- out_featsint
Size of each output sample
- hidden_dimint
Size of each hidden layer dimension
- num_mlp_layersint
Number of MLP layers
- epsfloat32, optional
Initial epsilon value, default:
0
- poolingstr, optional
Aggregator type to use, default:
sum
- train_epsbool, optional
If True, epsilon will be a learnable parameter, default:
True
models.nn.pyg_gtn
¶-
class
models.nn.pyg_gtn.
GTConv
(in_channels, out_channels, num_nodes)[source]¶ Bases:
torch.nn.Module
-
class
models.nn.pyg_gtn.
GTLayer
(in_channels, out_channels, num_nodes, first=True)[source]¶ Bases:
torch.nn.Module
-
class
models.nn.pyg_gtn.
GTN
(num_edge, num_channels, w_in, w_out, num_class, num_nodes, num_layers)[source]¶ Bases:
models.BaseModel
models.nn.pyg_han
¶models.nn.pyg_infograph
¶Encoder used in supervised model with Set2set in paper `”Order Matters: Sequence to sequence for sets” |
|
Encoder stacked with GIN layers |
|
Residual MLP layers. |
|
Implimentation of Infograph in paper `”InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation |
-
class
models.nn.pyg_infograph.
SUPEncoder
(num_features, dim, num_layers=1)[source]¶ Bases:
torch.nn.Module
Encoder used in supervised model with Set2set in paper “Order Matters: Sequence to sequence for sets” <https://arxiv.org/abs/1511.06391> and NNConv in paper “Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs” <https://arxiv.org/abs/1704.02901>
-
class
models.nn.pyg_infograph.
Encoder
(in_feats, hidden_dim, num_layers=3, num_mlp_layers=2, pooling='sum')[source]¶ Bases:
torch.nn.Module
Encoder stacked with GIN layers
- in_featsint
Size of each input sample.
- hidden_featsint
Size of output embedding.
- num_layersint, optional
Number of GIN layers, default:
3
.- num_mlp_layersint, optional
Number of MLP layers for each GIN layer, default:
2
.- poolingstr, optional
Aggragation type, default :
sum
.
-
class
models.nn.pyg_infograph.
FF
(in_feats, out_feats)[source]¶ Bases:
torch.nn.Module
Residual MLP layers.
- ..math::
out = mathbf{MLP}(x) + mathbf{Linear}(x)
- in_featsint
Size of each input sample
- out_featsint
Size of each output sample
-
class
models.nn.pyg_infograph.
InfoGraph
(in_feats, hidden_dim, out_feats, num_layers=3, unsup=True)[source]¶ Bases:
models.BaseModel
- Implimentation of Infograph in paper `”InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation
Learning via Mutual Information Maximization” <https://openreview.net/forum?id=r1lfF2NYvH>__. `
- in_featsint
Size of each input sample.
- out_featsint
Size of each output sample.
- num_layersint, optional
Number of MLP layers in encoder, default:
3
.- unsupbool, optional
Use unsupervised model if True, default:
True
.
models.nn.pyg_infomax
¶
|
-
class
models.nn.pyg_infomax.
Infomax
(num_features, num_classes, hidden_size)[source]¶ Bases:
models.BaseModel
models.nn.pyg_sortpool
¶
|
|
|
-
class
models.nn.pyg_sortpool.
SortPool
(in_feats, hidden_dim, num_classes, num_layers, out_channel, kernel_size, k=30, dropout=0.5)[source]¶ Bases:
models.BaseModel
Implimentation of sortpooling in paper “An End-to-End Deep Learning Architecture for Graph Classification” <https://www.cse.wustl.edu/~muhan/papers/AAAI_2018_DGCNN.pdf>__.
- in_featsint
Size of each input sample.
- out_featsint
Size of each output sample.
- hidden_dimint
Dimension of hidden layer embedding.
- num_classesint
Number of target classes.
- num_layersint
Number of graph neural network layers before pooling.
- kint, optional
Number of selected features to sort, default:
30
.- out_channelint
Number of the first convolution’s output channels.
- kernel_sizeint
Size of the first convolution’s kernel.
- dropoutfloat, optional
Size of dropout, default:
0.5
.
models.nn.pyg_srgcn
¶-
class
models.nn.pyg_srgcn.
SrgcnHead
(num_features, out_feats, attention, activation, normalization, nhop, subheads=2, dropout=0.5, node_dropout=0.5, alpha=0.2, concat=True)[source]¶ Bases:
nn.Module
-
class
models.nn.pyg_srgcn.
SrgcnSoftmaxHead
(num_features, out_feats, attention, activation, nhop, normalization, dropout=0.5, node_dropout=0.5, alpha=0.2)[source]¶ Bases:
nn.Module
-
class
models.nn.pyg_srgcn.
SRGCN
(num_features, hidden_size, num_classes, attention, activation, nhop, normalization, dropout, node_dropout, alpha, nhead, subheads)[source]¶ Bases:
models.BaseModel
models.nn.pyg_unet
¶-
class
models.nn.pyg_unet.
UNet
(num_features, num_classes, hidden_size, num_layers, dropout)[source]¶ Bases:
models.BaseModel
models.nn.rgcn
¶-
class
models.nn.rgcn.
RGCNLayer
(in_feats, out_feats, num_edge_types, regularizer='basis', num_bases=None, self_loop=True, dropout=0.0, self_dropout=0.0, layer_norm=True, bias=True)[source]¶ Bases:
torch.nn.Module
-
class
models.nn.rgcn.
RGCN
(in_feats, out_feats, num_layers, num_rels, regularizer='basis', num_bases=None, self_loop=True, dropout=0.0, self_dropout=0.0)[source]¶ Bases:
torch.nn.Module
-
class
models.nn.rgcn.
LinkPredictRGCN
(num_entities, num_rels, hidden_size, num_layers, regularizer='basis', num_bases=None, self_loop=True, sampling_rate=0.01, penalty=0, dropout=0.0, self_dropout=0.0)[source]¶ Bases:
cogdl.layers.link_prediction_module.GNNLinkPredict
,models.BaseModel
models.nn.unsup_graphsage
¶-
class
models.nn.unsup_graphsage.
SAGE
(num_features, hidden_size, num_layers, sample_size, dropout, walk_length, negative_samples)[source]¶ Bases:
torch.nn.Module
-
class
models.nn.unsup_graphsage.
Graphsage
(num_features, hidden_size, num_classes, num_layers, sample_size, dropout, walk_length, negative_samples, lr, epochs)[source]¶ Bases:
models.BaseModel
Submodules¶
Package Contents¶
Functions¶
|
New model types can be added to cogdl with the |
|
Compute utility lists for non-uniform sampling from discrete distributions. |
|
Draw sample from a non-uniform discrete distribution using alias sampling. |
|
-
class
models.
BaseModel
[source]¶ Bases:
torch.nn.Module
-
static
add_args
(parser)¶ Add model-specific arguments to the parser.
-
abstract classmethod
build_model_from_args
(cls, args)¶ Build a new model instance.
-
static
-
models.
register_model
(name)[source]¶ New model types can be added to cogdl with the
register_model()
function decorator.For example:
@register_model('gat') class GAT(BaseModel): (...)
- Args:
name (str): the name of the model
-
models.
alias_setup
(probs)[source]¶ Compute utility lists for non-uniform sampling from discrete distributions. Refer to https://hips.seas.harvard.edu/blog/2013/03/03/the-alias-method-efficient-sampling-with-many-discrete-outcomes/ for details
- 1
Created with sphinx-autoapi