datasets
GATNE dataset
- class cogdl.datasets.gatne.AmazonDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gatne.GatneDataset(root, name)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]The network datasets “Amazon”, “Twitter” and “YouTube” from the “Representation Learning for Attributed Multiplex Heterogeneous Network” paper.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"Amazon"
,"Twitter"
,"YouTube"
).
- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- property raw_file_names
The name of the files to find in the
self.raw_dir
folder in order to skip the download.
- url = 'https://github.com/THUDM/GATNE/raw/master/data'
- class cogdl.datasets.gatne.TwitterDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
GCC dataset
- class cogdl.datasets.gcc_data.Academic_GCCDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gcc_data.DBLPNetrep_GCCDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gcc_data.DBLPSnap_GCCDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gcc_data.Edgelist(root, name)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]- property num_classes
The number of classes in the dataset.
- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- property raw_file_names
The name of the files to find in the
self.raw_dir
folder in order to skip the download.
- url = 'https://github.com/cenyk1230/gcc-data/raw/master'
- class cogdl.datasets.gcc_data.Facebook_GCCDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gcc_data.GCCDataset(root, name)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- property raw_file_names
The name of the files to find in the
self.raw_dir
folder in order to skip the download.
- url = 'https://github.com/cenyk1230/gcc-data/raw/master'
- class cogdl.datasets.gcc_data.HIndexDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gcc_data.IMDB_GCCDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gcc_data.KDD_ICDM_GCCDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gcc_data.Livejournal_GCCDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gcc_data.PretrainDataset(name, data)[source]
Bases:
object
- property num_features
- class cogdl.datasets.gcc_data.SIGIR_CIKM_GCCDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
GTN dataset
- class cogdl.datasets.gtn_data.ACM_GTNDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gtn_data.DBLP_GTNDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.gtn_data.GTNDataset(root, name)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]The network datasets “ACM”, “DBLP” and “IMDB” from the “Graph Transformer Networks” paper.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"gtn-acm"
,"gtn-dblp"
,"gtn-imdb"
).
- property num_classes
The number of classes in the dataset.
- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- property raw_file_names
The name of the files to find in the
self.raw_dir
folder in order to skip the download.
HAN dataset
- class cogdl.datasets.han_data.ACM_HANDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.han_data.DBLP_HANDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.han_data.HANDataset(root, name)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]The network datasets “ACM”, “DBLP” and “IMDB” from the “Heterogeneous Graph Attention Network” paper.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"han-acm"
,"han-dblp"
,"han-imdb"
).
- property num_classes
The number of classes in the dataset.
- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- property raw_file_names
The name of the files to find in the
self.raw_dir
folder in order to skip the download.
KG dataset
- class cogdl.datasets.kg_data.BidirectionalOneShotIterator(dataloader_head, dataloader_tail)[source]
Bases:
object
- class cogdl.datasets.kg_data.FB13Datset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.kg_data.FB13SDatset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.kg_data.FB15k237Datset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.kg_data.FB15kDatset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.kg_data.KnowledgeGraphDataset(root, name)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]- property num_entities
- property num_relations
- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- property raw_file_names
The name of the files to find in the
self.raw_dir
folder in order to skip the download.
- property test_start_idx
- property train_start_idx
- url = 'https://cloud.tsinghua.edu.cn/d/d1c733373b014efab986/files/?p=%2F{}%2F{}&dl=1'
- property valid_start_idx
- class cogdl.datasets.kg_data.TestDataset(triples, all_true_triples, nentity, nrelation, mode)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.kg_data.TrainDataset(triples, nentity, nrelation, negative_sample_size, mode)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.kg_data.WN18Datset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
Matlab matrix dataset
- class cogdl.datasets.matlab_matrix.BlogcatalogDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.matlab_matrix.DblpNEDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.matlab_matrix.FlickrDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.matlab_matrix.MatlabMatrix(root, name, url)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]networks from the http://leitang.net/code/social-dimension/data/ or http://snap.stanford.edu/node2vec/
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"Blogcatalog"
).
- property num_classes
The number of classes in the dataset.
- property num_nodes
- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- property raw_file_names
The name of the files to find in the
self.raw_dir
folder in order to skip the download.
- class cogdl.datasets.matlab_matrix.NetworkEmbeddingCMTYDataset(root, name, url)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]- property num_classes
The number of classes in the dataset.
- property num_nodes
- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- property raw_file_names
The name of the files to find in the
self.raw_dir
folder in order to skip the download.
- class cogdl.datasets.matlab_matrix.PPIDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
OGB dataset
- class cogdl.datasets.ogb.OGBArxivDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.ogb.OGBCodeDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.ogb.OGBGDataset(root, name)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]- property num_classes
The number of classes in the dataset.
- class cogdl.datasets.ogb.OGBLCitation2Dataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.ogb.OGBLCollabDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.ogb.OGBLDataset(root, name)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- class cogdl.datasets.ogb.OGBLDdiDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.ogb.OGBLPpaDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.ogb.OGBMolbaceDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.ogb.OGBMolhivDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.ogb.OGBMolpcbaDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.ogb.OGBNDataset(root, name, transform=None)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- class cogdl.datasets.ogb.OGBPapers100MDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
TU dataset
- class cogdl.datasets.tu_data.CollabDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.ENZYMES(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.ImdbBinaryDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.ImdbMultiDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.MUTAGDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.NCI109Dataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.NCI1Dataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.PTCMRDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.ProteinsDataset(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.RedditBinary(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.RedditMulti12K(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.RedditMulti5K(data_path='data')[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class cogdl.datasets.tu_data.TUDataset(root, name)[source]
Bases:
Generic
[torch.utils.data.dataset.T_co
]- property num_classes
The number of classes in the dataset.
- property processed_file_names
The name of the files to find in the
self.processed_dir
folder in order to skip the processing.
- property raw_file_names
The name of the files to find in the
self.raw_dir
folder in order to skip the download.
- url = 'https://www.chrsmrrs.com/graphkerneldatasets'
- cogdl.datasets.tu_data.parse_txt_array(src, sep=None, start=0, end=None, dtype=None, device=None)[source]
Module contents
- cogdl.datasets.register_dataset(name)[source]
New dataset types can be added to cogdl with the
register_dataset()
function decorator.For example:
@register_dataset('my_dataset') class MyDataset(): (...)
- Parameters
name (str) – the name of the dataset