**The website is now located at [[http://graphlearning.io]]**
===== Benchmark Data Sets for Graph Kernels =====
This page contains collected benchmark data sets for the evaluation of graph kernels. The data sets were collected by [[http://www.ml.informatik.tu-darmstadt.de/|Kristian Kersting]], [[staff:kriege|Nils M. Kriege]], [[staff:morris|Christopher Morris]], [[staff:mutzel|Petra Mutzel]], and [[http://sites.wustl.edu/neumann|Marion Neumann]] with partial support of the
[[http://www.dfg.de/en/|German Science Foundation]] (DFG) within the [[http://sfb876.tu-dortmund.de/index.html?p-selected=%27news%27|Collaborative Research Center
SFB 876]] "//Providing Information by Resource-Constrained Data Analysis//", [[http://sfb876.tu-dortmund.de/SPP/sfb876-a6.html|project A6]] "//Resource-efficient Graph Mining//".
* **02.03.2020:** Added three new data sets from [29].
* **14.01.2020:** Added twenty-four new data sets from [24].
* **28.08.2019:** Added twenty-two new data sets from [28].
* **09.07.2019:** Added two new data sets from [27].
* **23.10.2018:** Added five new data sets from [26].
* **13.02.2018:** Added Cuneiform data set from [25].
* **11.05.2017:** Added twelve new data sets from [24].
* **17.06.2016:** Added Synthie data set from [21].
* **10.05.2016:** Added eight new data sets from [16].
* **19.04.2016:** Added FRANKENSTEIN data set from [15].
* **13.04.2016:** Added SYNTHETICnew data set from [3,10].
* **08.04.2016:** Added six new data sets from [14].
^**Name**^**Source**^**Statistics**|||^**Labels/Attributes**|||^**Download (ZIP)**|
^ | |//Num. of Graphs//|//Num. of Classes//|//Avg. Number of Nodes//|//Avg. Number of Edges//|//Node Labels//|//Edge Labels//|//Node Attr. (Dim.)//|//Edge Attr. (Dim.)//|
^AIDS|[16,17]| 2000 |2|15.69|16.20|+|+|+ (4)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/AIDS.zip|AIDS]]|
^alchemy_dev|[29]| 99776 |R (12)|9.71|10.02|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/alchemy_dev.zip|alchemy_dev]]|
^alchemy_test|[29]| 15760 |--|11.25|11.76|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/alchemy_test.zip|alchemy_test]]|
^alchemy_valid|[29]| 3951 |R (12)|11.25|11.77|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/alchemy_valid.zip|alchemy_valid]]|
^BZR|[7]| 405 |2|35.75|38.36|+|--|+ (3)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/BZR.zip|BZR]]|
^BZR_MD|[7,23]| 306 |2|21.30|225.06|+|+|--|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/BZR_MD.zip|BZR_MD]]|
^COIL-DEL|[16,18]| 3900 |100| 21.54 | 54.24 |--|+|+ (2)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COIL-DEL.zip|COIL-DEL]]|
^COIL-RAG|[16,18]| 3900 |100| 3.01 | 3.02 |--|--|+ (64)|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COIL-RAG.zip|COIL-RAG]]|
^COLLAB|[14]| 5000 |3|74.49 | 2457.78|--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COLLAB.zip|COLLAB]]|
^COLORS-3|[27]|10500|11|61.31|91.03|--|--|+ (4)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COLORS-3.zip|COLORS-3]]|
^COX2|[7]| 467 |2|41.22 |43.45|+|--|+ (3)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COX2.zip|COX2]]|
^COX2_MD|[7,23]| 303 |2|26.28|335.12|+|+|--|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COX2_MD.zip|COX2_MD]]|
^Cuneiform|[25]| 267 |30|21.27|44.80|+|+|+ (3)|+ (2)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Cuneiform.zip|Cuneiform]]|
^DBLP_v1|[26]|19456|2 |10.48|19.65|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/DBLP_v1.zip|DBLP_v1]]|
^DHFR|[7]| 467 |2|42.43|44.54|+|--|+ (3)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/DHFR.zip|DHFR]]|
^DHFR_MD|[7,23]| 393 |2|23.87| 283.01|+|+|--|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/DHFR_MD.zip|DHFR_MD]]|
^ER_MD|[7,23]| 446 |2| 21.33| 234.85 | +|+|--|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/ER_MD.zip|ER_MD]]|
^DD|[6,22]| 1178 |2|284.32| 715.66|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/DD.zip|DD]]|
^ENZYMES|[4,5]| 600 |6|32.63 | 62.14|+|--|+ (18)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/ENZYMES.zip|ENZYMES]]|
^Fingerprint|[16,19]| 2800 |4|5.42 | 4.42|--|--|+ (2)|+ (2)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Fingerprint.zip|Fingerprint]]|
^FIRSTMM_DB|[11,12,13]| 41 |11|1377.27| 3074.10|+|--|+ (1) |+ (2)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/FIRSTMM_DB.zip|FIRSTMM_DB]]|
^FRANKENSTEIN|[15]| 4337 | 2 |16.90| 17.88 |--|--|+ (780) |--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/FRANKENSTEIN.zip|FRANKENSTEIN]]|
^IMDB-BINARY|[14]| 1000 |2| 19.77 | 96.53 |--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/IMDB-BINARY.zip|IMDB-BINARY]]|
^IMDB-MULTI|[14]| 1500 |3| 13.00 | 65.94 |--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/IMDB-MULTI.zip|IMDB-MULTI]]|
^KKI|[26]|83|2 |26.96|48.42|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/KKI.zip|KKI]]|
^Letter-high|[16]| 2250 |15| 4.67 |4.50 |--|--|+ (2)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Letter-high.zip|Letter-high]]|
^Letter-low|[16]| 2250 |15| 4.68 |3.13 |--|--|+ (2)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Letter-low.zip|Letter-low]]|
^Letter-med|[16]| 2250 |15| 4.67 |4.50 |--|--|+ (2)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Letter-med.zip|Letter-med]]|
^MCF-7|[28]| 27770 |2|26.39| 28.52 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MCF-7.zip|MCF-7]]|
^MCF-7H|[28]| 27770 |2|47.30| 49.43 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MCF-7H.zip|MCF-7H]]|
^MOLT-4|[28]| 39765 |2|26.09| 28.13 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MOLT-4.zip|MOLT-4]]|
^MOLT-4H|[28]| 39765 |2|46.70| 48.73 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MOLT-4H.zip|MOLT-4H]]|
^Mutagenicity|[16,20]| 4337 |2| 30.32 | 30.77 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Mutagenicity.zip|Mutagenicity]]|
^MSRC_9|[13]| 221 |8|40.58| 97.94 |+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MSRC_9.zip|MSCR_9]]|
^MSRC_21|[13]| 563 |20|77.52|198.32|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MSRC_21.zip|MSRC_21]]|
^MSRC_21C|[13]| 209 |20|40.28 | 96.60|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MSRC_21C.zip|MSRC_21C]]|
^MUTAG|[1,23]| 188 |2|17.93|19.79|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MUTAG.zip|MUTAG]]|
^NCI1|[8,9,22]| 4110 |2|29.87|32.30|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/NCI1.zip|NCI1]]|
^NCI109|[8,9,22]| 4127 |2|29.68| 32.13 |+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/NCI109.zip|NCI109]]|
^NCI-H23|[28]| 40353 |2|26.07| 28.10 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/NCI-H23.zip|NCI-H23]]|
^NCI-H23H|[28]| 40353 |2|46.67| 48.69 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/NCI-H23H.zip|NCI-H23H]]|
^OHSU|[26]|79|2 |82.01|199.66|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/OHSU.zip|OHSU]]|
^OVCAR-8|[28]| 40516 |2|26.07| 28.10 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/OVCAR-8.zip|OVCAR-8]]|
^OVCAR-8H|[28]| 40516 |2|46.67| 48.70 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/OVCAR-8H.zip|OVCAR-8H]]|
^P388|[28]| 41472 |2|22.11| 23.55 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/P388.zip|P388]]|
^P388H|[28]| 41472 |2|40.44| 41.88 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/P388H.zip|P388H]]|
^PC-3|[28]| 27509 |2|26.35| 28.49 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PC-3.zip|PC-3]]|
^PC-3H|[28]| 27509 |2|47.19| 49.32 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PC-3H.zip|PC-3H]]|
^Peking_1|[26]|85|2 |39.31|77.35|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Peking_1.zip|Peking_1]]|
^PTC_FM|[2,23]| 349 |2|14.11|14.48|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PTC_FM.zip|PTC_FM]]|
^PTC_FR|[2,23]| 351 |2|14.56| 15.00|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PTC_FR.zip|PTC_FR]]|
^PTC_MM|[2,23]| 336 |2|13.97 | 14.32|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PTC_MM.zip|PTC_MM]]|
^PTC_MR|[2,23]| 344 |2|14.29| 14.69|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PTC_MR.zip|PTC_MR]]|
^PROTEINS|[4,6]| 1113 |2|39.06|72.82|+|--|+ (1)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PROTEINS.zip|PROTEINS]]|
^PROTEINS_full|[4,6]| 1113 |2|39.06|72.82|+|--|+ (29)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PROTEINS_full.zip|PROTEINS_full]]|
^REDDIT-BINARY|[14]| 2000 |2| 429.63| 497.75 |--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/REDDIT-BINARY.zip|REDDIT-BINARY]]|
^REDDIT-MULTI-5K|[14]| 4999 | 5 |508.52 | 594.87 |--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/REDDIT-MULTI-5K.zip|REDDIT-MULTI-5K]]|
^REDDIT-MULTI-12K|[14]| 11929 | 11 | 391.41 | 456.89 |--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/REDDIT-MULTI-12K.zip|REDDIT-MULTI-12K]]|
^SF-295|[28]| 40271 |2|26.06| 28.08 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SF-295.zip|SF-295]]|
^SF-295H|[28]| 40271 |2|46.65| 48.68 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SF-295H.zip|SF-295H]]|
^SN12C|[28]| 40004 |2|26.08| 28.11 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SN12C.zip|SN12C]]|
^SN12CH|[28]| 40004 |2|46.69| 48.71 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SN12CH.zip|SN12CH]]|
^SW-620|[28]| 40532 |2|26.05| 28.08 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SW-620.zip|SW-620]]|
^SW-620H|[28]| 40532 |2|46.62| 48.65 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SW-620H.zip|SW-620H]]|
^SYNTHETIC|[3]| 300 |2|100.00| 196.00|--|--|+ (1)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SYNTHETIC.zip|SYNTHETIC]]|
^SYNTHETICnew|[3,10]| 300 |2|100.00| 196.25|--|--|+ (1)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SYNTHETICnew.zip|SYNTHETICnew]]|
^Synthie|[21]| 400 |4|95.00| 172.93|--|--|+ (15)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Synthie.zip|Synthie]]|
^Tox21_AhR_training|[24]|8169|2 |18.09|18.50|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AhR_training.zip|Tox21_AhR_training]]|
^Tox21_AhR_testing|[24]|272|2 |22.13|23.05|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AhR_testing.zip|Tox21_AhR_testing]]|
^Tox21_AhR_evaluation|[24]|607|2 |17.64|18.06|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AhR_evaluation.zip|Tox21_AhR_evaluation]]|
^Tox21_AR_training|[24]|9362|2 |18.39|18.84|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR_training.zip|Tox21_AR_training]]|
^Tox21_AR_testing|[24]|292|2 |22.35|23.32|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR_testing.zip|Tox21_AR_testing]]|
^Tox21_AR_evaluation|[24]|585|2 |17.99|18.45|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR_evaluation.zip|Tox21_AR_evaluation]]|
^Tox21_AR-LBD_training|[24]|8599|2 |17.77|18.16|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR-LBD_training.zip|Tox21_AR-LBD_training]]|
^Tox21_AR-LBD_testing|[24]|253|2 |21.85|22.73|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR-LBD_testing.zip|Tox21_AR-LBD_testing]]|
^Tox21_AR-LBD_evaluation|[24]|580|2 |17.09|17.42|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR-LBD_evaluation.zip|Tox21_AR-LBD_evaluation]]|
^Tox21_ARE_training|[24]|7167|2 |16.28|16.52|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ARE_training.zip|Tox21_ARE_training]]|
^Tox21_ARE_testing|[24]|234|2 |21.99|22.91|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ARE_testing.zip|Tox21_ARE_testing]]|
^Tox21_ARE_evaluation|[24]|552|2 |17.01|17.33|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ARE_evaluation.zip|Tox21_ARE_evaluation]]|
^Tox21_aromatase_training|[24]|7226|2 |17.50|17.79|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_aromatase_training.zip|Tox21_aromatase_training]]|
^Tox21_aromatase_testing|[24]|214|2 |21.65|22.36|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_aromatase_testing.zip|Tox21_aromatase_testing]]|
^Tox21_aromatase_evaluation|[24]|528|2 |16.74|16.99|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_aromatase_evaluation.zip|Tox21_aromatase_evaluation]]|
^Tox21_ATAD5_training|[24]|9091|2 |17.89|18.30|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ATAD5_training.zip|Tox21_ATAD5_training]]|
^Tox21_ATAD5_testing|[24]|272|2 |21.99|22.89|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ATAD5_testing.zip|Tox21_ATAD5_testing]]|
^Tox21_ATAD5_evaluation|[24]|619|2 |17.68|18.11|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ATAD5_evaluation.zip|Tox21_ATAD5_evaluation]]|
^Tox21_ER_training|[24]|7697|2 |17.58|17.94|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER_training.zip|Tox21_ER_training]]|
^Tox21_ER_testing|[24]|265|2 |22.16|23.13|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER_testing.zip|Tox21_ER_testing]]|
^Tox21_ER_evaluation|[24]|515|2 |17.66|18.10|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER_evaluation.zip|Tox21_ER_evaluation]]|
^Tox21_ER-LBD_training|[24]|8753|2 |18.06|18.47|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER-LBD_training.zip|Tox21_ER-LBD_training]]|
^Tox21_ER-LBD_testing|[24]|287|2 |22.28|23.23|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER-LBD_testing.zip|Tox21_ER-LBD_testing]]|
^Tox21_ER-LBD_evaluation|[24]|599|2 |17.75|18.17|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER-LBD_evaluation.zip|Tox21_ER-LBD_evaluation]]|
^Tox21_HSE_training|[24]|8150|2 |16.72|17.04|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_HSE_training.zip|Tox21_HSE_training]]|
^Tox21_HSE_testing|[24]|267|2 |22.07|23.00|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_HSE_testing.zip|Tox21_HSE_testing]]|
^Tox21_HSE_evaluation|[24]|607|2 |17.61|18.01|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_HSE_evaluation.zip|Tox21_HSE_evaluation]]|
^Tox21_MMP_training|[24]|7320|2 |17.49|17.83|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_MMP_training.zip|Tox21_MMP_training]]|
^Tox21_MMP_testing|[24]|238|2 |21.68|22.55|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_MMP_testing.zip|Tox21_MMP_testing]]|
^Tox21_MMP_evaluation|[24]|541|2 |16.67|16.88|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_MMP_evaluation.zip|Tox21_MMP_evaluation]]|
^Tox21_p53_training|[24]|8634|2 |17.79|18.19|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_p53_training.zip|Tox21_p53_training]]|
^Tox21_p53_testing|[24]|269|2 |22.14|23.04|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_p53_testing.zip|Tox21_p53_testing]]|
^Tox21_p53_evaluation|[24]|613|2 |17.34|17.72|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_p53_evaluation.zip|Tox21_p53_evaluation]]|
^Tox21_PPAR-gamma_training|[24]|8184|2 |17.23|17.55|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_PPAR-gamma_training.zip|Tox21_PPAR-gamma_training]]|
^Tox21_PPAR-gamma_testing|[24]|267|2 |22.04|22.93|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_PPAR-gamma_testing.zip|Tox21_PPAR-gamma_testing]]|
^Tox21_PPAR-gamma_evaluation|[24]|602|2 |17.38|17.77|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_PPAR-gamma_evaluation.zip|Tox21_PPAR-gamma_evaluation]]|
^TRIANGLES|[27]|45000|10|20.85|32.74|--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/TRIANGLES.zip|TRIANGLES]]|
^TWITTER-Real-Graph-Partial|[26]|144033|2 |4.03|4.98|+|--|--|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/TWITTER-Real-Graph-Partial.zip|TWITTER-Real-Graph-Partial]]|
^UACC257|[28]| 39988 |2|26.09| 28.12 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/UACC257.zip|UACC257]]|
^UACC257H|[28]| 39988 |2|46.68| 48.71 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/UACC257H.zip|UACC257H]]|
^Yeast|[28]| 79601 |2|21.54| 22.84 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Yeast.zip|Yeast]]|
^YeastH|[28]| 79601 |2|39.44| 40.74 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/YeastH.zip|YeastH]]|
^ | ||||||||| |
^//All Data Sets// | |||||||||[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/DS_all.zip|DS_all]]|
R(N) are regression datasets with N tasks per graph.
==== File Format ====
The data sets have the following //format// (replace **DS** by the name of the data set):
Let
* n = total number of nodes
* m = total number of edges
* N = number of graphs
- **DS_A.txt (m lines):** sparse (block diagonal) adjacency matrix for all graphs, each line corresponds to (row, col) resp. (node_id, node_id). //All graphs are undirected. Hence, DS_A.txt contains two entries for each edge.//
- **DS_graph_indicator.txt (n lines):** column vector of graph identifiers for all nodes of all graphs, the value in the i-th line is the graph_id of the node with node_id i
- **DS_graph_labels.txt (N lines):** class labels for all graphs in the data set, the value in the i-th line is the class label of the graph with graph_id i
- **DS_node_labels.txt (n lines):** column vector of node labels, the value in the i-th line corresponds to the node with node_id i
There are //optional files// if the respective information is available:
* **DS_edge_labels.txt (m lines; same size as DS_A_sparse.txt):** labels for the edges in DS_A_sparse.txt
* **DS_edge_attributes.txt (m lines; same size as DS_A.txt):** attributes for the edges in DS_A.txt
* **DS_node_attributes.txt (n lines):** matrix of node attributes, the comma seperated values in the i-th line is the attribute vector of the node with node_id i
* **DS_graph_attributes.txt (N lines):** regression values for all graphs in the data set, the value in the i-th line is the attribute of the graph with graph_id i
==== Deep Learning Libraries ====
The datasets can also be accessed using [[https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html|PyTorch Geometric]] and the [[https://docs.dgl.ai/api/python/data.html|Deep Graph Library]].
==== Citing this Website ====
We encourage you to refer to our website at http://graphkernels.cs.tu-dortmund.de if you have used the data sets for your publication. Please use the following BibTeX citation:
@misc{KKMMN2016,
title = {Benchmark Data Sets for Graph Kernels},
author = {Kristian Kersting and Nils M. Kriege and Christopher Morris and Petra Mutzel and Marion Neumann},
year = {2016},
url = {http://graphkernels.cs.tu-dortmund.de}
}
If your bibliography style does not support the url field, you may use this alternative:
@misc{KKMMN2016,
title = {Benchmark Data Sets for Graph Kernels},
author = {Kristian Kersting and Nils M. Kriege and Christopher Morris and Petra Mutzel and Marion Neumann},
year = {2016},
note = {\url{http://graphkernels.cs.tu-dortmund.de}}
}
==== Bibliography ====
[1] Debnath, A.K., Lopez de Compadre, R.L., Debnath, G., Shusterman, A.J., and Hansch, C.
[[http://www.ncbi.nlm.nih.gov/pubmed/1995902|Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds.
Correlation with molecular orbital energies and hydrophobicity]]. J. Med. Chem. 34(2):786-797 (1991).
[2] Helma, C., King, R. D., Kramer, S., and Srinivasan, A. [[https://doi.org/10.1093/bioinformatics/17.1.107|The Predictive Toxicology Challenge 2000–2001]]. Bioinformatics, 2001, 17, 107-108. URL: [[http://www.predictive-toxicology.org/ptc|www.predictive-toxicology.org/ptc]]
[3] Feragen, A., Kasenburg, N., Petersen, J., de Bruijne, M., Borgwardt, K.M.: [[http://papers.nips.cc/paper/5155-scalable-kernels-for-graphs-with-continuous-attributes.pdf|Scalable
kernels for graphs with continuous attributes]]. In: C.J.C. Burges, L. Bottou, Z. Ghahramani, K.Q. Weinberger (eds.) NIPS, pp. 216-224 (2013).
[4] K. M. Borgwardt, C. S. Ong, S. Schoenauer, S. V. N. Vishwanathan, A. J. Smola, and H. P.
Kriegel. [[http://bioinformatics.oxfordjournals.org/content/21/suppl_1/i47.full.pdf+html|Protein function prediction via graph kernels]]. Bioinformatics, 21(Suppl 1):i47–i56,
Jun 2005.
[5] I. Schomburg, A. Chang, C. Ebeling, M. Gremse, C. Heldt, G. Huhn, and D. Schomburg. [[http://www.ncbi.nlm.nih.gov/pubmed/14681450|Brenda,
the enzyme database: updates and major new developments]]. Nucleic Acids Research, 32D:431–433, 2004.
[6] P. D. Dobson and A. J. Doig. [[http://www.ncbi.nlm.nih.gov/pubmed/12850146|Distinguishing enzyme structures from non-enzymes without
alignments]]. J. Mol. Biol., 330(4):771–783, Jul 2003.
[7] Sutherland, J. J.; O'Brien, L. A. & Weaver, D. F. [[http://www.ncbi.nlm.nih.gov/pubmed/14632439|Spline-fitting with a
genetic algorithm: a method for developing classification structure-activity
relationships]]. J. Chem. Inf. Comput. Sci., 2003, 43, 1906-1915.
[8] N. Wale and G. Karypis. [[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4053093&tag=1|Comparison of descriptor spaces for chemical compound retrieval and
classification]]. In Proc. of ICDM, pages 678–689, Hong Kong, 2006.
[9] [[http://pubchem.ncbi.nlm.nih.gov]]
[10] http://image.diku.dk/aasa/papers/graphkernels_nips_erratum.pdf
[11] M. Neumann, P. Moreno, L. Antanas, R. Garnett, K. Kersting. [[http://www-kd.iai.uni-bonn.de/pubattachments/716/neumann2013mlg_grasping.pdf|Graph Kernels for
Object Category Prediction in Task-Dependent Robot Grasping]]. Eleventh Workshop
on Mining and Learning with Graphs (MLG-13), Chicago, Illinois, USA, 2013.
[12] [[http://www.first-mm.eu/data.html]]
[13] M. Neumann, R. Garnett, C. Bauckhage, and K. Kersting. [[http://link.springer.com/article/10.1007%2Fs10994-015-5517-9|Propagation kernels: efficient graph kernels from propagated information]].Machine Learning, 102(2):209–245, 2016
[14] Pinar Yanardag and S.V.N. Vishwanathan. 2015. [[http://dl.acm.org/citation.cfm?id=2783417|Deep Graph Kernels]]. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York, NY, USA, 1365-1374.
[15] Francesco Orsini, Paolo Frasconi, and Luc De Raedt. 2015 [[http://www.ijcai.org/Proceedings/15/Papers/528.pdf|Graph invariant kernels]]. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI'15), Qiang Yang and Michael Wooldridge (Eds.). AAAI Press 3756-3762.
[16] Riesen, K. and Bunke, H.: [[http://link.springer.com/chapter/10.1007%2F978-3-540-89689-0_33|IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning]]. In: da Vitora Lobo, N. et al. (Eds.), SSPR&SPR 2008, LNCS, vol. 5342, pp. 287-297, 2008.
[17] [[https://wiki.nci.nih.gov/display/NCIDTPdata/AIDS+Antiviral+Screen+Data|AIDS Antiviral Screen Data (2004)]]
[18] S. A. Nene, S. K. Nayar and H. Murase. [[http://www.cs.columbia.edu/CAVE/software/softlib/coil-100.php|Columbia Object Image Library (]]COIL-100), Technical Report, Department of Computer Science, Columbia University CUCS-006-96,
Feb. 1996.
[19] [[http://www.nist.gov/srd/nistsd4.cfm|NIST Special Database 4]]
[20] Jeroen Kazius, Ross McGuire and, and Roberta Bursi. [[http://pubs.acs.org/doi/abs/10.1021/jm040835a|Derivation and Validation of Toxicophores for Mutagenicity Prediction]], Journal of Medicinal Chemistry 2005 48 (1), 312-320
[21] Christopher Morris, Nils M. Kriege, Kristian Kersting, Petra Mutzel. Faster Kernels for Graphs with Continuous Attributes via Hashing, IEEE International Conference on Data Mining (ICDM) 2016
[22] Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M. Borgwardt. 2011. [[http://www.jmlr.org/papers/volume12/shervashidze11a/shervashidze11a.pdf|Weisfeiler-Lehman Graph Kernels]]. J. Mach. Learn. Res. 12 (November 2011), 2539-2561.
[23] Nils Kriege, Petra Mutzel. 2012. [[http://icml.cc/2012/papers/542.pdf|Subgraph Matching Kernels for Attributed Graphs]]. International Conference on Machine Learning 2012.
[24] [[https://tripod.nih.gov/tox21/challenge/data.jsp|Tox21 Data Challenge 2014]]
[25] Nils M. Kriege, Matthias Fey, Denis Fisseler, Petra Mutzel, Frank Weichert. [[http://proceedings.mlr.press/v88/kriege18a.html|Recognizing Cuneiform Signs Using Graph Based Methods]]. International Workshop on Cost-Sensitive Learning (COST), SIAM International Conference on Data Mining (SDM) 2018, 31-44, ''[[https://arxiv.org/abs/1802.05908|arXiv:1802.05908]]''.
[26] [[https://github.com/shiruipan/graph_datasets|A Repository of Benchmark Graph Datasets for Graph Classification]]
[27] Boris Knyazev, Graham W. Taylor, Mohamed R. Amer. [[https://arxiv.org/abs/1905.02850|Understanding Attention and Generalization in Graph Neural Networks]]
[28] [[https://sites.cs.ucsb.edu/~xyan/dataset.htm|Chemical DataSets]]
[29] [[https://arxiv.org/abs/1906.09427|Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models]]
==== Contact ====
If you have any questions regarding the data sets or are interested in adding your graph data, please write an email to christopher.morris{{:staff:at.gif|}}tu-dortmund.de.