**The website is now located at [[http://graphlearning.io]]** ===== Benchmark Data Sets for Graph Kernels ===== This page contains collected benchmark data sets for the evaluation of graph kernels. The data sets were collected by [[http://www.ml.informatik.tu-darmstadt.de/|Kristian Kersting]], [[staff:kriege|Nils M. Kriege]], [[staff:morris|Christopher Morris]], [[staff:mutzel|Petra Mutzel]], and [[http://sites.wustl.edu/neumann|Marion Neumann]] with partial support of the [[http://www.dfg.de/en/|German Science Foundation]] (DFG) within the [[http://sfb876.tu-dortmund.de/index.html?p-selected=%27news%27|Collaborative Research Center SFB 876]] "//Providing Information by Resource-Constrained Data Analysis//", [[http://sfb876.tu-dortmund.de/SPP/sfb876-a6.html|project A6]] "//Resource-efficient Graph Mining//". * **02.03.2020:** Added three new data sets from [29]. * **14.01.2020:** Added twenty-four new data sets from [24]. * **28.08.2019:** Added twenty-two new data sets from [28]. * **09.07.2019:** Added two new data sets from [27]. * **23.10.2018:** Added five new data sets from [26]. * **13.02.2018:** Added Cuneiform data set from [25]. * **11.05.2017:** Added twelve new data sets from [24]. * **17.06.2016:** Added Synthie data set from [21]. * **10.05.2016:** Added eight new data sets from [16]. * **19.04.2016:** Added FRANKENSTEIN data set from [15]. * **13.04.2016:** Added SYNTHETICnew data set from [3,10]. * **08.04.2016:** Added six new data sets from [14]. ^**Name**^**Source**^**Statistics**|||^**Labels/Attributes**|||^**Download (ZIP)**| ^ | |//Num. of Graphs//|//Num. of Classes//|//Avg. Number of Nodes//|//Avg. Number of Edges//|//Node Labels//|//Edge Labels//|//Node Attr. (Dim.)//|//Edge Attr. (Dim.)//| ^AIDS|[16,17]| 2000 |2|15.69|16.20|+|+|+ (4)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/AIDS.zip|AIDS]]| ^alchemy_dev|[29]| 99776 |R (12)|9.71|10.02|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/alchemy_dev.zip|alchemy_dev]]| ^alchemy_test|[29]| 15760 |--|11.25|11.76|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/alchemy_test.zip|alchemy_test]]| ^alchemy_valid|[29]| 3951 |R (12)|11.25|11.77|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/alchemy_valid.zip|alchemy_valid]]| ^BZR|[7]| 405 |2|35.75|38.36|+|--|+ (3)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/BZR.zip|BZR]]| ^BZR_MD|[7,23]| 306 |2|21.30|225.06|+|+|--|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/BZR_MD.zip|BZR_MD]]| ^COIL-DEL|[16,18]| 3900 |100| 21.54 | 54.24 |--|+|+ (2)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COIL-DEL.zip|COIL-DEL]]| ^COIL-RAG|[16,18]| 3900 |100| 3.01 | 3.02 |--|--|+ (64)|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COIL-RAG.zip|COIL-RAG]]| ^COLLAB|[14]| 5000 |3|74.49 | 2457.78|--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COLLAB.zip|COLLAB]]| ^COLORS-3|[27]|10500|11|61.31|91.03|--|--|+ (4)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COLORS-3.zip|COLORS-3]]| ^COX2|[7]| 467 |2|41.22 |43.45|+|--|+ (3)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COX2.zip|COX2]]| ^COX2_MD|[7,23]| 303 |2|26.28|335.12|+|+|--|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/COX2_MD.zip|COX2_MD]]| ^Cuneiform|[25]| 267 |30|21.27|44.80|+|+|+ (3)|+ (2)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Cuneiform.zip|Cuneiform]]| ^DBLP_v1|[26]|19456|2 |10.48|19.65|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/DBLP_v1.zip|DBLP_v1]]| ^DHFR|[7]| 467 |2|42.43|44.54|+|--|+ (3)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/DHFR.zip|DHFR]]| ^DHFR_MD|[7,23]| 393 |2|23.87| 283.01|+|+|--|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/DHFR_MD.zip|DHFR_MD]]| ^ER_MD|[7,23]| 446 |2| 21.33| 234.85 | +|+|--|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/ER_MD.zip|ER_MD]]| ^DD|[6,22]| 1178 |2|284.32| 715.66|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/DD.zip|DD]]| ^ENZYMES|[4,5]| 600 |6|32.63 | 62.14|+|--|+ (18)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/ENZYMES.zip|ENZYMES]]| ^Fingerprint|[16,19]| 2800 |4|5.42 | 4.42|--|--|+ (2)|+ (2)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Fingerprint.zip|Fingerprint]]| ^FIRSTMM_DB|[11,12,13]| 41 |11|1377.27| 3074.10|+|--|+ (1) |+ (2)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/FIRSTMM_DB.zip|FIRSTMM_DB]]| ^FRANKENSTEIN|[15]| 4337 | 2 |16.90| 17.88 |--|--|+ (780) |--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/FRANKENSTEIN.zip|FRANKENSTEIN]]| ^IMDB-BINARY|[14]| 1000 |2| 19.77 | 96.53 |--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/IMDB-BINARY.zip|IMDB-BINARY]]| ^IMDB-MULTI|[14]| 1500 |3| 13.00 | 65.94 |--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/IMDB-MULTI.zip|IMDB-MULTI]]| ^KKI|[26]|83|2 |26.96|48.42|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/KKI.zip|KKI]]| ^Letter-high|[16]| 2250 |15| 4.67 |4.50 |--|--|+ (2)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Letter-high.zip|Letter-high]]| ^Letter-low|[16]| 2250 |15| 4.68 |3.13 |--|--|+ (2)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Letter-low.zip|Letter-low]]| ^Letter-med|[16]| 2250 |15| 4.67 |4.50 |--|--|+ (2)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Letter-med.zip|Letter-med]]| ^MCF-7|[28]| 27770 |2|26.39| 28.52 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MCF-7.zip|MCF-7]]| ^MCF-7H|[28]| 27770 |2|47.30| 49.43 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MCF-7H.zip|MCF-7H]]| ^MOLT-4|[28]| 39765 |2|26.09| 28.13 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MOLT-4.zip|MOLT-4]]| ^MOLT-4H|[28]| 39765 |2|46.70| 48.73 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MOLT-4H.zip|MOLT-4H]]| ^Mutagenicity|[16,20]| 4337 |2| 30.32 | 30.77 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Mutagenicity.zip|Mutagenicity]]| ^MSRC_9|[13]| 221 |8|40.58| 97.94 |+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MSRC_9.zip|MSCR_9]]| ^MSRC_21|[13]| 563 |20|77.52|198.32|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MSRC_21.zip|MSRC_21]]| ^MSRC_21C|[13]| 209 |20|40.28 | 96.60|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MSRC_21C.zip|MSRC_21C]]| ^MUTAG|[1,23]| 188 |2|17.93|19.79|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/MUTAG.zip|MUTAG]]| ^NCI1|[8,9,22]| 4110 |2|29.87|32.30|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/NCI1.zip|NCI1]]| ^NCI109|[8,9,22]| 4127 |2|29.68| 32.13 |+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/NCI109.zip|NCI109]]| ^NCI-H23|[28]| 40353 |2|26.07| 28.10 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/NCI-H23.zip|NCI-H23]]| ^NCI-H23H|[28]| 40353 |2|46.67| 48.69 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/NCI-H23H.zip|NCI-H23H]]| ^OHSU|[26]|79|2 |82.01|199.66|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/OHSU.zip|OHSU]]| ^OVCAR-8|[28]| 40516 |2|26.07| 28.10 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/OVCAR-8.zip|OVCAR-8]]| ^OVCAR-8H|[28]| 40516 |2|46.67| 48.70 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/OVCAR-8H.zip|OVCAR-8H]]| ^P388|[28]| 41472 |2|22.11| 23.55 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/P388.zip|P388]]| ^P388H|[28]| 41472 |2|40.44| 41.88 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/P388H.zip|P388H]]| ^PC-3|[28]| 27509 |2|26.35| 28.49 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PC-3.zip|PC-3]]| ^PC-3H|[28]| 27509 |2|47.19| 49.32 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PC-3H.zip|PC-3H]]| ^Peking_1|[26]|85|2 |39.31|77.35|+|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Peking_1.zip|Peking_1]]| ^PTC_FM|[2,23]| 349 |2|14.11|14.48|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PTC_FM.zip|PTC_FM]]| ^PTC_FR|[2,23]| 351 |2|14.56| 15.00|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PTC_FR.zip|PTC_FR]]| ^PTC_MM|[2,23]| 336 |2|13.97 | 14.32|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PTC_MM.zip|PTC_MM]]| ^PTC_MR|[2,23]| 344 |2|14.29| 14.69|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PTC_MR.zip|PTC_MR]]| ^PROTEINS|[4,6]| 1113 |2|39.06|72.82|+|--|+ (1)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PROTEINS.zip|PROTEINS]]| ^PROTEINS_full|[4,6]| 1113 |2|39.06|72.82|+|--|+ (29)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/PROTEINS_full.zip|PROTEINS_full]]| ^REDDIT-BINARY|[14]| 2000 |2| 429.63| 497.75 |--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/REDDIT-BINARY.zip|REDDIT-BINARY]]| ^REDDIT-MULTI-5K|[14]| 4999 | 5 |508.52 | 594.87 |--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/REDDIT-MULTI-5K.zip|REDDIT-MULTI-5K]]| ^REDDIT-MULTI-12K|[14]| 11929 | 11 | 391.41 | 456.89 |--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/REDDIT-MULTI-12K.zip|REDDIT-MULTI-12K]]| ^SF-295|[28]| 40271 |2|26.06| 28.08 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SF-295.zip|SF-295]]| ^SF-295H|[28]| 40271 |2|46.65| 48.68 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SF-295H.zip|SF-295H]]| ^SN12C|[28]| 40004 |2|26.08| 28.11 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SN12C.zip|SN12C]]| ^SN12CH|[28]| 40004 |2|46.69| 48.71 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SN12CH.zip|SN12CH]]| ^SW-620|[28]| 40532 |2|26.05| 28.08 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SW-620.zip|SW-620]]| ^SW-620H|[28]| 40532 |2|46.62| 48.65 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SW-620H.zip|SW-620H]]| ^SYNTHETIC|[3]| 300 |2|100.00| 196.00|--|--|+ (1)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SYNTHETIC.zip|SYNTHETIC]]| ^SYNTHETICnew|[3,10]| 300 |2|100.00| 196.25|--|--|+ (1)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/SYNTHETICnew.zip|SYNTHETICnew]]| ^Synthie|[21]| 400 |4|95.00| 172.93|--|--|+ (15)|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Synthie.zip|Synthie]]| ^Tox21_AhR_training|[24]|8169|2 |18.09|18.50|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AhR_training.zip|Tox21_AhR_training]]| ^Tox21_AhR_testing|[24]|272|2 |22.13|23.05|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AhR_testing.zip|Tox21_AhR_testing]]| ^Tox21_AhR_evaluation|[24]|607|2 |17.64|18.06|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AhR_evaluation.zip|Tox21_AhR_evaluation]]| ^Tox21_AR_training|[24]|9362|2 |18.39|18.84|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR_training.zip|Tox21_AR_training]]| ^Tox21_AR_testing|[24]|292|2 |22.35|23.32|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR_testing.zip|Tox21_AR_testing]]| ^Tox21_AR_evaluation|[24]|585|2 |17.99|18.45|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR_evaluation.zip|Tox21_AR_evaluation]]| ^Tox21_AR-LBD_training|[24]|8599|2 |17.77|18.16|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR-LBD_training.zip|Tox21_AR-LBD_training]]| ^Tox21_AR-LBD_testing|[24]|253|2 |21.85|22.73|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR-LBD_testing.zip|Tox21_AR-LBD_testing]]| ^Tox21_AR-LBD_evaluation|[24]|580|2 |17.09|17.42|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_AR-LBD_evaluation.zip|Tox21_AR-LBD_evaluation]]| ^Tox21_ARE_training|[24]|7167|2 |16.28|16.52|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ARE_training.zip|Tox21_ARE_training]]| ^Tox21_ARE_testing|[24]|234|2 |21.99|22.91|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ARE_testing.zip|Tox21_ARE_testing]]| ^Tox21_ARE_evaluation|[24]|552|2 |17.01|17.33|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ARE_evaluation.zip|Tox21_ARE_evaluation]]| ^Tox21_aromatase_training|[24]|7226|2 |17.50|17.79|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_aromatase_training.zip|Tox21_aromatase_training]]| ^Tox21_aromatase_testing|[24]|214|2 |21.65|22.36|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_aromatase_testing.zip|Tox21_aromatase_testing]]| ^Tox21_aromatase_evaluation|[24]|528|2 |16.74|16.99|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_aromatase_evaluation.zip|Tox21_aromatase_evaluation]]| ^Tox21_ATAD5_training|[24]|9091|2 |17.89|18.30|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ATAD5_training.zip|Tox21_ATAD5_training]]| ^Tox21_ATAD5_testing|[24]|272|2 |21.99|22.89|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ATAD5_testing.zip|Tox21_ATAD5_testing]]| ^Tox21_ATAD5_evaluation|[24]|619|2 |17.68|18.11|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ATAD5_evaluation.zip|Tox21_ATAD5_evaluation]]| ^Tox21_ER_training|[24]|7697|2 |17.58|17.94|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER_training.zip|Tox21_ER_training]]| ^Tox21_ER_testing|[24]|265|2 |22.16|23.13|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER_testing.zip|Tox21_ER_testing]]| ^Tox21_ER_evaluation|[24]|515|2 |17.66|18.10|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER_evaluation.zip|Tox21_ER_evaluation]]| ^Tox21_ER-LBD_training|[24]|8753|2 |18.06|18.47|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER-LBD_training.zip|Tox21_ER-LBD_training]]| ^Tox21_ER-LBD_testing|[24]|287|2 |22.28|23.23|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER-LBD_testing.zip|Tox21_ER-LBD_testing]]| ^Tox21_ER-LBD_evaluation|[24]|599|2 |17.75|18.17|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_ER-LBD_evaluation.zip|Tox21_ER-LBD_evaluation]]| ^Tox21_HSE_training|[24]|8150|2 |16.72|17.04|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_HSE_training.zip|Tox21_HSE_training]]| ^Tox21_HSE_testing|[24]|267|2 |22.07|23.00|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_HSE_testing.zip|Tox21_HSE_testing]]| ^Tox21_HSE_evaluation|[24]|607|2 |17.61|18.01|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_HSE_evaluation.zip|Tox21_HSE_evaluation]]| ^Tox21_MMP_training|[24]|7320|2 |17.49|17.83|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_MMP_training.zip|Tox21_MMP_training]]| ^Tox21_MMP_testing|[24]|238|2 |21.68|22.55|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_MMP_testing.zip|Tox21_MMP_testing]]| ^Tox21_MMP_evaluation|[24]|541|2 |16.67|16.88|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_MMP_evaluation.zip|Tox21_MMP_evaluation]]| ^Tox21_p53_training|[24]|8634|2 |17.79|18.19|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_p53_training.zip|Tox21_p53_training]]| ^Tox21_p53_testing|[24]|269|2 |22.14|23.04|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_p53_testing.zip|Tox21_p53_testing]]| ^Tox21_p53_evaluation|[24]|613|2 |17.34|17.72|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_p53_evaluation.zip|Tox21_p53_evaluation]]| ^Tox21_PPAR-gamma_training|[24]|8184|2 |17.23|17.55|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_PPAR-gamma_training.zip|Tox21_PPAR-gamma_training]]| ^Tox21_PPAR-gamma_testing|[24]|267|2 |22.04|22.93|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_PPAR-gamma_testing.zip|Tox21_PPAR-gamma_testing]]| ^Tox21_PPAR-gamma_evaluation|[24]|602|2 |17.38|17.77|+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Tox21_PPAR-gamma_evaluation.zip|Tox21_PPAR-gamma_evaluation]]| ^TRIANGLES|[27]|45000|10|20.85|32.74|--|--|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/TRIANGLES.zip|TRIANGLES]]| ^TWITTER-Real-Graph-Partial|[26]|144033|2 |4.03|4.98|+|--|--|+ (1)|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/TWITTER-Real-Graph-Partial.zip|TWITTER-Real-Graph-Partial]]| ^UACC257|[28]| 39988 |2|26.09| 28.12 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/UACC257.zip|UACC257]]| ^UACC257H|[28]| 39988 |2|46.68| 48.71 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/UACC257H.zip|UACC257H]]| ^Yeast|[28]| 79601 |2|21.54| 22.84 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/Yeast.zip|Yeast]]| ^YeastH|[28]| 79601 |2|39.44| 40.74 |+|+|--|--|[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/YeastH.zip|YeastH]]| ^ | ||||||||| | ^//All Data Sets// | |||||||||[[https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/DS_all.zip|DS_all]]| R(N) are regression datasets with N tasks per graph. ==== File Format ==== The data sets have the following //format// (replace **DS** by the name of the data set): Let * n = total number of nodes * m = total number of edges * N = number of graphs - **DS_A.txt (m lines):** sparse (block diagonal) adjacency matrix for all graphs, each line corresponds to (row, col) resp. (node_id, node_id). //All graphs are undirected. Hence, DS_A.txt contains two entries for each edge.// - **DS_graph_indicator.txt (n lines):** column vector of graph identifiers for all nodes of all graphs, the value in the i-th line is the graph_id of the node with node_id i - **DS_graph_labels.txt (N lines):** class labels for all graphs in the data set, the value in the i-th line is the class label of the graph with graph_id i - **DS_node_labels.txt (n lines):** column vector of node labels, the value in the i-th line corresponds to the node with node_id i There are //optional files// if the respective information is available: * **DS_edge_labels.txt (m lines; same size as DS_A_sparse.txt):** labels for the edges in DS_A_sparse.txt * **DS_edge_attributes.txt (m lines; same size as DS_A.txt):** attributes for the edges in DS_A.txt * **DS_node_attributes.txt (n lines):** matrix of node attributes, the comma seperated values in the i-th line is the attribute vector of the node with node_id i * **DS_graph_attributes.txt (N lines):** regression values for all graphs in the data set, the value in the i-th line is the attribute of the graph with graph_id i ==== Deep Learning Libraries ==== The datasets can also be accessed using [[https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html|PyTorch Geometric]] and the [[https://docs.dgl.ai/api/python/data.html|Deep Graph Library]]. ==== Citing this Website ==== We encourage you to refer to our website at http://graphkernels.cs.tu-dortmund.de if you have used the data sets for your publication. Please use the following BibTeX citation: @misc{KKMMN2016, title = {Benchmark Data Sets for Graph Kernels}, author = {Kristian Kersting and Nils M. Kriege and Christopher Morris and Petra Mutzel and Marion Neumann}, year = {2016}, url = {http://graphkernels.cs.tu-dortmund.de} } If your bibliography style does not support the url field, you may use this alternative: @misc{KKMMN2016, title = {Benchmark Data Sets for Graph Kernels}, author = {Kristian Kersting and Nils M. Kriege and Christopher Morris and Petra Mutzel and Marion Neumann}, year = {2016}, note = {\url{http://graphkernels.cs.tu-dortmund.de}} } ==== Bibliography ==== [1] Debnath, A.K., Lopez de Compadre, R.L., Debnath, G., Shusterman, A.J., and Hansch, C. [[http://www.ncbi.nlm.nih.gov/pubmed/1995902|Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity]]. J. Med. Chem. 34(2):786-797 (1991). [2] Helma, C., King, R. D., Kramer, S., and Srinivasan, A. [[https://doi.org/10.1093/bioinformatics/17.1.107|The Predictive Toxicology Challenge 2000–2001]]. Bioinformatics, 2001, 17, 107-108. URL: [[http://www.predictive-toxicology.org/ptc|www.predictive-toxicology.org/ptc]] [3] Feragen, A., Kasenburg, N., Petersen, J., de Bruijne, M., Borgwardt, K.M.: [[http://papers.nips.cc/paper/5155-scalable-kernels-for-graphs-with-continuous-attributes.pdf|Scalable kernels for graphs with continuous attributes]]. In: C.J.C. Burges, L. Bottou, Z. Ghahramani, K.Q. Weinberger (eds.) NIPS, pp. 216-224 (2013). [4] K. M. Borgwardt, C. S. Ong, S. Schoenauer, S. V. N. Vishwanathan, A. J. Smola, and H. P. Kriegel. [[http://bioinformatics.oxfordjournals.org/content/21/suppl_1/i47.full.pdf+html|Protein function prediction via graph kernels]]. Bioinformatics, 21(Suppl 1):i47–i56, Jun 2005. [5] I. Schomburg, A. Chang, C. Ebeling, M. Gremse, C. Heldt, G. Huhn, and D. Schomburg. [[http://www.ncbi.nlm.nih.gov/pubmed/14681450|Brenda, the enzyme database: updates and major new developments]]. Nucleic Acids Research, 32D:431–433, 2004. [6] P. D. Dobson and A. J. Doig. [[http://www.ncbi.nlm.nih.gov/pubmed/12850146|Distinguishing enzyme structures from non-enzymes without alignments]]. J. Mol. Biol., 330(4):771–783, Jul 2003. [7] Sutherland, J. J.; O'Brien, L. A. & Weaver, D. F. [[http://www.ncbi.nlm.nih.gov/pubmed/14632439|Spline-fitting with a genetic algorithm: a method for developing classification structure-activity relationships]]. J. Chem. Inf. Comput. Sci., 2003, 43, 1906-1915. [8] N. Wale and G. Karypis. [[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4053093&tag=1|Comparison of descriptor spaces for chemical compound retrieval and classification]]. In Proc. of ICDM, pages 678–689, Hong Kong, 2006. [9] [[http://pubchem.ncbi.nlm.nih.gov]] [10] http://image.diku.dk/aasa/papers/graphkernels_nips_erratum.pdf [11] M. Neumann, P. Moreno, L. Antanas, R. Garnett, K. Kersting. [[http://www-kd.iai.uni-bonn.de/pubattachments/716/neumann2013mlg_grasping.pdf|Graph Kernels for Object Category Prediction in Task-Dependent Robot Grasping]]. Eleventh Workshop on Mining and Learning with Graphs (MLG-13), Chicago, Illinois, USA, 2013. [12] [[http://www.first-mm.eu/data.html]] [13] M. Neumann, R. Garnett, C. Bauckhage, and K. Kersting. [[http://link.springer.com/article/10.1007%2Fs10994-015-5517-9|Propagation kernels: efficient graph kernels from propagated information]].Machine Learning, 102(2):209–245, 2016 [14] Pinar Yanardag and S.V.N. Vishwanathan. 2015. [[http://dl.acm.org/citation.cfm?id=2783417|Deep Graph Kernels]]. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York, NY, USA, 1365-1374. [15] Francesco Orsini, Paolo Frasconi, and Luc De Raedt. 2015 [[http://www.ijcai.org/Proceedings/15/Papers/528.pdf|Graph invariant kernels]]. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI'15), Qiang Yang and Michael Wooldridge (Eds.). AAAI Press 3756-3762. [16] Riesen, K. and Bunke, H.: [[http://link.springer.com/chapter/10.1007%2F978-3-540-89689-0_33|IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning]]. In: da Vitora Lobo, N. et al. (Eds.), SSPR&SPR 2008, LNCS, vol. 5342, pp. 287-297, 2008. [17] [[https://wiki.nci.nih.gov/display/NCIDTPdata/AIDS+Antiviral+Screen+Data|AIDS Antiviral Screen Data (2004)]] [18] S. A. Nene, S. K. Nayar and H. Murase. [[http://www.cs.columbia.edu/CAVE/software/softlib/coil-100.php|Columbia Object Image Library (]]COIL-100), Technical Report, Department of Computer Science, Columbia University CUCS-006-96, Feb. 1996. [19] [[http://www.nist.gov/srd/nistsd4.cfm|NIST Special Database 4]] [20] Jeroen Kazius, Ross McGuire and, and Roberta Bursi. [[http://pubs.acs.org/doi/abs/10.1021/jm040835a|Derivation and Validation of Toxicophores for Mutagenicity Prediction]], Journal of Medicinal Chemistry 2005 48 (1), 312-320 [21] Christopher Morris, Nils M. Kriege, Kristian Kersting, Petra Mutzel. Faster Kernels for Graphs with Continuous Attributes via Hashing, IEEE International Conference on Data Mining (ICDM) 2016 [22] Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M. Borgwardt. 2011. [[http://www.jmlr.org/papers/volume12/shervashidze11a/shervashidze11a.pdf|Weisfeiler-Lehman Graph Kernels]]. J. Mach. Learn. Res. 12 (November 2011), 2539-2561. [23] Nils Kriege, Petra Mutzel. 2012. [[http://icml.cc/2012/papers/542.pdf|Subgraph Matching Kernels for Attributed Graphs]]. International Conference on Machine Learning 2012. [24] [[https://tripod.nih.gov/tox21/challenge/data.jsp|Tox21 Data Challenge 2014]] [25] Nils M. Kriege, Matthias Fey, Denis Fisseler, Petra Mutzel, Frank Weichert. [[http://proceedings.mlr.press/v88/kriege18a.html|Recognizing Cuneiform Signs Using Graph Based Methods]]. International Workshop on Cost-Sensitive Learning (COST), SIAM International Conference on Data Mining (SDM) 2018, 31-44, ''[[https://arxiv.org/abs/1802.05908|arXiv:1802.05908]]''. [26] [[https://github.com/shiruipan/graph_datasets|A Repository of Benchmark Graph Datasets for Graph Classification]] [27] Boris Knyazev, Graham W. Taylor, Mohamed R. Amer. [[https://arxiv.org/abs/1905.02850|Understanding Attention and Generalization in Graph Neural Networks]] [28] [[https://sites.cs.ucsb.edu/~xyan/dataset.htm|Chemical DataSets]] [29] [[https://arxiv.org/abs/1906.09427|Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models]] ==== Contact ==== If you have any questions regarding the data sets or are interested in adding your graph data, please write an email to christopher.morris{{:staff:at.gif|}}tu-dortmund.de.