SIFTS database
The SIFTS database1 contains EC annotations for entries on the Protein Data Bank (PDB). Several models have been evaluated on this database, including IEConv2. I download the summary of the EC number(s) for each PDB chain that has been processed. In summary, there are 268,992 associations between 218,471 protein chains and 3,657 EC numbers.
-
Dana, J. M., Gutmanas, A., Tyagi, N., Qi, G., O’Donovan, C., Martin, M., & Velankar, S. (2019). SIFTS: updated Structure Integration with Function, Taxonomy and Sequences resource allows 40-fold increase in coverage of structure-based annotations for proteins. Nucleic acids research, 47(D1), D482-D489. ↩
-
Hermosilla Casajus, P., Schäfer, M., Lang, M., Fackelmann, G., Vázquez Alcocer, P. P., Kozliková, B., … & Ropinski, T. (2021). Intrinsic-extrinsic convolution and pooling for learning on 3D protein structures. In International Conference on Learning Representations, ICLR 2021: Vienna, Austria, May 04 2021 (pp. 1-16). OpenReview. net. ↩