RAISE: Big Data Tools: From Bioinformatics To Materials Genomics

This is a project funded by NSF (1743418). Materials Genome (MG) Initiative for Global Competitiveness, inspired by the Human Genome Project, aims to accelerate the discovery, development, manufacturing, and deployment of new and advanced materials. It is generally accepted that the goals of the Materials Genome Initiative can only be successful if the efforts are data-driven. This project recognizes the unique potential of tools developed to process and analyze big data from Biological Genome (BG) to be applied to the Materials Science domain. Specifically, we translate approaches and tools first developed in the context of biological genomics to materials genomics where the combinatorial possibilities of the domain's fundamental "genomic" building blocks approach infinity.Start Date: August 1, 2017


Sanguthevar  Rajasekaran (University of Connecticut)
Rampi Ramprasad (Georgia Institute of Technology)


  • X. Xiao and S. Rajasekaran, PMGAN: A Novel Parallel Mix-Generator Generative Adversarial Network, Proc. The 27th International Conference on Artificial Neural Networks (ICANN18), Rhodes, Greece, October 4-7, 2018.
  • X. Xiao, C. Shang, and S. Rajasekaran, Atom Embedding: A Novel Paradigm for Predicting the Outcomes of Chemical Reactions, submitted for publication, 2018.
  • C. Kim, A. Chandrasekaran, A. Jha, and R. Ramprasad, Active-Learning and Materials Design: The Example of High Glass Transition Temperature Polymers, MRS Communications, 2019, doi:10.1557/mrc.2019.78.
  • A.L. Fergusaon, T. Mueller, S. Rajasekaran, and B.J. Reich, Conference report: 2018 materials and data science hackathon (MATDAT18), Mol. Syst. Des. Eng., 2019, 4, 462-468.
  • X. Xiao, Z. Wang, and S. Rajasekaran, CrystalNet - Crystal Property Prediction with Structural Information and Atom Embeddings, manuscript, 2019.
  • Z. Wang and S. Rajasekaran, Efficient Randomized Feature Selection Algorithms, Proc. 21st IEEE International Conference on High Performance Computing and Communications (HPCC), 10-12 August 2019, Zhangjiajie, China.
  • X. Cai, A.-Al Mamun, and S. Rajasekaran, Efficient Algorithms for Finding the Closest l-mers in Biological Data, IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 2018 Jun 4. doi: 10.1109/TCBB.2018.2843364.
  • X.Cai and S. Rajasekaran, and F. Zhang, Efficient Approximate Algorithms for the Closest Pair Problem in High Dimensional Spaces, Proc. The 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), June 3rd - 6th, 2018, Melbourne, Australia.
  • A.-Al Mamun, Z. Wang, X. Cai, N. Ravishanker, and S. Rajasekaran, Efficient Sequential and Parallel Algorithms for Estimating Higher Order Spectra, CoRR abs/1805.11775 (2018).
  • P. Xiao and S. Rajasekaran, Efficient exact algorithms for LDD motif search, ICCABS, 2017. An extended version has been accepted for BMC Bioinformatics, 2018.
  • P. Xiao, X. Cai, and S. Rajasekaran, Efficient Algorithms for Finding Edit-distance Based Motifs, Proc. 6th International Conference on Algorithms for Computational Biology, AICoB, Berkeley, CA, May 28-30, 2019, Springer LNBI 11488, pp. 212-223. doi.org/10.1007/978-3-030-18174-1\_16.