A team from the Department of Chemistry took first place among 12 teams in a cross-campus computational science competition at the National Center for Supercomputing Applications.

Three graduate students, Jason Wu, Shruti Iyer, and Seonghwan Kim, and postdoctoral researcher Zheng Yu won the Ashby Prize in Computational Science Hackathon on April 23.

“This is a cross-campus hackathon sponsored by the NCSA,” said Prof. Nick Jackson, who encouraged the four members of his research group to enter the contest. Wu and Kim are co-advised by Prof. Charles Schroeder in Materials Science and Engineering.

“Remarkably, they managed to win first place despite not having any computer scientists on their team,” Jackson said.

The competition is co-organized by the Center for Artificial Intelligence Innovation at the NCSA on the University of Illinois Urbana-Champaign campus. The main goal of the hackathon is to let talented U. of I. students showcase their skills in a friendly competition while working on challenging problems involving computational science and machine learning using state-of-the-art computational systems at NCSA.

The competition takes place at the NCSA over a 48-hour period as teams utilize the state-of-the-art systems to execute their project and then present their final work two days later during the NCSA Student Research Conference.

Hackathon teams were challenged to build a front-end workflow management system using Large Language Models (LLMs) and related tools to setup and execute computational workflows. Students were provided with access to the Delta supercomputer and provided LLM access/credits.

The competition began with more than 50 participants across 12 teams. Each team was required to have at least one student from the Computer Science Department, but the chemistry team petitioned the organizers for an exception to this rule, which was granted.

Nine of the 12 teams successfully completed the hackathon and presented their results.

The chemistry team created a system called “Mol-Hunter: an artificial intelligence agent for automated molecular discovery and synthesis.” Their workflow included machine learning predictions, molecular dynamics simulations, quantum mechanical calculations, literature search, a database application programming interface, and a retrosynthetic reaction network search.

Seonghwan Kim, a third-year materials science and engineering graduate student in the Jackson and Schroeder groups, said the team was motivated to be in the competition because they all wanted experience utilizing LLMs for automated molecular discovery in their research. Large Language Models, like GPT4, Kim explained, are having huge impacts in chemistry, so many researchers are trying to use LLMs for automated lab work.

“That’s what motivated us, and we enjoyed learning lots of large language models during the Hackathon and communicating with a lot of computer science students and professors,” he said.

Iyer, a second-year chemistry graduate student in the Jackson group, said the overall goal was coming up with a computational workflow that could predict the physical properties of a particular molecule and then generate a synthesis pathway for that molecule.

In the Jackson Lab, Kim said the team has gained experience using machine learning in chemistry research, so they could seamlessly transfer those skills to this project, combining machine learning predictions and molecular simulations with LLMs to try to build an autonomous workflow.

One of the challenges they had to overcome, said Wu, a first-year chemistry graduate student in the Jackson and Schroeder labs, was not being very familiar with LLMs. So, the competition was a bit of a LLM crash course for the team members, who learned a lot in just two days with technical support from NCSA staff.

Yu said it was just a really great learning environment for them.

“This is an area that we don't have high expertise in, but the fact that we were all working on it together and then the environment of only having 48 to 72 hours to complete a huge project like this, all of that together really fostered that good productive sort of working environment,” Yu said. “We went in there the first day knowing nothing about how to program the language models, but by the end, not only did we have this project, but we also learned all the tools that can be helpful to our future research.”

The team said that their project as well as the knowledge they gained will factor into research in their lab. In particular, the team members in the Jackson lab are interested in extending the Mol-Hunter workflow to their work on the Open Macromolecular Genome – an open polymer property database to enable generative AI in the chemical sciences.

“We think that the Mol-Hunter project could represent a new paradigm for interfacing non-expert chemical scientists with advanced machine learning architectures via chatbot-like functionality enabled by large language models,” Jackson said.

And Yu said this project could be a springboard for his career.

“I'm trying to get a job in academia, and I think this large language model can be one direction for me to work on in the following 10 years,” he said.

Kim said he really enjoyed the collaboration process with his teammates.

“This Hackathon doesn't have specific individual tasks, so we had to discuss together what to do together,” he said. “That was a fun process.”

Iyer said she also enjoyed collaborating with her teammates and the learning process.

“My part of the project was something I think a computer science student would have been able to do much quicker, because it was very heavy on data structures and algorithms. And that's not something I have studied, but being able to figure that out and implement it to the problem that was satisfying for me personally,” she said.


Four researchers stand side by side in front of a blue wall
Researchers from the Jackson Lab, from left, Seonghwan Kim, Shruti Iyer, Jason Wu and Zheng Yu, won the Ashby Prize in Computational Science Hackathon at the National Center for Supercomputing Applications.