For someone who had little idea what computer science was when he entered college, Wright State University’s Matt Piekenbrock has come a long way.
The graduate computer science student just wrote a successful research proposal to Google, winning the financial support of the internet-technology giant in the highly selective Google Summer of Code 2017 competition.
Piekenbrock’s research project is designed to improve the ability of computers to think on their own, enabling them to unlock mysteries hidden in large amounts of data.
“Having a project selected by Google is a big deal. The competition is fierce, proposals are highly scrutinized, and the chances of selection are low,” said Derek Doran, assistant professor in Wright State’s Department of Computer Science and Engineering and leader of the Web and Complex Systems lab in Kno.e.sis, an Ohio Center of Excellence in knowledge-enabled computing.
Google Summer of Code is a global program focused on introducing students to open-source computer software development. Since its inception in 2005, the program has brought together almost 11,000 student participants and 10,000 mentors from over 113 countries worldwide.
“It means a lot,” Piekenbrock said of his selection and the fact that Google will fund his research. “I am thrilled.”
Piekenbrock grew up in Dayton. His father is a retired Dayton policeman and his mother works for the Community Blood Center.
Throughout middle school and high school at Chaminade Julienne, Piekenbrock became heavily involved in Science Olympiad, science-related competitions in events covering topics in physics, epidemiology, astronomy, chemistry, meteorology, engineering and other disciplines.
He enrolled at Wright State in 2010 to study environmental science, but in 2012 took a computer programming class and fell in love with computer science.
“A computer can solve any solvable problem in the universe,” he said. “If it can be done by a human, it can be done by a computer.”
At about the same time, Piekenbrock was introduced to statistics, which he believes holds the answers to everything from climate change to the outcome of presidential elections.
“It seems like the only people that are truly equipped to make really informed decisions on these topics are statisticians because they understand the mathematical uncertainty with respect to data,” he said. “Being a statistician to me is a way of being an expert in any field.”
Along the way, Piekenbrock got a research job at the Air Force Institute of Technology. His current work fits the Web and Complex Systems lab’s portfolio of projects that develop new machine learning algorithms and methods to model and understand complexity in logical systems and data.
In 2015, he earned his bachelor’s degree in computer science and minor in statistics from Wright State and is currently working on his master’s in computer science.
“I had been doing research on this one specific topic that I’ve become a little bit obsessed about,” Piekenbrock said. “So I wrote up a 14-page grant proposal and submitted it to Google Summer of Code 2017. It was very last-minute.”
Machine learning is the subfield of computer science that gives computers the ability to learn without being explicitly programmed. One type is supervised learning, in which computers are given data like photos of animals and “truth labels” telling the computer about the animal in the photo. Machine learning algorithms, based on the theory of statistics and optimization, enable the computer to gradually learn patterns that distinguish between the animals. In unsupervised learning, truth is not known and the computer is asked to infer truth statistically.
“What I pitched to Google is developing an algorithm that makes a computer more adept at learning structure automatically, having no prior knowledge,” Piekenbrock said.
His research project focuses on implementing an efficient R package that codifies recent developments in density-based clustering.
Density-based clusters are statistically significant areas of data that give more certainty about the structure of the underlying data manifold. R is the most popular open-source software tool for the analysis of data, and a package is a self-contained module making it easy for anyone to quickly use Piekenbrock’s work.
“I found this really cool theoretical paper, but no one could apply it to data because it was only theory,” Piekenbrock said. “I said I can program this, make it scalable, useful, open source and available for people to use. And Google liked it.”
So-called “deep learning” is made up of machines that think like humans. For example, one algorithm from deep learning was used to win a world-class poker tournament. However, these models are limited to very narrow tasks.
“The theoretical impact of my project is it gives you a model that is available for many different types of situations,” said Piekenbrock.
He will begin coding — putting information into a language that the computer can understand — on May 30 and must submit a report to Google on his findings by Aug. 30.
“I couldn’t have done it without the amazing guidance of my adviser, Dr. Doran, and my two mentors for the project — Dr. Michael Hahsler of Southern Methodist University and Dr. Mikhail Belkin of The Ohio State University,” Piekenbrock said.
Piekenbrock will continue his research after the Google project ends. And following graduation, he would like to pursue a Ph.D. in … computer science with a focus on statistical models.