Only computers can be programmed

Programmed by the computer

Computer software written by computers: the idea is fascinating. And it's realistic, says Martin Vechev, professor of computer science. He is one of the founders of a new field of research in which computer scientists want to largely automate programming. There are already auxiliary programs that make software developers' work easier. And soon, thanks to such assistance programs, normal developers would be able to program as well as only the best experts are today, says Vechev. “In ten years, automation will be so advanced that computers can autonomously write short programs,” he predicts.

This is possible thanks to machine learning and thanks to the huge databases already available for software that are publicly accessible. Millions of computer programs are stored in public databases with a total of billions of lines of program code. "Big Code" is what Vechev calls this immense pool of program code. As a software developer, you quickly lose track of things. But computers can help to evaluate this unimaginably large amount of data and make it usable.

Computers can recognize patterns in existing code and learn which patterns are used in which context. In this way, they capture not only the individual characters and commands, but also their meaning and the rules for their use. The way in which computers learn these rules is comparable to that of language translation programs such as the well-known Google Translate. "These translation programs also use machine learning to analyze words in their context and draw conclusions about their meaning and use, as well as grammatical rules," explains Vechev.

Learning assistant

Future assistance programs for developers should work similarly to the completion functions that help us to compose text messages on smartphones today: For example, a software developer then writes the first hundred lines of code, his assistance program analyzes these and compares them with existing code in databases. Based on this, the computer makes suggestions for the continuation, which the developer can accept or reject. The computer also uses such feedback to understand the programmer's intentions and to continuously improve the suggestions.

The core of these assistance programs are so-called probability models. They are built up from a large number of available programs and program fragments. The assistance program uses the models to show the user the most likely continuations. Vechev and his team are constantly developing better probability models. Recently his group developed one - called PHOG - which is considered to be the most precise model currently available for evaluating code. The model works not only with programming languages, but also with natural language. In addition, unlike other models, it not only provides answers, but also makes the choice of these answers understandable for the user. “Anyone can use the PHOG model to develop assistance programs based on it,” says Vechev.

In action

The ETH professor and his team also develop such assistance programs, for example the two freely available online programs JS Nice and APK Deguard. These are some kind of code correction programs. Users can use it to check their programs and get suggestions on how the programs can be improved so that outsiders can better understand them. It can also be used to decipher the content of programs that were intentionally programmed to be difficult to understand, for example to disguise malware. To date, more than 200,000 developers and IT security professionals worldwide have used the JS Nice assistance program.

Last year, Vechev and his former doctoral student Veselin Raychev founded the ETH spin-off Deepcode. The company has set itself the task of creating new assistance programs for developers based on research from Vechev's laboratory. Other applications are also conceivable: Programs that find and correct programming errors.

“In the long term, we would like to develop software that can solve intellectually difficult programming challenges better than a human being,” says Vechev. «A few years ago we were among the first to set ourselves the goal of learning from 'Big Code'. Today, many colleagues and software companies are showing an interest. The research field is growing rapidly. "

This article appeared in the current issue of “Globe”.