Code comprehension

TL;DR

The neuroscience of code comprehension is the main focus of my PhD study, so I have a lot to say. If you decide to continue reading, welcome, and please hold on!

From an evolutionary perspective, programming and code comprehension should be mission impossible, but some people are doing it proficiently, how?

Computer programming is the building block of our information society. The COVID pandemic drives our reliance on information technology to a new height: we take classes and attend meetings via video call services like Zoom; we shop for groceries on Instacart, order dinner on Uber Eats, and have these platforms search for drivers to deliver our order; and perhaps at least during the very peak of the pandemic, we check once in a while the COVID-19 Dashboard developed by Johns Hopkins University. All these were made possible by the ingenuity of software engineers.

Given the importance of programming to the modern world, the percentage of people who code is surprisingly low. Anecdotal estimations suggest that less than 0.5% of the population in the world knows how to code. Introductory programming courses are popular among undergraduate students, and several interactive programming education websites like Codecadamy profit from people’s curiosity about programming. However, arguably only a small portion of these programming students turn out to master this skill. This observation may not be so surprising when we take the evolution of human species into account.

In comparison to the history of around 300,000 years of Homo sapiens, the history of programming is vanishingly short. In early 19th century, Joseph-Marie Jacquard invented a loom that wove patterns according to “programs” presented as punched cards. This loom inspired Charles Babbage to conceive of a general-purpose computer, the Analytical Engine. But it was not until mid-20th century, about a century after Ada Lovelace first considered the idea of algorithms, that such general-purpose computer was built and the allegedly first programming language “Fortran” was proposed. So, the history of programming is only a little more than a century, or at most two. From an evolutionary perspective, this is a very short duration, so short that our brains could not have time to evolve a specific mechanism to carry out this task.

We call such evolutionarily new behaviors “cultural inventions

Some other cultural inventions include reading/writing and symbolic math. Reading and math requires explicit training to master, so is programming and code comprehension in specific. For a pair of untrained eyes, a piece of programming code means nothing. After all, our brains did not evolve to support these newly developed behaviors, but we… well, at least some of us… are engaged with them nonetheless. How does this happen?

Cultural recycling: repurposing existing networks in the brain to accommodate new tricks

Two outstanding cognitive neuroscientists, Dr. Stanislas Dehaene and Dr. Laurent Cohen, proposed “cultural recycling” to be the mechanism underlying the apparently impossible feat of learning evolutionarily new abilities. According to their hypothesis, these new abilities developed because they share some superficial similarities with the abilities that we humans already possess thanks to evolution. Through explicit training and practice, the brain networks underling the old abilities are rewired, repurposed, or “recycled” to fully support the newly acquired ability.

For example, we can recognize squiggles, either imprinted on a clay tablet or jotted down on a piece of paper, as letters/characters because such squiggles triggers the activity in the part in the brain evolved for recognizing object contours (this part in the brain is called “visual word form area, VWFA”, interested readers may check out Dr. Dehaene’s book “Reading in the Brain” to learn more). As another example, our ability to do symbolic math is “recycled” from our more or less innate ability to approximate the number of objects. Regarding the cultural recycling for math, interested reader may refer to another book of Dr. Dehaene, “The Number Sense”. Another good reference is Dr. Andreas Nieder’s “A Brain for Numbers”.

The cultural recycling of reading and math is supported by rigorous scientific researches. Then, it shouldn’t be too much of an extension to speculate that programming, and code comprehension in particular, also recycles the mechanism of some evolutionarily ancient human ability. Suppose this is the case, then what is recycled for code comprehension?

The brain network for logical reasoning is recycled for code comprehension

As my PhD work in Johns Hopkins University, I study code comprehension, which is the single activity estimated to occupy more than 50% of the work hours of a programmer, and is crucial to other stages in software development. There are two major candidates for the brain network that gets recycled for code comprehension:

Language and logical reasoning

Programming languages, such as Python, Java, and C++, are similar to languages because they are all structured according to some kind of grammar. And that’s why these programming languages are called “languages” in the first place. On the other hand, understanding programming code is similar to solving logical puzzles because elements related to logical reasoning are made explicit in programming languages: keywords like IF and FOR, and the nested structures that require readers to keep track of the current state of objects/variables… they all seem to engage logical reasoning than language comprehension.

In my first experiment, sought to find out whether the brain network for code comprehension is more similar to the network for language comprehension or logical reasoning. I recruited expert programmers who were proficient in the programming language Python. I used functional magnetic resonance imaging (fMRI) to record the brain activity of these experts while they were reading very simple Python programs like these:

Left: a program containing a FOR loop that does the same task repeatedly; Right: a program containing an IF conditional whose behavior depends on whether some condition is met.

In a separate fMRI scan, I recorded the brain activity of the same experts while they were solving language comprehension and logical reasoning puzzles like these:

Each of these activities (code comprehension, logical reasoning, and language comprehension) resulted in a specific map of activation across the brain. I compared the similarities between these activation maps and found that the activation for code comprehension is more similar than that for logical reasoning than language comprehension.

Therefore, this is the first evidence I found which showed that the acquisition of code comprehension ability may recycle the neural mechanism underlying logical reasoning.

Left to right: the activation maps for code comprehension, logical reasoning, and language comprehension. The more yellow, the stronger (and statistically more significant) the activation. On the logic and language maps, blue gradients indicate the regions also activated by code comprehension. This overlap shows logical reasoning shares a great deal of activation with code comprehension, whereas language comprehension doesn’t. Only the left brain is shown in this figure. In these maps, the brain is facing us with its left surface. So, the front end of the brain points to the left of the figure, while the back end points to the right.

Programs with different categories of algorithms can be distinguished in the regions activated during code comprehension

However, a region getting activated during the presence of programming code doesn’t necessarily mean that it is engaged in code comprehension per se. Suppose there is an AI system claimed to recognize faces, but all it does is to shout out “This is a face!” when it sees a face, without the ability to tell you whose face it is. In this case, you might not want to call this thing a face-recognizing AI. The same goes with our code-comprehension brain regions. Applying another analyzing method, I aimed to show these regions care about not only computer programs, but the types of the programs. That is,

if a region is really responsible for code comprehension, it should care about the algorithm contained in the code.

As illustrated earlier in this page, the expert programmers saw two types of Python programs: one type of programs contain a FOR loop to carry out some operation repeatedly, while the other type of programs contain an IF conditional to judge the truthfulness of some statements and take different actions accordingly.

FOR loops and IF conditionals are called “control structures“,

which are the essential elements in programming languages. My hypothesis is that the brain regions responding to code responds differently to both types of control structures. And I will use multi-variate pattern analysis (MVPA) — which is given a fancier name these days: “machine learning” — to distinguish the two types of control structures based on the fine-grained activation patterns in these brain regions.

As shown in the figure in the previous paragraph, three regions comprise the brain network engaged during code reading: the lateral prefrontal cortex, the intra-parietal sulcus, and the posterior middle temporal gyrus (they are labelled in the figure below). In each of these regions, I collected its activations patterns during the presence of each Python program, so that I had a whole bunch of patterns for FOR loops, and another bunch of patterns for IF conditional. Then, I trained a machine learning classifier on these patterns, and ask whether the classifier can distinguish FOR patterns from IF patterns.

And it can! In all three regions, the classifier can correctly identify whether a previously unseen brain activation pattern resulted from a FOR loop or an IF conditional, with an accuracy greater than 60%.

Classification, or “decoding”, accuracy in the three regions active during code reading.

This accuracy was much higher than random guesses, which would be 50% because there were two categories. Actually, I also trained a classifier in the primary visual region, which shouldn’t care about the algorithms. And the classification accuracy in the primary visual regions was much lower (55%), only slightly better than random guess.

This finding suggests that this code-responsive network in the brain actually represents the algorithm contained in the code. In the computer, a program is stored as a script file on the hard drive. When the computer is about to run a program, the script is fetched from the hard drive and temporarily loaded to the random access memory, where the commands in the script are processed. It seems that the code-responsive network I found could be the brain counterpart of the random access memory. When programmers read a program, the program is transcribed as a script and temporarily stored in this network for its commands to be carried out.

Ongoing projects

Knowing where in the brain programming code is stored and processed may have an implication on programming education. I will seek to find out whether there is a causal relationship between programming aptitude and the level of activation, and/or the classification accuracy between different kinds of algorithms in this logical reasoning network.

Specifically, I’m working on some new projects to try to answer the following questions:

  • Does a student’s logical reasoning ability before taking a programming class predict their learning outcome?
  • Although language comprehension doesn’t activate the same network as code comprehension, could it still play a role in this behavior? If it does, what is this role?
  • In people with little or no programming experience, is their logical reasoning network engaged during the comprehension of algorithms written down with plain language?
  • Does the classification accuracy of different types of programs in the logical reasoning network increase when a person accumulates more programming experience?

In the future, I may introduce what I find (or fail to find) in separate posts on this website. Please stay tuned!

Related scientific publications

  • Liu, Y.-F., Kim, J., Wilson, C., & Bedny, M. (2020). Computer code comprehension shares neural resources with formal logical inference in the fronto-parietal network. eLife
  • Ivanova, A. A., Srikant, S., Sueoka, Y., Kean, H. H., Dhamala, R., O’Reilly, U.-M., . . . Fedorenko, E. (2020). Comprehension of computer code relies primarily on domain-general executive brain regions. eLife
  • Dehaene, S., & Cohen, L. (2007). Cultural recycling of cortical maps. Neuron
  • Monti, M. M., Parsons, L. M., & Osherson, D. N. (2009). The boundaries of language and thought in deductive inference. Proceedings of the National Academy of Sciences
  • Coetzee, J. P., & Monti, M. M. (2018). At the core of reasoning: Dissociating deductive and non-deductive load. Human Brain Mapping
  • Kanjlia, S., Lane, C., Feigenson, L., & Bedny, M. (2016). Absence of visual experience modifies the neural basis of numerical thinking. Proceedings of the National Academy of Sciences
  • Amalric, M., & Dehaene, S. (2016). Origins of the brain networks for advanced mathematics in expert mathematicians. Proceedings of the National Academy of Sciences
  • Siegmund, J., Kästner, C., Apel, S., Parnin, C., Bethmann, A., Leich, T., . . . Brechmann, A. (2014). Understanding understanding source code with functional magnetic resonance imaging. Paper presented at the Proceedings of the 36th International Conference on Software Engineering