Massive genome study informs the biology of reading and language

What is the biological basis of our uniquely human capacity to speak, read and write? A genome-wide analysis of five reading- and language-based skills in many thousands of people, published in PNAS, identifies shared biology contributing to these traits. Findings from previous smaller genetic studies were not replicated. The international team—led by scientists from the Max Planck Institute for Psycholinguistics and the Donders Institute in Nijmegen, the Netherlands—also uncovered genetic links with language-related brain areas.

The use of spoken and written language is a fundamental human capacity. “We have known for many years that individual differences in the relevant skills must be influenced by variations in our genomes,” says first author Else Eising from the Max Planck Institute for Psycholinguistics (MPI) in Nijmegen. “This is the first time that datasets of tens of thousands of participants have been gathered together to really reliably investigate the many DNA variants that contribute.”

The study represents the first output of the GenLang consortium, an international network of leading researchers interested in the genetics of speech and language. The consortium was founded by MPI director Simon Fisher, together with colleagues from multiple different countries.

The scientists were able to combine data from 22 different cohorts collected worldwide. While most participants were English speakers, some had other mother tongues (Dutch, Spanish, German, Finnish, French and Hungarian). The large sample sizes—up to 34,000 individuals per trait—are suitable to investigate the contributions of several million common DNA variants, each with tiny effect size, via methods that have been successfully applied to biomedical traits.

Reading and language skills

For each cohort, researchers had previously tested participants on a range of different reading- and language-related skills. Three of these skills involved reading aloud of words (horse) or pronounceable nonwords (chove) and spelling. A fourth skill was phoneme awareness, the ability to distinguish and manipulate speech sounds in words, assessed by asking people to delete sounds (“say stop without s”) or to create spoonerisms (“Paddington Bear—Baddington Pear”). Finally, in tests of nonword repetition, people are asked to repeat spoken nonwords of varying lengths and complexity (loddernapish), a task tapping speech perception, verbal short-term memory, and articulation.

DNA was also available for all the cohorts, enabling the GenLang team to carry out a so-called genome-wide association study (GWAS). The team used genetic correlation analyses to investigate whether the DNA variants involved in the five skills overlapped with each other—and with other cognitive and brain imaging traits. “If we can uncover the biological bases of skills involved in speaking and reading, we may learn more about how language evolved in our species,” explains Eising. “In addition, we can better understand why there are individual differences in these skills, even in societies where most people receive similar high quality education towards literacy and language.”

Reappraising the field

Results of the GenLang study showed that the five reading- and language-related traits are highly related at the genetic level, suggesting shared biological bases. While there was evidence of genetic overlaps with general cognitive ability (both verbal and nonverbal skills), correlations with nonverbal IQ were low.

The team did not replicate earlier findings from much smaller studies. “We suspect that quite a few of the previously reported candidate gene associations with reading- and language-related traits in studies with small samples reflect false-positive findings,” says Eising.

The researchers identified a genetic link with individual differences in the neuroanatomy of a language-related brain area, the left superior temporal sulcus. This brain region is known to be an important player (together with other areas) in the processing of spoken and written language. There was also a genetic link with parts of the DNA that play a regulatory role in the fetal brain.

Nature intertwined with nurture

“This research shows the considerable value of team science approaches for understanding molecular genetic contributions to complex human traits like language,” concludes Fisher. “The biology of reading- and language-related skills is highly complex. To develop these skills, exposure to language as well as education in reading are essential. Our work illustrates the intertwining of both nature and nurture in the development of language and literacy.”

“In the future, we hope to build on these efforts with genetically informative datasets covering a broader range of traits relevant for language, for instance including abilities related to grammatical processing. To more quickly and easily characterize reading and language skills in large groups of individuals, we will likely need development of tests that can be administered online, and this is a major focus of the GenLang consortium moving forward.”