How difficult is language learning? And to explain it?

How difficult is language learning? And to explain it?

Language is a vertiginously complex symbolic system, and yet humans all learn it, effortlessly, within a remarkably short period of time. How do they do this?

We just have language
As a PhD student, the question made me shrug my shoulders, and it still provokes some resistance at parties: “Huhh, is that what you use our tax money for? Can’t you find out some useful stuff? … an innovative fun and easy educational method to teach statistics to law and social science students?...Or, what’s going on in criminals’ brains?” Here, I would point to science funders’ strong commitment to knowledge valorization to reassure my fellow partygoers… But I can see their point: The intuition that we just have language is so inevitable that the scientific study of language learning seems like dedicating a whole discipline and half a century of research to the question of how we learn to ride a bike.. or to drink a glass of beer.

Who do you think has stolen the banana?
So what excites psycholinguists so much about this question? Here is our answer. Children don’t make mistakes with a sentence like: “Who do you think has stolen the banana?”, even if they have never heard that sentence before. They do not believe, for example, that the speaker is accusing them of having stolen the banana. This immediate and errorless understanding is not trivial, because the word you is closer to the word stolen than the word who, so linking you with stolen would be a straightforward, simple way to make sense of this stream of words. Moreover, children’s capability to parse the sentence cannot be explained by their knowledge and experience with language. The database of language input heard in a few years is much too small to derive the rules of grammar, as Chomsky pointed out 50 years ago.

Compare children to computers: These machines have a virtually infinite database of language at their disposal; we can feed them with the full corpus of Google in hundreds of languages. Yet, anybody who has ever communicated with a speech recognition computer system (like Siri or the Railways Time Schedule Information Line) knows how hopelessly poor they are at understanding. Ask Siri “Who do you think has stolen the banana?” and depending on your intonation, her answer will be either “I am not sure I understand?”, “I think therefore I am, but let us not put Descartes before the horse”, or “I really cannot say” . Notice that we cannot excuse Siri on the grounds that it does not know the context. The only thing it has to do is apply grammar to this particular sentence, i.e., identify the two subject-verb-object parts. Computers have an infinite knowledge of language, and no understanding; children have very little knowledge of language, and excellent understanding.

Simple sentences
Together with colleagues from Cornell University, Cambridge University in the UK, Georgia State University and Erasmus University of Rotterdam, we carried out a series of 6 experiments (including one online study) to unravel what is behind children’s fabulous parsing skills. We did not look at actual children: we created artificial miniature languages that had the same grammatical structure as the ‘banana’ sentence, and exposed hundreds of participants (from the Netherlands and the US) to hundreds of sentences in these languages. Then, we tested their parsing skills. We thus mapped the eight years of children’s language acquisition period onto a one-hour lab session. The main finding was as simple as it was remarkable: those participants who were ‘fed’ with language input that was ordered according to increasing complexity – i.e., starting small and simple, with short one-phrase sentences, and ending up complex, with sentences containing multiple phrases – could learn the system, while those learners who had seen the same sentences in random order could not. In 2011, we already demonstrated that a training set of sentences with complex sentences only was insufficient to induce any grammar from (Lai & Poletiek, 2011). So grammar learners need to hear simple sentences, and they need them at the start of a training that grows gradually in complexity.

What do we learn from this scientifically? When faced with complex sentences, the Starting Small group identified the building blocks –simple sentences- they had been exposed to previously. In the same way, children hearing the banana sentence recognize the separate simple building blocks: “what do you think?” and “who has stolen the [something]?”. Crucially, it is children’s ability to selectively retrieve the parts of speech from memory as simple events encountered previously, that drives the parse. Luckily, their database of sentences is small. If the database were Google-sized, it would be like looking for a needle in a haystack. The parser might retrieve the wrong strings, like “who do” and “think has stolen the banana…”, as there would just be too many building blocks available in the gigantic memory store. And that is exactly where things go wrong for computers: They cannot make a meaningful choice. So computers fail because they cannot retrieve selectively. We would need to tell them what to retrieve, or searching the database takes ages and ends in nonsense. Here is the issue: Computers have no Consciousness that decides about what is useful to retrieve. They are shy of an ‘I’. And… no I, no parsing.

What do we learn from this practically? That watching television may not be the best way for your young child to learn language. Rather, talking and focusing attention. Note, long sentences are no problem (as we found in our study), as long as the structure is simple. So: Cuty baby boy has a fluffy softy little sleepy teddy bear. But not: Baby boy that needs a sleep has a cuty teddy bear.

Poletiek, F.H., Conway, C.M., Ellefson, M.E., Lai, J., Bocanegra, B.R., & Christiansen, M.E. (2018). Under What Conditions Can Recursion be Learned? Effects of Starting Small in Artificial Grammar Learning of Recursive Structure. Cognitive Science, doi:10.1111/cogs.12685. Early Online.


Fenna Poletiek

Wonderful that you checked the paper. And I agree that the contribution of the starting small effect in learning linguistic complex structures is hard to overestimate. Also for learning them in Dutch!

Gerard Kempen

Ik heb jullie artikel in Cognitive Science erop nagelezen, en daarin lijken jullie inderdaad het definitieve bewijs te leveren dat "klein beginnen" een belangrijke, zo niet cruciale voorwaarde is voor het verwerven van recursieve grammatica's. Vindingrijke experimenten, duidelijke uitkomsten: Dit wordt een klassieker! Chapeau!


Interesting topic and very nice article. Being raised bilingually I always have wonderen how it is possible in the first place to learn a language, let alone two. This looks like thorough research with six experiments as basis for clear conclusions. Cool to make this available for not-specialsed readers. Thank you!

Fenna Poletiek

Good question. Evelyn (just like me originating from Friesland) has done brilliant studies with bilingual Dutch-Frisian children. Check her work on google scholar.

Gomez and Gerken (TICS,2000) make a similar observation as I do, that children to not analyze a sentence simply on the basis of distances between words: "...children never erroneously transform a statement like ‘The man who is tall is Sam’ into the question ‘Is the man who tall is Sam?’ (by moving the subordinate clause verb rather than the main verb to the front of the sentence). The lack of such errors has been taken as evidence that children never consider rules based solely on linear order in sentences, such as ‘move the first verb to the front of the sentence’. The computational and logical difficulties raised by these conflicting pressures have caused many researchers to conclude that language is not learnable by an unspecialized learning device...."

Evelyn Bosma

Nice blog and interesting paper! You write that children are able to understand sentences like “Who do you think has stolen the banana?”. Do you know who investigated this?

Fenna Poletiek

How wonderful that you give all this feedback; most of it from the outside-of-the-university-world! Thank you, also on behalf of my co-authors from Cornell, Cambridge, Rotterdam and Boys Town. This helps our thinking. Continue...
@vdb,.. It might be good to worry, indeed. But not about your daugthers language processing skills.
@Michael Strange, best test of science is to see how people are experiencing the things we study in labs. Moving.. and useful comment.
@David Bos and Annet Mooij, thanks for your tolerance with slow and-not-directly-useful science. I thought: being an old fashioned scientist with social media phobia does not exempt me of some communication.
@Evy: is my nice colleague next door working with primates (Do you think they are also shy of an I? Or do they have a bit of it?)

Annet Mooij

A great piece of solid research. I enjoyed reading it!

David Bos

Fascinating! And very well written. I particularly liked: "They are shy of an ‘I’."


My 14-year-old daughter A. would not be able to interpret your “who...banana”-sentence in any other way than: “So you$§$|£|¥§£! think I have stolen the f@##€##ng banana!” Is this not reaching the first step or having reached a next? Should I worry?

Michael Strange

I have recently been enjoying the language acquisition of my 2 year old grandson. What a joy that is! And how interesting that adults mainly...generally (by instinct?) "feed" language to a child in an ordered way that increases in complexity over time. A great synergy.