Will they remember this next year? 5 key points for long-term learning

28 February 2018

Author: Niki Kaiser

Thanks to all who attended day 1 of our Long-term Learning course. We are really looking forward to hearing back from you on day 2 about the approaches you’ve begun to embed.

We started the day by considering research by Dunlosky on effective study approaches.


Dr Flavia Belham then introduced us to the neuroscience-based ideas about memory, and some recent research findings. Delegates can access the research papers she spoke about here.


This post summarises the key points we talked about during day 1, and includes relevant links to follow up on.

1) Memory model

Information is processed in the Working Memory, but unless we pay attention to it (and “rehearse” it), we undergo “catastrophic” loss, and it cannot be retrieved unless we re-encounter it.

We can use a very simple model to explain how memory works. In fact it’s so simple that we all drew it from memory after lunch (as a retrieval practice exercise), but an example can be found here


We encounter a huge amount of information all the time via our senses. If we paid attention to all of it, we would very soon be overloaded and unable to deal with any of it. For this reason, much of what we encounter is very quickly forgotten, unless we pay particular attention to it, and make a point of remembering it.

For example, when I asked everyone to recall various pieces of information, we could all remember things we’d deliberately made an effort to remember (like our phone numbers when we were younger) but we couldn’t remember other things, even though they were very recent (like the number of steps we’d just climbed to get to the second floor of the building we were in) or they’d happened multiple times during our lifetime (like the Queen’s birthday).

2) Working memory has a limited capacity, whereas long-term memory is potentially unlimited

Research from as far back as the 1950s suggests that working memory can only process a limited number of items. This might be as low as 4 or as high as 7, but the number is not definitely known.


Although there is a limit to the number of items that can be processed by the Working Memory, the items themselves can be very complex. An example of how we process more complex items is “chunking”.

For example, which of these number patterns can you remember most easily?




I grew up in Hastings, played in an Orchestra and read a dystopian novel by George Orwell for GCSE English, so the second set jumps out at me as: 1066, 1812 and 1984. These three numbers are easy for me to remember, because I have “chunked” the list.

We build “schema” in our long-term memory by using our prior knowledge to make links between items. For example, we know that these are all trees, even though the individual images are all very different. Schema help us to organize, process and use information.


3) Cognitive Load theory

Working memory is limited, and research suggests that it can’t be “trained” in a way that’s useful or transferable. So if we reduce the cognitive load that we place on Working Memory, this optimises transfer to long-term memory and aids long-term retention (Sweller, 1998).

Cognitive Load Theory takes account of 3 types of load, and we should try to reduce the first two as much as possible.


Extraneous cognitive load is anything that occupies the Working Memory, but doesn’t contribute to long-term retention. It depends on the way things are presented, and it’s increased when we try to multi-task, when we’re distracted or when we’re presented with reduntant information. We took a lighthearted look at some of the “crimes” we might commit (as teachers).

You don’t need to avoid these things completely, just think about how, when and why we do them.

Screen Shot 2018-02-28 at 11.09.49

Intrinsic cognitive load is the inherent challenge of any material to be learned. It depends on both the material’s complexity and the lerner’s background knowledge. So learning the meaning of an isolated word would have low intrinsic cognitive load, but completing a quadratic equation would be easier for someone with a strong maths background than for another person who is a less experienced mathematician.

Before the next session, we are going to read Rosenshine’s Principles of Instruction, and use Adam Boxer’s lesson checklist to analyse a lesson we teach. In the next session, we’ll think more about germaine load (and why this is not something we seek to reduce) and Bjork’s “desireable difficulties”.

4) Dual Coding

Dual Coding is one of the six effective study strategies identified by the Learning Scientists. You can read more about the research behind these strategies in their recent article.

Vicki Barnett recently wrote a long, informative (and very readable) post about Dual Coding, and in the afternoon of Day 1, she demonstrated how words and visuals can enhance learning. She spoke to the whole group about a complex family tree, but asked half of them to turn their backs on her! Those with the visual prompts and clues followed the story and retained information much better than those without them.


We had previously sketched diagrams represnting the key points from Flavia’s talk. We discussed how this approach can be used as a basis for elaboration and retrieval practice.


A key point about Dual Coding is that visual and auditory prompts only take up one “slot” in our Working Memory, as long as they are complementary (otherwise they just add to extraneous load).

5) Retrieval practice and space learning

Retrieval practice is the act of bringing previously-learned information to mind, and it has been shown (by a number of studies) to be more effective than simply re-studying material. This review summarises the research, and offers practical applications (as well as summarising current limitations) for the approach.

For example, we know that it is more effective to retrieve 72% of information correctly than is is to re-study 100% of it. Even though the re-studied material will contain 100% correct information, this won’t be remembered as effectively as the 72% of correct information that’s been retrieved.

We can further increase retention by spacing study and retrieval over time. And it appears that leaving time to forget is just as important as allowing time for retrieval and practice.


Flavia Belham demonstrated how these principles have been used to increase retention at KS4 using a (free) learning platform from Seneca Learning. You can try the demo for yourself here (must be accessed via Google Chrome) and read about their randomised control trial (RCT) with 1,120 Year 9 students in the latest issue of IMPACT, the journal from the Chartered College of Teaching.

Thank you to all who attended and contributed to the day. We look forward to seeing you again in April.

Posted on 28 February 2018
Posted in: Blog
Tags: , , , , ,

Comments are closed.