Recently we have proposed a mechanism of associative information retrieval that explicitly takes into account long-term neuronal representations of memory items (Romani et al., 2013). One of the basic predictions of the model is the existence of “easy” and “difficult” words.
This prediction was verified in our analysis of a large dataset of free recall experiments collected in the lab of Michael Kahana, where we showed that the probability of words to be recalled are consistent between arbitrarily chosen groups of subjects (Katkov et al., submitted). The natural question posed by these observations is what features are predictive for the word difficulty in recall experiments, in particular what if any is the contribution of the word length.
Most of the previous studies of word length effect used lists that were specifically composed of either short or long words. In two previous studies where lists composed of alternating short and long words were used, no word length effect was observed (Hulme et al., 2004; Jalbert et al., 2011).
Our current contribution uses free recall paradigm and is based on a much larger dataset than previous studies. We report that when words are selected randomly, irrespective of their length, long words are recalled better than short ones, in a seeming contradiction to classical word length effect in both serial and free recall (Baddeley et al., 1975; Russo and Grammatopoulou, 2003; Tehan and Tolan, 2007; Bhatarah et al., 2009). We provide a possible resolution of this contradiction in the framework of the associative retrieval model of (Romani et al., 2013).
Materials and Methods
Experimental Methods
The data reported in this manuscript were collected in the lab of M. Kahana as part the Penn Electrophysiology of Encoding and Retrieval Study (see Miller et al., 2012 for details of the experiments).
Here we analyzed the results from the 141 participants (age 17–30) who completed the first phase of the experiment, consisting of seven experimental sessions. Participants were consented according the University of Pennsylvania’s IRB protocol and were compensated for their participation. Each session consisted of 16 lists of 16 words presented one at a time on a computer screen and lasted approximately 1.5 h. Each study list was followed by an immediate free recall test.
Words were drawn from a pool of 1638 words. For each list, there was a 1500 ms delay before the first word appeared on the screen. Each item was on the screen for 3000 ms, followed by jittered 800–1200 ms inter-stimulus interval (uniform distribution).
After the last item in the list, there was a 1200–1400 ms jittered delay, after which the participant was given 75 s to attempt to recall any of the just-presented items. All trials were used; intrusions and repetitions were removed from trials.
The Model
We assume that each word is represented by a randomly chosen population of neurons in the dedicated memory network. We further assume that each retrieved item acts as an internal cue for the next one according to similarity measure between items, which is defined as a the size of the intersection between the corresponding populations (the number of neurons that represent both items).
Following (Romani et al., 2013), we consider the retrieval process that is directly determined by memory representations of the items, without explicitly simulating network activity.
The dynamics of the retrieval is described by a sequence of recalled items. The first one is randomly chosen among the presented ones, and each subsequent recalled item chosen to be the one that has a maximal similarity to the currently recalled one, not counting just “visited” item (Romani et al., 2013). The recall is terminated when the retrieval process enters a cycle and no more items can be retrieved.
To mimic the experimental protocol (see above), we generated W = 1638 random binary patterns of length N: {ξwiξiw = 0; 1} with w = 1, … , W; i = 1, … , N indicates the neurons in the network, such that ξwiξiw = 1 if neuron i is participating in the encoding of the memory item w.
The similarity between items w and w′ is then computed as Sww′=∑Ni=1ξwiξw′iSww′=∑i=1Nξiwξiw′. The pattern components for each item were drawn independently with the probability pw of ξwiξiw = 1 chosen in the following way: each pattern was arbitrarily assigned a syllabic length lw = 1…4 such that the distribution of lw across the patterns matched the corresponding distribution across the words used in the experiment (five words with syllabic length larger than four were combined with those of length four). For patterns with given lw, corresponding pw were equidistantly distributed from 0.02 − 10−3lw to 0.02 + 10−3lw.
With this choice of pattern statistics, the average number of neurons representing a given item does not depend on its syllabic length, whereas the variance is increasing with syllabic length. The word representations were then fixed throughout the simulated experiment.