Natural Statistics | Elizabeth M Clerkin

Real-world statistics at two timescales and a mechanism for infant learning of object names

Infants learn mappings between heard names and seen things before their first birthday and before they produce spoken language. Two challenges to explaining this early learning are the immaturity of infant memory systems and the infrequency of any individual object name in the heard language input. We quantified the frequency of visual referents, heard names, and the cooccurrences of referents and names in infant everyday experiences. We discovered statistical patterns at two timescales that align with a cortical mechanism of associative memory formation that supports the rapid formation of durable associative memories from very few experienced cooccurrences.

the Homeview Project

The Homeview project is a large corpus of infant perspective scenes (using head cameras) and audio in the home as infants 1 to 24 months of age go about their daily life. The corpus, with over 500 hours of head camera video promises new insights into the natural statistics of visual experiences

The everyday statistics of objects and their names

The frequency properties of visual objects and object names in daily life are fundamentally different, and they do not set up a rich co-occurrence structure. However, both modalities select for early-learned object names. The rarity of co-occurrences of early-learned object-name pairs in the mealtime context suggests that infants may be learning from minimal co-occurrence data as long as half of the pair (the object or its name) is highly frequent.

How everyday visual experience prepares the way for learning object names

A key question in early word learning is how infants learn their first object names despite a natural environment thought to provide messy data for linking object names to their referents. Using head cameras worn by 7 to 11-month-old infants in the home, we document the statistics of visual objects, spoken object names, and their co-occurrence in everyday meal time events. We show that the extremely right skewed frequency distribution of visual objects underlies word-referent cooccurrence statistics that set up a clear signal in the noise upon which infants could capitalize to learn their first object names.

The Developing Infant Creates a Curriculum for Statistical Learning

New efforts are using head-cameras and eye-trackers worn by infants to capture everyday visual environments from the infant learner’s point of view. From this vantage point, the training sets for statistical learning develop as the infant’s sensory-motor abilities develop, yielding a series of ordered data sets for visual learning that differ in content and structure between time points but are highly selective at each time point. These changing environments may constitute a developmentally ordered curriculum that optimizes learning across many domains.

Real-world visual statistics and infants' first-learned object names

We offer a new solution to the unsolved problem of how infants break into word learning based on the visual statistics of everyday infant-perspective scenes. The statistical structure of objects in these infant egocentric scenes differs markedly from that in the training sets used in computational models and in experiments on statistical word-referent learning. Therefore, the results also indicate a need to re-examine current explanations of how infants break into word learning.

How everyday visual experience prepares the way for learning object names

Infants learn their first object names by linking heard names to scenes. A core theoretical problem is how infants select the right referent from cluttered and ambiguous scenes. Here we show how the distributional properties of objects in young infants' visual experiences may help solve this core problem in early word learning. Infant perspective scenes of mealtimes were collected using head cameras worn by 9-month-old infants (147 mealtimes from 8 infants). The frequency distribution of objects was extremely skewed with the most frequent visual objects corresponding to the normatively first learned object names in English.