Ancient scrolls are being ‘read’ by machine learning — with human knowledge to detect language and make sense of them
Using a non-invasive method that harnesses machine learning, an international trio of scholars retrieved 15 columns of ancient Greek text from within a carbonized papyrus from Herculaneum, a seaside Roman town eight kilometres southeast of Naples, Italy.
- Using a non-invasive method that harnesses machine learning, an international trio of scholars retrieved 15 columns of ancient Greek text from within a carbonized papyrus from Herculaneum, a seaside Roman town eight kilometres southeast of Naples, Italy.
- The work of reading and analyzing the new Greek and Latin texts recovered from the papyri will fall to human beings.
Buried in ash
- Like Pompeii, Herculaneum was buried by the catastrophic eruption of Mount Vesuvius in 79 CE.
- But in 1752, excavation uncovered hundreds of papyrus scrolls in the library of an elaborate Roman villa.
Carbonized papyri
Starved of oxygen, the intense heat of Vesuvius’ pyroclastic flow carbonized (but did not ignite) the papyri. Resembling lumps of coal to the eye, 18th-century excavators did not immediately recognize them as ancient books.
The papyri are so brittle that many were destroyed by early attempts to access their texts. Studying them has therefore always required ingenuity. In 1754, a conservator and priest at the Vatican library devised a machine for slowly unrolling them.
More recently, multispectral photography has dramatically improved their legibility. But until now, a non-invasive method that would leave the scrolls intact remained out of reach. Its development marks a significant breakthrough. McOsker notes there are 659 items in the catalogue listed as “not unrolled,” but some of these are parts of scrolls.
Sparking innovation
- The latter are essential as a reference point (or “control”) for innovative approaches.
- The competition’s design encouraged transparency and collaboration: data published in the pursuit of smaller goals benefited all competitors.
Text mentions music, taste, sight
- A PhD student studying machine learning, an engineer studying computer science and a robotics student were declared
the victors. - According to McOsker, the text they retrieved mentions music twice, as well as the senses of taste and sight.
Hundreds of rolls to be studied
- With hundreds of rolls yet to be studied, the new method of recovering the contents of the Herculaneum papyri is only getting started.
- The production of scans at sufficiently high resolution can’t be done via ordinary equipment, but requires access to a facility with a particle accelerator.
- Via current techniques, which involve a fair bit of manual manipulation, fully segmenting one scroll would cost US$1–5 million.
Critical minds needed
- Their role is to analyze the model’s output of legible ancient Greek — and in so doing determine which approaches are most effective.
- It requires mastery of ancient languages and ideas as well as the puzzle-solver’s ability to fill in the inevitable gaps.
- For the challenge truly to succeed, we’re going to need critical minds as well as whizbang technology.
C. Michael Sampson receives funding from the Social Sciences and Humanities Research Council of Canada for 'the Books of Karanis,' a project that studies fragmentary Greek literature from the Egyptian village Karanis.