Newspaper Navigator: Reimagining Digitized Newspapers with Machine Learning

Fri, May 15, 2020, 11:30 am to 12:00 pm
Location: 
Virtual
Sponsor(s): 
Princeton University Library

Presented by Ben Lee, Library of Congress Innovator-in-Residence

The 16 million digitized historic newspaper pages within Chronicling America, a joint initiative by the Library of Congress and the NEH, represent an incredibly rich resource for a wide range of users. Historians, journalists, genealogists, students, and members of the American public explore the collection regularly via keyword search. But how do we navigate the abundant visual content? Newspaper Navigator is a project that Ben is currently carrying out while an Innovator-in-Residence at the Library of Congress, in collaboration with Library of Congress Labs, the National Digital Newspaper Program, and Ben's Ph.D. advisor, Professor Daniel Weld, at the University of Washington. Newspaper Navigator consists of two parts. The first is to extract headlines, images, illustrations, maps, comics, and editorial cartoons from millions of newspaper pages by training an image recognition model on thousands of crowdsourced annotations collected by the Library of Congress’s Beyond Words initiative. The second part of Newspaper Navigator is to reimagine how we can navigate this wealth of visual content through an exploratory search interface, enabling users to define queries for concepts of their own choosing (referred to as “open faceted search”).

In this talk, Ben will share current progress with Newspaper Navigator, including running the visual content recognition pipeline at scale. Ben will also discuss how this project, including the resulting datasets and search interface, can contribute to both computer science research and research within digital humanities.

Co-sponsored by the Center for Digital Humanities. This event is part of the Virtual Research Resources Series

A zoom link will be provided after registration.

To request accommodations for this event, please contact pulcomm@princeton.edu at least 3 working days in advance. 

 

 


 

Upcoming Professional Development Events

COVID-19 and On-Campus Events

Princeton University is actively monitoring the situation around coronavirus (Covid-19) and the evolving guidance from government and health authorities, in keeping with our commitment to ensure the health and safety of all members of the University community. The latest communications from the Graduate School to graduate students are available here. The latest University guidance for all students, faculty, and staff is available on the University’s website.

Accessibility

To request accommodations for this or any event, please contact the organizer or James M. Van Wyck at [jvanwyck@princeton.edu,] at least 3 working days prior to the event.