Introduction to Digital Libraries
This tutorial is a thorough and deep introduction to the field, providing a firm foundation: covering key concepts and terminology, as well as services, systems, technologies, methods, standards, projects, issues, and practices. It introduces and builds upon a theoretical framework (‘5S’: Streams, Structures, Spaces, Scenarios, Societies), giving explanations of all the key parts of a ‘minimal digital library’, and expanding from that basis to cover key DL issues. Illustrations come from case studies and current projects, including with webpages, social networks, and long documents. Further, new material will be added on building digital libraries using container, serverless, and cloud services.
Introduction to and Hands-On Use Cases with HathiTrust Research Center’s Extracted Features 2.0 Dataset
This tutorial will introduce attendees to the HathiTrust Research Center’s Extracted Features Dataset, and demo new data fields and functionality introduced in the latest version, 2.0. Generated from the over 17 million volumes in the HathiTrust Digital Library, the EF 2.0 Dataset supports text and data mining methods while still adhering to a public domain, restriction-free data model. This tutorial will introduce the EF 2.0 Dataset, the key concepts behind its creation, and hands-on research use cases for the dataset using Python notebooks.
Systemic Challenges and Computational Solutions on Bias and Unfairness in Peer Review
Peer review is the backbone of scientific research and determines the composition of scientific digital libraries. Any systemic issues in peer review – such as biases or fraud – can systematically corrupt the resulting scientific digital library as well as any analyses on that library. They also affect billions of dollars in research grants made via peer review as well as entire careers of researchers. The tutorial will discuss various systemic issues in peer review via insightful experiments, several computational solutions proposed to address these issues, and finally a number of important open problems.