Love Data Week 2024

In this section

Love Data Week

My Kind of Data

February 12, 2024 – February 16, 2024

In its fourth year, Johns Hopkins Libraries are hosting Love Data Week (February 12-16, 2024), an international celebration of data. This year’s theme is My Kind of Data.

JHU Love Data Week 2024 is made possible by:

JHU Love Data Week 2024 is organized by the Johns Hopkins Libraries: Sheridan Libraries, Welch Medical Library, Arthur Friedheim Library, APL Library, and SAIS Europe Library.

Love Data Week is presented in partnership with the Institute for Data-Intensive Engineering and Sciences (IDIES), Stavros Niarchos Foundation SNF Agora Institute, and Alexander Grass Humanities Institute (AGHI).

Contact

If you require a reasonable accommodation to enjoy and participate in these events, or if you have questions or feedback, please direct your inquiries to dataservices@jhu.edu.

Schedule

Day 1 - Monday, February 12th

Time	Topic	Presenter
10:00am Public	Welcome Remarks and Introduction	Elisabeth Long, Sheridan Dean of University Libraries, Archives, and Museums, Johns Hopkins University
10:10 – 11:00am Public	Keynote Presentation (Slides and Recording) Environment, Health and Justice: The Power of Community and Inter-disciplinary Science This talk will discuss the role of addressing environmental exposures to improve people’s health and advance environmental justice with a focus on communities that are affected by hazardous metal exposures from abandoned mines in the Northern Plains and the Southwest. Different types of data, the synergistic impact of multi-disciplinary approaches and interventions, and the role of traditional knowledge and data sovereignty will be presented.	Ana Navas-Acien, Professor of Environmental Health Sciences, Columbia University
11:00am – 12:00pm Public	Session Talks: Ethics in Data Visualization (Recording) A deep dive into how you can make your data visualizations inclusive and accessible. Do No Harm with Data Visualization (Alice Feng) (Slides) In today’s increasingly diverse world, data communicators must ensure that the way they analyze and present their data upholds the values of diversity, equity, and inclusion (DEI) and avoids perpetuating stereotypes, biases, or other types of harm. Taking a DEI perspective means considering how the specific lived experiences and perspectives of the people and communities we are studying, as well as our readers, will perceive the way we communicate data and information. This talk will offer a set of starting points on how to apply an equity lens to the way we visualize data, including considerations surrounding the use of language, color, icons and imagery, ordering, and more. Participants should walk away from this talk with an understanding of how to create and evaluate data-driven visuals that embody a DEI approach. Data Visualizations for Everybody – A Lesson on Accessibility (Nancy Shin) (Slides) Ever wondered how to make your data visualizations appealing to everybody? Discover and learn the do’s and don’ts of data visualizations as it relates to accessibility. Effective data visualizations should be more than just something “pretty” for the select few; effective visualizations should be functional and accessible to everybody! Learn the basics of accessible data visualizations in this 15-minute talk. No prior knowledge required.	Moderator: Pete Lawson, Data and Visualization Librarian, Johns Hopkins University Data Services Alice Feng, Visual Journalist, Axios Nancy Shin, Scholarly Communications Librarian, Johns Hopkins University Welch Medical Library
12:00 – 2:00pm	Break
2:00 – 3:30pm JH-Only (JHED Required)	Panel Discussion The challenge of balancing biomedical data sharing and patient privacy protection at Johns Hopkins (Recording) As funders and journals continue to expand their policies encouraging the sharing of research data and code, researchers in the biomedical field are increasingly needing to find ways to share their human participant research without violating any legal or ethical constraints. Accomplishing this balance of sharing and protecting sensitive data can be challenging and often researchers are not sure how to do it and who to turn to for help. This panel brings together JHU administrators and researchers to discuss policies and procedures around the sharing of sensitive data and what resources are available to researchers at Johns Hopkins to help meet this need. Moderator: Anne K. Seymour, Associate Dean, Welch Medical Library, Johns Hopkins Medicine, School of Nursing, Bloomberg School of Public Health Tom Burns, JD, MBA, Associate Dean for Research Affairs, Johns Hopkins University School of Medicine Paul Nagy, PhD, FSIIM, Director of Education, Biomedical Informatics and Data Science, Johns Hopkins University Schools of Medicine, Public Health, and Engineering Stuart C. Ray, M.D., Professor of Medicine and Oncology, Vice Chair of Medicine for Data Integrity and Analytics, Johns Hopkins University School of Medicine, Department of Medicine, Division of Infectious Diseases Megan Kasimatis Singleton, JD, MBE, CIP, Associate Dean for Human Research Protections and Director of the Human Research Protections Program, Johns Hopkins University School of Medicine

Day 2 - Tuesday, February 13th

Time	Topic	Presenter
10:00 – 11:00am Public	Webinar Introduction to Open Scholarship and Overview of Love Data Week (Slides and Recording) What is open scholarship, and why is it important to modern scientists? This webinar will cover how to conduct your research openly and reproducibly, the benefits to you and your research community, how to get started, and showcase examples of good open scholarship. At the end of the webinar, we will provide resources to help you improve the openness of your research. In addition, we will give an overview of this week’s Love Data Week events.	Chen Chiu, Senior Data Management Consultant, Johns Hopkins University Data Services
11:00am – 12:00pm JH-Only (JHED Required)	Workshop Reproducible Research with Quarto and JupyterLab (Recording) This session will overview Quarto and JupyterLab, two popular tools for integrating code, documentation, and text into reproducible research projects. This workshop will teach you the basics of using Quarto and Jupyter Notebooks to improve the legibility of your code in R, Python, or other programming languages, focusing on features that help you create reproducible research and code for technical and non-technical audiences.	Lubov McKone, Data Management Consultant, Johns Hopkins University Data Services Pete Lawson, Data and Visualization Librarian, Johns Hopkins University Data Services
12:00 – 1:00pm	Break
1:00 – 2:00pm Public	Workshop Data for Studying the Quality of Elections and Democracy Scholars of democracy have witnessed a worldwide surge in phenomena undermining it in recent years, sounding the alarm most recently over Trump and his partisans’ disinformation campaign to undermine electoral integrity and foment insurrection. In this session, I will introduce how scholars measure the quality of elections and liberal democracy and the use of several publicly available datasets. In looking at these data, we will see how they reflect (or fail to reflect) democracy as we might conceive of it, and also how they can aide us in grasping changes, fast and slow, born out of sociopolitical, economic, and technological forces.	Bryce Corrigan, Senior Statistician and Lecturer, SNF Agora Institute, Johns Hopkins University
2:00 – 3:30pm JH-Only (JHED Required)	Panel Discussion Open Scholarship at Johns Hopkins: Experiences and Advice from Our Researchers (Recording) This panel brings JHU researchers from various disciplines to discuss open scholarship at JHU. Panelists will share their open science journeys and advice for young scholars who are interested in making research open and reproducible. Moderator: Lubov McKone, Data Management Consultant, Johns Hopkins University Data Services Pete Aceves, Assistant Professor, Management and Organization, Johns Hopkins University Carey Business School Netz Arroyo, Associate Professor of Pharmacology, Johns Hopkins University School of Medicine Hunter Gehlbach, Professor of Education, Johns Hopkins University School of Education John Muschelli, Associate Scientist, Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health Joshua Vogelstein, Associate Professor, Department of Biomedical Engineering, Johns Hopkins University Whiting School of Engineering

Day 3 - Wednesday, February 14th

Time	Topic	Presenter
10:00am – 11:00am Public	Workshop Navigating HathiTrust with Python (Recording) When working with a digital repository, a fundamental research concern is being able to construct a subcorpus. HathiTrust is currently the largest open collaborative repository that houses 18+ million volumes, offering remote and open access to scholars who work with these sources in a variety of disciplines. However, finding specific subcorpora can be really challenging in such a repository, due to labeling errors, OCR errors and more. This workshop covers a research workflow for programmatically interacting with HathiTrust that can be applied to a variety of tasks, including finding subcorpora within HathiTrust that aren’t directly or cleanly labeled by the metadata.	Hale Sirin, Post-Doctoral Researcher, Johns Hopkins Center for Digital Humanities
11:00am – 12:00pm Public	Workshop Exploring Archives with OCR (and LLMs) (Slides and Recording) The development of Large Language Models like ChatGPT has made it increasingly easy to get started with rendering printed historical texts machine readable—the stumbling block for many digital humanities projects. This workshop will introduce participants to the basics of the practice through an easy-to-learn and easy-to-customize workflow, enabling participants to turn pictures into searchable documents. No coding experience is necessary!	Sam Backer, Postdoctoral Fellow, Johns Hopkins Center for Digital Humanities Louis Hyman, Dorothy Ross Professor of History, Johns Hopkins University
12:00 – 1:00pm	Break
1:00 – 2:00pm JH-Only (JHED Required)	Workshop Organizing and Sharing Qualitative Data via the Qualitative Data Repository (QDR) (Slides and Recording) In this webinar, researchers will learn about options for sharing qualitative data, with a focus on the Qualitative Data Repository, of which JHU is an institutional member. QDR’s Senior Curation Specialist will show how the deposit process works and will present examples of nuanced data sharing possibilities, including of sensitive data, securely shared under varying levels of access control. More broadly, researchers will gain a better understanding of qualitative data sharing norms and practices in the social and health sciences, and of simple steps they can take to manage qualitative research materials during the course of projects, which help them stay organized in their own work and enable future archiving and re-use of the materials they so carefully collect. This webinar is designed for anyone who uses qualitative or multi-method techniques in their research — such as faculty, graduate students, and independent researchers, and the staff who support them — and is especially timely for those applying for federal or nonprofit funding who need to prepare a data management / sharing plan (DMP or DMSP) for their qualitative or multi-method research projects. **Please note, not all qualitative data is suitable for QDR due to legal and ethic constraints. Always work with your IRB to determine whether data sharing in QDR is appropriate	Dessi Kirilova, Senior Curation Specialist, Qualitative Data Repository
2:00 – 3:30pm Public	Panel Critical Approaches to Data Sharing (Recording) This panel will highlight justice-focused initiatives that enact data sharing and analysis in ways that disrupt the power dynamic between the researcher and the researched. Topics of discussion will include data sharing for accountability, participatory or community-based research, or ethical approaches to collecting and sharing and data that represents marginalized experiences. Moderator: Kevin S. Hawkins, Program Director for the Opioid Industry Documents Archive, Johns Hopkins University Parker E. Foster, Research Associate, New York University Metro Center, Center for Policy, Research, and Evaluation Sofia Locklear, Assistant Professor, University of Toronto Mississauga Lauren Rubin, Director of Development, St. Francis Neighborhood Center

Day 4 - Thursday, February 15th

Time	Topic	Presenter
10:00 – 11:00am JH-Only (JHED Required)	Session Talks: An Introduction to Finding Data across Multiple Disciplines (Recording) We invite experts from Sheridan Libraries and Welch Medical Library to present how to search and access data in various academic disciplines. Business Location Data via the Sheridan Libraries (Jim Gillispie) When we think about places in our local community what often comes to mind are those anchoring businesses where we shop and connect with our neighbors. In studying communities, researchers recognize that business type and location have the potential to add or detract from the vitality of the surrounding area. With the availability of library licensed business location databases, it is now easier than ever to take a closer look at changes in business patterns over time and then visualize that business data in combination with other demographic factors. This presentation will focus on business location sources readily available via the Sheridan Libraries. Data for Your Business Plan or Pitch (Katy Troeschel) Want to see if your business idea or invention make sense? Learn where to find the stats that demonstrate your market size and value for your idea. Then we’ll explore tools and templates to help you compile the information you find. A Brief Introduction to dbGaP: An Archive of Genotype/Phenotype Data from NCBI (Rob Wright) Using the listing of NIH-supported data repositories as a starting point, I’ll focus on the Database of Genotypes and Phenotypes (dbGaP), which is a core repository for human genotype/phenotype studies. I’ll describe dbGaP, quickly work through a searching example, and highlight the contents of a dbGaP record, including the section on requesting controlled-access genomics data.	Moderator: Bonni Wittstadt, Geospatial Services Librarian, Johns Hopkins University Data Services Jim Gillispie, Social Science Librarian, Johns Hopkins University Sheridan Libraries Katy Troeschel, Technology & Entrepreneurship Librarian, Johns Hopkins University Sheridan Libraries Rob Wright, Basic Science Informationist, Johns Hopkins University Welch Medical Library
11:00 AM – 12:00pm Public	Workshop Version Control: Using Git and GitHub (Slides and Recording) Do you use different file names to track versions of your files? Are you wondering why everyone else is using GitHub? Do you comment out a chunk of code to test out code or debug? Do you want a system that can do version control automatically for you? If you answer yes for any of the questions above, then you should take this session to learn version control with Git and use GitHub as a git-hosting platform. We use a graphic user interface (GUI) tool, GitHub Desktop, to get your started with using Git and GitHub. No knowledge of command lines is needed for this session.	Chen Chiu, Senior Data Management Consultant, Johns Hopkins University Data Services
12:00 – 1:00pm	Break
1:00 – 2:00pm JH-Only (JHED Required)	Session Talks: Open Scholarship Support at JHU (Recording) Representatives from three JHU offices to present open scholarship supports they each provide to JHU researchers. An Overview of Data Services and the Institutional Repository (Chen Chiu) (Slides) Data Services helps JHU researchers navigate data. We help faculty, researchers, and students find, use, manage, visualize, and share data. In addition, we operate an institutional repository, Johns Hopkins Research Data Repository, that JHU researchers can deposit and share their research data in it. Here, we provide an overview of services that we provide and a brief introduction to how to share data in our institutional repository. Introducing the Johns Hopkins Open Source Programs Office (Megan Forbes) In this talk, we’ll delve into the world of Johns Hopkins University’s Open Source Programs Office (OSPO), a hub for faculty, researchers, students, and staff creating, contributing to, and participating in open source communities. We’ll explore the “why” behind open source – how it fosters transparency, reproducibility, and innovation across disciplines. Then, we’ll discuss the OSPO’s mission and share about the practical services and support we offer, from licensing guidance to help building sustainable communities. Finally, we’ll zoom out to see how the OSPO connects with the library’s broader open science and open access initiatives. Introduction to the Institute for Data Intensive Engineering and Science, SciServer, and the Scientific Software Engineering Center (Gerard Lemson) An introduction to the Institute for Data Intensive Engineering and Science (IDIES), its role in accelerating research software development through the Scientific Software Engineering Center (SSEC), and SciServer, an open collaborative environment for interactive analysis of extremely large datasets .	Moderator: David Fearon, Senior Data Management Consultant, Johns Hopkins University Data Services Chen Chiu, Senior Data Management Consultant, Johns Hopkins University Data Services Megan Forbes, Program Manager, Open Source Programs Office, Johns Hopkins University Sheridan Libraries Gerard Lemson, Director of Science of The Institute for Data Intensive Engineering and Science and Associate Director of the Scientific Software Engineering Center
2:00 – 3:30pm Public	Workshop Designing Effective Data Visualizations (Slides and Recording) Do you want to make your data visualizations more impactful? It’s said that a picture is worth a thousand words; a well-designed data visualization is essential in concisely and clearly communicating complex ideas to your audience. Join JHU Data Services in this workshop session, where we’ll cover design principles and recommended practices for creating effective data visualizations. There is no required prerequisite for this class, but experience preparing data visualizations would be useful.	Pete Lawson, Data and Visualization Librarian, Johns Hopkins University Data Services

Day 5 - Friday, February 16th

Time	Topic	Presenter
10:00 – 10:05am Public	Closing Remarks	Bill Branan, Senior Manager of Digital Research & Curation Center, Johns Hopkins University Sheridan Libraries
10:05 – 11:00am Public	Keynote Presentation (Slides and Recording) Science in the Era of AI Artificial Intelligence is having a tremendous impact on every aspect of our life, including science. The talk will discuss various aspects of how science is changing very rapidly and contrasts trends in academia to those in the industry. It is clear that data science and AI will transform universities and the scientific method, but today it is difficult to predict realistic expectations from the extreme hype. The talk will emphasize the importance of fundamental statistical principles, how science must go beyond applying black boxes, and how AI can also help in designing and controlling future experiments.	Alexander S. Szalay, Professor of Physics and Astronomy, Johns Hopkins University
11:00am – 12:00pm Public	Session Talks: Analyzing the Archive: Working with Data from Text, Audio, and Video (Recording) How might a computational approach to archival materials help researchers reconstruct hidden histories? These session talks will feature three projects that have synthesized archival materials such as corporate documents and handwritten tables to illuminate suppressed narratives. Supporting use of computational methods on the Opioid Industry Documents Archive (Kevin S. Hawkins) (Slides) The OIDA team at JHU is developing the OIDA Toolbox – a set of options for accessing and working with the archive’s growing collection of millions of documents arising from the opioid industry. This presentation will give a brief overview of the OIDA Toolbox and these options. Teaching with Historical Data (Casey Marina Lurtz) There’s nothing like a messy handwritten table from 1900 to get people thinking about all the decisions and politics that go into making data. By introducing students to historical data and the processes of transcribing, standardizing, and visualizing it, we can help them understand what’s behind the statistics we so often take as a given. Underwriting Souls: digitizing corporate archives and humanizing data (Alexandre White) Underwriting Souls, launched in the fall of 2023 aims to contextualize our understanding of a particular archive of slavery, one centered on the financial systems that formed and maintained the Trans-Atlantic Slave Trade until its demise in the British Empire in 1807. It is also meant to serve as a critical means to question and confront the narratives that emerge from relatively unexamined documents. The collections of Lloyd’s, the largest insurance and reinsurance market in the world, are critically important historical materials for understanding the histories of the Atlantic Slave Trade. While this collection details only a small period of the Atlantic Slave Trade, roughly 1794 leading up to the abolition of the British slave trade in 1807, these materials still provide a complex picture of the role of Lloyd’s during the height of the slave trade. These materials also provide key insights into the practices of the Lloyd’s market and the process of insuring slaving voyages. Such records are scant and limited and to our knowledge, some of these materials are likely the only of their kind. Furthermore, these materials shed light on the ways that slavery and the slave trade shaped the City of London.	Moderator: Lubov McKone, Data Management Consultant, Johns Hopkins University Data Services Kevin S. Hawkins, Program Director for the Opioid Industry Documents Archive, Johns Hopkins University Casey Marina Lurtz, Assistant Professor of History, Johns Hopkins University Alexandre White, Assistant Professor of Sociology, Johns Hopkins University
12:00 – 2:00pm	Break
2:00 – 4:00pm Public	Workshop SciServer.org: Easy Online Analysis of Big Data for Research and Education (Recording) Do you use big scientific data in your research or teaching? Learn how to use the SciServer online science platform to search through, visualize, and analyze Petabytes of data quickly and easily. Bring the analysis to the data to run quick, effortless cross-comparisons among our many available public datasets – or even your own data. Visualize data, explore patterns, and fit models with powerful computational tools in Python, R, Matlab, or Julia. Share your data and analysis scripts with colleagues and students. Create demos and assign homework exercises for introductory through graduate courses using SciServer’s educational tools. In this workshop, we will show you how to do all this and more with hands-on experience using the tools of SciServer (www.sciserver.org).	Jordan Raddick, Associate Director for Education at the Institute for Data-Intensive Engineering and Science (IDIES), Johns Hopkins University