XSEDE Science Successes
« Back
Using Supercomputers to Illuminate the Renaissance
USING SUPERCOMPUTERS TO ILLUMINATE THE RENAISSANCE
XSEDE's ECSS, Bridges map 16th and 17th century social networks
Published on December 1, 2016 by Faith Singer-Villalobos
Most of us have heard about the Six Degrees of Kevin Bacon, based on the "six degrees of separation" concept, which posits that any two people on Earth are six or fewer acquaintance links apart.
Now, there's a similar game in town: Who knew whom in Renaissance Britain?
This is the question that the project Six Degrees of Francis Bacon" seeks to uncover. "We're leveraging 21st century computational methods in order to illuminate the past," said Christopher Warren, associate professor of English at Carnegie Mellon University.
Christopher Warren, Associate Professor of English, Carnegie Mellon University Computational methods include using techniques such as machine learning, graph inferences, and web development to reconstruct and communicate the social networks of early modern Britain from about 1500 to 1700.
Carnegie Mellon University and Georgetown University researchers, including Warren, Daniel Shore, Jessica Otis, Scott Weingart, Cosma Shalizi, and Raja Sooriamurthi created this digital humanities project to look at big historical data to see how often names are mentioned together in the history of scholarship as a way of modelling social networks.
Their work is published in the July 2016 edition of Digital Humanities Quarterly.
"Our website allows scholars, students, and citizen humanists to improve the network — that is, add relationships to validate some of the inferences that we've made, and in many cases to reject some of the statistical inferences. This means that over time we get a more accurate representation of the social networks of the period," Warren said.
In essence, Warren and his colleagues take the history of scholarship in so far as it's been digitized and run it through algorithms to see how often any two names have been mentioned together. The machine learning aspect then finds ways to model these past relationships. The hope is to find a model that accords with what they've learned through years of study and helps extend their knowledge to new networks.
Trying to understand the historical context of the major literary and artistic works and ideas that emerged in the 16th and 17th centuries is no easy feat. The 200-year period that brought us the Reformation and the scientific method also brought us Hamlet, calculus, and the microscope.
"The only way you can understand any of these things is by understanding the context from which they emerge. If we want to understand how we got something like Paradise Lost or the separation of church and state, it's going to require us to pay attention to who knew whom and how ideas spread, and the ways in which our modern world is in crucial ways a function of historical social networks," Warren said.
Take, for example, the relationship between authors William Shakespeare and Christopher Marlowe. People have long supposed that Shakespeare and Marlowe existed in the same milieu and more than likely knew one another. But what scholars are finding now based on internal analysis of their work is that they were more than likely co-authors. In the new Oxford edition of Shakespeare's complete works, which will be available this month, Marlowe will be credited as such.
This is precisely the kind of finding that can be integrated into Six Degrees of Francis Bacon. As scholars find more examples of relationships, they can go to the website, add them, and see them integrated into the most current picture of scholarly knowledge.
"In our case, we inferred because Shakespeare and Marlowe's names often appear near one another that they probably knew one another at a 75 percent probability," Warren said. "Recent evidence seems to confirm this, so we can bump that confidence up to 100 percent and have an even better picture of the past."
Once you employ computational techniques you can start to assemble relationships at a much greater scale. This is something no human could ever have in their head. By putting this together and making it available for the scholarly community we hope that we're facilitating a new way of doing scholarship that allows for a full appreciation of these historical networks.
Christopher Warren, Associate Professor of English, Carnegie Mellon University
A project like this generates tons of data.
So, in July 2016, Warren and his colleagues became users of the Extreme Science and Engineering Discovery Environment (XSEDE) to help them analyze the data and to expand their data sources.
"We've primarily been working with the Oxford Dictionary of National Biography (ODNB), which is the gold standard of British lives from the Roman Empire to the present. Much of our initial work is with that corpus," Warren said.
But they needed more sources to verify the validity of the relationships they had found. This expansion, using the Bridges supercomputer at the Pittsburgh Supercomputing Center (PSC), was made possible with help from XSEDE's Extended Collaborative Support Services (ECSS). David Walling, the ECSS expert at TACC, is helping them see if the process they used on the ODNB can be extended to other corpora such as historical journals; ECSS expert David O'Neal at PSC helped to adapt the group's prior work to Bridges. The researchers decided to use Bridges because it provides unique capabilities for nontraditional applications to the XSEDE ecosystem.
"If we look at a large corpus of journal articles and we ask how often names appear near one another do we get a similar result as the ODNB or do we get something different?" Warren asked.
With the help of ECSS, they now have 15,000 people in the data base and on the order of 100 million possible relationships.
"Once you employ computational techniques you can start to assemble relationships at a much greater scale," Warren said. "This is something no human could ever have in their head. By putting this together and making it available for the scholarly community we hope that we're facilitating a new way of doing scholarship that allows for a full appreciation of these historical networks."
"We couldn't develop the project in the direction that is most useful without ECSS," he said. "ECSS allowed us to extend our early work and move forward with it rather than spin our wheels. I can't say enough about the impact that the ECSS program has had for the project."
Although most advanced computing is used for the hard sciences like physics and chemistry, this project is a unique collaboration between computer scientists and humanists. The project couldn't progress without the help of Walling, who is adapting code in R and deploying the website code onto virtual machines. The domain expertise of both computer scientists and humanists was central to the success of the project.
"Right now, part of my role is to take all of the R code which produced the network graphs that are visualized on the website, and make that R code usable with new datasets," Walling said.
R is a programming languagewidely used among statisticians and data miners for developing statistical software and data analysis. In addition, Walling is working in collaboration with XSEDE Campus Champions Fellow Xinlian Liu of Hood College. Their current focus is on applying the workflow established by Warren's research to a set of 450k+ articles from the JSTOR digital data collection.
This was Warren and his collaborators' first foray into XSEDE and ECSS. He has an interesting viewpoint on it.
"I'm not sure we would have become involved in XSEDE if it were not for ECSS," he said. "The collaborative support model was attractive because someone with my background and training was intimidated by the prospect of using supercomputers. Knowing that there was a process to get our team up to speed was incredibly influential in bringing us on board."
Walling is fairly new to this type of deep collaboration as well. "It's relatively early in the project so I'm still learning the specifics of the algorithms, how they work, and why they are particular to this data set," he said. "To me, the interesting parts of the project are the machine learning algorithms and the statistical analysis that goes into building these social networks from text documents. I'm excited to have the chance to dig deeper into the consulting roles of different groups getting to see what people are actually doing with our systems."
Since the website went live in September 2015, there have been more than 50,000 hits and about 500 active users who have created accounts and are contributing to the picture of the past. In addition, this research project is being taught in classrooms across the United States and it's been the focus of several workshops.
"It's been a successful launch and one that we hope the ECSS and XSEDE program can continue to help support," Warren said. With a grant from the National Endowment for the Humanities, Warren and his colleagues are planning to re-design the website and find a long-term home for the project.
"One challenge that visual projects like this face is preservation," he said. "Libraries are really good at preserving hard-copy books but digital artifacts like websites can fall obsolete very quickly."
The ultimate goal in redesigning the website is releasing the code to the scholarly community so other people can build and create similar networks for other time periods.
"We're doing a lot of documenting of the existing code to make it more user friendly, helping anyone who might be interested in doing something similar," Warren concluded.
Learn more: www.sixdegreesoffrancisbacon.com
Twitter: @6bacon, @6Bacon_Bot


Christopher Warren, Associate Professor of English, Carnegie Mellon University

- XSEDE Resources, Trinity Enable Non-Human Primate Reference Transcriptome Resource to Support Study of Genes in Our Closest Relatives
- Turtle Tree of Life
- Region 1 Champions meet at Idaho National Laboratory
- Crash test simulations expose real risks
- NSF supports development of new arctic maps
- How was the planet Earth formed?
- Exploring Large Data for Scientific Discovery
- XSEDE Value Added
- Scholars program helps realize dream
- Making sense of cyberinfrastructure
- XSEDE15 Wrap Up
- Bioinformatics Scripts Solutions
- XSEDE15 Plenary Panel
- Polymer Potential
- The Future of NSF Advanced Computing Infrastructure
- 2015 International Summer School on HPC Challenges
- A Catalyst for Complexity
- As Austin Grows So Does Its Traffic Woes
- The University of Tennessee, Knoxville, Wins Second Place in an International Student Supercomputing Competition
- PSC Receives NSF Award for Bridges Supercomputer
- Innovative New Supercomputers Increase Nation's Computational Capacity and Capability
- Exploring Competitive Balance
- A Direct Bridge
- The Dopamine Transporter
- XSEDE Supercomputers Laid the Foundation for an Unprecedented Simulation of Cosmological Evolution
- Big Data Needs Big Funding
- XSEDE helps create a more effective way to assemble genomic information
- Of Micelles and Machines
- XSEDE Allocation System to Receive Makeover
- Internet2: Advancing Science in the Age of Big Data
- XSEDE User Portal At Your Fingertips: Mobile App
- Researchers Study Air Pollution
- Dan Stanzione: New Executive Director at TACC
- People of XSEDE: Campus Champions - Preaching the HPC Gospel
- XSEDE and Blue Waters Go Supernova
- Two at a Time
- Show Him the Money
- Cosmic Slurp
- Turning Salt into the Unknown
- Looking Inside Images
- Farming the Wind
- Breaking out of the Digital Graveyard
- The Mechanism of Short-term Memory
- Open Science and Industry Collaboration
- XSEDE, Prace Call for Requests of Joint Support
- XSEDE Wins HPCWire Award
- Shields to Maximum, Mr. Scott
- The Ultimate Timekeeper
- Blue Waters, XSEDE sign collaborative agreement
- People of XSEDE - Outreach programs set XSEDE apart
- Wrangler Reels in Award
- The Great Comet: NSF awards $12 Million Grant to SDSC to deploy Comet
- Meet the Gribbles
- 2013 Nobel Prize in Chemistry winners bring HPC to the lab
- XSEDE helps create a more effective way to assemble genomic information
- XSEDE facilitates large-scale image analysis to understand diseases
- XSEDE announces new campus briding services and tools
- XSEDE, NSF Release Cloud Survey Report
- XSEDE13: Programming Competition Allows Students to "Geek Out" and Gain Crucial Skillsets
- Katlin Thaney gave XSEDE13 Keynote: Gateways for Open Science
- XSEDE13 conference selects best papers, posters visualizations and more
- XSEDE13 speaker tells how turbulence simulations help make movie magic
- XSEDE13 Plenary Talk: Accelerating Brain Research with Supercomputers
- Invited speakers announced for Extreme Scaling Workshop - Heterogenous Computing
- XSEDE13 speaker LeManuel "Lee" Bitsóí: Democratizing Scientific Research
Read more about Bitsóí's talk at this year's conference - More than 70 students from 4 continents gain HPC skills at fourth annual Summer School
- Registration opens for Extreme Scaling Workshop 2013
- Campus Champions Fellows Named
- Campus Champions program reaches 200 members
- Rock Snot Genomics: University of Texas researchers use advanced sequencing and TACC's Ranger supercomputer to uncover origin of common algae
- Experiencing some turbulence: Researchers Take on One of Physics' Most Important and Enduring Problems
- Register now for Virtual School summer courses on data-intensive and many-core computing
- XSEDE seeks a Scientific Workflow Specialist for Extended Collaborative Support Service
Applications are due May 31, 2013 - XSEDE13 schedule now available online
- Students from high school to grad school levels invited to participate in programming contest at XSEDE13 high performance computing conference
- SDSC's Gordon enables discoveries in the study of genetics Read about Gordon's role in pinpointing the genetic patterns underlying autism-spectrum disorders, schizophrenia and similar brain conditions.
- XSEDE, National Computational Science Institute offer summer workshops for educators
- XSEDE13 Student Day applications due May 15 High school and undergraduate students get hands-on experience in computational science and interact with expert researchers
- XSEDE upgrades to Internet2's 100G Network
- XSEDE13 Registration now open!
- Get to know XSEDE Staff XSEDE Allocations Manager Ken Hackworth: The Man, The Myth, The Legend
- Two sponsors commit to XSEDE13 conference: Cray and Intel .
- Texas Unleashes Stampede
- Swirling Secrets-Understanding the turbulence of gases
- Blacklight helps researchers develop better materials for carbon capture
- Journey to the limits of spacetime
- Students invited to participate in XSEDE13 Multiple ways for high school, undergraduate, and graduate students to get involved; funding support available.
- XSEDE Call for Humanities, Arts and Social Science ProjectsIf you and your collaborators need to access to large collections of digital data, need more computer power, or require substantial storage capacity and computing power – please share it with XSEDE.
- XSEDE needs your feedback! If you received an invitation to complete the 2013 User Satisfaction Survey, please take 10 minutes today to share your comments about the XSEDE user experience.
- XSEDE deploys Globus Online for data transfer The first official software service on XSEDE has been accepted for production deployment
-
The Stampede Era Begins XSEDE supercomputer now operational and available to the national open science community
- Call for ParticipationInternational Summer School on HPC Challenges in Computational Sciences
- XSEDE, European Grid Infrastructure seek collaborative use cases
Deadline extended to March 8! - XSEDE offers free online parallel computing course Learn to use parallel computers more efficiently and productively
- NICS makes the top of Green500 list XSEDE partner recognized for energy-conscious high-performance computer, Beacon
- XSEDE's John Towns appointed to Compute Canada board of directors Board includes leaders in industry, academia, and computational research
- STILL ACCEPTING RESPONSES to Cloud Use Survey from XSEDE, NSF All researchers encouraged to respond and help shape future of cloud computing in XSEDE
- Make room for Stampede: TACC expands data center for new supercomputer
Read more about the new data center at TACC
See TACC Deputy Director, Dan Stanzione describe the new center - SDSC welcomes Gordon supercomputer as a research powerhouse
Read more about SDSC's Gordon - Campus Bridging Early Adopter Program issues Call For Proposals to be submitted Dec. 1-9
Read more about the program - XSEDE12 announced -- first conference of Extreme Science and Engineering Discovery Environment
Read more about XSEDE12 - PSC, SGI Team Up on Shared-Memory Supercomputer
Read more about PSC's shared-memory supercomputer - Pittsburgh Supercomputing Center Wins High-Performance Computing Award
Read more about PSC - Blacklight Goes to Work at the Pittsburgh Supercomputing Center
Read more about Blacklight - Ranger supercomputer's lifespan extended one year as part of NSF XD initiative.
Read more about Ranger - Kraken set to deliver 2 billionth CPU hour, sustains 96 percent utilization
Read more about Kraken - TACC Offers New, Broader Computational Biology Software Stack to Open Science Community.
Read more about biology software stack - ACM launches new Special Interest Group on High Performance Computing. Join by Nov. 18 for special rate.
Read more about the new SIGHPC - 'What Are You Working on Today,' Ranger, Jaguar and iForge?
Read more about TACC's Ranger supercomputer
Read more about ORNL's Jaguar supercomputer
Read more about NCSA's iForge supercomputer - Adventures with HPC Accelerators, GPUs and Intel MIC Coprocessors
Read more about experiences with new hardware - Developing Scientific Computing Communities
Read more about development efforts - Indiana University to create the National Center for Genome Analysis Support, which will be integrated with XSEDE resources
Read more about the NCGAS at IU - Scientists use XSEDE/TeraGrid resources to determine how shock waves move through solids
Read more about 'super-elastic shock waves' - XSEDE upgrades network
Read more about the XSEDE upgrade - Richard Tapia, Rice University mathematician and professor and member of XSEDE outreach team, receives National Medal of Science
Watch the Oct. 21 webcast
Read more about Tapia's award
Learn more about Richard Tapia - Stampede's comprehensive capabilities to bolster U.S. open science computational resources
Read more about Stampede
Watch a video of Jay Boisseau, director of TACC, discussing Stampede - SDSC announces scalable, high-performance data storage cloud
Read more about SDSC cloud - Appro and SDSC Gordon supercomputer to provide up to 35M IOPS
Read more about SDSC's Gordon - Dr. Barry Schneider from the National Science Foundation to describe XSEDE in the Oklahoma Supercomputing Symposium keynote, Oct. 11-12
Read more about Dr. Schneider's keynote
Go to symposium site - Students research solar cells with HPC
Read more about HPC and solar research - Seeing Is Believing: Extreme Digital visualization and data analysis resources help researchers derive insights from massive data sets
Read more about Extreme Digital - New "Memory Advantage Program" on Blacklight at the Pittsburgh Supercomputing Center
Read more about PSC's MAP - XSEDE project brings advanced cyberinfrastructure, digital services, and expertise to nation's scientists and engineers
Read more about XSEDE - Watch the John Towns video
- How XSEDE will facilitate collaborative science
Read more about XSEDE and collaboration