From Glass to Gigabytes

In Building D at the Harvard College Observatory, there is a record of the universe. This particular version – reasonably complete – is made of glass and it weighs about 300 tons. Less a collection than a coalescence, the Harvard plate stacks contain roughly 525,000 photographs in all. These images of the night sky were taken from observatories as far-flung as New Zealand, Peru, and South Africa.

Screen Shot 2014-10-27 at 9.40.59 AM

Storage of astronomical plates at the Harvard College Observatory. Image source.

This data collection represents the congealed labor of hundreds of astronomers working over several decades and represent millions of hours of travel and work. The oldest images are daguerreotypes dating to before the American Civil War; the most recent photographs were taken at the end of the Cold War.

Screen Shot 2014-10-27 at 9.44.12 AM

Halley’s comet taken on April 21, 1910 from Arequipa, Peru with the 8-inch Bache Doublet, Voigtlander. The exposure was 30 minutes. Source.

The photographic emulsion on each of these photographic plates give information about the brightness and location for about tens of thousands of different objects. Additional inspection of the plates provided more information and analysis. For example, consider the image below. This is a photo negative of the Large Magellanic Cloud. It was taken in January 1897 by an astronomer working at a Harvard-operated telescope in Arequipa, Peru. After the plate was developed, it circulated back to Cambridge for analysis. Each of the notations on the pate was made by one of the “women computers” that observatory director Edward C. Pickering employed. The markings on the plate signal a star or other object of interest, some of which would be explored further.

Screen Shot 2014-10-27 at 9.40.40 AM

This image of the Large Magellanic Cloud was taken in January 1897 by a Harvard astronomer working in Arequipa, Peru. Source.

Stars, planets, galaxies, along with the occasional comet or asteroid, were all captured on glass. Occasionally, non-astronomical oddities were recorded too.

Screen Shot 2014-10-27 at 9.52.46 AM

Praying mantis recorded on January 10, 1925 in image made at Bloemfontein, South Africa.

The sum total offers an analog record of the universe unmatched in terms of sky coverage and time span. The exposed plates – most are eight by ten inches in size – were shipped back to the HCO for preservation and storage. Only on rare occasions would one of the fragile plates circulate out again, perhaps traveling from HCO to another observatory. But, most of the time, the plates remained in Cambridge, archived in sturdy olive green cabinets. Astronomers wanting to use the collection had to travel to Cambridge. The collection’s librarian annotated the brown paper envelope each plate was kept in with additional details, creating “metadata” – where as plate was taken and by whom, which telescope was used, and perhaps who had found it of especial scientific value.

Screen Shot 2014-10-27 at 9.57.09 AM

Envelope notations for one of HCO’s 525,000+ plates; this indicates when the image was made (1949) and what region of the sky was observed.

Logbooks maintained by observers recorded other important metadata. Here’s an example:

Screen Shot 2014-10-27 at 9.35.57 AM

Logbook page from 1888. All of these journals are in the process of being digitized, thanks to the efforts of volunteer George Champine who passed away in 2013.

Just as assembling the Harvard plate collection was time consuming and laborious, so was working with the items in it. Once a desired plate was located, a researcher would pore over it for hours with an high-magnification eyepiece to extract useful information from the data the plate recorded. As data generated by modern astronomical instrumentation of the mid-1970s onward was increasingly “born digital” (and utilized as such), the analog photographic plates represented a wasting and unwieldy asset to many scientists.

For the Harvard plate collection, however, these issues of access, usefulness, and circulation are changing. Over the last decade, a group of professional and amateur astronomers have constructed and begun operating the Digital Access to a Sky Century @ Harvard (DASCH) project.

Conceived by astronomer Jonathan Grindlay and executed by a team of staff members, students, and volunteers, the goal of DASCH is to distill and condense – via a custom-built scanning machine and automatic data-processing and calibration pipeline – the astronomical information contained in all of those glass plates into digital data.

To get to the heart of DASCH, one descends one of Building D’s tightly wound spiral staircases. Eventually, you get to a small climate-controlled room dominated by specially designed digital scanning machine.


The DASCH machine; it can scan two 8″x10″ plates at a time.

The entire apparatus rests on a one-ton granite table to minimize vibration errors. A custom camera above the scanner bed stitches overlapping frames made of the photographic plate – some collected just a few decades after Charles Babbage produced a prototype “difference engine” to help process astronomical data – into a composite digital image.

The DASCH project’s final product will be a database, an archive of astronomical information publicly accessible on-line, containing the brightness and position of all the stars on all the HCO plates. When running at full capacity, the machine can process two plates simultaneously in less than two minute, generating data equivalent to a DVD containing a typical Hollywood film. Eventually, the astronomical information contained in those 300 tons of glass will be refashioned into about 1500 gigabytes of processed, searchable, and available digital data.

When I visited DASCH this past summer, I was reminded of two things: First, DASCH encourages us to keep in mind that astrophysics – like geology, paleontology and so forth – is a historical as well as observational science. Digital data archives for astronomy, besides rejuvenating “old” data, offer a “Janus-faced perspective” for scientists to look into the past while creating new data for the future.

Second, DASCH highlights the fact that sharing and circulation of data are the central activities in science. Without these, in fact, there is no science. However, sharing and circulation of data demands an increasing fraction of researchers’ time, money, and expertise. There is what Paul Edwards and others call “science friction,” an obstacle to overcome in order for data to move and do useful work. DASCH’s conversion of analog data into a digital format is one example of how this data friction is made less sticky.

The importance of sharing data is a concern that transcends specific institutions, individual research questions, and national boundaries. For all astronomers, it is, in both senses of the phrase, a universal concern.

Leave a Reply

Your email address will not be published. Required fields are marked *