From Glass to Gigabytes

In Building D at the Harvard College Observatory, there is a record of the universe. This particular version – reasonably complete – is made of glass and it weighs about 300 tons. Less a collection than a coalescence, the Harvard plate stacks contain roughly 525,000 photographs in all. These images of the night sky were taken from observatories as far-flung as New Zealand, Peru, and South Africa.

Screen Shot 2014-10-27 at 9.40.59 AM

Storage of astronomical plates at the Harvard College Observatory. Image source.

This data collection represents the congealed labor of hundreds of astronomers working over several decades and represent millions of hours of travel and work. The oldest images are daguerreotypes dating to before the American Civil War; the most recent photographs were taken at the end of the Cold War.

Screen Shot 2014-10-27 at 9.44.12 AM

Halley’s comet taken on April 21, 1910 from Arequipa, Peru with the 8-inch Bache Doublet, Voigtlander. The exposure was 30 minutes. Source.

The photographic emulsion on each of these photographic plates give information about the brightness and location for about tens of thousands of different objects. Additional inspection of the plates provided more information and analysis. For example, consider the image below. This is a photo negative of the Large Magellanic Cloud. It was taken in January 1897 by an astronomer working at a Harvard-operated telescope in Arequipa, Peru. After the plate was developed, it circulated back to Cambridge for analysis. Each of the notations on the pate was made by one of the “women computers” that observatory director Edward C. Pickering employed. The markings on the plate signal a star or other object of interest, some of which would be explored further.

Screen Shot 2014-10-27 at 9.40.40 AM

This image of the Large Magellanic Cloud was taken in January 1897 by a Harvard astronomer working in Arequipa, Peru. Source.

Stars, planets, galaxies, along with the occasional comet or asteroid, were all captured on glass. Occasionally, non-astronomical oddities were recorded too.

Screen Shot 2014-10-27 at 9.52.46 AM

Praying mantis recorded on January 10, 1925 in image made at Bloemfontein, South Africa.

The sum total offers an analog record of the universe unmatched in terms of sky coverage and time span. The exposed plates – most are eight by ten inches in size – were shipped back to the HCO for preservation and storage. Only on rare occasions would one of the fragile plates circulate out again, perhaps traveling from HCO to another observatory. But, most of the time, the plates remained in Cambridge, archived in sturdy olive green cabinets. Astronomers wanting to use the collection had to travel to Cambridge. The collection’s librarian annotated the brown paper envelope each plate was kept in with additional details, creating “metadata” – where as plate was taken and by whom, which telescope was used, and perhaps who had found it of especial scientific value.

Screen Shot 2014-10-27 at 9.57.09 AM

Envelope notations for one of HCO’s 525,000+ plates; this indicates when the image was made (1949) and what region of the sky was observed.

Logbooks maintained by observers recorded other important metadata. Here’s an example:

Screen Shot 2014-10-27 at 9.35.57 AM

Logbook page from 1888. All of these journals are in the process of being digitized, thanks to the efforts of volunteer George Champine who passed away in 2013.

Just as assembling the Harvard plate collection was time consuming and laborious, so was working with the items in it. Once a desired plate was located, a researcher would pore over it for hours with an high-magnification eyepiece to extract useful information from the data the plate recorded. As data generated by modern astronomical instrumentation of the mid-1970s onward was increasingly “born digital” (and utilized as such), the analog photographic plates represented a wasting and unwieldy asset to many scientists.

For the Harvard plate collection, however, these issues of access, usefulness, and circulation are changing. Over the last decade, a group of professional and amateur astronomers have constructed and begun operating the Digital Access to a Sky Century @ Harvard (DASCH) project.

Conceived by astronomer Jonathan Grindlay and executed by a team of staff members, students, and volunteers, the goal of DASCH is to distill and condense – via a custom-built scanning machine and automatic data-processing and calibration pipeline – the astronomical information contained in all of those glass plates into digital data.

To get to the heart of DASCH, one descends one of Building D’s tightly wound spiral staircases. Eventually, you get to a small climate-controlled room dominated by specially designed digital scanning machine.


The DASCH machine; it can scan two 8″x10″ plates at a time.

The entire apparatus rests on a one-ton granite table to minimize vibration errors. A custom camera above the scanner bed stitches overlapping frames made of the photographic plate – some collected just a few decades after Charles Babbage produced a prototype “difference engine” to help process astronomical data – into a composite digital image.

The DASCH project’s final product will be a database, an archive of astronomical information publicly accessible on-line, containing the brightness and position of all the stars on all the HCO plates. When running at full capacity, the machine can process two plates simultaneously in less than two minute, generating data equivalent to a DVD containing a typical Hollywood film. Eventually, the astronomical information contained in those 300 tons of glass will be refashioned into about 1500 gigabytes of processed, searchable, and available digital data.

When I visited DASCH this past summer, I was reminded of two things: First, DASCH encourages us to keep in mind that astrophysics – like geology, paleontology and so forth – is a historical as well as observational science. Digital data archives for astronomy, besides rejuvenating “old” data, offer a “Janus-faced perspective” for scientists to look into the past while creating new data for the future.

Second, DASCH highlights the fact that sharing and circulation of data are the central activities in science. Without these, in fact, there is no science. However, sharing and circulation of data demands an increasing fraction of researchers’ time, money, and expertise. There is what Paul Edwards and others call “science friction,” an obstacle to overcome in order for data to move and do useful work. DASCH’s conversion of analog data into a digital format is one example of how this data friction is made less sticky.

The importance of sharing data is a concern that transcends specific institutions, individual research questions, and national boundaries. For all astronomers, it is, in both senses of the phrase, a universal concern.

Scientists as Customers?

Would Karl Marx smile and nod sagely if he observed how scientists do their work today?What would a business efficiency expert say to a scientist today? I had these thoughts while recently thumbing through a new issue of the pop science magazine Nautilus. Because, right on page 3, there’s this:

Screen Shot 2014-10-06 at 11.15.11 AM

The text at the bottom is hard to read so here’s a detail:

Screen Shot 2014-10-06 at 11.15.20 AM

At first I thought nothing of it and just kept reading. But this announcement kept coming back to me, raising all sorts of questions. For example – At who is this message aimed? Presumably not many readers of Nautilus will be jetting off to Chile to use the Very Large Telescope or any of the other science facilities the European Southern Observatory operates.

Screen Shot 2014-10-06 at 11.20.48 AM

OK then, so this isn’t an advertisement to drum up visitors to Cerro Paranal or solicit proposals for telescope time.

No, something else is going on here. ESO’s advertisement must be read as a boast – it’s trumpeting the efficiency and effectiveness of its scientific facilities. Its observatories are, ESO claims, the “most productive” in the world. This is not the same as proclaiming that they produce the “best science” which is a much harder claim to make.

This focus on productivity, and its close cousin, efficiency, got me thinking about Frederick Winslow Taylor. In 1911,Taylor published his book The Principles of Scientific Management

Screen Shot 2014-10-06 at 11.32.04 AM

Although little remembered today, it’s one of the 20th century’s most influential books. In it, Taylor laid out a philosophy of managing workers and work flow with the aim of solving some of that era’s labor problems (and making business more profitable). In short, he wanted to get manual laborers to do more work in the same amount of time. Workers, to put it mildly, objected to Taylor’s intrusion into their workplace. Moreover, in some cases, they proved that Taylor’s methods were anything but scientific. When you read today about managers monitoring the workplace, keeping track of key strokes, and recording service calls – thank Taylor.

Shift from scientific management to managing science. Until the 1990s, telescopes used to be operated most often in what’s called “classical mode.” You can picture the scene – astronomer at the telescope, late at night, alone, cold, heroically working to unravel the mysteries of the Universe. Something like this, although maybe without the coat and tie:

Screen Shot 2014-10-06 at 11.42.05 AM

1936 image by Russell Porter of astronomer using the 200-inch at Palomar.

Fast forward 40 years…astronomers’ nightly work now looked very much like this.  As I’ve written, computers changed everything about how astronomy was done.

Screen Shot 2014-10-06 at 11.45.02 AM

Astronomer Caty Pilachowski, c. 1988, using 4-meter telescope at Kitt Peak.

Along with computers came the introduction in the 1990s of what’s known as queue observing. In fact, computers and computer models made this possible. We might think of new way of doing science as an application of Taylor’s general goals of maximizing efficiency to science. Successful proposals for telescope time are put into an observatory’s queue and executed by staff astronomers when observing conditions are suitable. ESO operates its big facilities in Chile in this fashion, as do many other major observatories.1

Advocates of this queue observing stress that it enables science facilities can be used more efficiently. This isn’t trivial when a night of observing time can cost upwards of a $1/second. Opponents of queue scheduling argued that this mode of doing science might produce a generation of researchers who were, as Karl Marx might have said, alienated from the means of production. As one scientist remarked in 1996, “I am really worried about the Nintendo mentality in astronomy.”

Decades earlier, physicists accepted arguments about cost-effective use. At a 1966 meeting at the Stanford Linear Accelerator, for example, Berkeley’s Luis Alvarez encouraged colleagues to think in terms of the number of interesting “events per dollar” produced by ever-more expensive Big Science machines.

By the late 1990s, queue scheduling had prevailed at places like the Very Large Telescope and the international Gemini Observatory. Coincident with this was a shift in language about the effective use of science facilities. Look at the questions posed at a meeting in the mid-1990s to discuss telescope use:

The choice of language here is striking. Astronomers are referred to as “customers” seeking a product. So, what’s the product? As Matt Mountain, currently director of the Space Telescope Science Institute, told me in an interview several years ago, “We produce high quality, corrected beams of light pointed at the right direction at good instruments and detectors and collect the data.”

Queue scheduling allows on-site observers to select observing programs that are best suited for prevailing weather conditions. Moreover, telescope design has been done to increase the rapidity with which this “high quality” stream of photons can be switched from one instrument to another. (Observatories typically have several highly complex instruments clustered underneath or nearby the actual telescope.)

Queue scheduling at places like Gemini and the VLT was set up to maximize the efficiency and productivity. We might think of this emphasis on flexibility, efficiency, and productivity as resembling the famous “just in time” manufacturing techniques pushed by Japanese car makers in the 1950s (and widely admired by executives in the U.S.).

It’s this shift in telescope use – where efficiency is paramount – that is reflected in the advertisement ESO placed in Nautilus.

Screen Shot 2014-10-06 at 11.15.20 AM

Did the quest for better science drive the shift toward emphasizing productivity and efficiency? Yes, but that’s only part of the story. In the United States, these concerns followed larger trends. In 1993, for example, Congress passed the Government Performance and Results Act requiring each federal agency, including the NSF, to devise yardsticks to measure performance and progress. This was not just an American trend. European astronomers did similar studies evaluating telescope productivity. As ESO’s advertisement indicates, this way of thinking is still very much alive.

The need to demonstrate greater efficiency and productivity encouraged scientists to accept models and metaphors from the business world to describe observatory management and telescope operation. Astronomy in the 1990s, like particle physics in the 1950s and 60s, became a “big business” or, at the least, a very expensive one. The next generation of giant telescopes will drive this trend forward even more. Astronomers started describing observatories as “data factories.” So, perhaps its not a surprise that perhaps some observatory directors and their staff started to see the researchers who came to their facilities as customers.

None of this addresses the question of what one means by “productive” though. Is the proper metric of productivity the number of times a publication was cited? Perhaps it could be the number of scientific problems “solved?” Or prizes won by a paper published using data from a particular facility?

Screen Shot 2014-10-06 at 12.48.58 PM

Could a time come when observatories and other science facilities take a cue from the Golden Arches and simply tout the number of customers served? Let’s hope not.


  1. To be fair, I’m talking here largely about ground-based optical astronomers. Radio astronomers had long been accustomed to receiving data collected by others. And, of course, all space-based observations are done in queue mode. If you’re unclear why, watch this. []