POSTER P111: Why does every observatory, survey, project, and PI have to build their own incompatible archive from scratch? And can something be done about it?

ADASS posters are displayed all week

When

10:11 p.m., Nov. 6, 2023

Theme: Science with data archives: challenges in multi-wavelength and time domain data analysis

pretalxeposter

Astronomers generally pride themselves on a tradition of open data access, with many large surveys and many observatories providing archives of data and catalogs, each having a web presentation often with relatively sophisticated search functions. However, archives are expensive to build and maintain. Archives frequently have interfaces customized to their datasets, are rarely co-located, and don't have consistent APIs. As a result, doing cross-archive searches is very limited. Your options are typically to use a data aggregator like the NED or HyperLeda databases, to do a cone search of multiple archives with an interface like MAST or astroquery and match up the results yourself, or to download some catalogs and do the matching at home.  As datasets get larger, this is unsustainable. Meanwhile, smaller observatories and individual projects often don't have public-facing archives or searchable datasets. This is frequently not because people want to keep the data private, but because they don't have time, funding, or skills to make and maintain an archive interface. The average astronomer (including me) releases data by putting a catalog file and a tarball of FITS files on a webpage. This is well intentioned but un-searchable. Although the field has invested substantial resources in community software to process data tables and query archives, there are few examples and no infrastructure to help projects create archives. Can we develop frameworks to allow archive interoperability? Can we build simple archives to make data release easier?  I will discuss baby steps and possible progress and obstacles to these goals.

Contacts

Benjamin Weiner, MMTO