The Science Data and Computing Center (SDCC) at Brookhaven National Laboratory (BNL) of the US Department of Energy (DoE) boasted that it already has over 300 PB of information in its repositories. According to Datacenter Dynamics, it is the third largest scientific data repository in the United States and the largest tape archive in the United States.
The stored data relates to experiments in nuclear and particle physics. In a press release from Brookhaven Laboratory, SDCC officials state that the archive contains six times more data than the entire written history of mankind, starting with Sanskrit sources (50 PB). In particular, the repository contains data obtained from experiments at the US Department of Energy’s Relativistic Heavy Ion Collider (RHIC), which has been operating at Brookhaven Laboratory since 2000, and data from the ATLAS experiment at the Large Hadron Collider (LHC).
All information is available online and upon request. The information is stored in a high-tech robotic tape library. The laboratory has already developed its own software and website for monitoring data transfers, and the structure is also collaborating with other laboratories of the Department of Energy and IBM on the development of an information management system – High-Performance Storage System (HPSS). The latter ensures that different storage systems, from tape to disk, can be used effectively in different combinations. Software has also been developed for physicists to access SDCC information.
SDCC officials say the lab benefits from hybrid storage: data is stored primarily on tape and transferred to disk only when needed, reducing operating costs and making storage more environmentally friendly. Thus, disks require cooling and energy, while tapes simply lie in libraries outside of their service life. The tape libraries themselves are housed in special rooms with optimized energy consumption and cooling.
There are also spare capacities for the growing “cache” of data collected by the laboratories. As part of the sPHENIX experiment, scientists intend to obtain about 565 PB of data, which is simultaneously written to both disks and tapes. Next year, RHIC will be replaced by EIC, which is expected to generate 220 PB/year.
It is noted that the capacity of tape storage usually doubles every 4-5 years, and they themselves are becoming more compact. By periodically transferring data from old media to new ones, specialists free up a lot of space in the library. Currently, the potential capacity of the Laboratory’s libraries is about 1.5 EB, but scientists hope that over time they will increase it to 3 EB. It is noteworthy that the demand for tape cartridges is only growing. In 2023, shipments reached a record high of 153 EB. True, this did not help all vendors.