Skip to main content

Arquivo.pt preserves online data from European H2020 projects

Arquivo.pt, a service managed by FCCN, FCT's Scientific Computing Unit, recently preserved 197 million web files documenting research and development projects funded by the European Horizon 2020 program. This digital preservation makes it possible to safeguard around 17 Terabytes of information and prevent it from being lost forever.

After identifying and preserving websites of research and development projects funded by the European Union during the FP4, FP5, FP6 and FP7 programs (from 1994 to 2013), Arquivo.pt has now saved valuable online information at risk of disappearing under the Horizon 2020 program (2014 to 2021).

In recent years, the use of websites to document research project activities has increased. These websites provide relevant scientific information that complements the published literature, such as open datasets, presentations at events or software developed. With the end of the projects, this information was in danger of being irretrievably lost.

The task of identifying research projects involved various methodologies and the use of the European Union's open data portal. However, this portal does not provide all the information, and many projects omitted their website. It was therefore necessary to use tools developed by Arquivo.pt to supplement the missing information. For example, the Extended Model of Organic Semiconductors (EXTMOS) project website, which was available at extmos.eu, was no longer active. However, the information is fully accessible via Arquivo.pt.

Arquivo.pt will provide more information about this work and continues to invite all users to suggest sites that could be preserved.

Arquivo.pt is a public service, free of charge and freely accessible to all web users. Every day millions of pages are published on the web, but 80% of this information disappears 1 year after publication and becomes inaccessible. Arquivo.pt aims to counteract this trend and make it possible to search for and retrieve information from old websites.