"This will modernise how the content is captured and stored, and provide the Welsh Government with a reliable, comprehensive and intuitive search service of Wales digital history."
Manchester tech firm MirrorWeb has been awarded a three-year contract by the Welsh Government to digitally archive the Welsh nation’s online presence. This will open up full accessibility for the Welsh Government’s Information and Archive Services team.
The cloud-native web and social media archiving company will preserve the Welsh Government’s web-published content in both English and Welsh, including llyw.cymru/gov.wales and all websites of historic and national significance.
MirrorWeb has developed robust and highly scalable cloud-based archiving and monitoring tools to enable frequent archiving of web and social media assets for businesses in the private sector, and public sector bodies. It allows billions of documents to be indexed at unprecedented speed, making archives fully usable and searchable.
The decision by the Welsh Government to archive the dual-language websites and Twitter accounts will digitally preserve the Internet heritage of the Welsh nation. It will mean that the preserved digital history is fully indexed and searchable for generations to come.
National archives around the world have been collecting data for decades, and are only now beginning to realise that the archives of the future will be born out of web and social media content. Safely capturing and storing this information is the only way to prevent it being lost completely, and modern Big Data tools and the emergence of cloud computing now enable archives to index the data and derive real value on investment from it.
The project is currently in its early stages, with MirrorWeb having received the Welsh Government’s current historical archive – around 1.8 million pages – amounting to 20TB of data that has now been transferred seamlessly and cost efficiently to the cloud. The millions of web pages were harvested over the last three years, but have now been captured and indexed by MirrorWeb’s propriotory platform in a matter of hours.
MirrorWeb will now perform crawls of the sites and social media channels, harvesting and preserving the up-to-date data and publishing it using its state of the art technology, providing a comprehensive and complete archive for future generations to access and use.
Philip Clegg, Chief Technical Officer at MirrorWeb added: “We’re proud to be preserving a future-proof digital record of the Welsh Government’s online activity. Our website and social media archiving and monitoring platform is built on the cloud and provides the essential infrastructure and capacity to meet the size and complexity of the Welsh Government’s archive. This will modernise how the content is captured and stored, and provide the Welsh Government with a reliable, comprehensive and intuitive search service of Wales’ digital history.”
The announcement follows the recent contract award for indexing and archiving the UK Central Government’s online presence to the cloud. The UK’s National Archives engaged MirrorWeb to capture and transfer a 120TB web archive encompassing billions of web pages in less than two weeks.