UK Web Archive blog

Information from the team at the UK Web Archive, the Library's premier resource of archived UK websites

The UK Web Archive, the Library's premier resource of archived UK websites

14 June 2012

Crowdsourcing and Web Archiving

There has been a long history of members of the public acting as volunteers to refine, enhance and improve the collections of cultural heritage institutions for the benefit of others. Crowdsourcing can be seen as a continuation of this tradition.

The term crowdsourcing can be problematic as it is not necessarily about massive numbers of people or about outsourcing labour but rather about inviting participation from interested and engaged members of the public.

A workshop at the IIPC General Assembly in Washington DC in May 2012 addressed issues around applying crowdsourcing to web archiving. A paper by Trevor Owens, entitled The Crowd & the Library – the Agony and the Ecstasy of “Crowdsourcing our Cultural Heritage" was used as a framework for the workshop and a number of use case scenarios were evaluated by participants on the day.  

A number of key observations were made and extracted from the overall discussion. It was observed that there are advantages and disadvantages in engaging ‘the crowd’ in web archiving both for the institutions carrying out the initiatives and those members of the public involved.

There may be sensitivity around areas where there is already professional expertise within the organisation (e.g. cataloguing). It is important to design the project in such a way that the crowd and the expert each do what they are best at. Advanced users and regular users should be given different tasks, fully utilising the wisdom of the crowd.

Humans are capable of processing information and making judgements in ways that computers cannot. It is a waste of time to ask the public to do tasks that a computer can.

Putting the right tools in place will magnify the user’s effort by making it easier to accomplish tasks. Trade-offs quite often emerge between richer functionalities on a crowdsourcing website and forming barriers to participation by users. Requesting users to login for example has the advantage of being able to store information to enable personalised services but being able to start immediately without login is appealing.

It was pointed out that people feel motivated by doing something that matters to them and get a sense of belonging to something bigger than themselves. Crowdsourcing should be engaging, especially when users are asked to carry out repetitive tasks. It is important to provide feedback to users on how they are doing and how their contribution is furthering the overall progress of the project. This helps to keep users engaged.

Key challenges include devising an appropriate project and attracting an audience sizable enough to participate in the work. It was felt that crowdsourcing within web archiving would suit smaller discrete projects rather than ongoing open ended challenges. Suggested areas for the involvement of the crowd could apply to elements of the web archiving workflow such as identifying websites, quality assurance and cataloguing.



The comments to this entry are closed.