Skyblog is the emblematic platform for teenage communities on the French web in the 2000s. In the wake of the decision to close the platform in 2023, BnF launched a project to collect 12 million blogs and 40 terabytes of data. The Skybox project seeks to develop an epistemology of the web archive, using this collection as a field of study.

Presentation

The Skybox project seeks to develop an epistemology of the collection of skyblogs as source documents. The 12 million blogs are collected as part of the regular production of the Internet legal deposit. Eventually, the files collected should be accessible via the Internet Archives application, using a search by URL. Yet, it is difficult for users to know and produce a prior list of skyblog URLs without documentary and methodological support.

To support the research projects, the BnF has web archive accessing tools, such as full-text indexing and walkthroughs. Although these tools are useful to improve discoverability, they are not always effective for skyblog archives. Full-text indexing would be a significant operation of uncertain technical feasibility. Moreover, the extensive use of SMS language by skybloggers discourages its use for thematic and lexical research. Creating walkthroughs requires scientific expertise in the history and sociology of media. Furthermore, putting forward the profiles of skybloggers, who were teenagers at the time, could raise ethical issues.

Faced with this challenge, the project aims to turn skyblog archives into a field for the study of data as a support for analysing and producing scientific knowledge.

The BnF, the project leader, and the École nationale des chartes - PSL, the scientific lead, have agreed to develop the scientific knowledge of skyblogs by creating a collaborative sandbox-type workspace called ‘Skybox’. The aim is to create sub-corpuses, visual supports and cartographies, and to report on the archive's essential characteristics. The project can draw on technical production data as well as data submitted by the producer.

The project’s ultimate goal is to host research work at BnF. Each year, a theme will be chosen through consultation with the steering committee. Several topics (gender, health, digital culture) have already been identified that might help broaden the existing bibliography, which is essentially based on a micro-sociological approach to the platform. A proposal for an M2 TNAH internship will be published each year and the archive will be put forward as a research topic in the BnF DataLab AAPs. In this respect, the work on skyblogs marks the beginning of a history of the French web in the 2000s and continues the work carried out by the Web90 group over the previous decade. One collection will be dedicated to ‘web technologies of the 2000s’, with a particular focus on preserving older versions of applications (player, browser) in archives. Finally, a series of conferences will retrace the Skyblog decade and bring the project to a close.

Funding

  • Financing of one M2 TNAH internship per year for 3 years (€6,900).
  • Recruitment of a research officer for 1 year (€48,600).
  • Student contracts (research assistant) over 2 years (€3,150).
  • Overall project budget:
  • BnF four-year plan funding (recruitment, internships, development): €99,700
  • Contributions (HR partners)
    • BnF: €55,750
    • ENC: €64,000
  • Total project cost: €219,450

Research Diary

Research Diary ‘Web Corpora. Explorer les archives de l'internet à la BnF’
 

See the research diary

School referent(s)

Partner(s)

    Partager sur les réseaux sociaux

    Les autres projets de recherche