Skyblog is the emblematic platform for teenage communities on the French web in the 2000s. Following the announcement of its closure in 2023, the BnF initiated a project to collect 12 million blogs and 40 terabytes of data. The Skybox project seeks to develop an epistemology of the web archive, using this collection as a field of study.

Presentation

The Skybox project seeks to develop an epistemology of the collection of skyblogs as a source. The 12 million blogs are collected as part of the regular production of the Internet legal deposit. Eventually, the files collected will be accessible via the Internet Archives application, using a search by URL. Yet, it is difficult for a user to know and produce a prior list of skyblog URLs without documentary and methodological support.

To support research projects, the BnF has developed tools for accessing web archives, such as full-text indexing and guided tours. Although they can be used to improve discoverability, they are not necessarily effective for skyblog archives. Full-text indexing would be a significant operation, and its technical feasibility is uncertain. Moreover, the extensive use of SMS language by skybloggers undoubtedly discourages its use for thematic and lexical research. The creation of guided tours requires scientific expertise in the history and sociology of the media. -Furthermore, if the profiles of skybloggers, who were teenagers then, are put forward, this raises ethical issues.

Faced with this challenge, the project aims to turn skyblog archives into a field for the study of data as a support for the analysis and production of scientific knowledge.

The BnF, the project leader, and the École Nationale des Chartes - PSL, the scientific lead, have agreed to develop the scientific knowledge of skyblogs by creating a collaborative sandbox-type workspace called ‘Skybox’. The aim is to create sub-corpuses, visualisations, and cartographies and to report on the archive's essential characteristics. The project can be based on technical production data and data submitted by the producer. 

The final objective of the project is to host research work at BnF. Each year, a theme will be chosen in consultation with the steering committee. Several topics (gender, health, digital culture) have already been identified that might help to go beyond the existing bibliography, which is essentially based on a micro-sociological approach to the platform. Each year, a proposal for an M2 TNAH internship will be published, and the archive will be proposed as a research subject in the BnF DataLab AAPs. In this respect, the work on skyblogs initiates a history of the French web in the 2000s and continues the work carried out by the Web90 group over the previous decade. A collection will be dedicated to ‘web technologies of the 2000s’, with a particular focus on preserving older versions of applications (player, browser) in archives. Finally, a series of conferences will look back over the Skyblog decade and bring the project to a close.

Financing

  • Financing of one M2 TNAH internship per year for 3 years (€6,900).
  • Recruitment of a research officer for 1 year (€48,600).
  • Student contracts (research assistant) over 2 years (€3,150).
  • Overall project budget:
  • BnF four-year plan funding (recruitment, internships, development): €99,700
  • Contributions (HR partners)
    • BnF: €55,750
    • ENC: €64,000
  • Total project cost: €219,450

Research Diary

Research Diary ‘Web Corpora. Explorer les archives de l'internet à la BnF’
 

See the research diary

School referent(s)

Partner(s)

Partager sur les réseaux sociaux

Les autres projets de recherche