Open Science Framework for Data and Project Management

Alastair Dunning

One of the most common requests by new research projects at TU Delft is for a tool that can manage their all kinds of research data during a project and also deal with other types of data created during a project – for example, steering group minutes, presentations, interview permissions.

afbeelding OpenScience framework

Often, projects ending up using a mixture of tools (Basecamp, Google Drive, GitHub, SharePoint) that have different advantages and disadvantages

In this light, I’ve had an introductory look through the science-focussed Open Science Framework (OSF), that provides tools to help the entire workflow. Some of the advantages are listed below.

•    Very quick start up time- it’s possible to get a project up and running in a couple of minutes

•    Possible to upload and categorise all kinds of data and files. For example, ‘methods’, ‘hypotheses’ and ‘communication’

•    Ability to store versions of data – revisions to each file can be stored

•    Different files can have different levels of permission. OSF introduces the concept of component to help organise files and data in different ways. Each component can have different levels of access (e.g. admin, read/write, read only). This is very useful for projects involving multiple institutions and data requiring protection.

•    Ability to create public versions of parts of projects, with citations. For fully-fledged projects that wish to share data and ensure appropriate attribution this could be a strong pull.

Other questions that the usage of OSF raises:

•    how efficiently does OSF deal with big data sets? Individual files can be no more than 5GB. For larger files, linking to add-ons such as Dropbox is possible, but it would be interesting to see if OSF retains its speed when accessing multiple large data sets

•    how does it work with third party tools? Integration with common Cloud Apps such as Google Drive is already included. But for some research projects it will be the ability of the tool to connect to specialist code, tools and instruments could make OSF much more useful. But such integration is challenging. For example, how could a sensor recording meteorological data on a daily basis automatically transfer data to OSF? Or how could OSF expose data from traffic logs to allow the visual analysis of movement of cars, buses and lorries in a city? OSF have made their API public to respond to such goals, but that requires developer time to integrate

•    if data is being made public and being given a DOI for use in citations, the OSF will need to work hard to ensure long-term sustainability and the trustworthiness of the data. It will still be useful for research projects to deposit their final published data in a repository that accords with the Data Seal of Approval, for long-term curation of data.

Prepare to Share – Data Stewardship at TU Delft

by Alastair Dunning,

Coming fast on the heels of the Open Access movement for scholarly articles, the research data movement aims to liberate the data that provides the evidence for scholarly debate and argument.

Take one of the projects currently been done at the Faculty of data_stewardshipArchitecture. Led by Assistant Professor Stefan van der Spek, the Rhythm of the Campus project is creating a huge dataset of usage by staff and students on the wifi over the entire campus of TU Delft.

Such a rich dataset can tell researchers, students and indeed other interested parties across deal about how teachers, undergraduates, postgraduates and support services all make use of the university wifi.

A whole spectrum of questions can be asked on how individuals interact in different types of groups, and how attitudes and behaviour change in specific places and at specific times.

From the point of the Research Data Services team in the Library, there is great interest in what happens to the collected data. Such a catalogue of digital behaviour will be of interest not just to Stefan’s colleagues and students but many potential re-users around the world.

But before such research data like that can be re-used, many issues need to be addressed. How is it documented? How is it anonymised? How is it archived? How is it cited? These are all questions for Data Stewardship.

At TU Delft Library, the Research Data Services team has just kicked off the Data Stewardship project. It aims to create mature working practices and policies for research data management across each of the faculties at TU Delft, so that any project can make sure their data is managed well.

Four key values underpin such work

  1. The safe storage and protection of intellectual capital developed by scientists
  2. Best practice in ensuring scientific arguments are replicable in the long term
  3. Better exposure of work of scientists and improved citation rates
  4. Improved practices for meeting the demands of funders, publishers and others in respect to research data

To implement these values, work has begun on a draft policy framework ( This is being discussed over the summer at a faculty level, and their input will steer and refine the policies and practices throughout the university (e.g.,  on the need for training for PhDs in data management). We will continue to report on development on this blog as the project continues.