Section outline


    • The world contains an unimaginably vast amount of digital information which is getting ever vaster ever more rapidly. This in principle makes it possible to do many things that previously could not be done: spot business trends, prevent diseases, combat crime and so on. Managed well, data can be used to unlock new sources of economic value, provide fresh insights into science and hold governments to account. Changes to how we live, work, socialise Advances in digital technology changed how we communicate (Facebook, Twitter, Instagram, WhatsApp, Skype, etc.), work (videoconference, mails, Google Drive, Microsoft Teams, etc.), eat (Uber eats, etc.), travel (Uber, Couchsurfing, Booking.com, Airbnb, etc.), entertain ourselves at home (Netflix, streaming books/podcasts, etc.).


    • The impact of these technologies on our lives is therefore vast and seems rather well accepted according to the 2017 Eurobarometer. “75% of respondents think the most recent digital technologies have a positive impact on the economy, 67% - on their quality of life, 64% - on the society. 76% who use Internet every day say the impact of these technologies on their quality of life has been positive, compared to 38% who never use the Internet.”

      Source: European Commission, Attitudes towards the impact of digitisation and automation on daily life
    • The European Commission is taking this digital context into account in its plans, as is the data strategy

      The European data strategy aims to make the EU a leader in a data-driven society. Creating a single market for data will allow it to flow freely within the EU and across sectors for the benefit of businesses, researchers and public administrations.

      So data is everywhere, so much so that we talk about Big Data. But what do we really mean by Big Data?


    • Big data

      It is very difficult to have a common definition of Big Data. Big Data identifies a very unclear group of digital data stored for commercial, administrative and scientific objectives.
      Big Data consists of 3 main characteristics, called the 3Vs: Volume - Velocity - Variety
      Big Data means different things to different people. Regardless of the sources of the digital data, such as books, social media, databases, audio, and video, big data exhibits the characteristics of high-volume, high-velocity (speed of data in and out), and high variety (range of data types and sources).


    • This new type of data enriches research prospects and has potential to advance research in the humanities and social sciences in the following ways:

      • Advanced big data collection tools, such as web scraping, and innovative analytic techniques, such as machine learning, may help establish new research methodologies; 
      • New types of data may reveal new patterns and insights into human society, politics, and economics; 
      • New types of data may lead to new kinds of research questions that are beyond the perspectives of established theories.

    • A few terms related to big data

      Set of concepts and technologies (see below) that use intelligent behaviour based on algorithms (set of rules to be followed in the resolution of a specific problem).

      Automated analytical systems that learn over time, as more data they acquire.

      Algorithms that use neural networks to learn from unstructured data (images, audios, videos, posts on social networks...).

      Self learning systems that use sets of complex algorithms to mimic processes occurring in the humain brain.


    • Open data. Opening of administrative and political data


      Open data are freely accessible micro-data that can be used and reused freely by everyone. The term open data first appeared in 1995 in a document from an American scientific agency; it referred to the dissemination of geophysical and environmental data, but the idea that the empirical basis on which knowledge is built is a public good that should be available to all is much older.


      Don’t lock it away, do something useful with it notbrucelee, CC BY-SA

    • “the availability of open data creates opportunities for all kinds of organisations, government agencies and not-for-profits to come up with new ways of addressing society’s problems. These include predictive healthcare, and planning and improving London’s public transport system”

      Source : The Conversation, The future will be built on open data – here’s why
    • Example: A big boost to open data came in the late 2000s, when first the OECD invited member country governments to open their data in 2008 and then the United States government launched the datasite.gov in 2009, a web address designed to provide full access to databases and time series that were held by states of the Union and federal agencies.

      The European Union launched Open Government Partnership in 2011, an initiative for openness, transparency and civic participation, with the involvement of 65 governments that are committed to activating an action plan on five thematic areas - participation, transparency, integrity, accountability and technological innovation.

      The main difference with the past is that, while before some public bodies made all macro-data - that is, the aggregated data - available through publications, online documents, DVDs, etc., open data are micro-data which is downloadable from the internet free of charge, already in matrix format (generally in .csv or .xml format) and immediately usable for secondary analysis.



      These are generally data that have great relevance for the planning, monitoring and evaluation of public policies, and which are made open to all with a dual cognitive and regulatory objective. They provide technicians and experts with knowledge bases to redirect and improve policies and also allow citizens to find out whether the policies implemented have had the announced effects or not.

      Open data are also a consequence of the importance that transparency and accountability (the obligation for a subject to account for their decisions and to be responsible for the results achieved) are gaining nowadays.

      Although open data have created new opportunities for secondary research, it should be emphasized that there are limits to their use. Firstly, there is a problem of issues: if in principle open data can deal with any topic, to date completely public data are almost exclusively economic, geographic and related to transport. A second limitation concerns the way in which data is opened. It would be necessary to complement the matrices with a series of additional information on the methodological choices that have been made to produce those data and indications on the various aspects of their quality. Often this information is missing and this makes it difficult to analyze the data effectively.
    • Why data should be open?

      1. Transparency: In a well-functioning democratic society citizens need to know what their government is doing. To do that, they must be able to freely access government data and information and share that information with other citizens. Transparency isn’t just about access, it is also about sharing and reuse — often, to understand material it needs to be analyzed and visualized and this requires the material to be open so that it can be freely used and reused.
      2. Releasing social and commercial value: In the digital age, data is a key resource for social and commercial activities. Everything from finding your local post office to building a search engine requires access to data, much of which is created or held by government. By opening up data, government can help drive the creation of innovative business and services that deliver social and commercial value.

      3. Participation and engagement: Participatory governance or, for business and organizations, engaging with your users and audience. Much of the time citizens are only able to engage with their own governance sporadically — maybe just at an election every 4 or 5 years. By opening up data, citizens are enabled to be much more directly informed and involved in decision-making. This is more than transparency: it’s about making a full “read/write” society — not just about knowing what is happening in the process of governance, but being able to contribute to it .