Working with data - A workshop on understanding, visualising and mapping data, organised by Transparent Chennai, between 3rd to 5th August 2012

The data camp focused on training activists, researchers and students to work with data and learn about open data, data visualisation, spatial data
This workshop organised by Transparent Chennai at The Institute of Financial Management and Research, Chennai was the outcome of the experiences of the earlier open data camp events organised by Transperant Chennai in Bangalore and Hyderabad, where there was a wide discussion among attendees who were excited by the potential of data and the open data movement, but who did not have the necessary skills or technical background to work effectively with it.
It was felt that there was a much larger community of activists, researchers, and non-profits who could benefit from learning to use the kinds of tools presented at the camps. Thus, this event was planned differently from a data camp and focused on training activists, researchers and students to work with data where participants would learn about open data, data visualisation, spatial data and practical issues that come up when working with data in various forms.
 The workshop thus aimed at helping the participants to:
  • Understand various formats of data, diverse possibilities of data visualisation and effective tools for doing so, with a special focus on web-based tools
  •  Understand how to think through projects involving collection, processing and visualisation of data
  •  Develop a basic understanding of software packages and methods for visualising quantitative data, creating geo-visualisation and undertaking participatory mapping
  •  Understand the connection between data technologies and rights to access and use data

Day 1: 3rd August 2012

The first day discussions included a basic understanding of what is data, the significance and relevance of using data for representation, types of data, open data and the importance of understanding spatial data that included vector and raster data, data files, and formats and databases.
Session 1
  • What is data?
  • Uses of data
  • Types of data
  • What is Open data
  • What is spatial data
  • Types of spatial data

Session 2

  • Data formats and files
  • Databases
Session 3
Stories through maps
The third session on what can be done with data was very interesting where the discussion highlighted how the data could be used to tell stories through visualization. The presenters led the participants through historical and hand made maps to describe how hand made maps with manually plotted data and drawings could successfully convey messages and were used to find causal relationships. For example:
  • The example of the hand drawn map made by a scientist John Snow was discussed that was useful in establishing the relationship between poor and deprived neighbourhoods, contaminated water supply and cholera, which made the discovery that cholera is a water borne disease
  • The other project that was discussed as an example was the Voice of Kibera project. Kibera the largest slum in Nairobi, Africa was mapped through participation of slum dwellers who were trained to map. Voice of Kibera project targeted on different aspects which made sense to people in Kibera such as news, deaths etc. Information was thus collected, classified and put up by the people themselves.
  • The Kibera project thus  became means of putting up news, etc on the net and this attracted a lot of people. Sadly, the slum dwellers had to move out of slums. However, this pioneering effort was lauded for its success as a possible tool to empower people that demonstrated how maps could represent voices of people and could tell stories.
  • The other point that was highlighted was that although data can be empowering, it also has to do with who accesses and uses data. For example, information can also be used by the powerful to reach their goals at the cost of the people who make them. Putting up this information, does not mean that people's voices are heard or their problems are voiced and attempts made to address them, as data is power and many a times powerful get to access and use data. The issue is also about who owns the data.
  • The other examples of representations or visualizations of data that were discussed included:
    • The true size of Africa
    • Google flu map: Search queries used to collect info on flu in areas
    • Other maps shared included
    •  Use of face book worldwide
    •, semantic analysis used to detect  protests made by people
    • used participatory reporting of power cuts through phones

Session 4

  • Working with quantitative data
  • What is quantitative data
  • Quantitative data formats
  • Softwares to work with quantitative data
  • Scraping
  • Scraping ethics
This session touched upon types of quantitative data formats, softwares in quantitative data, how to work with quantitative data, scraping of data and storage of databases. This was just for people to understand the concepts and it was not in detail, but more introductory. This session was followed by a presentation that described a mapping experiment, namely the Delhi Digest: A sketch Book, on e waste, that hosted a growing collection of interviews, images and data around the issue of e-waste in Delhi. A series of conversations with activists, waste-pickers, urban planners, journalists and bureaucrats explored how citizens, the state and corporations coexisted along the often divisive lines of this issue. This example of data mapping was not only interesting, but it also presented the idea that data need not be always quantitative and that it was also possible to do visual mapping with qualitative data.

This session also included a group exercise where all the participants were split into  three groups and were asked to try and build possible stories out of data that they considered important. The participants were asked to work out the details of the process of making the story in terms of the basic problem they wanted to highlight, the underlying processes involved or the hypothesis building, the data needed, the possible ways of representing data, the emerging perspectives and actions that could be planned based on the findings.

This session was a delight in terms of the range of ideas people churned out and also the way in which people tried to twist and turn around the hypothesis or the basic theoretical frameworks! These were some of the ideas that people brought out:
  • Public schemes like NAREGA, how can we cull out stories from non understandable data
  • Mapping carbon footprint
  • Food we eat
  • Revenue generated by movies
  • Beach poor and rich parts, how garbage disposal is the problem
  • Corporate sponsors in Olympics
  • Corporate malpractices
  • Map bus routes at bus stops, to analyse gaps in public transport
  • Start looking at data on Groundwater and variables related to that
  • Time series map of urbanization, Groundwater
  • Bubble chart of Groundwater and agricultural production
  • Heat map showing GW in areas where there is construction
  • Groundwater level Time series inf
  • Agricultural productivity
  • Population and electricity supply
  • Farmer suicides
  • Industry and drinking water and sewage
  • Map noise level in city and traffic
  • Accidents and season, time
  • Sickness impacting livelihoods and education

Session 5

This session was followed by an introductory session on spatial data that included:

  • Working with spatial data
  • Spatial data formats

Day 2: 4th August 2012

Session 1
  • What is data visualisation
  • Types of charts
  • principles of data visualisation
The second day started with a presentation by two researchers from the Open knowledge Foundation, UK. The consequent sessions on the second day introduced the basics of visualization, and introduced the basic means of visualization such as scatter plots, histograms, bar diagram, pie Charts, bubble charts, radar charts, mind maps, tree maps, flow charts etc.

This was followed by introduction to the guiding principles of visualization,  A range of maps and visualizations were shown to indicate the different ways in which data could be visualized and the importance of having clear, specific messages keeping the target audience in mind and the importance of making the visualizations simple to view was emphasised.

The participants were asked to see these sites later:
  • Charles Minnad see maps on net if u find
  • David McCandiess The billion pound o gram
  • Christain Nold
  • Light painting Paul Torrens, wifi geographies
  • How’s life, OECD better life index
  • Word clouds

Session 2

  • Tools for spatial data collection
  • Tools for spatial data visualisation
The next part of the session included an introduction to the data visualization tools.  The participants were asked to download the Quantum GIS software with explanations on how to use it. It was obvious though that although the concept was understandable, this would need some time to get used to and can be used only after practising and using the software for some time.

Day 3: 5th August 2012
Session 1
  • Introduction to GPS
In this session, the participants were introduced to the usefulness of GPS to map locations and were introduced to the method for taking readings using the GPS tool. The partcipants were divided into groups of three and after a basic introduction to how to use the GPS tool, were asked to go out into the city and record locations for selected points such as garbage bins, manholes, water pipes, handpumps etc. This was a very practical and useful session to know on how to use a GPS tool in the field. After the group work, there was a demonstration on how to put this data up in quantum GIS to make a map. The sessions following this included introduction to other softwares such as many eyes, wordle and others, which can need more introduction and practice.

Overall, the workshop was  a great experience in terms of getting a basic idea and concepts regarding the importance and significance of mapping and visualizing data, while the use of softwares would need some further hands on experience, handholding and practice.
The workshop was developed and delivered by Sajjad, Sumadro and Shashank and their profiles are given below:
  • Shashank Srinivasan is an independent ecologist and cartographer who works with NGOS across India. He is interested in applied geographic information systems, conservation policy and environmental governance, and in the intersection of these disciplines.
  • Sumandro Chattapadhyay studied economics and worked with architect-planners, social researchers and programmers on urban development, policy analysis, data visualisation and participatory mapping. Presently he is an independent researcher based in Delhi. He contributes to The Pop Up City blog and edits an upcoming blog on open geospatial data, mapping practices and location-based services in India.
  • Sajjad Anwar is a hacktivist and programmer. He works on research and design of data analytics and infographics. He loves maps and have been part of the OpenStreetMap project for over four years. Associated with free software projects like Ubuntu and Mozilla Firefox, he is involved in building and testing accessible technologies.

The presentations from the workshop can be downloaded from below: