GIS data acquisition

Geographic Information Systems

A Geographic Information System (GIS) is an integrated set of hardware and software tools, designed to capture, store, manipulate, analyse, manage, and digitally present spatial (or geographic) data and related attribute information. GIS can relate information from different sources, using two key index variables space (or location) and time. Common GIS data types (models) include:

Spatial Data: Describe the absolute and relative location of geographic features.

  • Vectors

    • Arcs (Polylines): Line segments forming individual linear features
    • Polygons: Areas enclosed by arcs
    • Points: Single coordinate pairs
  • Rasters

    • Grid-Cells: single column/row positions
    • Cell size: Resolution or else the accuracy of the data

Attribute data: Describe characteristics of the spatial features. These characteristics can be quantitative and/or qualitative in nature. Attribute data is often referred to as tabular data.

The selection of a particular data model, vector or raster, is dependent on the source and type of data, as well as the intended use of the data. Certain analytical procedures require raster data while others are better suited to vector data.

GIS data sources

Every day governments, private sector and development aid organizations collect data to inform, prepare and implement policies and investments. Yet, while elaborate reports are made public, the data underpinning the analysis remain locked in a computer out of reach. Because of this, the tremendous value they could bring to public and private actors in data-poor environments is too often lost. is an open data platform launched recently by The World Bank Group and several partners, trying to change energy data paucity. It has been developed as a public good available to governments, development organizations, non-governmental organizations, academia, civil society and individuals to share data and analytics that can help achieving universal access to modern energy services. The database considers a variety of open, geospatial datasets of various context and granularity. KTH Division of Energy Systems (KTH-dES), formerly known as KTH division of Energy Systems Analysis (KTH-dESA), contributes on a contnuous basis by providing relevant datasets for electrification planning.


Indicative open libraries of GIS data

Over the past few years, KTH dES has been actively involved in the field of geospatial analysis. The following table presents a list of libraries and directories that provide access to open GIS data.

The Humanitarian Data Exchange | Different types | |

Country specific databases

With geospatial analysis gaining momentun in many research areas, many countries have set up their own geo-databases in an effort to facilitate interdisciplinary research activities under a geospatial context. Here are few examples:

Country Source
East Timor

GIS data in OnSSET

OnSSET is a GIS-based tool and therefore requires data in a geographical format. In the context of the power sector, necessary data includes those on current and planned infrastructure (electric grid networks, road networks, power plants, industry, public facilities), population characteristics (distribution, location), economic and industrial activity, and local renewable energy flows. The table below lists all layers required for an OnSSET analysis.

# Dataset Type Description
1 Population density & distribution Raster Spatial identification and quantification of the current (base year) population. This dataset sets the basis of the ONSSET analysis as it is directly connected with the electricity demand and the assignment of energy access goals.
2 Administrative boundaries Polygon Delineates the boundaries of the analysis.
3 Existing grid network Line shapefile Used to identify and spatially calibrate the currently electrified/non-electrified population.
4 Power Substations Point shapefile Current Substation infrastructure used to identify and spatially calibrate the currently electrified/non-electrified population. It is also used in order to specify grid extension suitability.
5 Roads Line shapefile Current Road infrastructure used to,identify and spatially calibrate the currently electrified/non-electrified population. It is also used in order to specify grid extension suitability.
6 Planned grid network Point shapefile Represents the future plans for the extension of the national electric grid. It also includes extension to current/future substations, power plants, mines and queries.
7 Nighttime lights Raster Dataset used to,identify and spatially calibrate the currently electrified/non-electrified population.
8 GHI Raster Provide information about the Global Horizontal Irradiation (kWh/m2/year) over an area. This is later used to identify the availability/suitability of Photovoltaic systems.
9 Wind speed Raster Provide information about the wind velocity (m/sec) over an area. This is later used to identify the availability/suitability of wind power (using Capacity factors).
10 Hydro power potential Point shapefile Points showing potential mini/small hydropower potential. Dataset developed by KTH dESA including environmental, social and topological restrictions and provides power availability in each identified point. Other sources can be used but should also provide such information to reassure the proper model function.
11 Travel time Raster Visualizes spatially the travel time required to reach from any individual cell to the closest town with population more than 50,000 people.
12 Elevation Map Raster Filled DEM maps are use in a number of processes in the analysis (Energy potentials, restriction zones, grid extension suitability map etc.).
13 Slope Raster A sub product of DEM, used in forming restriction zones and to specify grid extension suitability.
14 Land Cover Raster Land cover maps are use in a number of processes in the analysis (Energy potentials, restriction zones, grid extension suitability map etc.).
15 Service transformers Point shapefile Current Transformer infrastructure used to identify and spatially calibrate the currently electrified/non-electrified population. It is also used in order to specify grid extension suitability


  • Before a model can be built, one must acquire the layers of data outlined above.

More often than not, each layer must be acquired on its own. The final outcome is a multilayer map conveying all the information necessary to initiate an OnSSET electrification analysis.

  • The spatial resolution of the final map depends on the availability of input data and on the targeted level of accuracy.

OnSSET can handle various levels of input data, with typical resolutions ranging from 1x1 kilometers (km) to 10x10 km. The selection of inputs usually involves a trade-off between the time needed for computation and the desired level of detail. The modeler has to decide which resolution best fits the purpose of the analysis.

GIS basic datasets

Administrative boundaries

Coverage Type Resolution Year Source Link
World shapefile Counties,provinces, departments, bibhag, bundeslander, daerah istimewa, fivondronana,,krong, landsvæðun, opština, sous-préfectures, counties & thana 2011 GADM
World,(& per country) shapefile Countries 2011 DIVA-GIS
Europe geodatabase/shapefile Countries, provinces 2013 Eurostat

Population data

Transmission lines data

UK shapefile Power transmission lines, underground cables, stations etc. na National Grid
US raster 100 m grid cells 2015 ArcGIS online
World OSM potential points or polylines 2015 OSM of various mirrors  
World From Vmap level 0 Power lines and utilities na Can be downloaded from:

Power plants location data


Travel time to major cities

Mining and Quarrying

Coverage Type Resolution Year Source Link
USA Shapefile, csv, KML, KMZ Active mines and mineral plants in the US 2003 USGS
World Shapefile, dBase, HTML, Tab text,csv, Google earth points 2012-2013


Coverage Type Resolution Year Source Link
World | ESRI ASCII GRID, GeoTIFF 9 arc sec 2017 SolarGIS


Coverage Type Resolution Year Source Link
World GeoTIFF 250m 2018 Technology University of Denmark

Land cover

Coverage Type Resolution Year Source Link
World Bioenergy potential 1 km na IRENA
World CI Land cover - raster 300 m time series from 1992 to 2015 ESA
World GeoTiff, Google earth, jpeg,png 1-0.1 degrees 2001-2010 NASA-NEO
World HDF-EOS 0.5 degrees 2001-2012 NASA-MODIS
World Raster, csv 0.0028 - 0.0083 degrees 2000, 2005, 2010 ESA-ENVISAT
World/Protected areas Shapefile, KML, csv na 2014 Protected planet
World various various 2015 Global Land Cover Facility
World Rasters for: Costal areas, Cultivated areas, Forests, Mountains, Islands, Inland waters etc. 0.00833 degrees 2000 SEDAC
World Raster for croplands 0.0833 degrees 2000 SEDAC
World Various Rasters on Land Use various 1990-2010 Nelson Institute
World Soil type various na Worldmap.Harvard
World Various Rasters on Land Use various 1980-2014 EarthStat

The model classifies the land cover in order to calculate the grid extension penalties. The default classification values are based on the MODIS dataset found here, where the legend ranges from 0-16 with the values and corresponding land cover type can be seen below. If land cover data is retrieved from other data sources with different classification values they should be reclassified in GIS (using the Reclassify tool in ArcGIS or r.reclass in QGIS) to match those below. Alternatively changes can be madein the Python code instead. If this reclassification is not performed it may lead to an incorrect grid penalty factor or, if the highest values are above 16, an error message while running the code.

Value Label
0 Water
1 Evergreen Needleleaf forest
2 Evergreen Broadleaf forest
3 Deciduous Needleleaf forest
4 Deciduous Broadleaf forest
5 Mixed forest
6 Closed shrublands
7 Open shrublands
8 Woody savannas
9 Savannas
10 Grasslands
11 Permanent wetlands
12 Croplands
13 Urban and built-up
14 Cropland/Natural vegetation mosaic
15 Snow and ice
16 Barren or sparsely vegetated


Coverage Type Resolution Year Source Link
World Coast Lines, oceans Physical vectors, ESRI shapefiles, GeoTIFF (1:10, 1:50 and 1:110 m) 2015 Natural Earth
World Climate data 30 arc seconds and 2.5/5/10 arc minutes na WorldClim
World/USA Climate change scenarios various na na
World/Australia Water and Landscape Dynamics 0.05 to 1 degrees 1979-2012 Australian National University
Open Street Map (OSM) - Osmosis osm.pbf depending on mirror source up to date NOAA
Nighttime lights Raster file 0.0083 degrees 1992-2013 na
Africa information Highway various vectors various AfDB
World Cliamte data various various Oregon State University

Methodology for Open Street Map data and Osmosis


  • Open Street Map (OSM) is a collaborative project that intends to provide free and open access data used in mapping the world. This document aims at describing in brief the methodology used in order to obtain OSM data and transform them in compatible and useful information with the use of Osmosis and QGIS.
  • To begin with, bulk download of updated OSM data can be performed through the Planet OSM:
  • The files can be downloaded as .xml and .pbf format. However, due to the large volume of data there are various mirrors/extracts that provide access to masked data for different regions of the planet. More information can be found here: In previous cases and where used successfully.
  • It should be mentioned at this point that an interesting tool is the Overpass API. More specifically, using quarry and convert forms and redirecting to Overpass Turbo it is possible to utilize the wizard function and obtain required data for a defined area. The area is delineated by the map shown in the screen while data types include nodes, ways and relations. The data can be exported in various formats with .kml (amongst others) being compatible with the latest versions of QGIS. (As an example use the word: power in the wizard function and you will get the power related information depicted on the map). A disadvantage of this method is that the restrictions in the area size, which is limited to 100 square km.
  • Coming back to the other sources (Geofabrik, BBBike), data can be downloaded per region in .pbf format. In the latest version of QGIS it is possible to insert this data directly by simply dragging the file onto the QGIS window. However, since the files are usually very large it is recommended to transform the .pbf into a spatialite database.
  • To do this transformation open up the OSGeo shell follwoing with your installation, navigate to the folder in which you have your .pbf file (by typing cd [folder path]) and enter the following line: ogr2ogr -f SQLite X.sqlite Y.pbf (note change X to the name you want to use for your spatialite database and Y to the name of your downloaded .pbf file)
  • Once This transformation is finished (it may take some time) drag this new file into QGIS and work with it instead of the .pbf file.
  • OSM data provide access to a tremendous amount of information of various types. Feel free to explore the potential and share the results with an enthusiastic community.