Archive for the ‘GIS’ Category

A satisfied customer

Posted: November 9, 2015 in GIS, This and That
Tags: ,

Recently I did a QGIS scripting job and here is the feedback from an extremely satisfied customer:

It was fantastic to work with Yury. He provided an excellent product, that went far beyond what I was expecting. I will certainly be contacting Yury in the future for any jobs that relate to GIS and python scripting. Communication was excellent and his ability to understand the job requirement was very impressive. A+++

Guys, if you are in need of geoprocessing tool for your project – don’t hesitate to contact me 😉

My QGIS Processing Scripts at GitHub

Posted: October 23, 2015 in GIS
Tags: , , ,

This is probably my shortest post ever.

All my QGIS processing scripts (R and Python) and models that I already blogged about, plus some extra are now available at GitHub.

A Quick Map With QGIS and OSM

Posted: June 18, 2015 in GIS
Tags: , , ,

What I love about QGIS is that one is able to create a nice map quickly. The other day I was asked to make a situation map for the project we are working on to include it into presentation. Аll I had was a laptop with no relevant spatial data at all, but with QGIS installed (I even had no mouse to draw something). Though it was more than enough: I loaded OSM as a base layer and used annotation tool to add more sense to it. Voilà:

Posted: June 11, 2015 in GIS
Tags: , , , ,
This is strange, but I was unable to find instruction about importing QGIS layers into PostGIS database with PyQGIS API. The PyQGIS cookbook has the example of exporting layers as .shp-files via QgsVectorFileWriter.writeAsVectorFormat() function and says that the other other OGR-supported formats are available to use in this function as well. PostGIS is supported by OGR so people got confused and try to use this function to import data to the PostGIS with no success and have to write generic import functions. 


After of couple of hours of searching the internet for the solution I gave up and decided to find the answer the hard way: started to search through the source code for the DB Manager plugin that has this nice “import layer” feature. It took about 20 minuets to trace down the function that was in charge of the import and it was QgsVectorLayerImport.importLayer(). But the quest wasn’t over yet! The documentation says nothing about provider names that are accepted by this function. “PostgreSQL” would be the obvious name for the provider as it is the name for the PostgeSQL provider in OGR, but it is not the case. I had to go through the source code of DB Manager again and luckily in comments (and there are quite a few of them in DB Manager: I didn’t find a single doc-string there) the author wrote that it is “postgres” for the PostgreSQL.

Now here is the very basic example code of importing QGIS layer into PostGIS:

uri = "dbname='test' host=localhost port=5432 user='user' password='password' key=gid type=POINT table=\"public\".\"test\" (geom) sql="
crs = None
# layer - QGIS vector layer
error = QgsVectorLayerImport.importLayer(layer, uri, "postgres", crs, False, False)
if error[0] != 0:
    iface.messageBar().pushMessage(u'Error', error[1], QgsMessageBar.CRITICAL, 5)

One of the optional input for the creation of the classification models in OrfeoToolbox is XML image statistic file that is produced by Compute Image Second Order Statistics tool. If you opt to calculate this statistics – necessarily check the created XML file. If you will see “#INF” record in band statistics (i.e. “1.#INF”) – replace “#INF” with something like “0e+32” (“1.#INF” -> “1.0e+32”), or (a better solution) – calculate the statistics for this problematic band independently (probably with another tool) and completely replace the value.

If you leave “#INF” record in XML file, models for classification that will be created based on it will have “#IND” records and when you will try to run classification based upon these models – the process will be terminated complaining about unrecognised “#IND”-values, because values must be numeric and not characters.

I don’t know what causes the “#INF” records to appear in the first place, maybe not properly defined no-data values? In some cased it is impossible to provide correct no-data value due to limitations of the input field (I use OTB from within Processing module for QGIS).

Pan-sharpening Using R

Posted: February 8, 2015 in GIS, Spatial data
Tags: , , ,

In my previous post I described how to perform pan-sharpening using OrfeoToolbox and QGIS. This time I will show you how to do this in R. At the bottom you will find several functions I wrote on top of the ‘raster’ package that allow a convenient pan-sharpening in R.

Motivation

You may wonder why I even bothered myself with pan-sharpening in R when I already have a nice model for pan-sharpening in QGIS. See, one can’t control the data-type of the imagery returned by pan-sharpening that involves OTB. This leads to some unpleasant consequences: during pan-sharpening one will get floating point pixel values even if in initial values were integers. So for example a 600 MiB multi-spectral imagery (with integer pixel values) after pan-sharpening will grow to 5.2 GB. But if we will change datatype of the resulting imagery to force it store only integers it size will be reduced from 5.2 to 2.8 GB which is a huge difference. ‘raster’ package in R allows to control output datatype. Plus using R you can play with different filtering options to play with.

The Theory

In OTB pan-sharpening is performed using following well-known formula:

Where and are pixels indices, PAN is the panchromatic image, XS is the multi-spectral image and PANsmooth is the panchromatic image smoothed with a kernel to fit the multi-spectral image scale.


We will implement exact the same approach using ‘raster’ package for R.

Code Usage and Result

As pan-sharpening is the type of procedure that will reoccur over some time I decided to write generic functions for pan-sharpening itself and for saving the results to have easier time in future.

The usage is as simple as:


pan <- raster('pan.tif')
multi <- brick('multi.tif')

pansharp <- processingPansharp(pan, multi)

output_path <- 'r_pansharp' # includes filename but not the extention
saveResult(pansharp, output_path)

 

Here you are the example results from the script and from the OTB model for one of the illegal landfills in Russia:

Initial multi-band raster
Initial multi-band raster

 

Initial panchromatic raster
Initial panchromatic raster

 

Result of pan-sharpening using R script
Result of pan-sharpening using R script

 

Result of pan-sharpening using OTB

Result of pan-sharpening using OTB

Which output do you like better: from OTB or R? Comparing both output results you can notice that the output from R bears heavier filtering markings than the one from OTB. On the other hand R output has more hues of the green colours which actually helps in distinguishing different types of vegetation. As you will see in the code – one can easily adjust or modify procedure of filtering panchromatic raster (extractLPF() function) in order to get desired output.

The code

 

library(raster)


# Create needed functions -------------------------------------------------

pansharpFun <- function(raster){
    '
    This function pansharpens a raster
    ' 
    # @param raster - Raster object with 3 bands (to-be-pansharpened, high-res and low-frequency component of the high-res image)
    # @param band - band numver, integer
    # @return pansharpened_raster - pansharpened Raster object
    # pansharp = Lowres * Highres / LPF[Highres]
    
    pansharpened_raster <- (raster[,1] * raster[,2]) / raster[,3]
}

extractLPF <- function(pan, multi, filter = 'auto', fun = mean) {
    '
    Returns a low-frequency component of the high-resolution raster by the
        filter adjusted to the low-resolution raster
    '
    # @param pan - a high-resolution panchromatic raster - Raster object
    # @param multi - low-resolution raster to be pansharpened - Raster object
    # @param filter - a smoothing wondow - matrix
    # @param fun - a function to process filter (part of the focal() function)
    # @return LPF - a low-frequency component of the high-resolution raster - Raster object
    
    # Adjust filter size 
    if (filter == 'auto') {
        pan_res <- res(pan) # (x, y) resolution of the panchromatic raster in CRS units (?)
        multi_res <- res(multi) # (x, y) resolution of the lowres raster in CRS units (?)
        x_res_ratio <- round(multi_res[1]/pan_res[1])
        y_res_ratio <- round(multi_res[2]/pan_res[2])
        total <- x_res_ratio + y_res_ratio
        filter <- matrix(1, nc = x_res_ratio, nr = y_res_ratio)
        
        # Enshure that the matrix has an uneven number of colums and rows (needed by focal())
        if (nrow(filter)%%2 == 0) {
            filter <- rbind(filter, 0)
        } 
        if (ncol(filter)%%2 == 0) {
            filter <- cbind(filter, 0)
        }
        
    LPF <- focal(pan, w = filter, fun = fun) # low-frequency component
    
}  


processingPansharp <- function(pan, multi, filter = 'auto', fun = mean){
    ' 
    Pansharpening routine
    '
    # @param pan - a high-resolution panchromatic raster - Raster object
    # @param multi - low-resolution raster to be pansharpened - Raster object
    # @param filter - a smoothing wondow - matrix
    # @param fun - a function to process filter (part of the focal() function)
    # @return pansharp - pansharpened 'multi' raster - Raster object

    # Check if input parameters are valid - we can loose a lot of time if some of the inputs is wrong
      
    LPF <- extractLPF(pan, multi, filter, fun)
        
    multi <- resample(multi, pan) # resample low-resolution image to match high-res one
        
    all <- stack(multi, pan, LPF)
    
    bands <- nbands(multi)
    pan_band <- bands + 1
    lpf_band <- bands + 2
    
    # Pansharpen layers from low-resolution raster one by one
    pansharp_bands <- list()
    for (band in 1:bands) {
        subset <- all[[c(band, pan_band, lpf_band)]]
        raster <- calc(subset, pansharpFun)
        pansharp_bands[[band]] <- raster
    }
    
    pansharp <- stack(pansharp_bands)
}

saveResult <- function(raster, path, format = 'GTiff', datatype = 'INT2S'){
    '
    Saves Raster object to location
    '
    # @param raster - raster to be saved - Raser object
    # @param path - path including filename without extention - string
    # @param format - format of the output raster accordingly to writeRaster() function - string
    # @param datatype - datatype of the raster accordingly to writeRaster() - string
    
    writeRaster(raster, 
                path, 
                format = format, 
                datatype = datatype, 
                overwrite = T)
}


# Do pansharpening --------------------------------------------------------


pan <- raster('pan.tif')
multi <- brick('multi.tif')

pansharp <- processingPansharp(pan, multi)

output_path <- 'r_pansharp' # includes filename but not the extention
saveResult(pansharp, output_path)

UPD: you may also want to check out my post on Pansharpening Using R.
For a long time I wanted to play with the OrfeoToolBox instruments (and its GUI named Monteverdi) – a set of powerfull tools to process remote-sensing imagery. Finally I got the opportunity – I needed to perform pansharpening of the World-View-2 scene and I decided to make it using OTB modules available via QGIS Processing toolbox.
To make OTB modules available in QGIS you need to install it on your system. Official site provides good instructions on how to do it. When you have OTB installed you need to enable it in QGIS. Go Processing -> Options and configuration -> Providers, activate OTB and provide folder information if needed:

A dialogue to enable OTB modules in QGIS

Pansharpening is made in two steps: 1) resampling of the low resolution raster via Superimpose sensor tool; 2) pansharpening using Pansharpening tool. Alternatively you can do it using one command line (no QGIS or other GUI needed) in your OS’s console:
otbcli_BundleToPerfectSensor -inp pan_image -inxs xs_image -out output_image.
Performing two steps instead of just one is super-boring! Lets create a model that will allow us to perform pansharpening in QGIS in one step (virtually). Go to Processing Toolbox -> Models -> Tools and start creating new model:

Processing toolbox location of 'Create new model tool'

In the model creation window fill the fields of the model name and group name. Then add two rasters as inputs and name them: High_resolution_panchrom_raster and Low_resolution_raster. Now add the module named Superimpose sensor (Orfeo Toolbox -> Geomentry -> Superimpose sensor) and configure it to resample the Low_resolution raster: use High_resolution_panchrom_raster as input in Reference input field and Low_resolution_raster as input in The image to reproject field.

 

Superimpose sensor dialogue window

Add Pansharpening (rsc) (Orfeo Toolbox -> Geomentry -> Superimpose sensor) module to model. Use High_resolution_panchrom_raster as input in the Input PAN Image and output of the Superimpose sensor as the input in the Input XS Image field. Give a name (pansharpened_OTB) for the Output image to let model know that this is the final stage of the processing.

Pansharpening (rsc) dialogue window

Here how our model looks like in Model builder:

Overview of the Pan-sharpening model in model builder

And this is how it looks like when you launch it:

Pan-sharpening model window

Also you can skip model creation process and just download model that I created. You need to paste files from archive’s models folder into your /.qgis2/processing/models directory.
If you will have issues running OTB Pansharpening it is likely that they will be covered in following gis.stackexchange topics: OTB Pansharpening Error: Adapter for adaptPansharpening-bayes not found and What causes OTB pansharpening ERROR: Inputs do not occupy the same physical space?

This post is mainly my answer to the corresponding question.

The main rule for accurate measurements is to define correct CRS for your layers and project. This is somewhat wide theme to talk about so I won’t 😉

Lets just discuss what you should do when the project CRS can’t be easily used for measurement (like Web Mercator) and you want to use Measure tool to calculate distance or area. Go Settings -> Project Properties, in CRS tab enable on the fly transformation; in General tab in Measure tool menu choose ellipsoid for distance calculations. See picture:

When you start measurements with the Measure tool move your cursor over the Measure window – an information about measurement settings will pop up just like that:

Now you can make your measurements and be sure that they are accurate. NOTE that you should always check Measure tool settings like it shown at the picture above because Ellipsoid settings for the Measure tool are not always stored or displayed in Project settings correctly (QGIS 2.6.1).

A Travelling Salesman Problem (TSP) is a well known computational challenge. A lot of algorithms were developed to solve it or its special cases.

I came around an article authored by Fang Liu ‘A dual population parallel ant colony optimization algorithm for solving the travelling salesman problem’. In this article he proposed a modification of an Ant Colony System algorithm for solving TSP and presented results obtained by his algorithm. In the table with results all looked fine – his algorithm was able to provide very good solutions for the TSP instances from TSPLIB (which is the common testing ground for TSP algorithms).

So the researcher presented good results… it seems. But then he decided to show the best routes his algorithm was able to find and annotated them with the corresponding route costs. Lets take a look at one of them. Here you are his best route for the ‘att48’ instance from the TSPLIB:

Route that claims to be optimal (but the cost is very wrong)

The optimal route for ‘att48’ and its cost is well-known (it applies to all TSPLIB instanses). Its cost is approximately 33523 (there are different approaches to rounding distances between points). So what we see at the picture above should be the optimal route (or extremely close to it). But dear reader, do you think that you see optimal route? Humans are able to provide very good solutions to TSP instances that consists of not too many points. I bet you can draw far better route yourself. The route from this picture is 1, 8, 46, 33, 20, 17, 43, 27, 19, 37, 6, 30, 36, 28, 7, 18, 44, 31, 38, 9, 40, 15, 12, 11, 47, 21, 13, 25, 14, 23, 3, 22, 16, 41, 34, 2, 29, 5, 48, 39, 32, 24, 42, 10, 45, 35, 4, 26, 1 and its cost is 41052 which is whooping 22% far from optimal! The same story for another illustration in the article.

Here take a look at the optimal route which cost is really 33523:

Actually optimal route for ‘att48’ with Cost = 33523

So what we can conclude? I do believe that the routes demonstrated in the article are the best routes found by given algorithm, but costs are put-up for the routes and for the table of results in the article as well. I think that author developed an algorithm that wasn’t able to find good solutions and provided fraud table with the put-up testing results. And clearly this article wasn’t reviewed by scientist that have knowledge in TSP area because these plots are so obviously flawed that you can’t overlook it!

No one usually shows plots of the routes they find with their algorithms. I wonder if there are more modern algorithms with the put-up results?

For my own project I needed to create a graph based on a Delauney triangulation using NetworkX python library. And a special condition was that all the edges must be unique. Points for triangulation stored in a .shp-file.I was lucky enough to find this tread on Delauney triangulation using NetworkX graphs. I made a nice function out of it for point NetworkX graphs processing. This function preserves nodes attributes (which are lost after triangulation) and calculates lengths of the edges. It can be further improved (and most likely will be) but even at the current state it is very handy.

Example of use:

import networkx as nx
import scipy.spatial
import matplotlib.pyplot as plt

path = '/directory/'
f_path = path + 'filename.shp'
G = nx.read_shp(f_path)

GD = createTINgraph(G, show = True)

Code for the function:

import networkx as nx
import scipy.spatial
import matplotlib.pyplot as plt

def createTINgraph(point_graph, show = False, calculate_distance = True):
  '''
  Creates a graph based on Delaney triangulation

  @param point_graph: either a graph made by read_shp() from another NetworkX's point graph
  @param show: whether or not resulting graph should be shown, boolean
  @param calculate_distance: whether length of edges should be calculated
  @return - a graph made from a Delauney triangulation

  @Copyright notice: this code is an improved (by Yury V. Ryabov, 2014, riabovvv@gmail.com) version of
                    Tom's code taken from this discussion
                    https://groups.google.com/forum/#!topic/networkx-discuss/D7fMmuzVBAw
  '''

  TIN = scipy.spatial.Delaunay(point_graph)
  edges = set()
  # for each Delaunay triangle
  for n in xrange(TIN.nsimplex):
      # for each edge of the triangle
      # sort the vertices
      # (sorting avoids duplicated edges being added to the set)
      # and add to the edges set
      edge = sorted([TIN.vertices[n,0], TIN.vertices[n,1]])
      edges.add((edge[0], edge[1]))
      edge = sorted([TIN.vertices[n,0], TIN.vertices[n,2]])
      edges.add((edge[0], edge[1]))
      edge = sorted([TIN.vertices[n,1], TIN.vertices[n,2]])
      edges.add((edge[0], edge[1]))


  # make a graph based on the Delaunay triangulation edges
  graph = nx.Graph(list(edges))

  #add nodes attributes to the TIN graph from the original points
  original_nodes = point_graph.nodes(data = True)
  for n in xrange(len(original_nodes)):
    XY = original_nodes[n][0] # X and Y tuple - coordinates of the original points
    graph.node[n]['XY'] = XY
    # add other attributes
    original_attributes = original_nodes[n][1]
    for i in original_attributes.iteritems(): # for tuple i = (key, value)
      graph.node[n][i[0]] = i[1]


  # calculate Euclidian length of edges and write it as edges attribute
  if calculate_distance:
    edges = graph.edges()
    for i in xrange(len(edges)):
      edge = edges[i]
      node_1 = edge[0]
      node_2 = edge[1]
      x1, y1 = graph.node[node_1]['XY']
      x2, y2 = graph.node[node_2]['XY']
      dist = sqrt( pow( (x2 - x1), 2 ) + pow( (y2 - y1), 2 ) )
      dist = round(dist, 2)
      graph.edge[node_1][node_2]['distance'] = dist


  # plot graph
  if show:
    pointIDXY = dict(zip(range(len(point_graph)), point_graph))
    nx.draw(graph, pointIDXY)
    plt.show()

  return graph