Archive for the ‘researches’ Category

“Wow!” – I said to myself after reading R Helps With Employee Churn post – “I can create interactive plots in R?!!! I have to try it out!”


I quickly came up with an idea of creating interactive plot for my simple model for assessment of the profitable ratio between the volume waste that could be illegally disposed and costs of illegal disposal [Ryabov Y. (2013) Rationale of mechanisms for the land protection from illegal dumping (an example from the St.-Petersburg and Leningrad region). Regional Researches. №1 (39), p. 49-56]. The conditions for profitable illegal dumping can be describes as follows:


Here: k – the probability of being fined for illegal disposal of waste;

P – maximum fine for illegal disposal of waste (illegal dumping);

V – volume of waste to be [illegally] disposed by the waste owner;

E – costs of illegal disposal of waste per unit;

T – official tax for waste disposal per unit.The conditions for the profitable landfilling can be described as follows:

Here: V1 – total volume of waste that is supposed to be disposed at illegal landfill;

Tc – tax for disposal of waste at illegal landfill per unit;

P1 – maximum fine for illegal landfilling;

E1 – expenditures of the illegal landfill owner for disposal of waste per unit.

Lets plot the graphs (with some random numbers (except for fines) for a nice looking representation) to have a clue how it looks like.


Note that there is a footnote (this post provides nice examples on how to do it) with the values used for plotting – it is important to have to have this kind of indication if we want to create a series of plots.

Now I will show you the result and then will provide the code and some tips.

Playing with the plot

Tips and Tricks

Before I will show you code I want to share my hardly earned knowledge about nuances of the manipulate library. There are several ways to get static plot like that using ggplot, but some of them will fail to be interactive with manipulate.

  1. All the data for the plot must be stored in one dataframe.
  2. All data for plots must be derived from the dataframe (avoid passing single variables to ggplot).
  3. Do not use geom_hline() for the horizontal line – generate values for this line and store them inside dataframe and draw as a regular graph.
  4. To create a footnote (to know exactly which parameters were used for the current graph) use arrangeGrob() function from the gridExtra library.
  5. Always use $ inside aes() settings to address columns of your dataframe if you want plots to be interactive

The Code

<pre class="brush: r; title: ; notranslate" title="">library(ggplot2)

## Ta --- official tax for waste utilisation per tonne or cubic metre.
## k --- probability of getting fined for illegal dumping the waste owner (0


Today I received a copy of proceedings of the conference I participated in. A peculiar moment is that my article about using Random Forest algorithm for the illegal dumping sites forecast is the very first article of my section (as well as of the whole book) and it was placed regardless of the alphabetical order of the family names of the authors (this order is correct for all other authors in all sections).

My presentation and speech were remarkable indeed – the director of my scientific-research centre later called it “the speech of guru” (actually, not a “guru”, there is just no suitable equivalent in English for the word used). Also the extended version of this article for one of the journals of the Russian Academy of Sciences received an extremely positive feedback from the reviewers. So I suppose the position of my article is truly some kind of respect for the research and presentation and not a random editorial mistake.

Now I should overcome procrastination and make a post (or most likely two) about this research of mine.


Ok, it’s time to finish the story about land monitoring in Sverdlovskaya region. In this post I would like to demonstrate some of the most unpleasant types of the land use.

Lets begin with illegal dumping. This dump (note that there is the smoke from waste burning down) is located right next to the potato field (mmm… seems these  potato are tasty). The ground was intentionally excavated here for dumping waste. Obviously this dump is exploited by the agricultural firm – owner of this land, but who cares…

Panorama of freshly burnt illegal dump

The next stop is peat cutting. A huge biotops are destroyed for no good reason (I can’t agree that use of peat as an energy source is a good one). At the picture below you can see a peat cutting with the area of 1402 ha. There are dozen of them in the study area…

Peat Cutting (RapidEye, natural colours)

But the most ugly scars on the Earth surface are left from mining works. There is Asbestos town in Sverdlovskaya region. It was names after asbestos that is mined  there. The quarry has an area of 1470 ha and its depth is over 400 meters. Its slag-heaps covers another 2500 ha… The irony is that this quarry gives a job for this town and killing it. You see, if you wand to dig dipper you have to make quarry wider accordingly. Current depth is 450 m and in projects it is over 900 m, but the quarry is already next to the living buildings. So quarry is going to consume the town… By the way, the local cemetery was already consumed. Guess what happened to human remains? Well, it is Russia, so they were dumped into the nearest slug-heap.

Here is the panorama of the quarry. You may try to locate BelAZ trucks down there 😉

Asbestos quarry

Here is the part of the biggest slag-heap:

A slag-heap

That’s how it looks from space:

Asbestos town area (imagery – RapidEye, NIR-G-B pseudo-colour composition)

And in the end I will show you the very basic schema of disturbed land in the study area (no settlements or roads included). Terrifying isn’t it?

Basic schema of disturbed land
There was a press conference on Tuesday the 19-th about illegal dumping in Leningrad region (Russia). I was asked to be the main speaker there and to present to the press my recent study on illegal dumping prevention. I’ve already had two presentations on this subject recently at the international scientific conference in St. Petersburg State University and at the round tablefor the discussion of the upcoming “Let’s do it. Russia” clean up event.Some video from the press conference:

The main conclusion that I made by investigating possible impacts on illegal dumping prevention (such as penalty increase, chance of being caught increase and waste disposal fare decrease) is that decrease of the waste disposal fare for population is the most efficient way. And I managed to find two other publications that came to the exact conclusion (for example, there is an evidence that 1% waste fare increase leads to 3% increase of illegal dumping cases).

By the way I was able to assess probability of being caught for illegal dumping in Russia. It is about 10-5 (you can die while playing soccer with such probability).

The only way to reduce waste fares is to use waste as a resource. That means that the only way to prevent illegal dumping is to create waste management system that would be able to complete the zero waste goal.

And here is an abstract from my article:

Mechanisms of the land protection were discussed in this article. An algorithm of decision making whether to dump illegally or not was explained. Formulas for determination of profitable ration of expenditures per unit and amount of illegally dumping waste are substantiated. Effect from different types of impacts that can be used for land protection from illegal dumping were discussed (such as fares change, penalties change, penalty application probability change). Decreasing of waste disposal fares was acknowledged as the most effective way for illegal dumping prevention, but it is possible only if «zero waste» concept is implemented.

There was a scientific seminar dedicated to environmental risks assessment in the scientific-research centre where I work. A speaker was awfully ignorant in subject unfortunately. As a person who is experienced in environmental risk assessment (see my posts about risks and a particular methodology) I was afraid that I will be the one to ask the speaker (quite an old man) some inconvenient question about formulas he used, but luckily he was ashamed by someone else.

During the discussion the question of monetary aspect of the risk and damage to environment was raised: whether it is possible to use money as the measure of risks that only applicable to environment itself. In other words: is it rational to use money when assessing possible damage to solely ecosystem (there are no money in ecosystem by itself), and how to perform such assessment?

What do YOU think? I wasn’t able to find an appropriate answer at that moment, but now I believe I have a point. My answer is YES, we can use money to assess risks and damage dealt to ecosystem only.

Firstly the assessment is made by humans and for humans. And humans understand monetised value more easily. The approach that I want to propose is about assessment of money that have to be spent to recover ecosystem to exact the same state it was prior to caused or possible damage. Just imagine how much money one have to spent for recreating and reintroduction of just one extinguished species (a tasmanian wolf for example). Here you are a monetised damage to environment.

Another approach I have in mind is about evaluation of risks via relative live value of species (which can be easily monetised too). Lets use this formula for evaluation of life of individual of a given species: V=(1/N)*P, where V – relative value, N – population of the given species (or given areal of species), P – total population of the human beings. We will have a relative value as 1 for humans and 1*(P/N) for a given species. For example for a tiger we will have its relative individual value about 1 076 900! Literally, if we have a choice whether to save 1 million people or a single tiger, the tiger must be saved – not a million of people!!!

And we can monetise this value by multiplication on the average value of the single human life (you can play a bit with numbers given here).

So the damage to ecosystem may be assessed via loss of number of individuals of species that live in a given ecosystem and we are able to easily evaluate a relative value of the individuals of the each species, and it can be easyly monetised.

It’s actually already two month old news, but my research “Developement of the Universal Methodology for Assessement of Environmental Risk Caused by Fires at Illegal Dumps” (download in RUSSIAN), that was made special for Fire Monitoring Challenge (by GIS-Lab, Microsoft, NEXTGIS, several universities and GIS/spatial data corporations), was  awarded the 2-nd pace. The prize consisted of the fancy diploma, Lenovo IdeaPad G560 (thanks to all the gods it became much less uglier when I’ve installed openSUSE at it and applied an OSM sticker 😉 ), a wireless mouse (my wife was happy to grab it) and a nice book on remote sensing for children.

Instead of abstract:

Developed methodology for assessment of the fire probability in dependence of spatial location and actual area of illegal dump. It is applicable for any part of the world. Software used: QGIS, R.

Spatial component of the probability of the fire at illegal dump in Leningrad region, Russia

I was lucky to present this research at two conferences and today I’ve received a printed “minor” publication of the article (it is beta-version of the paper available at the link above). So it is possible now to cite it as:

Yury V. Ryabov (2011) Razrabotka univercal’noy metodiki rascheta veroyatnosti vozniknovenia pozhara na nesankcionirovennoy svalke // Sbornik nauchnih trudov molodyh specialistov, prepodavateley i aspirantov po resultatam provedenia Tret’ego molodezрnogo ecologichescogo congressa “Severnaya palmira”, 21-22 noyabria 2011, Sankt-Peterburg. – SPb NICEB RAN – pp. 93-106.

To Do: develope formula for composition coefficient calculation; translation to English; major publication.

P.S. If you are interested in this research and do not speak Russian don’t hesitate to contact me and ask for general translation.

As you may already know, I’m a proud owner of AMD FX-8150 8-core CPU. And I’ve purchased it not for gaming reasons, but for science. My previous CPU was painfully slow with such calculations as determination of the relation between fires and distance to the nearest highway. I even didn’t try to perform that calculations to the whole dataset of the roads mapped in OSM in Leningrad region. But now I can do this!

With the new CPU I’ve recalculated previous distribution (with the same data) in dependence only on highways and performed new calculation on the whole roads dataset. Some numbers first:

  • 6,990 – number of fire points detected by FIRMS for the last 10 years in Leningrad region;
  • 10,966 – number of the highway features used as highways for calculations;
  • 87,422 -number of features from whole dataset of roads;
  • 2,3 Gb RAM and a single core were consumed by R during calculations for the whole dataset.


Recalculated fire distribution for the highways

Recalculated values for the highways are different to the acquired at the last time despite the data was the same. But there were hardware update and most important – software updates for R and its packages (OS was updated too). But this graph looks far more reasonable than the previous one.

Lets see what we’ve got for the whole roads dataset (I will compare it to the graph above).

Distribution calculated for the whole dataset of roads

The maximum distance from road decreased almost in to times: from 41 to 26 kilometres. The distance for the highest values decreased accordingly: a rapid decreasing stops at 7 kilometres and for only highways it was 18 kilometres.

So the first 5 kilometres from the road are the most probable zone fore the fire event. This distance is easily covered on foot in two hours. Another evidence of the massive anthropogenic impact on fire starting.

If I will ever lay hands on the road data from the topographic maps (here OSM data used) I will perform the calculation again to get the most precise data.

Conclusion: FX-8150 worth buying )))

 Instead of introduction

Just for fun I decided to investigate relationship between fires intensity in Leningrad region (and St. Petersburg as well) and distance to the nearest road in order to gain the evidence of the major influence of the anthropogenic factor on fire starting.

 Materials and methods

Data used:

Software used: QGIS; R.

OSM data about major roads of Leningrad region was used to create a distance map. Distance map and data about locations of the fires detected from 2001 to June 2011 were used as arguments for the R “rhohat” function of “spatstat” package in order to investigate dependence of fires intensity on distance to the nearest highway.

Results and discussion

Firstly lets look at the map of the fires intensity distribution in space below (this map is rough and was created just to demonstrate the situation in general and it wasn’t used for the computations described below). Due to usage of the roads at this map will be a hinder for fire data I used railways instead.

Rough fires intensity distribution in Leningrad region

As you can see, fires intensity is [somewhat] related to the location of the railways: note that there are almost no fires at the east side of the map where railways net is sparse (north side of the map is similar, but there is Ladoga Lake located). Ofcourse railways are not the cause for the fires by themselves, but I suppose that in this particular case this can be evidence of the significant human influence on the fire events.

I believe you think that if we want to find clear evidence of the human influence on starting fire than we should investigate density of the population and compare it to the fires intensity. Fair enough, but you may do it by yourself using official data if you want. I will not do it because the results will be flawed. There is an issue with  population – there is no data for the actual population of Leningrad region in summer (late spring and early autumn as well) time – the time of fires. At this time a lot of people from St.Petersburg go to their summer houses in Leningrad region (take into account that the population of St. Petersburg is 5 times higher than a population of Leningrad region) and you have to estimate population of the region accordingly, and it is far not a trivial task.

So we have to investigate not the population density by itself, but the proximity of the areas that were on fire to the population. We will measure the proximity as a distance to the nearest highway. Ofcourse it is better to use all available roads for such research, but if we will do so, R will calculate it for a very long time (there are over 80 000 features in the shp-file) – I’ve stopped the process after 10 hours of waiting. So I’ve used only major highways (with “primary”, “secondary”, “trunk”, “tertiary” and “motorway” attributes in OSM) – over 10 000 features.

 rhohat{spatstat} function was used to create the following graph:

Dependence of the fires distribution on the distance to the nearest highway

The graph is somewhat weird (and we will talk about it) but it is obvious that intensity of fires have maximum at 2-3 kilometres from the highway and then goes down. So we have the piece of evidence that anthropogenic factor has a major influence on the fire events indeed.

There is an interesting “sinusoid” from 18th to 38th kilometre. Narrow gray stripe demonstrates possible error so I assume that this graph is reliable. I need to explain it somehow. My wife who is better mathematician than I tells that the best explanation is a shitty computation. Well, it is possible. But I have another opinion.

Possible explanations:

  1. Remember that we left more than 70 000 roads outside our calculations for this graph and there are a lot of roads which are not recorded in OSM. I suppose that if we would use almost complete dataset of roads in Leningrad region and more powerful computer than mine we would have more adequate graph where this sinusoid may disappear.
  2. Proximity, estimated here is not only a proximity for “occasional fire starters” but also a proximity for fire fighters. If a fire starts far away from the road it is more difficult to fight it. So this is about fire data and how we treat it for the computation: it is possible that fire events are continuously less frequent as we move away from the highway, but single event produce fire which covers larger area than average due to it is hard to fight it quickly, and this single event produce more “hotspots” (which were used for this mini-research) than average event.


We have got a strong evidence that anthropogenic factor plays a major role as cause for fire in Leningrad region: intensity of fires riches its maximum at distance of 2-3 km away from the road and then goes down.

“Sinusoid” between 18th and 38th kilometres may be caused by insufficiency of the road data used. Calculations should be repeated with more comprehensive data. Also it may be necessary to pre-process fire data in order to replace “hotspots” related to a single fire event with the single point i.e. to make it one point for one fire.

Digital Globe published all research papers that was submitted to 8-Brand Challenge. And you can find mine there 😉

Yes, this research on Illegal Dumping Monitoring With Implementation of WorldView-2 Imagery isn’t brilliant (my skills in remote sensing and English could be better), but if you are interested in illegal dumping monitoring it may provide you with some insights. And don’t hesitate to contact me if you would like to cooperate in illegal dumping researches.

One of the most interesting finding of the research is that it is hard to distinguish illegal landfill from the construction site (which is crucial for St. Petersburg). So it is necessary to use cadastral data to determinate type of the land use of the land parcel (cadastre contains information if there are construction works at the given parcel).

Mean values of digital numbers for Illegal Landfill,Construction Site and Constructions (buildings) at WV-2 imagery.

Also I wasn’t able to test Change Detection method (using Non-Homogeneous Future Difference index calculation method developed for WV-2) properly, because I haven’t ordered multi-temporal imagery in the first place… But seems that it can provide some advantages. See the paper for more information.

Today I’ve acquired another evidence that FIRMS data can be used for monitoring of fires at illegal dumps. Of course I was sure it is possible and I was able to detect fires occurred at one of the landfills in St. Petersburg, but I had no evidences that fires occurred next to illegal dumps with known [to me] location were indeed fires at illegal dumps, because I do not have an information about dumps age and FIRMS accuracy about 1 km and exceeds common size of dump, which is about couple of dozens or hundreds of meters, so there is no way I could insist that fire points next to illegal dumps were actually fires at illegal dumps.

But I’ve found some good (actually – bad) news today. There is a note about fires at two illegal dumps in Leningrad region. Look what we’ve got:

Despite no points located at the illegal dumps directly they lie within 1 km zone from the borders of dumps and the news show that these fires are actually dump fires. The bad thing is that these fires lasts for a several days, but were detected just one and two times.