I came up with a weekly R challenge, as a way to “force” myself into learning new things. One figure, every week! The catch is that every time I should be trying something new… maybe I’ll use a new type of visualization for first time, or maybe I’ll explore a new dataset. Initially I thought it would be neat if there was some unifying topic, e.g. water, but then I decided to keep it open.

### Intro to entry #2 for 2019

I moved to Aalborg (Denmark) on Thursday last week (17 Jan) and on the next morning I had my 1st job interview. Couple of hours later I got a phone call from the boss, who told me that I was very (or maybe it was “most”) qualified for the position, but they needed someone who could communicate in Danish now. I wasn’t really surprised that I didn’t get the job. The project/program is mid-way, and there is no time/resources/patience to wait for me to improve and start speaking Danish. Hiring someone who’s not fluent in the working language would mean also revising all written reports, and covering the oral presentations… for the first few months. The experience was a good practice and maybe one day I’ll work with the people who interviewed me, but the big take-home message is: no Danish, no job. This reaffirmed my plan to spend the next couple of months actively learning Danish and getting back into the national policy and other water/environment documents.

For this week’s challenge, I decided to continue with the water consumption data (last week: Bulgaria & Singapore), but to check out also the most recent DANVA benchmarking report. DANVA stands for Danish Water and Wastewater Association, which is the national industry and stakeholder organisation of Danish water and wastewater utilities. Information about the benchmarking can be found here.

I am keeping the color scheme and the y-axis scale the same as last week, but i am trying symbols graph type with squares. Each square’s side equals to the water consumption for the specific year. The sizes are scaled so the largest data-point is an inch. The largest water consumption in the period was in 2013, so the largest square is in 2013. More about the Danish water consumption $$\rightarrow$$ after the graph.

### Data

For this week’s graph, I used the official Danish statistic (dk: Danmarks Statistik) on:

1. Households’ consumption of water (Physical water accounts) for the period 2010-2016, given in 1000 m3. The data can be downloaded from here (dk: Forbrug af vand, Fysiske vandregnskab).
2. Population figures from the censuses (for the entire country). It can be downoaded from here (dk: Folketal, summariske tal fra folketællinger)
year <- c(2010:2016)
h2o.m3 <- c(236910, 237352, 238396, 245861, 232541, 217466, 210001)*1000  # m3
ppl <- c(5534738,   5560628,    5580516,    5602628,    5627235,    5659715,    5707251)  # n

### Graph and code

I couldn’t find calculated consumption in liters per capita per day, so I calculated it from the available data. Of course, this is quite rough estimate, which is nearly the same as the estimate by DANVA (see below).

# convert to liters
h2o.l <- h2o.m3*1000  # l
# how many days are there in the years 2010:2016
days <- c(365, 365, 366, 365, 365, 365, 366)  # 2012 & 2016 are leap years
# calculate in l/cap/day
l.cap.d <- round(h2o.l/ppl/days)
# dataframe
data <- data.frame(year, l.cap.d)

The new for me thing is using symbols plot. I had no idea that you can plot multivariate data with it. It alows for: circles (bubble chart), squares, rectangles, stars, thermometers, and boxplots. I’m quite curious how the thermometers look like, so I may be testing it when the appropriate dataset comes along. I was more interested in getting interesting visual representation, than in the informativeness of the figure.

# setting the graph params for background, colors, fonts etc.
par(mar=c(1, 4, 1, 1), bg="#1A66FF", col.axis="#E6FFFF", col.lab="#E6FFFF", family="mono", fg="#E6FFFF")
cols <- rev(dichromat_pal("LightBluetoDarkBlue.10")(8))
plot(x=data$year, y=data$l.cap.d, type="n", xlim=c(2009, 2017), ylim=c(80, 160),  ylab="l/cap/day", xlab="", bty="n", axes=FALSE)
grid()
symbols(y=data$l.cap.d, x=data$year, squares = data$l.cap.d, bg=cols, fg=cols, add = TRUE, xlim=c(2009, 2017), ylim=c(80, 160)) points(x=data$year, y=data\$l.cap.d, pch=15, col= "#1A66FF")
axis(side=2, las=2, lwd=0, lwd.ticks = 1)
mtext("@DenitzaV", side=1, line=-1, adj = 1, col="#E6FFFF", cex=0.7)
text(year-0.3, l.cap.d-4, labels = year, col="#1A66FF", cex=0.8)

From the graph is visible that the water consumption in 2016 was the lowest in the period, also after 2013 it has been steadily decreasing. Danmark Statistik does not provide the numbers for pre-2010 water consumption, so unfortunately, that’s all we can say about these 7 data-points.

DANVA benchmarking report (which is based on 52 water supplying companies, representing 3.2 million users) gives that in 2016 the water consumption was 104 l/cap/day1. My calculation gives 101 l/cap/day, which is pretty close to DANVA’s estimate. In the newest DANVA benchmarking report2 it’s written that the water consumption in 2017 was 103 l/cap/day, while in 1987 this number was 172 l/cap/day, which is a 40% decrease for 31 years (p.7, Vand i Tal 2018 by DANVA). See my translation of this part of the report in my next blog-post.

That’s all!

1. see p.5 of Water in Figures 2017 pdf

2. Vand i Tal 2018 (so far only in Danish) pdf