Preparing gridded raster data for visualization

Map design Custom data visualization

This collection of articles presents a suggested workflow for creating Custom data visualizations.

  1. Creating a custom vector dataset (GeoJSON)
  2. Data sources
  3. Replacing data sources
  4. Replacing name label
  5. Editing JSON
  6. Map data visualization with MapTiler Cloud
  7. Preparing gridded raster data for visualization
  8. How to add MapTiler attribution to a map

This document shows how to select and prepare raster data for use in a data visualization created with the MapTiler SDK and weather module. Data from different time slices are prepared to create an animation in the browser. The document highlights the different steps required to get the best results when visualizing the data.

Introduction

Data visualization is a key use of maps and is growing in popularity using specialist software. While it can be easy to produce simple spatial data visualizations such as basic choropleths using JavaScript mapping libraries, controlling the finer details or presenting data differently for people with different needs can be difficult.

I wanted to create an interactive data visualization taking the form of a map showing how the global population has changed over time. To reveal the details, the user needs to be able to zoom and pan the map and move backward and forward in time. In this blog, I will take you through the processes and thinking behind the decisions made while building it using the MapTiler SDK Weather module.

This tutorial documents how the data was acquired and the processing and design decisions made to get the data ready for visualizing with the MapTiler SDK.

Along with this tutorial, there is a news article about advanced data visualization and two more tutorials on how to process the data in MapTiler Engine and how to code the demo using the MapTiler SDK:

  1. Visualizing population density on JavaScript Maps
  2. Global Population Density Data Processing
  3. Visualize and animate population data in a browser

Data Preparation

Visualizing global Population density data over time will require aligning datasets from different time points in terms of their spatial extent and format.

Proper data preparation is key to any good data visualization. Sketches and planning will help you better understand what is possible and give you a better idea of the attributes and IDs you may need to display the data correctly or join it to other datasets.

Global Population Data

For this project, I used a gridded population of the world dataset from NASA’s Socioeconomic Data and Applications Center (SEDAC), namely the Population Density v4.11 dataset. I opted for the GeoTIFF format with the highest resolution (30 seconds, which is approximately 1km squares) and downloaded the data for 2000, 2005, 2010, 2015, and 2020.

Note: Downloading the data from SEDAC is free but requires you to be logged in.

SEDAC.png

The data are float 32 rasters where each pixel’s value contains the average population density in “the number of people per square kilometer”. More information about this data can be found in the very comprehensive sidecar document from SEDAC (p. 14 “2. Population Density, v4.11 (2000, 2005, 2010, 2015, 2020)“).

Data Processing

I am using MapTiler’s weather Library for this data visualization, as it is designed to display raster maps that change over time. We want the maps to display on the web, so we’ll use the Web Mercator projection, which all JavaScript Mapping Libraries can handle. We’ll also want the data to be in PNG format to ensure they are compact enough to load fast enough to be interactive maps. The maps should zoom in to at least level 7 to see all the details.

First, let’s look at the output size. Here, I made decisions to ensure the output is manageable and has enough detail.

The MapTiler Weather module uses square-shaped web Mercator tiles. The total size of the data depends on the maximum zoom level and individual tile size. Let’s see how far we can go:

Zoom Level No. of tiles on each axis Total size in pixels No. of tiles
0 1 512 1
1 2 1024 4
2 4 2048 16
3 8 4096 64
4 16 8192 256
5 32 16 384 1024
6 64 32 768 4096
7 128 65 536 16 384

The original data is 43 200 x 21 600px and could be reprojected to a 43 200 x 43 200px image in Web Mercator without losing detail along the longitude axis. Since the total size of a tiled dataset can only be in powers of 2, we need to choose wisely which zoom level (z) we want to generate tiles up to. In this situation, we have two possibilities:

  • I downsample from 43 200 to 32 768, targeting a max zoom level of 6
  • I upsample from 43 200 to 65 536, targeting a max zoom level of 7

To avoid losing precision, we’ll upsample and generate tiles up to zoom level 7. (spoiler: each yearly tileset up to zoom level 7 will be ~100 MB, for a total of 21 845 tiles!)

Next we need to look at the scaling of the data. I’ve chosen to use PNG tiles, though you could also use WebP or JPEG. PNG pixel values are limited to integer values between 0-255. Since the population density greatly exceeds 255, we must scale the density values. We can find the max value by reading the highest pixel value in each tiff file using gdalinfo or any image analysis software. Be aware that if there are errors in the data, the software will pick these up.

Note: If your data contains an erroneous outlier, it could lead to a mistake at this point. Cross-check any maximum value results with trusted research on the topic of your data! The data used in this example contained a peak value of over 800 000 people per km2. Research on the topic reveals a sensible value to be nearer 80 000.

These are peak values found at the heart of the densest urban areas such as Macau, Paris, or Manhattan. When it comes to linear scaling down to a smaller integer range, there is no “one size fits all” strategy. We can include the maximum densities, but this will create larger “bins”: each value from the range [0-255] will represent more people, hence losing granularity.

Example:

  • scaling 0-20 000 down to 0-255 will result in a precision step of 78
  • scaling 0-40 000 down to 0-255 will result in a precision step of 156
  • scaling 0-80 000 down to 0-255 will result in a precision step of 312

In terms of data visualization, capturing the peak values is great as they often play the role of a reference point (at least visually) to compare to the rest of the data. However, capturing the fluctuations where values are minimal is important in less populated places. As we see above, if we chose a maximum value of 40 000, we won’t be able to represent places with a density somewhere between 0 and 156 people per square kilometer, and this granularity is too coarse for this project. It especially matters for visualizing changes in rural areas.

But scaling does not have to be linear! For capturing peak values and fine variations on the lower end, it’s better to encode the square root (sqrt) of the values. Let’s see how the two compare once decoded:

Tile value (uint8) Population density (linear) Population density (sqrt)
0 0 0
1 156 1
2 313 4
3 470 9
4 627 16
120 18 823 14 400
121 18 980 14 641
122 19 137 14 884
251 39 372 63 001
252 39 529 63 504
253 39 686 64 009
254 39 843 64 516
255 40 000 65 025

While the linear encoding applies the same step on the whole range, the sqrt method applies a much finer step on small values and a larger one on the upper end. As a bonus, sqrt can also encode larger peak values so the cap is now 65 025!

Here’s how they compare in a more graphical form (green is linear, red is sqrt, black is 255, the maximum value possible on the PNG). I have swapped the axes to make it easier to decode the values.

encoding.png

Having a fine granularity on the lower end also makes our visualization more interesting in less crowded areas. Here is the East Coast of the US seen with the TURBO color ramp capped at 40 000 (year 2000):

Population density (linear, max: 40000) Population density (sqrt, max: 65025)
linear40k.jpeg linear65k.jpeg

We can already spot patterns at a semi-global scale thanks to the sqrt encoding and its smaller steps at low density. Have a look at the Appalachian terrain and its population density:

Appalachian Terrain Appalachian Population density (sqrt)
AppalachianTerrain.jpeg AppalachianSqrt.jpeg

To compute the sqrt form from the original GeoTiff provided by SEDAC, we can use the following GDAL command:

gdal_calc.py -A gpw_v4_population_density_rev11_2000_30_sec.tif --outfile=density_2000_sqrt.tif --calc="numpy.where(A<0, 0, numpy.sqrt(A))" --hideNoData

In the above, we also put all the no-data values from -3.402823e+38 to 0.

As a result, we get a GeoTiff that is still float32 and the original projection, but with scaled values and no-data replaced by zeros (which makes sense in our case because no-data was placed only where population density is zero).

Global Population data conversion

The final step in data processing is to convert from the data’s original projection, Plate-Carrée, to Web Mercator. The most convenient method to tile a GeoTiff, regardless of its source projection, is to use MapTiler Engine.

Once you open MapTiler Engine, you must follow a few steps to convert a GeoTiff, export it as mbtiles, and host it on MapTiler Cloud. The steps are detailed in the Global Population Density Data Processing with MapTiler Engine tutorial.

engine.png

We will have to repeat these steps again for the years 2005, 2010, 2015, and 2020. At the end of the process, the new raster tilesets are available under the My Tiles section in your MapTiler Cloud space, where you can get some info, including their tileset IDs (this will be useful later!)

Building an interactive JavaScript Web Map Visualization

Now that our data is ready for visualization, we need to use technology that can handle zooming, panning, and timelapse animation. At MapTiler, we have developed the Weather Library module for our web mapping SDK. This JavaScript library and the data layers we provide with it let you show many different animated layer types (temperature, wind with particles, cloud coverage, etc.) in a super easy way.

This library could also be used for non-weather raster data visualization, animated and interpolated over time; this is exactly what I did with the population density! The complete source code for the interface seen in the demo at the top of this article can be found in the Visualize and animate the evolution of population data tutorial.

Next steps

Continue to How to add MapTiler attribution to a map to learn how to include an appropriate copyright attribution.