This lesson covers some basics about making interactive maps in R. We will be looking at point and polygon data in this lesson.
First, let’s load some of the libraries we will be using today.
library(sf)
library(leaflet)
library(tidyverse)
library(data.table)
library(ggplot2)
library(formatR)
library(tidyft)
library(readr)
library(htmltools)
Leaflet is an easy way to make interactive maps with R. Leaflet maps work similarly to ggplot2 graphics, as they are built in layers. The layers start with basemaps and work up through points, polygons, popup labels or tooltips. You can also add clustering functions and layer controls to leaflet maps.
There are many different types of basemaps that can be used in leaflet. Here is a vignette on “Using Basemaps” from the leaflet webpage. You can also preview available provider tiles. Note that some of the basemaps require an account to access.
# Example basemaps
leaflet() %>%
addTiles()
leaflet() %>%
addProviderTiles("Esri.WorldTopoMap")
leaflet() %>%
addProviderTiles("Esri.NatGeoWorldMap")
leaflet() %>%
addProviderTiles("CartoDB.DarkMatter")
For this tutorial we are going to map The Universal Mill List of palm oil mills compiled by Rainforest Alliance and the World Resources Institute (WRI). The list is available as a .csv file here or a .shp file here.
We’ll work with the .csv file here. When loading the data, you’ll notice it has some issues with encoding and nonstandard characters. This is common when using real world data, as it may or may not be optimized for your system. A big part of working with data is figuring out how to clean it and read it properly on your particular computer. I’m including some of the links that helped with troubleshooting to demonstrate that workflow.
mills <- read_csv("https://maddiebrown.github.io/ANTH475/data/UML-Jan-2026.csv",
local = locale(encoding = "UTF-8")) #(this seems to have an issue loading, so trying a read_csv() as suggested here: https://www.reddit.com/r/rstats/comments/dtgk9n/i_am_new_to_r_and_am_having_a_problem_importing_a/)
# mills$Latitude <- as.numeric(mills$Latitude)
names(mills)
[1] "UML ID" "Group Name"
[3] "Parent Company" "Mill Name"
[5] "RSPO Status" "RSPO type"
[7] "Date RSPO Certification Status" "Latitude"
[9] "Longitude" "GPS coordinates"
[11] "ISO" "Country"
[13] "Province" "District"
[15] "Confidence level" "Alternative name"
str(mills)
Classes 'data.table' and 'data.frame': 2399 obs. of 16 variables:
$ UML ID : chr "PO1000000017" "PO1000000019" "PO1000000020" "PO1000000021" ...
$ Group Name : chr "AGROPALMA" "KULIM (MALAYSIA) BERHAD" "NBPOL" "ROYAL GOLDEN EAGLE" ...
$ Parent Company : chr "AGROPALMA" "EPA MANAGEMENT SDN BHD" "NEW BRITAIN PALM OIL" "INTI INDOSAWIT SUBUR" ...
$ Mill Name : chr "CPA" "SEDENAK" "MOSA" "BUATAN I" ...
$ RSPO Status : chr "Not RSPO Certified" "RSPO Certified" "RSPO Certified" "RSPO Certified" ...
$ RSPO type : chr "" "RSPO Certified, IP, MB" "RSPO Certified, IP" "RSPO Certified, MB" ...
$ Date RSPO Certification Status: chr "13/01/2026" "13/01/2026" "13/01/2026" "13/01/2026" ...
$ Latitude : chr "-2.253539" "1.730858" "-5.6225" "0.4344444" ...
$ Longitude : chr "-48.58567" "103.538323" "150.23735" "101.825" ...
$ GPS coordinates : chr "-2.253539, -48.58567" "1.730858, 103.538323" "-5.6225, 150.23735" "0.4344444, 101.825" ...
$ ISO : chr "BRA" "MYS" "PNG" "IDN" ...
$ Country : chr "Brazil" "Malaysia" "Papua New Guinea" "Indonesia" ...
$ Province : chr "Par\xe1" "Johor" "West New Britain" "Riau" ...
$ District : chr "Acar\xe1" "Kulaijaya" "Talasea" "Siak" ...
$ Confidence level : chr "1-Fully Verified" "1-Fully Verified" "1-Fully Verified" "2-High Confidence" ...
$ Alternative name : chr "" "" "" "" ...
- attr(*, ".internal.selfref")=<externalptr>
### here is a function from stack overflow that can help with multiByte
### character issues
### (https://stackoverflow.com/questions/4993837/r-invalid-multibyte-string)
find_offending_character <- function(x, maxStringLength = 256) {
print(x)
for (c in 1:maxStringLength) {
offendingChar <- substr(x, c, c)
# print(offendingChar) #uncomment if you want the indiv characters
# printed the next character is the offending multibyte Character
}
}
string_vector <- mills$Latitude
# lapply(string_vector, find_offending_character)
# fix latitude vector
mills$Latitude <- str_replace_all(mills$Latitude, "\xa0", "")
mills$Latitude <- as.numeric(mills$Latitude)
mills$Longitude <- str_replace_all(mills$Longitude, "\xa0", "")
mills$Longitude <- as.numeric(mills$Longitude)
Now that the latitude and longitude columns have been cleaned we can
plot our data. Leaflet works much like ggplot2
in that you build the map in layers. Below, we add in basemap tiles and
then add circle markers on top using the mills data.
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = as.numeric(mills$Longitude), lat = mills$Latitude)
Using filter() from tidyverse, make an
object that only includes the mills from Indonesia. Hint: Look at your
variable names and figure out which one you will need to select the data
from, then only filter out rows that equal “Indonesia”.
Map the Indonesian mills with leaflet. Hint: Look
back at the code for mapping global mills.
Indonesia_mills <- mills %>%
filter(Country == "Indonesia")
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = as.numeric(Indonesia_mills$Longitude), lat = Indonesia_mills$Latitude)
You can also cluster map markers dynamically as you zoom.
leaflet() %>%
addTiles() %>%
addMarkers(clusterOptions = markerClusterOptions(), lng = Indonesia_mills$Longitude,
lat = Indonesia_mills$Latitude)
You may want to color the map markers based on whether or not they have a specific attribute. In this case, we can use the code below to color and resize the points based on whether or not they are RSPO Certified.
First, let’s remove extra spaces in the column names of our dataframe.
# fix column names:
# https://dnidzgorski.wordpress.com/2017/06/09/r-fix-column-names-spaces/comment-page-1/
# code adapted from here
names(Indonesia_mills) <- str_replace_all(names(Indonesia_mills), c(` ` = ""))
Indonesia_mills <- Indonesia_mills %>%
as.data.table() %>%
utf8_encoding(GroupName)
Then we can map the mill locations using leaflet and color them based on their certification status.
leaflet(Indonesia_mills) %>%
addTiles() %>%
addCircleMarkers(lng = Indonesia_mills$Longitude, lat = Indonesia_mills$Latitude,
radius = ~ifelse(Indonesia_mills$RSPOStatus == "RSPO Certified", 5, 1), color = ~ifelse(Indonesia_mills$RSPOStatus ==
"RSPO Certified", "orange", "red"))
A neat thing in Leaflet is that you can add popup
labels or other interactive features. In the code below, we take the
map we previously made and add in a popup argument that
shows the Group Name and RSPO Status for each mill.
leaflet(Indonesia_mills) %>%
addTiles() %>%
addCircleMarkers(lng = Indonesia_mills$Longitude, lat = Indonesia_mills$Latitude,
radius = ~ifelse(RSPOStatus == "RSPO Certified", 5, 1), color = ~ifelse(RSPOStatus ==
"RSPO Certified", "purple", "yellow"), popup = ~paste("<b>Mill name:</b>",
htmltools::htmlEscape(Indonesia_mills$GroupName), "<br/>", "<b>RSPO Status:</b>",
htmltools::htmlEscape(Indonesia_mills$RSPOStatus), sep = " "))
We can also add options to turn data layers on and off or change the basemaps. In the code below, we make a map in which you can change the basemap and also toggle on and off the markers for mills based on whether or not they are RSPO certified.
# first make two data layers: one for certified mills, one for noncertified
# mills
RSPOcertified <- Indonesia_mills %>%
filter(RSPOStatus == "RSPO Certified")
NotRSPOcertified <- Indonesia_mills %>%
filter(RSPOStatus != "RSPO Certified")
leaflet(Indonesia_mills) %>%
addTiles() %>%
addProviderTiles("Esri.WorldTopoMap", group = "Topo") %>%
addProviderTiles("Esri.NatGeoWorldMap", group = "National Geographic") %>%
addMarkers(data = RSPOcertified, group = "RSPO Certified") %>%
addCircleMarkers(data = NotRSPOcertified, group = "Not RSPO Certified") %>%
addLayersControl(baseGroups = c("Topo", "National Geographic"), overlayGroups = c("RSPO Certified",
"Not RSPO Certified"))
Sometimes we want to map things that we know the name of, the address, or the city, but we don’t have the latitude and longitude. In this case, we can geocode the data to identify the spatial location. One way to geocode in R is to use the tidygeocoder package. You can also use tmaptools to geocode. See SESYNC’s guide to Geocoding with R for more information on available packages.
In this example, we will geocode the locations of arboretums in Maryland. These arboretum names were copied from a table on this Wikipedia page with a “List of botanical gardens and arboretums in Maryland”.
First we read in the data and then load the library
tidygeocoder. Then, we use the function
geocode() to match the place names to spatial
coordinates.
arboretums <- read.csv("https://maddiebrown.github.io/ANTH475/data/marylandarboretums.csv")
arboretums
Name
1 Adkins Arboretum
2 Brookside Gardens
3 Cylburn Arboretum
4 Historic London Town and Gardens
5 Ladew Topiary Gardens
6 McCrillis Gardens
7 Howard Peters Rawlings Conservatory and Botanic Gardens of Baltimore
8 Salisbury University Arboretum
9 Helen Avalynne Tawes Garden
10 University of Maryland Arboretum & Botanical Garden
11
Affiliation City X X.1
1 Tuckahoe State Park Ridgely NA NA
2 Maryland-National Capital Park and Planning Commission Wheaton NA NA
3 City of Baltimore Recreation and Parks Department Baltimore NA NA
4 Edgewater NA NA
5 Monkton NA NA
6 Maryland-National Capital Park and Planning Commission Bethesda NA NA
7 City of Baltimore Recreation and Parks Department Baltimore NA NA
8 Salisbury University Salisbury NA NA
9 Maryland Park Service Annapolis NA NA
10 University of Maryland College Park NA NA
11 NA NA
library(tidygeocoder)
# Run the function to geocode the place names. This can take a bit of time.
# Here we geocode the name, city, and a city_state combo to compare the results
# of each operation.
arboretums_spatial <- arboretums %>%
geocode(Name)
arboretums_city <- arboretums %>%
geocode(City)
# since all the arboretums are in Maryland, there is no 'state' column in this
# dataset. We can add one with the code below
arboretums_state <- arboretums %>%
dplyr::mutate(State = "MD")
# Now we can geocode using the combination of the city name and state name
arboretums_city_state <- arboretums_state %>%
geocode(city = City, state = State)
addCircleMarkers() function, add an argument for
label= set to the name of the arboretum. See this article on
adding labels for help.leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = arboretums_spatial$long, lat = arboretums_spatial$lat,
label = arboretums_spatial$Name)
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = arboretums_city$long, lat = arboretums_city$lat, label = arboretums_city$Name)
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = arboretums_city_state$long, lat = arboretums_city_state$lat,
label = arboretums_city_state$Name)
Continuing our mapping of oil palm in Indonesia, let’s add in a layer that contains different types of tree plantations as polygon features. The Tree plantations dataset from Global Forest Watch (GFW) contains polygons and information about global tree plantations. For this tutorial, lets filter out just the data from Indonesia and Malaysia.
For this section, you will learn to load data into R from a shapefile. More information on loading shapefiles into R
This Try It asks you to practice downloading and using real data from the Internet.
Navigate to the Global Forest Watch tree plantation data website.
Click on “View Map” and find the “Filter Data” button on the sidebar menu. Filter out only the data from Indonesia.
Click on the “Download” button on the sidebar menu and be sure to “Toggle Filters” on so that you reduce the number of records to be downloaded. At the time of this tutorial writing, there were around 11,000 records.
Download the file, unzip it and read in the shapefile by pasting
the filepath into the read_sf() function.
Plot your data using a syntax similar to this:
leaflet() %>% addTiles() %>% addPolygons(data=treeplantation, popup=paste(treeplantation$spec_org), color=ifelse(treeplantation$spec_org %like% "Oil palm", "red", "yellow")).
Replace the data object name with your spatial object name.
If your data aren’t mapping, it might be a projection issue. You
can set the projection using this code:
projected_data_object <- st_transform(YOUR_DATA_OBJECT, "+init=epsg:4326").
Replace the data object with your named object. See this
forum post that helps with troubleshooting this issue.
We can also load geojson files into R. The code below walks you through loading and mapping the Indonesian tree plantations as a .geojson file.
Navigate back to the Global Forest Watch tree plantation data website. Follow the same instructions outlined in the previous section, but this time download the data as a .geoJSON file.
Once the data are loaded, take a look at the data structure.
treeplant <- read_sf("your file path")
str(treeplant)
Suppose we wanted to map only certain types of tree plantations. First, we can look at the list of unique species listed as the first species in each plantation as well as the simplified names.
unique(treeplant$spec_1)
table(treeplant$spec_1)
unique(treeplant$spec_simp)
Now let’s make a map showing some of the different species found in the tree plantations in Indonesia. We can color the layers based on the species and add control over the map layers.
# data selection code adapted from here:
# https://library.virginia.edu/data/articles/data-scientist-as-cartographer-an-introduction-to-making-interactive-maps-in-r-with-leaflet
leaflet(treeplant) %>%
addTiles() %>%
addPolygons(data = treeplant[treeplant$spec_1 == "Coffee", ], popup = ~spec_org,
color = "Orange", weight = 5, group = "Coffee") %>%
addPolygons(data = treeplant[treeplant$spec_1 == "Oil palm", ], popup = ~spec_org,
color = "Red", weight = 5, group = "Oil Palm") %>%
addLayersControl(overlayGroups = c("Coffee", "Oil Palm"), options = layersControlOptions(collapsed = F))
leaflet(treeplant) %>%
addTiles() %>%
addPolygons(data = treeplant[treeplant$spec_1 == "Coffee", ], popup = ~spec_org,
color = "Orange", weight = 5, group = "Coffee") %>%
addPolygons(data = treeplant[treeplant$spec_1 == "Oil palm", ], popup = ~spec_org,
color = "Red", weight = 5, group = "Oil Palm") %>%
addPolygons(data = treeplant[treeplant$spec_1 == "Eucalyptus", ], popup = ~spec_org,
color = "Magenta", weight = 5, group = "Eucalyptus") %>%
addCircleMarkers(data = Indonesia_mills, lng = Indonesia_mills$Longitude, lat = Indonesia_mills$Latitude,
radius = 2, group = "Mills") %>%
addLayersControl(overlayGroups = c("Coffee", "Oil Palm", "Eucalyptus", "Mills"),
options = layersControlOptions(collapsed = F))