ANTH475 Home

Overview

This lesson covers some basics about making interactive maps in R. We will be looking at point and polygon data in this lesson.

Loading Libraries

First, let’s load some of the libraries we will be using today.

library(sf)
library(leaflet)
library(tidyverse)
library(data.table)
library(ggplot2)
library(formatR)
library(tidyft)
library(readr)
library(htmltools)

Getting Started with Leaflet

Leaflet is an easy way to make interactive maps with R. Leaflet maps work similarly to ggplot2 graphics, as they are built in layers. The layers start with basemaps and work up through points, polygons, popup labels or tooltips. You can also add clustering functions and layer controls to leaflet maps.

Basemaps

There are many different types of basemaps that can be used in leaflet. Here is a vignette on “Using Basemaps” from the leaflet webpage. You can also preview available provider tiles. Note that some of the basemaps require an account to access.

# Example basemaps

leaflet() %>%
    addTiles()

leaflet() %>%
    addProviderTiles("Esri.WorldTopoMap")

leaflet() %>%
    addProviderTiles("Esri.NatGeoWorldMap")

leaflet() %>%
    addProviderTiles("CartoDB.DarkMatter")

Point data

For this tutorial we are going to map The Universal Mill List of palm oil mills compiled by Rainforest Alliance and the World Resources Institute (WRI). The list is available as a .csv file here or a .shp file here.

We’ll work with the .csv file here. When loading the data, you’ll notice it has some issues with encoding and nonstandard characters. This is common when using real world data, as it may or may not be optimized for your system. A big part of working with data is figuring out how to clean it and read it properly on your particular computer. I’m including some of the links that helped with troubleshooting to demonstrate that workflow.

mills <- read_csv("https://maddiebrown.github.io/ANTH475/data/UML-Jan-2026.csv",
    local = locale(encoding = "UTF-8"))  #(this seems to have an issue loading, so trying a read_csv() as suggested here: https://www.reddit.com/r/rstats/comments/dtgk9n/i_am_new_to_r_and_am_having_a_problem_importing_a/)

# mills$Latitude <- as.numeric(mills$Latitude)
names(mills)
 [1] "UML ID"                         "Group Name"                    
 [3] "Parent Company"                 "Mill Name"                     
 [5] "RSPO Status"                    "RSPO type"                     
 [7] "Date RSPO Certification Status" "Latitude"                      
 [9] "Longitude"                      "GPS coordinates"               
[11] "ISO"                            "Country"                       
[13] "Province"                       "District"                      
[15] "Confidence level"               "Alternative name"              
str(mills)
Classes 'data.table' and 'data.frame':  2399 obs. of  16 variables:
 $ UML ID                        : chr  "PO1000000017" "PO1000000019" "PO1000000020" "PO1000000021" ...
 $ Group Name                    : chr  "AGROPALMA" "KULIM (MALAYSIA) BERHAD" "NBPOL" "ROYAL GOLDEN EAGLE" ...
 $ Parent Company                : chr  "AGROPALMA" "EPA MANAGEMENT SDN BHD" "NEW BRITAIN PALM OIL" "INTI INDOSAWIT SUBUR" ...
 $ Mill Name                     : chr  "CPA" "SEDENAK" "MOSA" "BUATAN I" ...
 $ RSPO Status                   : chr  "Not RSPO Certified" "RSPO Certified" "RSPO Certified" "RSPO Certified" ...
 $ RSPO type                     : chr  "" "RSPO Certified, IP, MB" "RSPO Certified, IP" "RSPO Certified, MB" ...
 $ Date RSPO Certification Status: chr  "13/01/2026" "13/01/2026" "13/01/2026" "13/01/2026" ...
 $ Latitude                      : chr  "-2.253539" "1.730858" "-5.6225" "0.4344444" ...
 $ Longitude                     : chr  "-48.58567" "103.538323" "150.23735" "101.825" ...
 $ GPS coordinates               : chr  "-2.253539, -48.58567" "1.730858, 103.538323" "-5.6225, 150.23735" "0.4344444, 101.825" ...
 $ ISO                           : chr  "BRA" "MYS" "PNG" "IDN" ...
 $ Country                       : chr  "Brazil" "Malaysia" "Papua New Guinea" "Indonesia" ...
 $ Province                      : chr  "Par\xe1" "Johor" "West New Britain" "Riau" ...
 $ District                      : chr  "Acar\xe1" "Kulaijaya" "Talasea" "Siak" ...
 $ Confidence level              : chr  "1-Fully Verified" "1-Fully Verified" "1-Fully Verified" "2-High Confidence" ...
 $ Alternative name              : chr  "" "" "" "" ...
 - attr(*, ".internal.selfref")=<externalptr> 

### here is a function from stack overflow that can help with multiByte
### character issues
### (https://stackoverflow.com/questions/4993837/r-invalid-multibyte-string)
find_offending_character <- function(x, maxStringLength = 256) {
    print(x)
    for (c in 1:maxStringLength) {
        offendingChar <- substr(x, c, c)
        # print(offendingChar) #uncomment if you want the indiv characters
        # printed the next character is the offending multibyte Character
    }
}

string_vector <- mills$Latitude

# lapply(string_vector, find_offending_character)

# fix latitude vector
mills$Latitude <- str_replace_all(mills$Latitude, "\xa0", "")
mills$Latitude <- as.numeric(mills$Latitude)

mills$Longitude <- str_replace_all(mills$Longitude, "\xa0", "")
mills$Longitude <- as.numeric(mills$Longitude)

First Leaflet Map

Now that the latitude and longitude columns have been cleaned we can plot our data. Leaflet works much like ggplot2 in that you build the map in layers. Below, we add in basemap tiles and then add circle markers on top using the mills data.

leaflet() %>%
    addTiles() %>%
    addCircleMarkers(lng = as.numeric(mills$Longitude), lat = mills$Latitude)

Try It

  1. Using filter() from tidyverse, make an object that only includes the mills from Indonesia. Hint: Look at your variable names and figure out which one you will need to select the data from, then only filter out rows that equal “Indonesia”.

  2. Map the Indonesian mills with leaflet. Hint: Look back at the code for mapping global mills.

Click for solution
Indonesia_mills <- mills %>%
    filter(Country == "Indonesia")

leaflet() %>%
    addTiles() %>%
    addCircleMarkers(lng = as.numeric(Indonesia_mills$Longitude), lat = Indonesia_mills$Latitude)

Aesthetics

You can also cluster map markers dynamically as you zoom.

leaflet() %>%
    addTiles() %>%
    addMarkers(clusterOptions = markerClusterOptions(), lng = Indonesia_mills$Longitude,
        lat = Indonesia_mills$Latitude)

You may want to color the map markers based on whether or not they have a specific attribute. In this case, we can use the code below to color and resize the points based on whether or not they are RSPO Certified.

First, let’s remove extra spaces in the column names of our dataframe.

# fix column names:
# https://dnidzgorski.wordpress.com/2017/06/09/r-fix-column-names-spaces/comment-page-1/
# code adapted from here
names(Indonesia_mills) <- str_replace_all(names(Indonesia_mills), c(` ` = ""))

Indonesia_mills <- Indonesia_mills %>%
    as.data.table() %>%
    utf8_encoding(GroupName)

Then we can map the mill locations using leaflet and color them based on their certification status.

leaflet(Indonesia_mills) %>%
    addTiles() %>%
    addCircleMarkers(lng = Indonesia_mills$Longitude, lat = Indonesia_mills$Latitude,
        radius = ~ifelse(Indonesia_mills$RSPOStatus == "RSPO Certified", 5, 1), color = ~ifelse(Indonesia_mills$RSPOStatus ==
            "RSPO Certified", "orange", "red"))

Adding interactive layers

A neat thing in Leaflet is that you can add popup labels or other interactive features. In the code below, we take the map we previously made and add in a popup argument that shows the Group Name and RSPO Status for each mill.

leaflet(Indonesia_mills) %>%
    addTiles() %>%
    addCircleMarkers(lng = Indonesia_mills$Longitude, lat = Indonesia_mills$Latitude,
        radius = ~ifelse(RSPOStatus == "RSPO Certified", 5, 1), color = ~ifelse(RSPOStatus ==
            "RSPO Certified", "purple", "yellow"), popup = ~paste("<b>Mill name:</b>",
            htmltools::htmlEscape(Indonesia_mills$GroupName), "<br/>", "<b>RSPO Status:</b>",
            htmltools::htmlEscape(Indonesia_mills$RSPOStatus), sep = " "))

Layer control

We can also add options to turn data layers on and off or change the basemaps. In the code below, we make a map in which you can change the basemap and also toggle on and off the markers for mills based on whether or not they are RSPO certified.

# first make two data layers: one for certified mills, one for noncertified
# mills
RSPOcertified <- Indonesia_mills %>%
    filter(RSPOStatus == "RSPO Certified")
NotRSPOcertified <- Indonesia_mills %>%
    filter(RSPOStatus != "RSPO Certified")

leaflet(Indonesia_mills) %>%
    addTiles() %>%
    addProviderTiles("Esri.WorldTopoMap", group = "Topo") %>%
    addProviderTiles("Esri.NatGeoWorldMap", group = "National Geographic") %>%
    addMarkers(data = RSPOcertified, group = "RSPO Certified") %>%
    addCircleMarkers(data = NotRSPOcertified, group = "Not RSPO Certified") %>%
    addLayersControl(baseGroups = c("Topo", "National Geographic"), overlayGroups = c("RSPO Certified",
        "Not RSPO Certified"))

Geocoding with R

Sometimes we want to map things that we know the name of, the address, or the city, but we don’t have the latitude and longitude. In this case, we can geocode the data to identify the spatial location. One way to geocode in R is to use the tidygeocoder package. You can also use tmaptools to geocode. See SESYNC’s guide to Geocoding with R for more information on available packages.

In this example, we will geocode the locations of arboretums in Maryland. These arboretum names were copied from a table on this Wikipedia page with a “List of botanical gardens and arboretums in Maryland”.

First we read in the data and then load the library tidygeocoder. Then, we use the function geocode() to match the place names to spatial coordinates.

arboretums <- read.csv("https://maddiebrown.github.io/ANTH475/data/marylandarboretums.csv")
arboretums
                                                                   Name
1                                                      Adkins Arboretum
2                                                     Brookside Gardens
3                                                     Cylburn Arboretum
4                                      Historic London Town and Gardens
5                                                 Ladew Topiary Gardens
6                                                     McCrillis Gardens
7  Howard Peters Rawlings Conservatory and Botanic Gardens of Baltimore
8                                        Salisbury University Arboretum
9                                           Helen Avalynne Tawes Garden
10                  University of Maryland Arboretum & Botanical Garden
11                                                                     
                                              Affiliation         City  X X.1
1                                     Tuckahoe State Park      Ridgely NA  NA
2  Maryland-National Capital Park and Planning Commission      Wheaton NA  NA
3       City of Baltimore Recreation and Parks Department    Baltimore NA  NA
4                                                            Edgewater NA  NA
5                                                              Monkton NA  NA
6  Maryland-National Capital Park and Planning Commission     Bethesda NA  NA
7       City of Baltimore Recreation and Parks Department    Baltimore NA  NA
8                                    Salisbury University    Salisbury NA  NA
9                                   Maryland Park Service    Annapolis NA  NA
10                                 University of Maryland College Park NA  NA
11                                                                     NA  NA

library(tidygeocoder)

# Run the function to geocode the place names. This can take a bit of time.
# Here we geocode the name, city, and a city_state combo to compare the results
# of each operation.
arboretums_spatial <- arboretums %>%
    geocode(Name)
arboretums_city <- arboretums %>%
    geocode(City)

# since all the arboretums are in Maryland, there is no 'state' column in this
# dataset. We can add one with the code below
arboretums_state <- arboretums %>%
    dplyr::mutate(State = "MD")

# Now we can geocode using the combination of the city name and state name
arboretums_city_state <- arboretums_state %>%
    geocode(city = City, state = State)

Try it

  1. Map each of the geocoded arboretum locations using leaflet. Inside your addCircleMarkers() function, add an argument for label= set to the name of the arboretum. See this article on adding labels for help.
  2. How do the different geocoded datasets compare to one another?
Click for solution
leaflet() %>%
    addTiles() %>%
    addCircleMarkers(lng = arboretums_spatial$long, lat = arboretums_spatial$lat,
        label = arboretums_spatial$Name)

leaflet() %>%
    addTiles() %>%
    addCircleMarkers(lng = arboretums_city$long, lat = arboretums_city$lat, label = arboretums_city$Name)

leaflet() %>%
    addTiles() %>%
    addCircleMarkers(lng = arboretums_city_state$long, lat = arboretums_city_state$lat,
        label = arboretums_city_state$Name)

Polygon data

Continuing our mapping of oil palm in Indonesia, let’s add in a layer that contains different types of tree plantations as polygon features. The Tree plantations dataset from Global Forest Watch (GFW) contains polygons and information about global tree plantations. For this tutorial, lets filter out just the data from Indonesia and Malaysia.

Loading data from a Shapefile

For this section, you will learn to load data into R from a shapefile. More information on loading shapefiles into R

Try It

This Try It asks you to practice downloading and using real data from the Internet.

  1. Navigate to the Global Forest Watch tree plantation data website.

  2. Click on “View Map” and find the “Filter Data” button on the sidebar menu. Filter out only the data from Indonesia.

  3. Click on the “Download” button on the sidebar menu and be sure to “Toggle Filters” on so that you reduce the number of records to be downloaded. At the time of this tutorial writing, there were around 11,000 records.

  4. Download the file, unzip it and read in the shapefile by pasting the filepath into the read_sf() function.

  5. Plot your data using a syntax similar to this: leaflet() %>% addTiles() %>% addPolygons(data=treeplantation, popup=paste(treeplantation$spec_org), color=ifelse(treeplantation$spec_org %like% "Oil palm", "red", "yellow")). Replace the data object name with your spatial object name.

  6. If your data aren’t mapping, it might be a projection issue. You can set the projection using this code: projected_data_object <- st_transform(YOUR_DATA_OBJECT, "+init=epsg:4326"). Replace the data object with your named object. See this forum post that helps with troubleshooting this issue.

Loading Geojson files

We can also load geojson files into R. The code below walks you through loading and mapping the Indonesian tree plantations as a .geojson file.

Navigate back to the Global Forest Watch tree plantation data website. Follow the same instructions outlined in the previous section, but this time download the data as a .geoJSON file.

Once the data are loaded, take a look at the data structure.

treeplant <- read_sf("your file path")
str(treeplant)

Suppose we wanted to map only certain types of tree plantations. First, we can look at the list of unique species listed as the first species in each plantation as well as the simplified names.

unique(treeplant$spec_1)
table(treeplant$spec_1)
unique(treeplant$spec_simp)

Now let’s make a map showing some of the different species found in the tree plantations in Indonesia. We can color the layers based on the species and add control over the map layers.

# data selection code adapted from here:
# https://library.virginia.edu/data/articles/data-scientist-as-cartographer-an-introduction-to-making-interactive-maps-in-r-with-leaflet

leaflet(treeplant) %>%
    addTiles() %>%
    addPolygons(data = treeplant[treeplant$spec_1 == "Coffee", ], popup = ~spec_org,
        color = "Orange", weight = 5, group = "Coffee") %>%
    addPolygons(data = treeplant[treeplant$spec_1 == "Oil palm", ], popup = ~spec_org,
        color = "Red", weight = 5, group = "Oil Palm") %>%
    addLayersControl(overlayGroups = c("Coffee", "Oil Palm"), options = layersControlOptions(collapsed = F))

Try it

  1. Try adding a third layer to your tree plantation map based on an additional tree species.
  2. Try adding the point data for the palm oil mills we mapped earlier in this tutorial.
  3. Note the different geographic ranges of the layers you added to your map.
Click for solution
leaflet(treeplant) %>%
    addTiles() %>%
    addPolygons(data = treeplant[treeplant$spec_1 == "Coffee", ], popup = ~spec_org,
        color = "Orange", weight = 5, group = "Coffee") %>%
    addPolygons(data = treeplant[treeplant$spec_1 == "Oil palm", ], popup = ~spec_org,
        color = "Red", weight = 5, group = "Oil Palm") %>%
    addPolygons(data = treeplant[treeplant$spec_1 == "Eucalyptus", ], popup = ~spec_org,
        color = "Magenta", weight = 5, group = "Eucalyptus") %>%
    addCircleMarkers(data = Indonesia_mills, lng = Indonesia_mills$Longitude, lat = Indonesia_mills$Latitude,
        radius = 2, group = "Mills") %>%
    addLayersControl(overlayGroups = c("Coffee", "Oil Palm", "Eucalyptus", "Mills"),
        options = layersControlOptions(collapsed = F))