Sampling locations in a city

with the aim of getting an aerial map of that location.

Ken Butler http://ritsokiguess.site/blogg
10-10-2020

Introduction

Do you follow @londonmapbot on Twitter? You should. Every so often a satellite photo is posted of somewhere in London (the one in England), with the implied invitation to guess where it is. Along with the tweet is a link to openstreetmap, and if you click on it, it gives you a map of where the photo is, so you can see whether your guess was right. Or, if you’re me, you look at the latitude and longitude in the link, and figure out roughly where in the city it is. My strategy is to note that Oxford Circus, in the centre of London, is at about 51.5 north and 0.15 west, and work from there.1

Matt Dray, who is behind @londonmapbot, selects random points in a rectangle that goes as far in each compass direction as the M25 goes. (This motorway surrounds London in something like a circle, and is often taken as a definition of what is considered to be London; if outside, not in London. There is a surprising amount of countryside inside the M25.)

London has the advantage of being roughly a rectangle aligned north-south and east-west, and is therefore easy to sample from. I have been thinking about doing something similar for my home city Toronto, but I ran into an immediate problem:

Toronto with boundary

Toronto is not nicely aligned north-south and east-west, and so if you sample from a rectangle enclosing it, this is what will happen:

randomly sampled points from rectangle surrounding Toronto

You get some points inside the city, but you will also get a number of points in Vaughan or Mississauga or Pickering or Lake Ontario! How to eliminate the ones I don’t want?

Sampling from a region

Let’s load some packages:

I had this vague idea that it would be possible to decide if a sampled point was inside a polygon or not. So I figured I would start by defining the boundary of Toronto as a collection of straight lines joining points, at least approximately. The northern boundary of Toronto is Steeles Avenue, all the way across, and that is a straight line, but the southern boundary is Lake Ontario, and the western and eastern boundaries are a mixture of streets and rivers, so I tried to pick points which, when joined by straight lines, enclosed all of Toronto without too much extra. This is what I came up with:

boundary <- tribble(
  ~where, ~lat, ~long,
"Steeles @ 427", 43.75, -79.639,
"Steeles @ Pickering Townline", 43.855, -79.17,
"Twyn Rivers @ Rouge River", 43.815, -79.15,
"Rouge Beach", 43.795, -79.115,
"Tommy Thompson Park", 43.61, -79.33,
"Gibraltar Point", 43.61, -79.39,
"Sunnyside Beach", 43.635, -79.45,
"Cliff Lumsden Park", 43.59, -79.50,
"Marie Curtis Park", 43.58, -79.54,
"Rathburn @ Mill", 43.645, -79.59,
"Etobicoke Creek @ Eglinton", 43.645, -79.61,
"Eglinton @ Renforth", 43.665, -79.59,
"Steeles @ 427", 43.75, -79.639,
)
boundary
# A tibble: 13 × 3
   where                          lat  long
   <chr>                        <dbl> <dbl>
 1 Steeles @ 427                 43.8 -79.6
 2 Steeles @ Pickering Townline  43.9 -79.2
 3 Twyn Rivers @ Rouge River     43.8 -79.2
 4 Rouge Beach                   43.8 -79.1
 5 Tommy Thompson Park           43.6 -79.3
 6 Gibraltar Point               43.6 -79.4
 7 Sunnyside Beach               43.6 -79.4
 8 Cliff Lumsden Park            43.6 -79.5
 9 Marie Curtis Park             43.6 -79.5
10 Rathburn @ Mill               43.6 -79.6
11 Etobicoke Creek @ Eglinton    43.6 -79.6
12 Eglinton @ Renforth           43.7 -79.6
13 Steeles @ 427                 43.8 -79.6

I kind of had the idea that you could determine whether a point was inside a polygon or not. The idea turns out to be this: you draw a line to the right from your point; if it crosses the boundary of the polygon an odd number of times, it’s inside, and if an even number of times, it’s outside. So is there something like this in R? Yes: this function in the sp package.2

So now I could generate some points in the enclosing rectangle and see whether they were inside or outside the city, like this:

set.seed(457299)
n_point <- 20
tibble(lat = runif(n_point, min(boundary$lat), max(boundary$lat)),
       long = runif(n_point, min(boundary$long), max(boundary$long))) -> d
d %>% mutate(inside = point.in.polygon(d$long, d$lat, boundary$long, boundary$lat)) %>% 
  mutate(colour = ifelse(inside == 1, "blue", "red")) -> d
d
# A tibble: 20 × 4
     lat  long inside colour
   <dbl> <dbl>  <int> <chr> 
 1  43.8 -79.6      0 red   
 2  43.6 -79.5      0 red   
 3  43.6 -79.6      1 blue  
 4  43.7 -79.3      1 blue  
 5  43.7 -79.2      0 red   
 6  43.8 -79.3      1 blue  
 7  43.6 -79.2      0 red   
 8  43.6 -79.2      0 red   
 9  43.7 -79.5      1 blue  
10  43.8 -79.2      1 blue  
11  43.8 -79.4      1 blue  
12  43.7 -79.5      1 blue  
13  43.6 -79.1      0 red   
14  43.7 -79.4      1 blue  
15  43.8 -79.4      1 blue  
16  43.8 -79.2      1 blue  
17  43.7 -79.3      1 blue  
18  43.7 -79.1      0 red   
19  43.8 -79.6      0 red   
20  43.7 -79.2      1 blue  

The function point.in.polygon returns a 1 if the point is inside the polygon (city boundary) and a 0 if outside.3

I added a column colour to plot the inside and outside points in different colours on a map, which we do next. The leaflet package is much the easiest way to do this:

leaflet(d) %>% 
  addTiles() %>% 
  addCircleMarkers(color = d$colour) %>% 
    addPolygons(boundary$long, boundary$lat)

The polygons come from a different dataframe, so I need to specify that in addPolygons. Leaflet is clever enough to figure out which is longitude and which latitude (there are several possibilities it will understand).

This one seems to have classified the points more or less correctly. The bottom left red circle is just in the lake, though it looks as if one of the three rightmost blue circles is in the lake also. Oops. The way to test this is to generate several sets of random points, test the ones near the boundary, and if they were classified wrongly, tweak the boundary points and try again. The coastline around the Scarborough Bluffs is not as straight as I was hoping.

Mapbox

Matt Dray’s blog post gives a nice clear explanation of how to set up MapBox to return you a satellite image of a lat and long you feed it. What you need is a Mapbox API key. A good place to save this is in your .Renviron, and edit_r_environ from usethis is a good way to get at that. Then you use this key to construct a URL that will return you an image of that point.

Let’s grab one of those sampled points that actually is in Toronto:

d %>% filter(inside == 1) %>% slice(1) -> d1
d1
# A tibble: 1 × 4
    lat  long inside colour
  <dbl> <dbl>  <int> <chr> 
1  43.6 -79.6      1 blue  

and then I get my API key and use it to make a URL for an image at this point:

mapbox_token <- Sys.getenv("MAPBOX_TOKEN")
url <- str_c("https://api.mapbox.com/styles/v1/mapbox/satellite-v9/static/",
             d1$long,
             ",",
             d1$lat,
             ",15,0/600x400?access_token=",
             mapbox_token)

I’m not showing you the actual URL, since it contains my key! The last-but-one line contains the zoom (15) and the size of the image (600 by 400). These are slightly more zoomed out and bigger than the values Matt uses. (I wanted to have a wider area to make it easier to guess.)

Then download this and save it somewhere:

where <- "img.png"
download.file(url, where)

and display it:

satellite image of somewhere in Toronto

I don’t recognize that, so I’ll fire up leaflet again:

It’s the bit of Toronto that’s almost in Mississauga. The boundary is Etobicoke Creek, at the bottom left of the image.

References

How to determine if point inside polygon

point.in.polygon function documentation

Matt Dray blog post on londonmapbot


  1. London extends roughly between latitude 51.2 and 51.7 degrees, and between longitude 0.25 degrees east and 0.5 west. Knowing this enables you to place a location in London from its lat and long.↩︎

  2. Having had a bad experience with rgdal earlier, I was afraid that sp would be a pain to install, but there was no problem at all.↩︎

  3. It also returns a 2 if the point is on an edge of the polygon and a 3 if at a vertex.↩︎

Citation

For attribution, please cite this work as

Butler (2020, Oct. 10). Ken's Blog: Sampling locations in a city. Retrieved from http://ritsokiguess.site/blogg/posts/2021-11-07-sampling-locations-in-a-city/

BibTeX citation

@misc{butler2020sampling,
  author = {Butler, Ken},
  title = {Ken's Blog: Sampling locations in a city},
  url = {http://ritsokiguess.site/blogg/posts/2021-11-07-sampling-locations-in-a-city/},
  year = {2020}
}