In this week’s episode, Lju and Alexander explore data on rest areas across New York State, from the U.S. Government open data site. Catch-up on the episode now!
This week we explored rest areas available in New York State data, including location, name and facilities available.
Whilst it’s a relatively small data set (some 32 rows!), it is a rather fun data set for us to explore. It also gives us a great opportunity to have a play with Neo4j spatial.
The following questions for the data set came up during the stream:
-
If I’m traveling on a certain route in a certain direction, what rest stops are available?
-
What rest areas closest to me has a public phone?
-
What route has the most handicap spaces available?
You can view and play with the data model in Arrows App here.
We have used more than one label on a node! Looking at the data set, we could see that some rest areas were open, and others were closed. A convenient way to filter out open rest stops is to apply a label - this allows the query engine to quickly filter out nodes without looking at properties.
We’ve not loaded all of the available properties. By all means, if you wish to do a bit of exploring, please do!
To load the data into the database, we use the following query:
LOAD CSV WITH HEADERS FROM "https://data.ny.gov/api/views/qebf-4fd8/rows.csv?accessType=DOWNLOAD" AS row
MERGE (c:County {name:row.County})
MERGE (r:Route {name:row.Route})
CREATE (rs:RestStop {name:row.Name, description:tointeger(row.Description),
publicPhone:tointeger(row.`Public Phone`), handicapSpaces:tointeger(row.`Handicapped Spaces`),
truckSpaces:tointeger(row.`Truck Spaces`), carSpaces:tointeger(row.`Car Spaces`),
location:point({latitude:tofloat(row.Latitude), longitude:tofloat(row.Longitude)})})
CREATE (rs)-[:ON_ROUTE {direction:row.`Travel Direction`}]->(r)
CREATE (rs)-[:IN_COUNTY]->(c)
WITH rs, row.Status AS status WHERE status = "Open"
SET rs:Open
As before, we need to cast lots of the numbers into integers and floats as appropriate! We didn’t do this on the live stream, we’ve made the changes in this write up.
Another interesting thing about this data set, is that we get to use the Neo4j spatial types! You can read more about them here.
Question 1 - If I’m traveling on a certain route in a certain direction, what rest stops are available with truck parking?
WITH 'I-87' as route, 'Southbound' as direction
MATCH (r:Route{name:route})-[:ON_ROUTE {direction:direction}]-(rs:RestStop:Open)
WHERE rs.truckSpaces > 0
RETURN rs.name, rs.truckSpaces, rs.location.y, rs.location.x ORDER BY rs.location.y DESC
In this query, we are using the distance function from Neo4j spatial.
Question 2 - What rest areas closest to me has a public phone?
We quickly discovered that none of the rest areas had public telephones! so we switched it over from telephones to handicap spaces.
Here’s the random location in New York state we picked as the location to work with.
WITH point({latitude:43.102700,longitude:-75.232339}) as location
MATCH (rs:RestStop:Open) WHERE rs.handicapSpaces>0
RETURN rs.name, rs.location.y+","+rs.location.x, distance(rs.location, location) AS dist
ORDER BY dist
Question 3 - What route has the most handicap spaces available?
MATCH (rs:RestStop:Open)-[:ON_ROUTE]->(r:Route)
WHERE rs.handicapSpaces>0
WITH r, collect(rs) AS stops ORDER BY size(stops) DESC LIMIT 1
RETURN r.name AS `Route Name`, size(stops) AS `No. Stops`