Wednesday, November 18, 2015

Network Analysis

Introduction

The goal of this lab is to introduce the basics functions of Network Analysis in ArcMap.  Sand mines transport their sand in a multitude of ways.  Generally all of the sand produced in Wisconsin will have to travel by raiload to leave the state.  Many mine sites have a direct access to rail transportation and do not truck the sand via public roadways.  However, many of the mines in Wisconsin have to transport their sand a fair distance to reach the railroad terminal.

The Wisconsin Department of Transportation NW Region Planning Staff estimated when the sand mining industry hit full stride it could have the ability to haul approximately 40 million tons of sand a year out of the state of Wisconsin (Hart).  The transportation of 40 million tons of sand to the terminals could have a significant impact on the local roadways.

Through the use of the Network Analysis function in ArcMap, I will use hypothetical value of 2.2 cents per mile to calculate the cost of additional maintenance to the roadways by county from sand transportation.  We will also be using a arbitrary number of 50 trips per mine to the railway facility.  Thus my findings will not be a true reflection of the cost but my methods could be used to calculate the true impact if the proper cost was available.



Methods

Preparation of the feature classes is needed before utilizing the Network Analysis tool.  I will be using the mine data received from the Wisconsin DNR which was utilized in the previous lab.  Not all of the mines in the data are actively producing or transporting sand.  Many of the mines have rail loading stations directly at the mine site and will not be trucking any sand.  Additionally, it is highly likely mines within 1.5 km of a railway will have had a spur rail built to transport their sand.  I wrote a python script to select all of the active mines, without a rail loading station, and not withing 1.5 km of a railway.  In the end I was left with 41 mine which fit my criteria.

I was provided a geodatabase for the lab exercise which contained a feature class of the rail terminals in Wisconsin I was instructed to use for the analysis.  I added a street network dataset from ESRI street map USA which was also provided for me.

Utilizing model builder I created a model to calculate the cost of maintaining the roadways from sand transportation (Fig. 1).  First I used the Make Closest Facility Layer tool  and Add Locations with the mines as the incidents and the rail terminals as the facilities to set up my the network analysis.  To actually run the analysis I added the Solve tool to determine the rail station with the shortest drive time from each mine.  The next step was to use the Select Data and Copy Features tool to create a feature class from the calculated routes from the network analysis.  The calculated route was in a GCS coordinate system which cannot be used accurately for measurement purposes.  I brought in the Project tool to project the feature class in to NAD 1983 HARN Transverse Mercator feet to let me achieve accurate measurements and calculations.  The next step was to use the Intersect tool to break the routes distance down by county.  Since some counties had multiple routes I used the Summary Statistics tool to create a table with the total route distance broken down by each county. With the use of the Add Field and Calculate Field tools I created 2 new fields within the table.  The first field I created converted the measurement of the distance from feet to miles.  I multiplied the foot distance by 0.00018939 to give me the distance in miles.  The second field was the dollar amount calculation of the impact cost of the trucking.  The cost of maintaining the road networks was calculated by multiplying the number of trips to the railway station by 2 to account for the return trip to the mine, multiply that result by the miles of the route and finally multiply that figure by .022 (hypothetical cost of maintenance).  The equation was displayed like the following in the tool: "2 * 50 * [Dist_Miles] *.022".

(Fig. 1) Model within ArcMap Model Builder for the creation of the network analysis tool and calculations.
Results

To better display the results I exported the final table with all of the calculations to an Excel file using the Table to Excel conversion tool within ArcMap.  With the table in Excel I was able to create a graph to display the results.

(Fig. 2) Graphic display of increased maintenance cost of roads due to sand mine truck traffic.
(Fig. 3) Additional roadway maintenance cost by county from sand transportation.




Discussion

The total amount of money is a lot lower than I originally figured.  I feel this has to do with the dollar figure we used to complete our calculations and the number of trips per mine.  Even if the dollar figure was correct I can almost guarantee the number of trips is higher than 50 trips per year.  I would venture to guess that on a good day the number of trips would be 50 per day.  Even doubling the number of trips would greatly increase the dollar figure calculated.

The transportation model only used the railroad stations within Wisconsin.  Due to an error in my methods early on I found a couple of the mines in the Western part of Wisconsin would save time if they took their sand to railroad stations in Minnesota.  Had we utilized the stations in Minnesota this would have change the impact to specific counties in those areas.

Trempealeau county has the highest number of sand mines but does not have high maintenance cost.  The cause is a centrally located rail facility which keeps the distance trucks have to travel down.  I believe this is one reason why there are so many mines located in Trempealeau county.

The two counties in Northwestern Wisconsin show a cost but there is not a route displayed on the map.  The network analysis took the fastest route to get to the rail facility.  The majority of the route was in the state of Minnesota.  The trucks only traveled a short distance to get out of the state, thus low maintenance cost for the county.

The analysis tool choose the fastest route to from the mine to the railroad.  Just because the tool choose this route does not meant the trucks use the determined path.  An actual route track from the mining companies would need to be obtained to calculate the true impact to the roads, along with the true dollar figures and number of trips per year.

Conclusion

The network analysis tool has a vast number of uses, and is fairly simple to use if you have a basic understanding of ArcMap.  Calculating the shortest or fastest distance is useful but may not always be the route chosen by actual people or businesses.  Businesses use this tool to everyday to save money by keeping the miles down on company vehicles thus saving on maintenance and fuel costs.  Being the numerical values are hypothetical I cannot derive and true conclusion about the impacts sand mine transportation has on local roadways.  There is no doubt in my mine semi transportation has a impact on the longevity of the roadways all across the planet.

Sources

Hart, M. V., Adams, T., & Schwartz, A. (2013). Transportation Impacts of Frac Sand Mining in the MAFC Region: Chippewa County Case Study. In Mid-America Freight Coalition. Retrieved November 11, 2015, from  http://midamericafreight.org/wp-content/uploads/FracSandWhitePaperDRAFT.pdf

Friday, November 6, 2015

Data Normalization, Geocoding, and Error Assessment

Goals and Background 

The objective of this exercise is to further our understanding of geocoding.  Proper geocoding requires the analyst to normalize the input data.  Normalizing the data is required for the geocoding tool with in ArcMap to "properly" function.  After geocoding the analyst must examine the errors inherently associated with geocoding.

This exercise had me geocoding the location of 20 of 129 sand mines in Wisconsin given to us from the Wisconsin Department of Natural Resources (DNR).  The entire list of mines was split between all of the members in our class, allowing for the mines to be geocoded by 3 different students.  Having multiple students geocode the locations will display if there was an error in the process by one or more of the students.  Our professor kindly removed the x,y coordinates from the file to simulate real world situations of acquired data.  In the following Methods section I will discuss the variety of ways I utilized to geocode the list of addresses I was given.

Methods
The first step in the process was to copy the information of my 20 mines from the original Excel file into a separate Excel file. You can see in (Fig. 1) there is multiple addresses columns and multiple formats within those address columns. 


(Fig. 1) Portion of the Excel file received from the Wisconsin DNR.

After copying my 20 mines into my own Excel file I created new columns to separate out the addresses from the city, zip codes, state, and the Public Land Survey System (PLSS) information.  Many of mines in my group had no address and only had the PLSS information.  I also eliminated the fields not pertinent to geocoding. Looking at (Fig. 2) you can see I created a separate field for each portion of the address.

(Fig. 2) Portion of the table I normalized from the original data.

After normalizing my data I ran the geocoding tool with in ArcMap.  My instructions were to use the World Geocode Service (ArcGIS Online) when geocoding in ArcMap.  After the tool ran I had results of 15 (75%) matched, and 5 (25%) unmatched addresses.  Now these results are a little misleading.  After using the Zoom to Candidates in the Geocoding Review/Rematch Address screen, I found only one of my mine locations was correctly depicted on the map with the actual location.  The meant I would have to locate all of the mines on the map manually using the Pick Address from Map feature.

Due to the approximation of the tool many of the locations which had actual addresses in the table were in the ball park area but needed to be adjusted for precise location.  The precise location is desired for the most accurate analysis possible when we use this data in later exercises.

Additionally, I was verifying the locations using a ArcMap basemap which was not as current as I would prefer.  Many of the mines were not actually depicted on the basemap.  I utilized Google Earth which has updated images to see if the mine was depicted for comparison against my basemap.  I also used Google Earth to check addresses which were not found using the geocoding tool.  The majority of the address I input into Google Earth brought me to the direct location I was looking for.  Using Google Earth for reference, I would locate the same area on my basemap and adjust the geocoded point to the correct location using the Pick Address from Map feature.

For the mines with no address, I was provided PLSS data.  Using feature classes of the Townships, Sections, Quarter Sections, and Quarter Quarter Sections, from the DNR geodatabase on our campus servers, I located the approximate locations of the mines.  After narrowing the location down using the PLSS system, I used the basemap image to find the actual location of the mine.  Being the basemap was not current not all of the mines were depicted.  I used Google Earth as I did before to locate the mines I could.

After manually checking and locating the 20 mines I was assigned my results table was finally 100% (Fig. 3).  This does not mean all of the mine locations are depicted in the correct location.  A few of the mines were not visible on the basemap or Google Earth.  Using a combination of address (if available) and the PLSS information (if available) I chose the location for the mine on the map.

(Fig. 3) Geocoding Result chart from ArcMap.

Results

The final task for this assignment was to compare my results against my classmates who geocoded the same mines I did.  However, 2 of the 4 class members did not turn in their shapefile for me compare against.  Using the mines from the other 2 people I was only able to compare the accuracy of 9 mines.  Being able to only compare 7 of 20 mines doesn't give a good assessment of the accuracy between my points and the other peoples points.  Even with only 7 mines I could see something was off with one of the mine locations (Fig. 4).  All of the points from my classmates were very close to exactly where my points were located except for one.  You can see in (Fig. 4) the points along the west edge of Wisconsin do not line up.  After further investigate it was my point which was incorrect and my classmates was correct.
(Fig. 4) Comparison between my geocoded locations (green triangles) and the 7 mines from other classmates (blue dots)
Due to a lack of information to compare to I didn't calculate an average of distance error but I utilized the Near tool in ArcMap to complete a quick calculation of the distance between my points and those from my classmates (Fig. 5).  The row in the chart with Maiden Rock is the point location which I made a mistake on, and you can see by the last column the distance error is the largest.

(Fig. 5) Error chart for the distances between my points and my classmates points.

After geocoding we were given the shapefile with the actual locations of the mines to compare to our points.  Since I had point locations for all of the mines in Wisconsin I selected out my mines by their unique id and created a new feature class of just those mines.  I created a map comparing my geocoded point locations and point locations of the actual mines (Fig. 5).  Looking at the map you can see the same point from my previous analysis is off verifying my point is the one which is wrong. As you peruse the map you will find a few other of my points which are not exactly where they should be.  The scale of the map does not precisely show the distance variation between the points.  I will examine the reason for errors in the discussion section.

(Fig. 6) Comparison of my mine locations with the actual locations of the mines.

After plotting the points on the map I again used the Near tool in ArcMap to calculate the distance between my points and the actual locations of the mines.  After the calculations were made I added the values to a new column in Excel to the corresponding mine.  Then I calculated the average distance of error between all of the mines.  As you can see from (Fig. 7) the average distance of error was ~1713 meters.

(Fig. 7) Error distance of each mine along with the average calculation.

Discussion 

There are a number of factors which contribute to the error of the mine locations.  Digitization of the location was an inherent and operational cause of error.  The points from the DNR were centrally located within the mine area.  I was instructed to locate my point at the driveway entrance of the mine for future roadway analysis.  You can see the variance in location from the driveway entrance and the central point of the mine (Fig. 8).

(Fig. 8) Large scale image of a mine location and the variance of my point and the DNR location.

Inherent errors are very typical in geographic data.  How you represent a location on the map with a small point when the actual object (in this case a sand mine) is not point shaped or the same size. Also, each person creating the map with choose a different point style and choose to locate that point based on their own purpose.  If the map designer needed highly accurate locations geocoding is definitely not the proper way to locate points on a map.

The locations in which were a great distance off was an error on my part.  Somewhere along the process I either missed relocating the point or never even looked at it.  I felt I went through the complete list more than once but I was obviously wrong.  I never went back through to double check all of my locations were correct.  This would have been a simple error to catch had I gone back through the list.

I would have liked to compare my location with those of my classmates further.  However, their failure to complete their task left me unable to fully complete mine.  I fell this is a good reminder lesson of how depending on other can go wrong.  In the real world there will be people who don't get their information in on time which could jeopardize the entire job.  I feel this should be taken into account when planning the time frame for completing a job.

Conclusion

Overall this lab was a great learning experience of many aspects.  Data organization from outside sources may not always be in the best format for your use.  Understanding your platform allows you to best prepare your data for analysis.  Geocoding (and maps in general) are a generalization and no two people will map the same locations the same.  There will always be a variance in locations on a map unless you use an accurate GPS location of your desired point.  Preparing the locations per the task is the best way to achieve accurate results.