## ESRI: Geospatial Data Issues

## Turning Data into Information Using ArcGIS

*The Training Course*

The objective of this course is to learn about the scientific methods used to derive information from spatial data. It explored GIS theory pertaining to visualization, measurement, transformation, and the optimization of spatial data. Spatial data has an inherent characteristic of uncertainty, and this issue was explored to learn how to identify, measure, and deal with it.

*Module One: Basics of Data and Information*

The objective of this module is to learn how to "describe what geographic data are; identify the different types of attribute data; indentify the fundamental problems with representing geographic data; understand the two ways for conceptualizing geographic data; describe two methods of representing geographic data in digital form; describe the importance of visualization and interaction to GIS; define what spatial analysis is; and to identify the types of spatial analysis".

There are four parts to this module:

1. Representing geography is to learn how to produce representations of any part of of the geographic world. You have to make choices about "what to represent, at what level of detail, and over what time period".

***Exercise: Explore geographic data***

Exploring the Geography and Campgrounds of Yellowstone Park

2.
The nature of geographic data is an issue, because much geographic data is done with selected samples. You have to learn to understand the missing info between samples.

***Exercise: Explore how sampling scheme can affect spatial interpolation*
**Sample Elevation and Hillshade Geoprocessing

3. Creating and visualizing information explores several types of spatial analysis.

***Exercise: Spatial analysis sampler***

4. Uncertainty explains the difficulties of representing the world and why our knowledge of spatial objects is limited.

*Module Two: Cartography, Map Production, and Geovisualization*

The objective of this module is to learn how to "list ways that a GIS map differs from a paper map; understand how GIS maps affect decision making; represent attributes with symbol variation; group attributes into classes for easier interpretation; identify ways in which the spatial properties of objects can be altered to clarify a map's message; understand how 3-D perspective, real-time animation, and virtual reality are changing GIS visualization; query a GIS for spatial and attribute information; and explain why dasymetric mapping and multivariate analysis result in more informative maps".

There are four parts to this module:

1. GIS-based visualization explained the importance of advances in computer technology for GIS analysis.

***Exercise: Explore the power of a GIS-based representation***

2.
Representing attributes and spatial objects explored the subject of representing and classifying attributes with symbology.

***Exercise: Represent features and attributes***

***Exercise: Simplify a line feature***

3. Scientific visualization explored the newer technology of computer based GIS analysis software.

***Exercise: Interact with a GIS display using feature selections***

Select Features by Attribute and Location for Analysis

4.
Advanced methods for improving visualizations explored the subjects of Dasymetric mapping and Multivariate mapping.

*Module Three: Query and Measurement*

The objective of this module is to learn how to "name the different views of data a GIS provides; query a catalog view; query data for specific attribute and spatial conditions; link views to see relationships between maps, tables, and graphs; explain how distance is measured on planar and spherical surfaces; understand how compactness is measured; describe how slope and aspect are measured on raster surfaces; and derive slope and aspect surfaces from an elevation surface".

There are three parts to this module:

1. Querying views of a GIS explored the subject of viewing of represented data in a catalog, map, a table, or histogram/scatterplot through a query request for information from a GIS database.

***Exercise: Explore data using a catalog view***

2.
Advanced queries explored the value of linking views and querying tables to obtain more interesting information from a database than from asimple query. The attribute values can be viewed in a spatial context.

***Exercise: Create advanced queries to get information***

3. Querying for measurements for length or area is a normal task done with GIS software because it is easy and fast method.

***Exercise: Work with measurements*
**Measurement of Aspect, Slope, Elevation, Distance to Find Suitable Sites

*Module Four: Transformations and Descriptive Summaries*

The objective of this module is to learn how to "describe what buffers are and how they are used; perform point-in-polygon analysis and explain how it is applied in discrete object and field perspectives; perform polygon overlay analysis and understand how it can be used; explain what surface interpolation is; explain the difference between the Inverse Distance Weighting and Kriging surface interpolation methods; list the different ways that density can be calculated.Describe statistical summaries of data, including central tendency and dispersion; understand visual summaries of data, such as histograms and scatterplots; and understand methods of creating geographic summaries of data, such as spatial dependence and fragmentation.

There are five parts to this module:

1. Buffering, point-in-polygon, and polygon explored the use of these tool to study relationships between different data layers.

***Exercise:
Perform buffer, point-in-polygon, and polygon overlay operations***

The Buffer Process to find each school's assignment within the stream buffers.

2.
Spatial interpolation and density estimation eplored the usage of interpolation tools Inverse Distance Weighting, Kriging, and Calculating density tool processes.

***Exercise: Use and compare Inverse Distance Weighting (IDW) and Kriging***

***Exercise: Estimate density***

3. Centers and dispersion explored the best way to summarize a set of point locations using the measures of central tendency and dispersion, this included the use of measure of central tendency by the mean, median, and the mode; and the measure of dispersion which is the mean distance from the centroid. The Varignon Frame experiment described.

***Exercise:
Calculate measures of central tendency and dispersion***

4.
Histograms, pie charts, and scatterplots usage was explored, including the issue of successfully comparing attributes when objects do not spatially coincide.

***Exercise: Work with charts and histograms***

5. Spatial dependence and fragmentation concepts and usge was explored, including the issue of positive vs. negative spatial autocorrelation.

***Exercise: Explore spatial dependence and fragmentation***

Calculate the Compactness of the Polygons in the Old Vegetation Layer

Compactness of the New Vegetation Layers

*Module Five: Optimization and Hypothesis Testing*

The objective of this module is to learn how to "define what location-allocation problems are; dscribe routing problems, including the traveling salesman and orienteering problems; give an example of an optimum path problem; understand why heuristic techniques are used to solve complex problems in a short time; understand the concepts of random sampling and hypothesis testing; explain what a null hypothesis is and why it is used; explain what a confidence level is and name the common ones used in hypothesis testing; and describe two problems with using inferential statistics to characterize geographic data.

There are two parts to this module:

1. Optimization explored the tools and methods for determining best locations of routes for which there is rarely an absolute solution, but GIS is very good for doing such analysis. The subjects of Point location, Routing problems, Optimum paths, Determining the coverage area for fire stations, and Finding the shortest route to visit customers were covered in detail.

***Exercise: Find the least cost path for a power line***

Site Analysis to for Power Line Path Location - Least Cost

2.
Hypothesis testing uses of inferential statistics to make make educated guesses about the numeric characteristics of large groups, but caution must be used when doing this for geography. The concepts of Sampling and testing a hypothesis by actually testing its opposite, the null hypothesis are discussed; along with the problem of Inferential statistics being able to tell about the characteristics of a random sampling, but then cannot tell you about the spatial arrangement of values.

*Module Six: Uncertainty*

The objective of this module is to learn how to "measure error in nominal data using a confusion matrix; understand how error in nominal data is affected by sampling methods; describe some theoretical problems in sampling natural areas; know what "accuracy" and "precision" mean in the context of measuring ratio data; explain what the Root Mean Squared Error is used to measure; list factors that can limit error in a geographic database; name two approaches to managing error in a geographic database; explain the difference between the frequentist and subjectivist views of probability; describe how fuzzy set theory can be used to understand uncertainty in nominal data; and list some practical guidelines for living with uncertainty in a GIS.

There are three parts to this module:

1. Measuring uncertainties of nominal and ordinal values were explored, because both data types are subject to misclassification in a database. The concepts discussed were for nominal data, but the same applies to ordinal data. Four areas were covered for this subject: the use of confusion matrix tables for recording classification error for a database, summarizing a confusion matrix, spatial sampling, and the difficulties in sampling natural areas.

***Exercise:
Examine and interpret a confusion matrix*
**2. Measuring uncertainty of interval or ratio values is measurable on a continuous scale, and a wide range of statistical techniques can be used. The concepts of accuracy and precision, how to describe the distribution of errors, and uncertainty in spatial data were discussed in detail.

***ExerciseExamine Root Mean Squared Error***

3. Measuring uncertainty of spatial data is a special concern in GIS. There are different types of uncertainty which are categorized. This exercise considers the issue a systemic perspective; even though it is chronic issue, it is manageable in GIS problem. Four aspects of about errors were discussed; spatial structure of errors, error propagation, fuzzy approaches, and living with uncertainty.

*Evaluation*

I considered this ESRI course to be very educational and learned many skills, along with a better appreciation of the many uses and limitations of GIS analysis. I have always been fascinated with mysterious concept of fuzzy logic and related technology.
It is random logic which is illogical, but functions very well when used, something which is difficult to fully understand.

I really appreciated the ending statement of the course, because the issue of uncertainty is always present when performing a GIS analysis. I always find myself asking the following questions: Do I have the correct and most accurate data for the specific project? Are the geoprocessing results the the best possible? Is there an alternative method which will provide a better result or solution?

**It is important to learn the
guidelines for living with GIS uncertainty.**

... To begin and end the process...