Shared Flashcard Set


Undergraduate 2

Additional Geography Flashcards




GIS and Projections

Projections are more than a simple tag in a metadata file.

-should be able to figure out projection, simply by looking at metadata


Allow us to understand the process used to flatten the earth.


Why do we bother?

-If we understand process used to flatten it, can control what should be represented.


Can be relatively easily changed if required.

-Which to change (ie. must know why you use certain projections in certain instances)


Transform the earth from a sphere to a flat earth.


Preserve distance, direction (ie. navigation), shape, or area.


-occasionally we can put two of these together

-no single projection that can do all of these

Defining the Earth's Shape

None of these things are actually the earth (just models)

surface- Actual surface

geoid - water surface covering entire earth. Sea level is remarkably variable.

More accurate representation of what is going on, gravitationally.

ellipsoid - simplified version of earth's shape.


Most are usually present (in the majority of projections).


Although there are several earth shape models for various parts of the world, they all consist of the same variables:

An equatorial and polar radius and a flattening factor (ratio of radii; %)


Early geodesists from different regions of the Earth came up with a different model of its general shape.

The undulations in the Earth's surface due to differentatial forces of gravity is known as a Geoid model.

Newer technology, including satellite and laser measurements give better estimates of the Earth's shape.

Still use some of the old stuff for certain areas (good enough and nice and simple)



Water surface covering entire earth.

This imaginary sea is not affected by the moon, wind, or waves - only gravity. Thus the geoid model and true sea level are not aways equivalent.


Flood entire land with water. Ups and downs seen in sea level would be caused by gravity, and nothing else.

Geographical Coordinate System


Angular distance NorS of the equator from centre of the earth (up to 90 degrees).

-from equator



Angular distance EorW of a point on the earth, measured from centre of the earth (up to 360 degrees)



Lethbridge is 49Degrees/49'(minutes)/33''(seconds) N




A netowork of lines representing a selection of the Earth's longitudes and latitudes.

-all maps don't need a N arrow... just need graticule

-N is not constant.

9 Projection types (look at chart)




Normal (to sphere) = At right angles to the rotational axis of the earth

-very few maps are created using this kind of map

-works with polar projection


Transverse = Perpendicular to the equator.

-has to be in reference to point of contact.

-need to know the infinitely small point of contact.


Obique = any angle between those two extremes

-not used very often


Planar - one infitessimally small place where the map is "right". Gets dramatically wrong the further away from the tangent point that you get.


Cylyndrical - More than one point touching. A line touches. Infinite # of touch points, all in a line.


Conical - Touching at a line as well. One point of intersection. Trade-offs. Often only use conics for half (ie. equator-up or equator-down)

Map Projections Orientation

Normal projection: Main axis of projection is parallel to axis of the planet (touching equator)


transversal: main axis orthogonal to the axis of the planet


polar: main axis is perpendicular to the axis of the planet

Properties of Projections

Equidistant (length preserving): NO projection can preserve lengths on the whole surface. Approximately for large scale maps.

-usually preserved in only ONE direction

-compromise between angle and area deformation


Equal Area (area preserving): Shape of earth changes canot be angle preserving.

-distorts shape, but areas are preserved

-albes or Lamberts equal area projections

types: population density maps, land-use maps, resource management, etc


Equal Angle (Angle preserving or conformal): Scale is the same in any direction at a given point (conformal).

-angles on the surface are the same as on the projection

-shapes are preserved but NOT size/area

-mercator most common on earth

-useful for navigation, but lets not put them on the news (makes people think Canada is really big)


Which Projection to Use?



Think about the eventual use of the map (digital maps need design)


Best practices include:

-some thought as to the end use of the map

-standardized projection?

-scale of map?

-Are quantitative measured possible for the eventual use of the map (area, length)? (scary one)


Look at "Projection Differences" slice (#9, Jan 19th)

Map Projection Parameters

Standard Line - A line of tangency between the projection surface and the Earth

-cylindrical and conic projections have just one line. Secant projections have two standard lines.

-represents an area of no distortion.


Standard Parallel - a standard line that follows a line of Latitude


Standard Meridian - a standard line that follows a line of Longitude



Faking Things

When a specific map projection is used to create a coordinate system, the central parallel and meridian are used as the origin (zero X, Y) for the map.


Create a False Origin (false easting and northing)


*check out series of diagrams (Jan 19th)


Universal Transverse Mercator

-Almost never use true mercator, we use the one that the American Military developed during WWII

-most commonly used map projection

-world standard for topographical mpaping

Selected Projections

Peters Projection

Equal area projection, and not particularly nice.

Make you realize that Africa is one of largest continents on earth.


Sinusoidal Equal-Area


Lambert Azimuthal Equal Area

-equal area focused on the pole


Lambet Conformal Conic

-very common

-origin: 23N, 96W

Standard Parallels: 20N, 60N

Projection Basics

Datums, Spheroids etc.

-How do we know where we are?

A datum is a point at which we start things.


Use a model of th Earth (a Spheroid)

-basic input


Clark 1866

Geodetic Referene System 1980 (GRS80)

-lots have been created since

Clark 1866

Alexander Clark was a British surveyor

-we already knew earth was not a perfect sphere


Clark 1866 is the longest used model of the Earth's surface (used faithfully for ~100 years)


Basis of the North American Datum defined in 1927 (NAD27)


"Based on Clark's spheroid, we're going to call these things true- datums (singular point of reference)."

-so that we can know where we are relative to it


Used by the USGS for 50+ years

GRS80 and WGS84

New spheroids come and go


GRS-80 (geodetic reference system, 1980) defines only the geometric shape of its ellipsoid and a normal gravity field formula.


WGS (World Geodetic System) defines a fixed global reference frame for the entire earth

The latest revision is WGS 84 dating from 1984 (last revised in 2004) valid up to 2010.


WGS covers entire earth, while GRS does not


Reference from which measurements are made.


A datum is the starting point for all measurements of the Earth's surface - you have to start somewhere.

-set these up very carefully.

Set them up in reference to the earth's surface.


Based on an accurate spheroid - gives an estimate of the shape of the earth.


Must be based locally - errors tend to propagate away from the point of reference. When you're at something, you know where it is. The further you get away, the less accurate you know where you are.


Horizontal datums are used for describing a point on the Earth's surface in a coordinate system (x and y).


Vertical datums are used for depth - bathymetry (z element).

Datums in GIS

NAD27 is based on the Clarke 1866 spheroid that originates at a surface point at Meades Ranch, Kansas.


Geographic centre of the contiguous US.

(not the centre of North America, which is in N Dakota)


At that one point, error is zero. But nobody lived in Kansas in 1866. Maps at this time were a little wonky as a result.


USGS - North American Use (doesn't really work for Canada though- been screwing up our maps for a long time).


Most maps in Can and US are based on this datum. We get tons of errors from this.


Most are now being updated to more recent spheroids and datums.



North American Datum 1983

Can be based on either the GRS80 or WGS84 spheroids (they are pretty close).

Both are more accurate for Canada. Spheroids fits better due to how the were created.




Avoid time-consuming and possibly costly data alignment issues.


GIS users need a basic understanding of projections, spheroids, and datums.


Awareness of the datums associated with the data sets they use.

Projections - UTM

Universal Transverse Mercator (UTM) is Global international metric coordinate system.


It is mathematically consistent and well defined for the entire earth (apart from 84N and 80S). There's nothing there anyways.


The earth is divided into 60 zones that span 6 degrees of latitude each (84N-80S).


Zone 1 starts at 180W.

Can be based on various datums (NAD27; NAD83).

Limiting distortions away from the central meridian.

The zones are also split N and S of the equator.


UTM in 6 degree increments, yet the prairie provinces are 10 degrees wide. Makes problems for AB and SK mapping.


Each zone has a common reference system.


Origin (for N. Hem) is the equator and the zone's central meridian.


To eliminate negative coordinate values to the west of the central meridian in each UTM zone, the origin of each is given a false easting value of 500 000 meters.

-most common problem is that people will monkey around with the 500 000 number, or neglect to include it.


This places the origin 500 000 meters west of th central meridian.


The equator is the origin for latitude, and it begins at zero for all Northern zones.


Southern UTM zones have a false northing at the equator equal to 10 000 000 meters so that there are no negative latitudes for southern zones.


Proper UTM coordinates are very large numbers (always Easting, Northing).

Alberta - the Projection Problem Child

We sit in 2 UTM zones (11 and 12)

The meridian at 115W divides the zones.

Cannot use UTM to map all of AB


To properly map AB as a whole, the 10TM projection was introduced

-NAD83, AB 10TM coordinates

-Central meridian: -115

-Projection scale factor at Central Meridian 0.9992

-False easting: 0 [500 000]

-False Northing: 0 [0]


-semi-major = 6378137.0000

-semi-minor = 6356752.3142

-False easting is different in the "forestry" vs "resource" version of this projection (explanation of the brackets)

More "Oh-No's"

3TM = 3-Degree Transverse Mercator

3TM is used in municipalities

Used in all urban areas of Alberta (and others).


UTM used in rural areas (or 10TM).

3TM provides more accurate projections

Makes integration difficult

People outside AB find it difficult

Dominion Land Survey

Western Canada was divided (mostly) into one square mile sections.


Based on the systems the US uses (Public Land Survey)


Started in 1871 - Manitoba was founded in 1870 and the NWT became part of Canada.


The DLS is the largest survey grid in the world created at once.

Defined by meridians.

Going from East to West- there are 7 meridians in the system.


The first is just west of Winnipeg, second is the Man/Sask boundary, third is Moose Jaw, Forth is the Sask/AB boundary, Fifth runs through Calgary (Barlow Trail), Sisth is Grande Prairie, and the seventh is between Hope and Vancouver.


The base unit of measure is the Township - which are 6 x 6 miles (NS-EW)


There are two tiers of the townchips to the north and two tiers to the sout of each baseline.


E and W edges of a township are defined as lines of longitude and convergence to the N (N edge of the township is smaller thanthe S) - Correction lines are created.... (finish later)


Townships are then designated by their "township number" and "range nuber".


Township 1 is th first N of the First Baseline, and the numbers increase to the North.


Range numbers recommence with Range 1 at each meridian and increase to the W


"Township 52, Range 25 W of the Fourth Meridian", abbreviated "52-25-W4". Meridians are not referenced in Manitoba.


Each section is divided into four "quarter sections" SE, SW, NW, NE. And the division goes on... these are Legal Subdivisions (LSD)

Object vs Field Concepts

Objects are entities such as buildings, roads, pipes, properties; they have distinct boundaries; they are considered discrete entities.


Fields are continuous phenomena such as elevation, temperature, and soil chemistry; they exist everywhere (every point has an elevation or temperature); they are not discrete entities.


In modern GIS, we get away from Vectors/Rasters and move towards Objects/Fields

Spatial database models

Vector - points, lines, polygons.

Raster - gridded, classified space.



-a line only has width in raster.

-a point has area (single pixel)


Spatial Database - Internal GIS database

Attribute Database - External database


Node = functionally the same as a point.

Vertex = between a 'from' and 'to'.

Arcs = Lines (though normall have an associated topology). A line does not. This is the fundamental difference between them.

-gets decalred a little differently than a line.

Polygons - use arc and arc direction to combine poygons.

** look at charts, slide 14 on Jan 24th... may have to replicate tables.

Geographical Data Models

All geographic information systems are built using formal models that describe how things are located in space.


A formal model is an abstract and well-defined system of concepts.


A geographic data model defines the vocabulary for describing and reasoning about the things that are located on the earth.


Geographic Data models serve as the foundation on which all geographic information systems are built.

Information File Types

ESRI Coverage:

-Georelational data model (old school)

-Developed in mid-80s to separate GIS from CAD data models


Is topologically encoded (first ever)


-three topological relationships

---connectivity - arcs connected to each other

---area definition - areas defined by their arcs

-contiguity -arcs have directions and right/left polygons

The Shapefile

The shapefile is a standard, nontopological data format used in ESRI products.


Although the shapefile treats a point as a pair of x-,y-coordinates

A line as a series of points

A poly gon as a series of line segments


No file describes the spatial relatinships between these geometric objects.



-portablity to non-topoligcal GISs (interoperability)--- many GISs don't contain topology

-Speed of display

-Con be converted to coverages

---converting from coverage to shape file is easier than the other way around


***Check out Slide 13 (Jan 26th)***

-ArcView became a very populat product... got more and more capable, and started to become just like a GIS

-ArcGIS started with version 8 in 2000

-about to get version 10

Python: programming that can create workflows


**Check out slide 14 (coverages vs shapefiles)

-there's going to be a folder for

With a shapefile, you get 3 for each one.

-shape file, extension file, and a database file

Advantages of Nontopological Vector Data

Nontopological data (such as shapefiles) have two main advantages:

-fast (they can display more rapidly on the computer monitor than topologically-based data

-They are nonproprietary and interoperable, meaning that they can be used across different software packages (eg. MapInfo can use shapefiles and ArcGIS can use MapInfo Interchange Format files).

Object-Based Data Model

The object-based data model treats spatial data as objects. It differs from the georelational data model in two important ways:

-The object-based data model stores both the spatial and attribute data of spatial features in a single system (inconvenient)

-The object-based data model allows a spatial feature (object) to be associated with a set of properties and methods (important part!... this is brilliant).

---can create data that processes itself!

---if you open up the file it will do the thing immediately (can't do this is a georelational database)


Check out The slide after this one...

Classes and Class relationships

A class is as et of objects with similar attributes


Class relationships include association, aggregation, ....... (first slide, jan 28)


An interface represents a set of externally visible operations of an object. It allows the user to use the properties and methods of teh object.


Look @ Figure 3.1 in text (IFeature interface)

 Ifeature has access tot he properties of Extent and Shape and the method of Delete. Object-oriented technology uses symbols to represent interface, property, and method. The symbols for the 2 properties are different in this case becasue extent is a read-only property whereas shape is a read and write (by reference) property.

** learn this stuff better in lab



check out Fig 3.12

The Geodatabase

The geodatabase is part of ArcObjects a collection of thousands of objects, properties, and methods that provides the foundation for ArcGIS Desktop.




The geodatabase organizs vector data sets into feature classes and feature datasets.


A feature class stores spatial data for the same geometry type.

-classes points/lines/polygons in categories


A feature dataset stores feature classes that share the same coordinate system and area extent. Container for feature classes.


In a geodatabase, feature classes can be standalone feature classes or members of a feature dataset.

-can hold them in a geodatabase, just can't hold them in a feature data set if they're not the same.


*This all is complicated, but its consistent.

Topology Rules (geodatabase)

The geodatabase defines topology as relationship rules and lets the user choose the rules, if any, to be implemented in a feature dataset.

Could be topologically encoded OR not.

Compare/Contrast = shapefile has no topology, gdb can have or not.


The geodatabase offers 25 topology rules by feature type.

-don't memorize list of rules

-rules for polygons, lines, and points.

Advantages of the Geodatabase

The hierarchical structure of a geodatabase is useful for data organization and management.

The geodatabase, which is part of ArcObjects, can take advantage of object-oriented technology.


The geodatabase offers on-the-fly topology, aplicable to features within a feature class or between two or more participating feature classes.


Thousands of objects, properties, and methods in ArcObjects are available for GIS users to develop customized applications.


ArcObjects provides a template for custom objects to be develped for different industries and applications.


"Data about data"

Data increases in size and relevance.

Accessing large repositories of data has become easier to do.


Canadian gov has released all data that they've collected with tax dollas.


Held in multiple locations. Used to be held in a single file (bad old days).


It is a reference and organizational system about data.


Allows for the description of:

-GIS data content

-quality of data (ie. 2 different map source resolutions)

-creation date

-redistribution rights

-spatial information


Relatively recent devlopment

Makes data mining much easier

-data mining is deriving data from data (finding one common string through a mountain of data...... ie 'which arcs were changed on which day?')

-if you want to trace back the 'heritage' of your data

Reduces duplication

Shares knowledge about how data are created

Heritage of data

Composite Features

Composite features refer to those spatial features that are better represented as composites of points, lines, and polygons.


Composite features include TINs, regions, and routes.


A TIN approximates the terrain with a set of non-overlapping triangles.


A composite feature that allowed us to approximate terrain without the data-heavy raster style.


Allows us to hold data points very efficiently.

-all you need is points (don't even need to hold the triangles.. they're all created using an algorithm).


TIN is not only efficient spatially, but also in display (vs raster)

-it doesn't mean they're accurate


***Look at Figure 3.15 (Jan 31)



A region is a geographic area with similar characteristics.


A data model for regions must be able to handle two spatial characteristics: A region may have spatially joint or disjoint areas, and regions can overlap or cover the same area.

-normally in a GIS, this is not allowed.


The regions subclass allows overlapped regions and spatially disjoint components (the former is a special case).


A route is a linear feature such as a highway, a bike path, or a stream.

Not simple vectors. They can't be. It provides conveyance.

ie. a boundary would be a linear feature that does not deserve a route.


Unlike other linear features, a route has a measurement system that allows linear measures to be used on a projected coordinate system.


Need to create a topological structure (this is the direction, etc etc)

-Figure 3.19*

F- (from)

T- (to)


An early application of topology in geospatial technology is the TIGER (Topologically Integrated Geographic Encoding and Referencing) database from the U.S. Census Bureau.


Created in 80s


Equivalent Canadian organization is Stats Canada (large GIS division)


A little weird in the way it references stuff (Figure 3.4)

Another combined file type


Topology in the TIGER database involves 0-cells or points, 1-cells or lines, and 2-cells or areas.


Address ranges and zip codes in the TIGER database have the right- or left- side designation based on the direction of the street.

GIS Basic Data Models

What are the two types of Data Models?

(ie. models for graphically representing geographic space)



Note: A database structure ned seldom be made to suit a data model. But a well prepared data model is vital for a successful GIS analysis.

Raster Data Models (Structure)

One model for representing geographic space.


Spatial locations are implicit (point of differentiation - in a vector they are explicit).


Relationships between entities/objects are explicit. Just can't have spatial locations that are.


Points associated with single grid cell (as accurate as you can get). GPS point lights up a single pixel.


Lines are a connected sequence of cells.

-generally results in lines of unequal areas (crossing multiple cells in some areas, and one in others)


Areas are a sequence of interconnected cells.

A cartesian coordinate system is composed of explicit x-y- coordinates
Why would you change vector to raster?

Why would you give up accuract of position?

1) continuous features

2) modelling

Raster Data: Description

Consists of a matrix of homogeneous grid cells (usually square in shape)


Each raster map layer has two origins:

1)The cartesian coordinate origin at the bottom left, referencing a cell's position to a real-world location (overlying grid with infinite resolution)

2)The row and column index origin at the top left, referencing a cell location within the grid matrix.


Individual grid cells in a raster image are referred to as "picture elements" or "pixels"


MetaData will tell you:

RC (row/column) .. ie 1024x768

Cart/real N(7values) E(6values)

Resolution (ie. 10m)

Raster Data Sources

Satellite imagery

-Landsat data; SPOT data, etc.


Existing cell-based data

-DEM; Arc/Info Grid; GRASS (a free GIS); IDRISI


Scanned Imagery

-aerial photographs; hard copy maps


Vector-to-raster conversion

Raster Data: Resolution

The area within a grid cell (ie. cell size) defines the spatial resolution of the raster.

***Doesn't mean that it covers what the raster resolves

-need multiples of anything to resolve something.

-30m pixels doesn't mean you'll find something 30m big on the ground


GSD (ground sampled distance) - a better notion... since 'resolution' gives people the impression that you can see stuff.


The smaller the cell, the greater the resolution and accuracy (more detailed feature representation)


There is a trade-off between resolution and cost of storage and processing.



Aerial Images:

-Digital Orthophoto Quadrangles (DOQs) 1m, 2.5m, 10m & 30m resolutions.

-Custom orthophotography - resolution 25cm.


Satellite Images:

-MODIS: 250m to 1km per pixel

-Landsat 30m, SPOT 10/5/2.5m, IKONOS 1m others



-DEMs: 10 & 30m

-Scanned maps, vector conversions - varies greatly


**resolution is "on the side"

-10m resolution is 4 times better than 20m.

Conversion: Vector to Raster

Centroid Method

-Code your polygones

-overlay raster grid

-calculate the centre of each raster (becomes a point file)

-then do "point in polygon"

-code each raster based on where the centre of it falls.


End up with over- and under-representations.

Conversion: Raster to Vector

Use an edge method.

Not the best representation you can imagine.



Conversion Errors
Errors caused by exchanging data between raster and vector formats.
Raster Data: Values

Each grid cell stores an associated value that defines which class, group, category, or member of the cell belongs.

The value is either an integer, floating point, or No Data value. Cells with No Data value are excluded during calculations and analysis.

Raster Data: Cell measurement values

Nominal: identifiers with no relation to a fixed point of linear scale (soil type)


Ordinal: Lists of discrete classes with inherent order but without magnitude or relative proportions (gold and silver medals)


Interval: classes not only with natural sequence, but also with meanings attached to the distance between sequential values (time of day, pH value, etc)


Ratio: variables with the same characteristics as interval variables, but in addition, they have a natural zero or starting point (age, distance, income).

-can't have less than zero distance.

Raster Cells: Coding (not in text)

There are GIS formats that allow for rasters to be smart (not really smart, just not retarded).\


Coding is assigning a number to rasters based on characteristics (as in for a vector to raster conversion).


Can hold all of these as a raster, or as a table


In the table, we can include more than one value.


One Object: Multiple Attribute Layers

-only one attribute value may be asigned to each cell (whether forest type, tree age, etc). Objects with several attributes are represented with a number of raster layers, one for each attribute.


Each raster cell will have a column of multiple variables.

***Need to look at slides for this (near end of Feb 2, I think lecture 11 - perhaps labelled as a second lecture 10, as he forgot to change first slide).

Raster Data: Methods of Compacting

Four common methods of storing data:

-Run-length codes

-Raster chain codes

-Block codes


Characteristics of raster images

Dimensions: Width, Height

The number of color planes

Bytes per/plane, bits/pixel



Map projection system


Spatial resolution: units/pixel


Height: the number of rows

Width: the number of columns

Number of color planest and bits per plane

One plane:

-binary: 1 bit/plane

-grayscale: 8 bit/plane


Three planes:

-RGB color images

-8 bit/plane

-images formed this way for display


N planes:


-8-16 bit/plane

-(spectral resolution)

-expensive images, becoming more common

-small size spatially, but rich dataset in terms of spectral content.


Every raster has a grayscale value (between 1-255).

0 is black (all off)

255 is white (all on)

Raster images with palette

Check out this slide (Feb 7)


16.7 million colour combinations (not all are useful)


R- 28




(3 planes/images/bands)


Added together = 224


**learn more on this

Sources of raster map images

-Photo: aerial photo, sat images

-Scanned paper map

-Generated and/or compiled raster images and raster maps

-Raster terrain Data


Raster Data




-simple data structures

-location specific manipulation of attribute data is easy

-many kinds of spatial analysis and filtering may be used

---gets very complicated (haven't figured out how to do these things with vector)

-mathematical modeling is easy because all spatial entities have a simple, regular shape

-technology is cheap

-many forms of data are available


So guess what- there's lots.



-exceedingly large data volumes

-using large grid cells to reduce data volumes reduces spatial resolution; loss of information and inability to recognize phenomenologically defined structures.

-crude raster maps are inlegant though graphic elegance is becoming less of a problem

-coordinate transformations are difficult and time consuming unless special algorithms and hardware are used and even then mayresult in loss of information or distortion of grid cell shape.

Raster Data Structure

1. Cell-by-cell encoding

-We've got rows, and we've got columns



2.Run Length Encoding

-works best with bits (not great for bytes or higher bit depth)

-The run length encoding method records the cell values in runs. Row 1, for example, has two adjacent cells in columns 5 and 6 that are gray or have the value of 1. Row 1 is therefore encodd with one run, beginning in column 5 and ending in column 6. The same method is used to record other rows (Figure 5.9).


3. Quad Tree

5.10 in text books

-look in text.. they are difficult to describe (not to encode)

-says: if we take an area, we imagine we could split it up in regularly sized blocks

-The regional quad tree method divides a raster into a hierarchy of quadrants. The division stops when a quadrant is made of clels of the same value (gray or white). A quadrant that cannot be subdivided is caled a leaf node. In the diagram, the quadrants are indexed spatially: 0 for NW, 1 for SW, 2 for SE, and 3 fo NE. Using the spatial indexing method and the hierarchical quad tree structure, the gray cells can be coded as 02, 032, and so on.



***REad about these three**

Data Compression

Lossless data compression: Store/Transmit big files using few bytes so that the original files can be perfectly retrieved (ie. zip)


Lossy data compression: Same thing, but approximately retrieved (ie. mp3)

-willing to give up something (holding it makes no sense)


Motivation: Save storage space and/or bandwidth (speed)

Digital Sampling Theorem

In 1927, Hary Nyquist, an enginer at the Bell Telephone Lab determined the following principle of digital sampling.


When sampling a signal (ie converting froman analog signal to digital), the sampling frequency must be at least twice the highest frequency present in the input signal if you want to reconstruct the original perfectly from the sampled version.

-not a physical law.. predominantly theory.


His work was later expanded by Claude Shannon and led to modern information theory.


For this reason the theorem is now known as the Nyquist-Shannon Sampling Theorem.

Relevance to GIS?

The images/raster are samples of real scenes

-optics are used to create a two dimensional analog signal made up of spatial waveforms

-DEMs are samples from a real landscape


Spatial resolution in the development of DEM's is often the result of rules of thumb


The sampling frequency is approximated by the pixel spacing


Must be at least twice the highest spatial frequency present in the scene to faithfully record the information in the image.


"What is the right resolution for that piece of data?"

"Why did you choose that resolution?"


Oversampling is okay, just less efficient. Undersampling is bad.


If you violate this rule it's called Undersampling of Oversampling.

Data compression (2)

Data compression refers to the reduction of data volume

A variety of techniques are available for image compression

-compression techniques can be lossless or lossy


The wavelet transform, the latest technology for image compression, treats an aimage as a wave and progressively decomposes the wave into simpler wavelets.

Frequency dependent codes

Not all data appear with same frequency, some are more prevalent than the others.


Frequently appearing values could be assigned shorter codes than the others - results in reduced number of bits.

Run Length Encoding

Simple form of raster data compression.


Based on creating simple frequency codes

Method based on lossless compression

Not particularly efficient- but simple... good place to start.


Instead of sending long runs of '0's or '1's, it sends only how many are in the run.


If the raster has repeating spatial structures then RLE is useful.


Runs of the same bit.

Used in fax machine transissions (used with 1s and 0s)

Bitmap = 1s and 0s

Text on a page doesn't take up much space.

When faxing, about 80% of a page is white (immediately much smaller).


If fax data, there are many 0s (whte spots)

-transmit the run-length as fixed size bniary integer (only 1s and 0s)


Receiver generates proper number of bits in the run and inserts the other bit in between.


For us, transform codes into runs (same thing)


Best when there are many long runs of zeros, with increased frequency of 1s, becomes less efficient.


Relative Encoding:

Some applications may not benefit from the above: video image- little repetitive within, but much repetition from one image to the next.

-Differential encoding is based on coding only the difference from one to the next

-have a first frame, a second frame, and a difference.

-the difference can be run-length encoded very efficiently.


Run-length encoding is the method used to save data storage space by reducing a row of cells with the same value to a single unit having a specific value and quantity.

Raster Chain and Block Codes

Raster chain code assigns number 1-4 to indicate direction (N,S,E,W), then notes how many grids to move in each direcion, along with assigning grid cell value.... etc


(missed slide, Feb 9)

(Not testable)

Fourier Transform

The frequency spectrum of the signal shows what frequencies exist in the signal



-frequency domain

-temporal domain


No frequency information is available in time-domain


No time information is available in frequency-domain signal.


-Take space, and reduce it to frequency.

-Can start switching space, time and frequency domain around.

-Things are way easier to do in the frequency domain.

What is a wavelet transform?

Provides time-frequency representation


Wavelet transform decomposes a signal into a set of basis functions (wavelets)


Wavelets are obtained from a single prototype wavelet called mother wavelet by dilations and shifting.


Similar to what Fourier thing is doing, but bazingly fast.


Wavelets = Little Waves

pure mathematics



-analysis (mainly studying functions and operators)

---fourier, harmonics, wavelets



-Continuous Wavelet Trasnformt (CWT)

-Discrete Wavelet Transform (DWT) - use allll the time

1D Discrete Wavelet Transform

Separates the high and low-frequency portions of a signal through the use of filters


One level of transform:

-signal is passed through low and high pass filters

-down sample by a factor of two


Multiple levels (scales) are made by repeating the filtering and decimation process on low-pass outputs

Haar Wavelet Transform

Find the average of each pair of samples

Find the difference between the average and sample

Fill the first half with averages

Fill the second half with differences

Repear the process on the first half

Step 1:

[3 5 4 8 13 7 5 3]

[4 6 10 4 -1 -2 3 1]

Averaging    Differencing

(Feb11 s12)


Final stage gets us down to holding only one digit.

End up with with averaged number, and all the differencing

-differencing is quite compressible (lots of ones and twos and threes)

Haar Signal Decomposition

Magnitudes of the fluctuation subsignal (d) are often significantly smaller than those of the original signal


Logical: samples are from continuous analogsignal with very short time increment


Has application to signal compression


-textbook says this is the most advanced method, but it isn't. At least 15 years old. Actually, kind of old school.

2-D DWT (discrete wavelet transform)

ie. image compression

-Why would you compress an ocean in rows when you can do it from all sides?


Step 1: replace each row with its 1-D DWT

Step 2: Replace each column with its 1-D DWT

Step 3: Repeat steps 1+2 on the lowest subband for the next scale

Step 4: Repeat step 3 until as many scales as desired.


I need to hold many things in the image to run the compression.

-vertical, horizontal, and diagonal details at many scales


-check out slide 18 feb11


If you want to compress, can just start eliminating high frequency stuff

Image compression

JPEG compression- both for grayscales and color images


Previous compression methods were lossless

-it was possible to recover all the information from the compressed code


JPEG is lossy: image recovered may not be the same as the original

JPEG Compression

It consists of three phases: Discrete Cosine Transform (DCT), Quantization, Encoding



-image is divided into blocks of 8*8 pixels

-for grey-scale images, pixel is represented by 8-bits

-for color images, pixel is represented by 24 bits or 3 8 bit groups


DCT takes an 8*8 matrix and produces another

Computes how rapidly te image changes as a function of distance

Reproduces the image as a sequence of values called spatial frequencies


Spatial frequencies directly relate to how much the pixel values change as a fnction of their positions  in the block


The block of spatial frequenceis can be directly related to the original signal (or simplified)


Quantization: Provides a way of ignoring small differences in an image that may not be perceptible



If the loss is minimal, the vision system may not notice

Retain the effects of lower spatial frequencies as much as possible - less subtle features noticed if changed

What is a GIS?

A GIS is a computer-based system including software, hardware, people, and geographic information


A GIS can: create, edit, query, analyze and display map information on the computer



Geographic- 80% of government data collected is associated with some location in space

Information- attributes, or the characteristics (data), can be used to symbolize and provide further insight into a given location

System- a seamless operation linking the information to the geography - which requires hardware, networks, software, data, and operational procedures


Not just software!

Not just for making maps!

Who uses GIS?

International organizations

-UN HABITAT, The World Bank, UNEP, FAO, WHO etc


Private industry

-Transport, Real Estate, Insurance, etc.



-Ministries of ENvironment, Housing, Agriculture, etc.

-Local Authorities, Cities, Municipalities, etc.

-Provincial Agencies for Planning, Parks, Transportation, etc


Non-profit organizations/NGO's

-World Resources Institute, ICMA, etc.


Academic and Research Institutions

-Smithsonian Institution, CIESIN, etc.


What can you do with a GIS?



The possibilities are unlimited...

-Environmental impact assessment

-Resource management

-Land use planning

-Tax mapping

-Water and Sanitation Mapping

-Transportation routing


How does a GIS work?

GIS data has a spatial/geographic reference

-This migt be areference that describes a feature on the earth using:

-a lat/long

-a national coordinate system

-an address

-a district

-a wetland identifier

-a road name


Geography and Databases

-A GIS stors information about the world as a collection of thematic layers that can be linked together by geography


GIS provides data integration

Two fundamental types of data


-a series of X,Y coordinates

-for discrete data represented as points, lines, polygons



-grid and cells

-for continuous data such as elevation, slope surfaces


A desktop GIS should be able to handle both types of data effectively


The most common data format

Easy to perform mathematical and overlay operations

Satellite information is easily incorporated

Better represents "continuous" type data



Acuracte positional information that is best for storing discrete thematic features (eg roads, shorelines, sea-bed features)

Compact data storage requirements

Can associate unlimited numbers of attributes with specific features

Other features of a GIS

Produce good cartographic products


Generate and maintain metadata


Use and share geoprocessing models


Managing data in a geodatabase using data models for each sector


GISs Don't Make Maps!!!

-Good to know something about these issues when creating a mpa and doing spatial analysis..



---basic cartographic principles regarding design, generalization, etc.

Spatial Data Infrastructure (SDI)

Definition- the technology, policies, standards, human resources,a nd related activities necessary to acquire, process, distribute, use, maintain, and preserve spatial data


Part of many nation's e-Gov strategy

Geospatial data
Are data describing both the locations and characterstics of spatial features such as roads, land parcels, andvegetation stands on the Earth's surface
Coordinate System

Spatial features on the Earth's surface are referenced onto a geographic coordinate system in longitude and latitude values.

When displayed on maps, spatial features are typically based on a projected coordinate system in x-y coordinates


Geographic and projected coordinate systems are connected by the process of projection, which transforms the Earth's spherical surface onto a plane surface.

Thousands of geographic and projected coordinate systems are in use.

Vector Data Model

The vector data model uses points and their x-y- coordinates to represent discrete features with a clear spatial location and boundary, such as streams, land parcels, and vegetation stands.


Depending on the data structure, a vector data model can be georelational or object-based, with or without topology, and simple or composite.


The vector data model uses x-y-coordinates to represnt point features, and the raster data model uses cells in a grid to represent point features.


Adds Intelligence


Topology refers to the relationships or connectivity between spatial objects


GIS analysis answers many questions:

-Where is it?

-What is it next to?

-Is it inside or outside?

-How far is it from something else?


The mathematical terms for these answers are:







Topological: Coverage (geo-relational), geodatabase (object-based)

Non-topological: Shapefile (geo-relational), geodatabase (object-based)

Other spatial features

Built on compositing simple feature (points/lines polygons)

TINs are a good example (points and lines= triangles and topography)


Regions model

-allows overlapping spatial features in a single layer


Dynamic Segmentation

-process of transforming linearly referenced data (events) stored in a TABLE into spatial features

Utility company could segment pipeline condition based on a unique identifier and location along a linear feature.

GIS Operations
GIS activities can be grouped into spatial data input, attribute data management, data display, data exploration, data analysis, and GIS modeling
geospatial data

data that describe both the locations and the characteristics of spatial features such as roads, land parcels, and vegetation stands on the Earth's surface.


consists of spatial and attribute data



Describe the locations of spatial features, which may be discrete (do not exist spatially between observations) or continuous (exist spatially between observations)

Raster and Vector Data Structure

The raster model uses a simple data structure with rows and columns and fixed cell locations


The cevtor data model may be georelational or object-based, may or may not involve topology, and may include simple or composite features.

Georelational Data Model vs Object-based data model

Geo-relational DM:

Uses a split system to store spatial data and attribute data.


Object Based DM:

Storesspatial data and attribute data in a single system


Coverages (topological) and shapefiles (non-topological) are geo-relational types of data, while the geodatabase (either) is object-based


Joining Spatial and Attribute Data

The georelational data model stores attribute data separately from spatial data in a split system. The two data components are ilnked through the feature IDs

The object-based data model stores spatial data as an attribute along with other attributes in a single system. Thus, it eliminates the comlpexity of coordinating and coordinating and synchronizing two sets of data files as required in a split system.

Relational Database

Whether spatial and attribute data are stored in a split or single system, the relational database model is the norm for data management in GIS.


A relational database is a collection of tables (relations). The connection b/w tab les is made through a key, a common field whose values can uniquely idenfity a record in a table. For example, the feature ID serves as the key in the georelational data model to link spatial data and attribute data.


A relational db is efficient and flexible for data search, retrieval, editing, and creating tabular reports. Each table in the database can be prepared, maintained, and edited separately from other tables. And the tables can remain separate until a query or analysis requires that the attribute data from differente tables be linked or joined together.

GIS Operations

Spatial data Input

-data entry, editing, projection, etc


Attribute Data Management

-data entry and verification

-database management

-attribute data manipulation


Data Display

-Cartographic symbolization

-Map design


Data exploration

-Attribute data query

-spatial data query

-Geographic visualization


Data Analysis

-Vector data analysis: buffering, overlay, distance measurement, spatial statistics, map manipulation

-Raster Data analysis: local neighborhood, zonal, global, raster data manipulation

-terrain mapping and analysis

-viewshed and watershed

-spatial interpolation

-geocoding and dynamic segmentation

-path analysis and network applications


GIS Modelling

-Binary models

-index models

-regression models

-process models


Attribute Data Management

Atribute data reside as tables in a relational database


An attribute table is organizaed by row (spatial feature) and column (characteristic)

An ESRI data format for topological vector data
Dynamic Segmentation model
A data model that allows the use of linearly measured data on a coordinate system
The process of transforming from a geographic grid to a plane coordinate system
Relational Database
A collection of tables which can be connected to each other by attributes whose values can uniquely identify a record in a table.
Geographic Coordinate System

Is the location reference system for spatial features on the Earth's surface. The geographic coordinate system is defined by longitdue and latitude.


meridians are tlines of equal longitude

-prime meridian passes through Greewich, England


Parallels are lines of equal latitude


Entered as x (longitude) and y (lat) values into a GIS. Long values are positive in the easter hemisphere and neg in the western hemisphere.

Lat are pos N of Equator and neg S of it.


A datum is a known and constant surface which is used to describe the location of unknown points on the earth.


It can be thought of as a spheroid with an origin.


A datum is mathematical model of the Earth, which serves as the reference or base for calculating the geographic coordinates of a location.


The definition of a datum consists of an origin, the parameters of the spheroid selected for the computations, and the separation of the spheroid and the earth at the origin.


Until 1980, Clarke 1866 was the standard spheroid for mapping the US. NAD27 is a local datum based on this spheroid, with its origin in Kansas.


In 1986, NAD83 was introduced, an earth-centred datum based on the GRS80 spheroid.

-more accurate representation of the earth's shape


WGS84 is similar to GRS80, but has parameters that refer to local datums used in different countries. WGS84 is the datum used for GPS readings.

Map projections (book) & projection types based on preserved properties

The process of projection transforms the spherical Earth's surface to a plane.


Cartographers group map projections by the preserved property into the following four classes:


Conformal projection: preserves local angles and shapes


Equivalent projection: represents areas in correct relative size


Equidistant projection: maintains consistency of scale along certain lines.

Azimuthal projection: retains certain accurate directions


****The conformal and equivalent properties are mutually exclusive****

-these are global


The equidistant and azimuthal properties can be combiend with others, though are local properties.

Central line

the standard line should not be confused with the central line. Whereas the standard line dictates the distribution pattern of projection distortion, the central lines (the central parallel and meridian) define the centre of map projection.


In a secant projection, the standard lines have a scale factor of 1, and the central line has a scale factor of less than 1


The centre of the map projection, as defined by the central parallel and the central meridian, becomes the origin of the coordinate system and divides the coordinate system into four quadrants.


To avoid having negative coordinates, we can assign false eastings and false northings.

Projected Coordinate System

AKA Plane coordinate System


Built on a map projection. PCSs and Map Projections are often used interchangeably.

It is a plane coordinate system that is based on a map projection.


A PCS is often divided into different zones, with each zone defined by a different projection centre.


A projection coordinate system is defined not only by the parameters of the map projection it is based on, but also the parameters (ie datum) of the geographic coordinate system that the map projection is derived from.


Include the UTM Grid System,the Universal Polar Stereographic (UPS) grid system, and the State Plane Coordinate (SPC) system, among others.


Because datum is part of the definition of a projected coordinate system, the UTM grid system may be based on NAD27, NAD83, or WGS84.

Working with Coordinate systems in GIS

Basic GIS tasks with coordinate systems involve defining a coordinate system, projecting geographic coordinates to projected coordinates, and reprojecting projected coordinates from one system to another.


A GIS system typically has many options of datums, spheroids and coordinate systems (combinations of these three variables can be up to 3000 options)


An early application of topology in preparing geospatial data is the TIGER data base. It contains legal and statistical area boundaries such as counties, census tracts, and block groups, which can be linked to the census data,as well as roads, railroads, streams, water bodies, powerlines, and pipelines.


In the TIGER database, points are called 0-cells, lines 1-cells, and areas 2-cells. Each 1-cell is a directed line, meaning that the line is directed from a starting point toward an end point with an explicit left and right.

ESRI's coverage model

Was introduced to separate GIS from Computer -aided design (CAD).

Coverage is a topology-based vector data format. A coverage can be a point coverage, line coverage, or polygon coverage. The coverage model supports three basic topological relationships:

-connectivity (arcs connect at nodes)

-area definition (area defined by connected arcs)

-contiguity (arcs have directions and left/right polygons)


Coverages can be converted to shapefiles, and vice versa.

Composite features

Composite features refer to thsoe spatial features that are better represented as composites of points, lines, and polygons.



Approximates the terrain with a set of nonoverlapping triangles. Each triangle assumes a constant gradient.

-ponint lines represent elevation

Object-Based Data Model

Geodatabase model as opposed to Georelational Database


Spatial and attribute data are held together.


The data model treats spatial data as objects. Objects can represent a spatial feature such as a road, atimber stand, or a hydrologic unit.


Major differences:

1)Object-based data model stores both spatial and attribute data of spatial features in a single system rather than a split system.

2)Allows a spatial feature (object) to be associaed with a set of properties and methods. A property describes an attribute or characteristic of an object (ie. shape and extent). A method performs a specific action (ie. delete).


A class (in object-based data mode) is a set of objects with similar attributes.


Operationally, a class defines the properties and methods of objects that are members of the class.

Relationships between classes


-defines how many instances of one class can be associated with the other class through multiplicity expressions at both ends of the relationship.



-describes the whole-part relationships between classes. Aggregation is a type of association except that the multiplicity at the composite "whole" end is typically 1 and the multiplicity at the other "part" end is zero or any positive integer. For example, a census tract is an aggregate of a number of census blocks.



-Similar to aggregation in that the multiplicity at the composite end is 1 and the multiplicity at the other end is zero or any positive integer. But the composite in a composition relationship solely owns teh part. For example, a highway can have 0 or any number of roadside rest areas. The highway controls the lifetime of a roadside rest area.


Type Inheritance

-defines teh relationship between a superclass and a subclass.



-means that an object of a class can be created from an object of another class. (ie. a high density residential area object can be created from a residential area object)

Geodatabase model

The geodatabase data model is the third major data model offered by ESRI following the coverage model in the 1980s and the shapefile in the 1990s.



The geodatabas data model distinguishes between "feature class" and "feature dataset" in data structure. A feature class stores spatial data of the same geometry type. A feature dataset stores feature classes that share the same coordinate system and area extent.


A feature class does not have to be included in a feature dataset. Feature classes that are included in a feature data set typically participate in topological relationships with each other.


A feature class is like a .shp in having simple features, and a feature dataset is similar to a coverage in having multiple data sets that are based on the same coordinate system and area extent. But a feature dataset can contain different theme layers, whereas a coverage contains only different pars of a simgle layer such as arcs, nodes, and tics.

Topology and geodatabase model

Defines topology as relationship rules and lets the user choose the rules, if any, to be implemented in a feature dataset.


Different from the coverage model, which enforces three topological relationships with its data structure.

Advantages of the Geodatabase model

Built on ArcObjects. A geodatabase can therefore take advantage of new functionalities from object-oriented technology.


-grouping objects into subtypes by a valid range of values or avalid set of values for an attribute

-connect objects that are associated

-build geometric networks such as streams roads, and water utilities.


A geodatabase provides a convenient framework for storing and managing different types of GIS data. Besides vector data, a geodatabase can also store raster data, TINs, location data, and attribute tables.


Eliminates the complexity of coordinating b/w the spatial and attribute components, thus reducing the processing overhead.

Cell Value

Each cell in a raster carries a value, which represents the characteristic of a spatial phenomenon at the location denoted by its row and column. Depending on the coding of its cell values, a raster can be either an integer (no decimal digits) or a floating-point (has decimal digits) raster.


Integer = used for categorical data

FLoating point = continuous, numeric data


Each type has advantages and disadvantages (space take-up, ease of query, raster value availability in attribute table, etc).



-A raster may have a single band or multiple bands. If it has multiple bands, each cell has multiple cell values (one for each band).

Raster Data Structure

Refers to the method of raster storage


CELL by CELL Encoding

Provides the simplest raster data structure. A raster is stored as a matrix and its cell values are written into a file by row and column.

DEMs use this, because neighbouring elevation values are rarely the same.



The cell-by-cell encoding method becomes inefficient if a raster contains many redundant cell values.

This method records the cell values by row and by group



Uses recursive decomposition to divide a raster into a hierarchy of quadrants. Recursive decomposition refers to a process of continuous subdivision until every quadrant in a quad tree contains only one cell value.

A quad tree contains nodes and branches

Data Compression

Refers to the reduction of data volume, a topic particularly important for data delivery and Internet mapping.


Lossless compresion allows the original image to be precisely reconstructed.


Lossy compression cannot reconstruct fully the original image but can achieve high compression ratios.


wavelet transform: treats an image as a wave and progressively decomposes the wave into simpler wavelets. Using a wavelet function, the trasnform repetitively averages groups of adjacent pixels and, at the same time, records the differenes between the original pixel values an the average.

Data Conversion
rasterization and vectorization
JPEG Compression (cont)

Why do it?

If he loss is minimal, the vision system may not notice.


Retain the effects of lower spatial frequencies as much as possible - less subtle features noticed if changed.


Many such 8*8 arrays, adjacent blocks with litle difference => more potential for compression.


JPEG may provide 95% compresion (depends on image and quantization array).


GIF (Graphic Image Format) reduces color to 256. Best suited for a few colors and sharp boundaries (charts, lines). Not good for variations and shading - full color photo.

Check out slide on Feb 14!! Great slide near beginning of raster vs vector data models.... lec 15?
Rate Measures

Original image: 6000*5000pel

Rate: 3*8 byte/pixel = 24 bit/pixel

File size: 6000*5000*3 = 90MB


Compressed image: Rate= 0.53 bit/pixel, then file size goes down to 2MB..


Compression ratio: 45 (90/2)


45:1 (pretty standard)

File Formats for Raster maps

GeoTIFF: Tagged Image File Format

DRG: Digital Raster Graphics

ECW: Enhanced Compression Wavelets

MrSID: Multiresolution Seamless Image Database



(Raster image + georeferencing data)

GeoTIFF (Tagged Image File Format)

Tagged Image File Format


GeoTIFF is a standard for storing georeferenced image data


It is based onthe standard TIFF image format with extra header.


Tags used to store various types of georeferencing information.


Digital Raster Graphics


Designed to work for Scanned topographic maps


USGS invented it


Images are stored in TIFF version 6


PackBits compression: run-length encoding


Georeferencing information: GeoTIFF, vers. 1.0


Enhanced Compression Wavelet


Developed by Earth Resource Mapping (no longer exist)


ECW for compressing and managing very large images


Tested up to 50 terrabytes

MrSiD, JPEG2000, etc


-developed by LizardTech

-available in Arc GIS, ArcView,ArcObjects

-Wavelet based image compression


JPEG: DCT-based image compression

JPEG2000: wavelet-based image compression


DjVU: combined scheme for graphics and halftone (wavelet based)



**Most new compression is wavelet based**

Compression Remarks

We cannot compress all data. Thus, we must concentrate on compressing "relevant" data.


Where, When? (exam Long Q)

ie. don't JPEG compress a line file


It is trivial to compress data known in advance. We should concentrate on compressing data about which there is uncertainty.


We will use probabiity theory as a tool to model uncertainty about relevant data.




Rectification is a process of tranforming the data from one grid system (columns and rows) into another (map coordinate system) using a geometric transformation model.


Georeferencing (geo-registration)

-used to be called vertical integration


Rectification and Resampling are the two steps needed. One step happens in a raster environment, the other in a vector environment.

Ground Control Points

1)Find Ground Control Points (GCP) on the aerial photo


2)Match the GCP to the features on the base map.


Points are chosen on the image that can be matched to points on the base map


Road intersections and other cultural features are preferred as reference points rather than natural features.


GPS- Ground Survey (Paint big L's on benchmarsk - this is a million times better than digitizing)

Geometric Transformation Models

Many different options..


1)Affine model

2)Plane projection model

3)Polynomial model

4)piecewise affine model

5)etc. (ie. ortho)


#1 provides most of the relative simple framework for what we need to do

Affine Model

Can do four things with this model..


-rescaling (stretch or compress)




You can also do all of these things at once. Difficult to show graphically.


Equationd don't allow you to curve photo. Thus affine model no so great for mountainour regions.


We have a set of points from one, and a set of points fom another. We do all of the transformation stuff in a vector world. You've digitized a set of points. The vectors are distorted, not the rasters.

Supporting users have an ad free experience!