Wednesday 16 December 2009

3.11 References and Resources

Blog
http://eleanor-mclachlan-dita.blogspot.com/
Webspace
http://www.student.city.ac.uk/~abgy297/first.html
Javascript application
http://www.student.city.ac.uk/~abgy297/javascript.html


3.1 Introduction

Web 2.0 technology
http://en.wikipedia.org/wiki/Web_2.0
John Kitson’s blog
http://www.insurancetimes.co.uk/thejohnkitsonblog/


3.2 Text/ HTML

ASCII
http://en.wikipedia.org/wiki/ASCII
ESRI ASCII Raster format
http://resources.esri.com/help/9.3/arcgisengine/java/GP_ToolRef/spatial_analyst_tools/esri_ascii_raster_format.htm
ESRI Whitepaper on metadata
http://www.esri.com/library/whitepapers/pdfs/metadata-and-gis.pdf


3.3 Internet/WWW

Hypertext markup language
http://en.wikipedia.org/wiki/HTML
Internet
http://www.walthowe.com/navnet/history.html
World Wide Web
http://en.wikipedia.org/wiki/History_of_the_World_Wide_Web
Digital divide
http://commons.wikimedia.org/wiki/File:Global_Digital_Divide1.png
Webpages
http://www.student.city.ac.uk/~abgy297/first.html
http://www.student.city.ac.uk/~abgy297/index.html
http://www.student.city.ac.uk/~abgy297/Cat.html


3.4 Images and Graphics

GIF
http://en.wikipedia.org/wiki/Gif
JPEG
http://en.wikipedia.org/wiki/JPEG
PNG
http://en.wikipedia.org/wiki/Portable_Network_Graphics
Other image formats used in GIS
http://en.wikipedia.org/wiki/GIS_file_formats


3.5 XML

XML
http://en.wikipedia.org/wiki/XML
GML
http://gislounge.com/geography-markup-language-gml-20-enabling-the-geo-spatial-web/
OS Mastermap in GML
http://www.ordnancesurvey.co.uk/oswebsite/business/sectors/wireless/news/articles/pdf/OS%20MasterMap%20in%20GML.pdf


3.6 CSS

Cascading style sheets
http://en.wikipedia.org/wiki/Cascading_Style_Sheets
Examples of blog post using different CSS’s
http://www.student.city.ac.uk/~abgy297/css_blog_inverse.html
http://www.student.city.ac.uk/~abgy297/css_blog_large.html
http://www.student.city.ac.uk/~abgy297/css_blog_own.html


3.7 Databases

Program data dependence
http://wiki.answers.com/Q/What_is_program_data_dependence
Spatial databases
http://en.wikipedia.org/wiki/Spatial_database
Spatial database engines
http://www.esri.com/software/arcgis/arcsde/index.html


3.8 Information Retrieval

Stop words
http://www.webconfs.com/stop-words.php
Stemming
http://en.wikipedia.org/wiki/Stemming
Inverted file
http://en.wikipedia.org/wiki/Inverted_index
Google’s three distinct parts
http://www.googleguide.com/google_works.html
Confusion about Google’s use of stop words
http://searchengineland.com/conjunction-junction-google-no-longer-displays-stop-words-malfunction-13161


3.9 Client side programming

My javascript program
http://www.student.city.ac.uk/~abgy297/javascript.html


3.10 Information Architectures

Grid indexing
http://en.wikipedia.org/wiki/Grid
Quadtree indexing
http://en.wikipedia.org/wiki/Quadtree
R-Tree indexing
http://en.wikipedia.org/wiki/R-tree
ESRI Whitepaper
http://www.esri.com/library/whitepapers/pdfs/metadata-and-gis.pdf
ArcCatalog
http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=An_overview_of_ArcCatalog

Tuesday 8 December 2009

3.10 Information Architectures

Information Architecture relates to the organisation, labelling and navigation of information within an information system. The effective organisation of information to facilitate its efficient retrieval is becoming increasingly important as data volumes increase exponentially. This is especially relevant in the field of GIS where the vast quantities of data available mean it is imperative that only the relevant information is returned to the user to enable the swift execution of queries and drawing of maps. Geographic information stored in databases is indexed for this purpose. Grid, Qaudtree and R-tree indices all organise geographic information according to their spatial location to speed up queries and the return of information.

The labelling of geographic information takes the form of metadata, literally defined as data about data. In GIS metadata is often stored as a separate xml file containing information about a files content, quality, type, creation and spatial information such as the coordinate system. An ESRI whitepaper describes metadata as making “spatial information more useful to all types of users by making it easier to document and locate data sets” (ESRI, 2002). Thus this is another form of information architecture that facilitates the efficient organisation and effective retrieval of spatial data.

Navigation of geographic information is solved in ESRI's GIS software via their ArcCatalog architecture. Spatial data often consists of a series of files, for example a single polygon shapefile can consist of up to seven separate files including an index file, a projection file and a geometry file. ArcCatalog displays all these separate pieces of information as a single file, thus making storage, organisation and editing simpler. The application allows you to browse and find geographic information from various sources such as databases, the internet and locally; view and manage metadata; and manage datasets and datasources

3.9 Client Side Programming

My javascript application uses three functions declared in the head of the HTML document. The first, ‘newsorsport’, produces a prompt box asking the user if they are interested in news or sport and requesting they enter 1 for news or 2 for sport. The user's input is declared as a variable and converted to an integer and depending on the outcome either the function ‘area’ or ‘sporttype’ is called. Both these functions request further input from the user via a prompt box and display a link depending on the input.


I found the following aspects of creating the program particularly challenging:
  • Whether to put all the script in the body section of the HTML document or whether to use functions scripted in the head part and then call these in the body
  • The synatx of If statements. In VBA (which I have some limited experience of) 'If, Then Else' is used whereas in Javascript just the word 'If' is required
  • Getting the curly brackets in the correct place
  • Remembering to use double equals signs (==)
  • Getting the anchor tagged URLs to work. They need to be in the paragraph tag and the whole phrase needs to sit within double quotation marks whereas the URL is within single quotes
The fruits of all my labours (and frustrations) can be found here.

Sunday 29 November 2009

3.8 Information Retrieval

Information retrieval refers to the retrieval of unstructured information relevant to a particular user’s requirements. Due to the subjective relevance of the results it is probabilistic, whereas querying a database for structured information is deterministic. For example, many users may enter the same search terms into a search engine, while actually looking for different information, whereas if several users query a RDMS using the same SQL they should be attempting to retrieve the same information.

In order to facilitate the efficient retrieval of unstructured information such as text, the information has to be indexed by identifying relevant fields and words for indexing and preparing the text. This is achieved by removing stop words, stemming and identifying synonyms. The most widely used type of index is an inverted file, an index of searchable terms containing a list of associated documents.

In order to find resources for my DITA blog I have relied mainly on Google. Google has three distinct parts; GoogleBot,- the web crawler that finds and retrieves web pages; the indexer that sorts through the full text of web pages and stores search terms in a massive database; and the query processor which carries out the search by comparing entered terms with the index. There is currently some confusion about Google’s use of stop words. Google used to automatically ignore stop words but informed you that it was doing so and gave you the option to repeat the search with the words included. This message no longer appears and it is unclear whether Google no longer uses stop words and indexes every single word, or whether they still use stop words but just don’t tell the searcher.

Tuesday 17 November 2009

3.7 Databases

Before the advent of the database approach in the early 1970s, data users had no means by which to centrally store and share information; leading to duplication, inaccuracies and program data dependence. Database Management Systems (DBMS) are a suite of software programs which allow information to be stored, organised and accessed in a systematic and consistent way and in a central location, allowing numerous users to access the same data. This increases efficiency by removing duplication and the inaccuracies of maintaining multiple tables of the same information.

In GIS the development of spatial databases and spatial database engines has enabled geographic data to be stored alongside non-spatial database tables within a single DBMS, thus driving the integration of spatial information. Using SQL information can be retrieved from both spatial and non spatial data simultaneously. For example, say we wanted to view a database table of customer addresses on a map, the table could be joined to a spatial table of addresses. As long as the spatial attribute field(s) are included in the output table (either the numeric coordinates or a proprietary geometry field) the table can be imported into a GIS and the customers' addresses viewed spatially. The following SQL query will retrieve the customer number field from the Customer_table and the coordinates from the Address_table and write them to an output table called customer_location.

create table customer_location as
select customer_number, xcoord, ycoord
from customer_table join address_table
on customer_table.address = address_table.address

Customer_table
Customer_number
Address
Postcode
NR173974
45 Laurel Avenue
HP1584
TM184903
7 High Street
E45GE
HA194829
Mill Cottage
IP76CD
MX960417
11 Vincent Street
HP114YE


Address_table
ID
Address
Postcode
Xcoord
Ycoord
ODFD197843
45 Laurel Avenue
HP1584
816304
497628
BNBV497553
7 High Street
E45GE
794382
201975
ASTT796962
Mill Cottage
IP76CD
794682
412876
PEKD969710
11 Vincent Street
HP114YE
994685
325874

Customer_location
Customer_number
Xcoord
Ycoord
NR173974
816304
497628
TM184903
794382
201975
HA194829
794682
412876
MX960417
994685
325874

Sunday 8 November 2009

3.6 CSS

Cascading Style Sheets are a means of describing the aesthetic and stylistic aspects of a web-page using defined syntax to instruct a browser how to display the contents of a page. For example the font type, size, colour and the background colour and layout.

Style sheets bring efficiency to web design by being applicable to any number of documents. A whole website can reference the same CSS and adhere to the same stylistic rules, giving it a distinct aesthetic feel. The term ‘Cascade’ refers to the fact that numerous style sheets can be referenced in the same document and the browser will read the sheets in order so earlier sheets will be successively overwritten by later ones. Cascading Style Sheets can be included in an HTML document as an external CSS file to which the HTML points using the link tag, included using the style tag or included directly in an element via the style attribute.

Pros of Cascading Style Sheets:
  • Separate style from content so HTML remains legible and accessible to all users (eg, visually impaired users can use screen readers or apply different style sheets to the content)
  • Make it easy to change the look of webpages
  • Can be applied to any number of documents improving efficiency as code only has to be written once
  • Reduce network traffic, as if the same sheet is applied to numerous pages it is only downloaded once

Cons of Cascading Style Sheets:
  • Different browsers treat some of the styling instructions in different ways
  • Earlier versions of Internet Explorer don’t support CSS well
Examples of Cascading Style Sheets:
here , here and here are examples of this blog post using different CSS's.

Friday 30 October 2009

3.5 XML

XML  is a means of describing data. It is a method of encoding data in text which renders the data easy to store, transport and interpret. It is not a language but provides a framework that allows users to write their own Mark-up languages for their own specific needs. GML is such a language which is concerned with the description of geographic content. It describes data or objects that have a spatial element by encoding geometry and spatial reference systems. The fact that it is based on XML means that it can be read and edited using any text editor, is easy to transport and transform and, crucially, can be easily integrated with non-spatial data.

This final point is incredibly important. For many years spatial data has been stored, viewed and analysed separately from non-spatial data. In recent years GIS developers have striven to integrate spatial data and analysis. GML enables the integration of geographical information by using a set of rules and guidelines (XML) which can be applied to any data type for any purpose. This means that geographic information can be integrated with a massive range of non-geographic data types thus, greatly enhancing the value and accessibility of spatial information.

The Ordnance Survey’s MasterMap dataset is a digital database of vector information describing geographical features on the ground in incredible detail. Individual building outlines, bollards, trees and road networks all have their own unique identifier called a TOID (Topographic Identifier) as well as information on their geometry and attributes. MasterMap is based on GML. The OS use GML because it's “well defined geometric primitives coupled with a structured mechanism for defining features ensures that when spatial data is exchanged in GML it can be interpreted and understood by everyone.”

Below is a sample of MasterMap for my house viewed in a GIS, and its associated GML.

MasterMap Data Around My House



Crown Copyright/ Database right 2009
An Ordnance Survey/ Edina supplied service

A Sample of the GML for the above MasterMap Data