Why you need the ArcGIS API for Python in your life...

Why you need the ArcGIS API for Python in your life...

ArcGIS API for Python got a recent new release (v1.2.1), but what is it? ArcGIS API for Python is a web GIS integrator for arcpy and other Python 3 packages and you can think of it as a unifying technology for the ArcGIS Platform.

Read More

Python scripting in ArcGIS 10.1

Want to dive into Python scripting in ArcGIS 10.1 but don't feel like reading a long, involved blog post about how to get started? That's good, because I don't have time to write one!

The Python Window is an easy way to start experimenting with Python in ArcMap. You can click on tools in toolboxes and "drag and drop" them into the Python Window.

The ArcGIS Python Community page has quick links to presentations, videos, tutorials, and other helpful resources. This is invaluable to beginners and experts alike.

If you want to automate repetitive tasks involving map documents (MXDs), for example checking all the MXDs in a folder for broken data layers, you will be interested in the arcpy.mapping sample scripts.

If your work involves more ArcGIS Server administration, then the Server REST API administration scripts will be more useful to you.

Finally, one of the main new Python features at 10.1 is the Data Access module (called arcpy.da). This provides new improved cursor objects to access geodatabases that are more powerful (for example, they can control edit sessions) and much faster, often more than 5 times quicker than the old cursors. The old cursors are still available to support legacy applications.

Working with Date Fields in Field Calculator

First thing first - I am not a developer.  Sounds a simple enough thing, but sometimes, when I want to work with date fields in my feature classes, it feels like I need to be.  If this statement resonates with you, then help is at hand.  If you spend hours staring at the field calculator blankly, trying to work out what Date ( ) and Now ( ) actually mean, then I share your pain.   

In my work, I frequently have to work with date and time data in tables; creating new data, reformatting existing data or performing analysis using this data.  The trouble is, I always forget the correct expression required to perform some of these calculations, and as the required syntax is not always straight-forward, I usually end up pestering my colleagues who have developer experience to help me out.

And so, as an aid-memoir to me and hopefully for the benefit of a few of you, I have listed below some of the most common challenges I face and examples of the expressions required to get the required results.  If this inspires you to try some more, then I have also listed some other resources (which I have used in preparing this list) at the bottom of this post.

Although my preference has been to write expressions in VB Script I’ve also included equivalent Python examples as well. Python is becoming more and more integrated within ArcGIS so if you’re not familiar with using it the examples I’ve included are an easy way to dip your toe in.

Example Data Calculations include:

  1. The difference between Shapefiles and Geodatabase date fields – not an expression, but very useful to understand before you carry any out!
  2. How to field-calculate today’s date
    • As Date and Time
    • As Date only
  3. How to field-calculate a specific date
  4. How to field-calculate random dates for a specific time period
  5. How to convert dates in a String/Text field into a Date field

1. The difference between Shapefiles and Geodatabase date fields

One of the first things to be aware of is a subtle, yet crucial, difference between the way a shapefile and a geodatabase store date values.

  • A shapefile (shp) stores dates in a date field with this format: yyyy-mm-dd.
  • A geodatabase (gdb, mdb or sde) formats the date as datetime yyyy-mm-dd hh:mm:ss AM or PM.

Therefore if your data contains times, then you must use geodatabases (file, personal or SDE) or your times will be truncated to dates.

Settings on your Windows system determine how the dates are displayed in ArcMap—M/d/yy, MM/dd/yy, yy/MM/dd, and so on. ArcMap uses the system short date format (numerical) for displaying dates. To alter your settings, go to Start>Control Panel>Region and Language

2. How to field-calculate today’s date

This may seem a simple task, but even this requires the correct Expression.  These expressions may be used to populate a Date field or a Text field.

a. As Date and Time (for Geodatabases only – expression will work for shapefiles but will return the date only)

Using VB Script

MyField = Now ( )

Using Python

MyField = datetime.datetime.now( )

b. As Date only

Using VB Script

MyField = Date ( )

Using Python

MyField = time.strftime("%d/%m/%Y ")

3. How to field-calculate a specific date

Sometimes, you have a specific date you want to populate numerous records with.  To do this, you simply surround the date with symbols as per below:

Using VB Script

Date format: #DD-MM-YYYY# or #DD-MM-YYYY HH:MM:SS#

Example expressions:

e.g. for dates only:                         

MyField = #30-01-2012#

 e.g. for dates and times:              

MyField = #30-01-2012 12:35:15#

Using Python

Date format: “DD-MM-YYYY” or “DD-MM-YYYY HH:MM:SS”

Example expressions:

e.g. for dates only:                         

MyField = "30-01-2012"

 e.g. for dates and times:              

MyField = "30-01-2012 12:35:15"

4. How to field-calculate random dates for a specific time period

This may not be an everyday requirement, but it is something I need to do a lot when creating fictitious demo data.  For example, the following code will create random dates between 01/01/2010 (inclusive) and 01/01/2011 (exclusive). In other words, the random dates go from 01/01/2010 to 31/12/2010, i.e. any date in 2010.

Using VB Script

Check “Show Codeblock”, and, in the “Pre-Logic Script Code” section, enter the following:

MinDate = #2010-01-01# MaxDate = #2011-01-01# Randomize Dim MinDate, MaxDate, RandDate RandDate = MinDate + INT((MaxDate - MinDate)*RND)

Edit the MinDate and MaxDate as appropriate. The MaxDate should always be the day after the maximum date you want to allow. So if you want to allow all dates in February 2010, “MinDate” should be 2010-02-01 and “MaxDate” should be 2010-03-01.

Then, beneath:

MyField = RandDate

date_image2.png

Using Python

Check “Show Codeblock”, and, in the “Pre-Logic Script Code” section, enter the following:

import random def randomDate(minDate, maxDate): minDate = datetime.datetime.strptime(minDate, "%Y-%m-%d") maxDate = datetime.datetime.strptime(maxDate, "%Y-%m-%d") diff = (maxDate - minDate).days return minDate + datetime.timedelta(random.randrange(0, diff))

Then, beneath:

randomDate("2010-01-01", "2011-01-01")

image3.png

5. How to convert dates in a String/Text field into a Date field

Often when data is provided to me or imported into ArcMap from Excel, dates are brought in as text fields rather than date field.  On some occasions, the dates are nicely formatted in one field whilst other times, the date may be spread over a number of fields.

If the date is already in a correctly formatted (e.g. ‘30/01/2012’ or ’30-01-2012’ or ’30 Jan 2012’) then the field may be directly calculated by referencing the field only as follows:

Using VB Script

Expression:                        

MyField = [StringDateField]

Using Python

Expression:                        

MyField = !StringDateField!

If, however the required date elements are split over numerous text fields or an element of the date is missing, the following expression style may be used which forms the date by adding each date element together:

Using VB Script

Example input 1:              3 fields formatted as 30 | 01 | 2012 (Day|Month|Year)

MyField = [Day]&"-"& [Month]&"-"& [Year]

Example input 2:              2 fields formatted as 30 | 01 (Day|Month) with no year

                                                let’s assume we want to set the year as 2009

MyField = [Day]&"-"& [Month]&"-"& 2009

Using Python

Example input 1:              3 fields formatted as 30 | 01 | 2012 (Day|Month|Year)

MyField = !Day! + "-" + !Month! + "-" + !Year!

Example input 2:              2 fields formatted as 30 | 01 (Day|Month) with no year

                                                let’s assume we want to set the year as 2009

MyField = !Day! + "-" + !Month! + "-" + "2009"

Resources

If that has left you wanting more, a good place to start are the following references I referred to:

The Field Calculator Unleashed:

http://www.esri.com/news/arcuser/0405/files/fieldcalc_1.pdf

Simplify Date and Time Calculations

http://www.esri.com/news/arcwatch/1210/tip.html

Fundamentals of Date Fields:

http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//005s00000018000000.htm

Date/Time Field manipulation in Python:

http://www.doughellmann.com/PyMOTW/datetime/

Useful Python Resources for ArcGIS:

http://blogs.esri.com/support/blogs/supportcenter/archive/2012/01/10/10-easy-ways-to-tame-python-scripting-in-arcgis.aspx

Creating a web map from UK Open Data

There's never been a better time to make web maps! Source data is available for free online (see this blog post on Open Data), the software to visualise this data is cheaper and more powerful than ever, and the Internet is a convenient way to share the maps you create.

I'm going to talk about my own experience of creating a web map from Open Data published by the Driving Standards Agency (DSA) in the UK. The DSA administers all the practical driving tests in Great Britain (but not Northern Ireland) and publishes statistics about how many people pass at each test centre. The national average for the practical car driving test was 46% in the year 1 April 2010 to 31 March 2011, but this number hides a lot of variation among the different test centres across the country. Dense urban areas tend to have much lower pass rates than more sparsely populated rural regions: the average is as low as 30% in parts of London and West Yorkshire, and as high as 80% in the remote Scottish islands. This is the kind of data that would look great on a map, and I decided to make one when I couldn't find anything like it already.

Above: Average car driving test pass rates across Great Britain. Cities such as Birmingham, Glasgow, Leeds and London (dark red) have the highest failure rates. Dark blue areas have the highest pass rates. This map should not be interpreted as a map of "easy" places to take your test; if you're badly prepared then you'll fail, no matter where you take it! Click here for the full map.


First, the legal niceties. In 2010, the UK government created a generic data licence called the Open Government Licence (OGL); anyone in the world is allowed to reuse data released under the OGL (e.g. make maps from it) without charge, as long as the original creator of the data is acknowledged. Many UK public sector bodies, such as the Department for Transport (of which the DSA is a part), release much of their data under the OGL. If in doubt, you should email the agency who produced the data you are interested in to get clarification. It's always worth doing this because it's incredibly frustrating to spend time producing a beautiful web map, only to realise you can't actually publish it because you've breached copyright!

The data source I started off with was this PDF file, which has statistics broken down by calendar month, driving test location and gender. Although this is fine for looking up details of your local test centre, it's very difficult to compare different centres using a long list of tables.

The first challenge was to get the location of each driving test centre as a pair of (X, Y) coordinates, i.e. geocode the test centre names. The Department for Transport (DfT) publishes a full list of test centres with addresses on their website; this data can be extracted, or scraped, from the web page using a custom script. I also found an online resource called ScraperWiki, where programmers and citizens with ideas can get together and collaborate to produce scraping software for difficult data sources. This particular screen-scraping script (try saying that quickly three times) was designed to pull out a list of driving test centres from the DfT website, so I had a usable list of test centre locations to work with, without having to write my own scraper.

The next step was to write a Python script to take the data in the PDF file, look up the postcode of each test centre in the scraped data, then use the free Code-Point Open dataset to convert the postcode into an easting/northing coordinate. The output was a CSV file with a row for each test centre containing its location and associated pass rate statistics. This wasn't straightforward for two reasons: firstly, the names of the test centres are sometimes slightly different in the PDF compared to the scraped data (e.g. "Island of Mull" vs. "Isle of Mull"), so the Python code had to do a bit of guessing; secondly, some of the postcodes on the DfT website are invalid! In this case, I had to manually correct them.

Once I had the locations and statistics for each test centre, it was easy to import them into ArcGIS Desktop. I used the Create Thiessen Polygons tool to generate a catchment area polygon around each point, then clipped my polygons using Ordnance Survey's free Boundary-Line dataset. Thiessen polygons mark out areas around each test centre containing locations closer to that test centre than any other test centre (in a straight-line sense). This assumes that people will travel in a straight line to their nearest test centre: not altogether realistic, but a straightforward piece of analysis that produces simple geometries.

Uploading my map to ArcGIS Online was also easy. The red-blue colour scheme was chosen to be friendly to certain users with Colour Vision Deficiency, a topic that my colleague Will White touched upon in a recent blog post.

Two final notes: first, if the idea of Python scripting makes you want to run away, don't worry because it's a gentle language to learn! The reality is that it's still a frustrating experience working with most Open Data without scripting experience, although if you are an experienced spreadsheet user then you may be able to get around this. There are inevitably times when you will need to automate part of your workflow, so even a modest knowledge of Python (the scripting language of choice in GIS these days) can go a long way. This page links to several useful resources for Python beginners and Esri UK also runs introductory Python training courses delivered over the Internet.

Second, in my experience, at least 70% of your time building a web map will be spent collecting and processing data rather than designing a map. Of course, this doesn't mean that the aesthetic elements of a web map aren't important, and the balance of work can certainly tilt more towards design if your map has a complex layout and symbology. Still, it's important not to underestimate the amount of time that data preparation takes. On the bright side, once you're done massaging raw data into something usable, most of the pain is over. Have fun mapping!

ArcPy demonstrations from the 2011 UC

Earlier this week I presented a session on using the ArcPY site package to manipulate map objects. I decided to take this approach since the standard geoprocessing through python is quite well known now. It turned out to have been a timely presentation since only yesterday ESRI announced a new online course on the subject http://training.esri.com/acb2000/showdetl.cfm?DID=6&Product_ID=1011

I also demonstrated a tool validator class which isn’t exactly new (it was introduced at 9.3) but which is little used functionality within the model/scripting framework. I realise that whilst the slides will be published there wasn’t much there without the demos. This article is partly to redress that by attaching the scripts I used so that anyone can download them: 

https://static1.squarespace.com/static/55bb8935e4b046642e9d3fa7/55bb8e8ee4b03fcc125a74c0/55bb8e90e4b03fcc125a7933/1306326540373/ArcPydemos.zip

Note that throughout the demos I chose to mix and match Arcpy and core (or OS module) methods. In many cases these are interchangeable when dealing with disk files.

The first demo centred on adding a feature class as a layer in a map. Much of the processing relies on creating a temporary disk file for the layer and this is then added to the map. I also played with the progressor bar in this, although it is fairly quick so you have to be watching to see it change.

The second demo iterated through the layers in a map and re-pointed these from an FGDB to an ArcSDE geodatabase. It was, in fact, a personal ArcSDE gdb on my machine but the same logic applies for an enterprise geodatabase. This showed the simplicity of looping with python lists and required nested lists to iterate through layers within data frames within the map. It also shows the perverse way in which group layers are handled. When I started the script I imagined that I’d iterate through a group layer if I found one but the demo2 script shows this is not the case. The layers are presented in the order they are in the TOC as if all the + toggles are hit. So the group layer appears in turn and the next layer is the first in the group. Once all the layers in the group layer are accessed the next layer is whatever is in the TOC after the group.

Due to time constraints I didn’t get chance to talk about the error handling. As a result of making the whole thing more “pythonic” it is no longer enough just to output the arcpy.GetMessages(2). Many of the errors are raised as python errors; assertion error, type error etc. and these have to be trapped by using the sys methods and the traceback module to report on these errors. The demos both show a rather crude example of this which has the essential steps to get the python error codes out of a failing script.

The final demo is a simple tool validator class to extract all the feature classes from a FGDB and use the list to populate a value list on a string field. While the example easily shows what could be done it is not especially useful as the field has to be of type String not FeatureClass for a value list to be applied to it. The called script therefore has to establish the workspace in the same way as the tool validator has. Take care writing code in a tool validator, it can be very hard to debug. Also ensure that it is in the correct place, initialise, update or message. InitialiseParameters runs as the tool opens. UpdateParameters runs whenever a parameter is changed by the user. The code must determine whether a parameter that is being “watched” has changed; there is no distinction in the invocation of the class method.

One of the biggest strengths of choosing python is that it opens up a whole range of prebuilt packages to perform tasks. Yesterday I was advising on using python modules to zip files and directories once geoprocessing has completed and investigation revealed that at least 3 (zipfile, gzip and tarfile) are packaged with the default python 2.6 installation.

I hope you find the demo scripts useful, and if you attended the session I hope you enjoyed it. Thanks, Rob

Printing in ArcGIS Server

Printing in ArcGIS Server is one of those topics we get asked about on a regular basis. If you are looking for something more than the out of the box functionality the solution was usually to roll up your sleeves and get stuck into ArcObjects. At 10 however Esri released the arcpy.mapping Python module. This gives a relatively simple option for some nice prints. If you haven't seen it before Esri have written a great getting started blog article with some nice examples. Well worth a read:

http://blogs.esri.com/Dev/blogs/arcgisserver/archive/2011/04/12/An-introduction-to-arcpy.mapping-for-ArcGIS-Server-developers.aspx

Enjoy!

Improving the performance of geoprocessing models and services

Recently one of my colleagues wanted to calculate driving routes and CO2 emissions for car journeys in the UK. He put together a prototype in ModelBuilder and published it as a geoprocessing service to ArcGIS Server so that we could run it from the web.

Although the model worked well, we noticed that it felt sluggish for a web service that was going to be accessed repeatedly, so we started looking at straightforward ways to boost the model's performance.

The first thing we did was to measure! We looked carefully at which tools in the model were running quickly and which ones weren't. To do this, we opened up the model for editing in ArcCatalog and from the "Model" menu selected "Run Entire Model". As each tool in the sequence is executed, its process box is temporarily highlighted red.

Any tool that stays red for an extended period of time is a possible bottleneck that needs further investigation. This is a great way to visualise where delays in your model are happening. For the more numerically inclined, the number of seconds taken to execute each tool (rounded down to the nearest second) is also shown in the progress box.

Although how you proceed will depend on your individual circumstances, we found two quick wins to speed up our particular model, which made it run about four times faster!

  1. We used in-memory workspaces to store temporary data
  2. We reduced the number of tools by replacing entire tool sequences with Python scripts

Let's look at these in more detail.

Typically, each tool in the model processes some data and writes its output to a feature class in the scratch workspace (a directory on your computer's hard disk). The next tool in the model reads its input from this location and the cycle repeats.

Writing to the hard disk is much slower than writing to a temporary location in the computer's memory (RAM), so consider using an in-memory workspace for passing data from one tool to the next.

In-memory workspaces have some important limitations, which are discussed in the articles below:

Intermediate data and the scratch workspace (ArcGIS 9.3, scroll down to see the section on in-memory workspaces)

Using in-memory workspace (ArcGIS 10.0)

In particular, data (e.g. feature classes) written to an in-memory workspace cannot be edited once they have been written. You should also delete in-memory data after you are finished with them, otherwise they can fill up your computer's RAM until you restart your ArcGIS Desktop client (if running a Desktop model) or your server process is recycled (if running a geoprocessing service), which might not happen for hours. The "Delete" tool can be used to clear an "in_memory" workspace.

Secondly, our model uses a sequence of tools such as "Calculate Field" and "Join Field" to look up CO2 emissions data from a table and calculate an estimate for our car journey. However, we could do this more efficiently by replacing the tool sequence with a single Python script that doesn't spend a lot of time passing data between different tools.

The following pages have useful information and Python code samples that show you how to wire up a Python script's inputs and outputs so it can be used as a tool in a ModelBuilder model.

Setting script tool parameters using the arcgisscripting module (ArcGIS 9.3)

Setting script tool parameters using the arcpy module (ArcGIS 10.0)

The general process of performance tuning involves measuring the areas of your model that are slow and making them more efficient (e.g. slow actions might be reading from/writing to disk, using more tools than you need to, using data without attribute indexes, etc)

If you're interested in learning more, the ArcGIS Resource Center has some general performance tips for geoprocessing services.

For more detailed insights into geoprocessing on Server, there's a useful video presentation from the 2010 Esri, Inc. Developer Summit:

Building and Optimizing Geoprocessing Services in ArcGIS Server

Happy performance tuning!