Emil's Arc/GIS/Python: 2015

Thursday, December 17, 2015

Creating Unique Feature Class, Feature Layer, and Field Names with ArcPy

Most Python/ArcPy scripts frequently require the creation of new feature classes, feature layers, and, at times, fields. I developed scripts that return unique names for these so that I don't have to worry about my script throwing an error because files already exist, or worry that my script copies over existing data.

Unique file or feature class name:

def UniqueFileName(Location = "in_memory", Name = "file", Extension = ""):
    """
    returns an unused file name
    'Location' will be the file folder
    'Name' will be the file name plus numeric extension if needed
    'Extension' is the file extension

    similar to arcpy.CreateUniqueName
    """
    import arcpy
    import os

    if Extension:

        outName = os.path.join (Location, Name + "." + Extension)
    else:
        outName = os.path.join (Location, Name)
    i = 0
    while arcpy.Exists (outName):
        i += 1
        if Extension:
            outName = os.path.join (Location, 
                                    Name + "_" + str(i) + "." + Extension)
        else:
            outName = os.path.join (Location, Name + "_" + str(i))
        
    return outName

Unique feature layer name:

def UniqueLayerName(inFC = None, sql = None):
    """
    returns an unused feature layer name
    """
    import arcpy
    lyrName = "lyr0"
    i = 0
    while arcpy.Exists (lyrName):
        i += 1
        lyrName = "lyr" + str(i)
    if inFC:
        arcpy.MakeFeatureLayer_management (inFC, lyrName, sql) 
    return lyrName

Unique field name:

def UniqueFieldName(inFC, inField = "NEWFIELD"):
    """
    returns an unused field name
    """
    import arcpy

    Type = arcpy.Describe (inFC).dataType

    if Type == "ShapeFile":
        inField = inField[:10]

    fields = [f.name for f in arcpy.ListFields (inFC)]
    i = 0
    outField = inField
    while outField in fields:
        i += 1
        outField = inField + "_" + str(i)

        q = 0
        while Type == "ShapeFile" and len (outField) > 10:
            q -= 1
            outField = inField[:q] + "_" + str(i)

    return outField

Friday, November 6, 2015

Creating Data Driven Pages with a Maximum Scale

I often use Data Driven Pages to create sets of maps. Data Driven Pages iterates through a feature class and sets ArcMap's extent to match each feature of the feature class, and generates a PDF map for each. Sometimes the features in the feature class driving the Data Driven Pages can be small, meaning the scale of the map will be quite large. Large scale maps are sometimes not much use, as they don't have enough information to let the user know what the map represents. A while ago I creates a script that creates a new feature class from an input feature class. The script checks the scale of each feature in the feature class that will be generated with Data Driven Pages. If the scale is too large, it replaces the feature with a new one at the indicated maximum scale.

Inputs for the script are the feature class to check, the maximum scale desired in Layout View (for example, 1:2000 is input as 2000), the percent margin (corresponding to the percent margin in the Data Driven Pages setup window), the new feature class name and the new feature class location, and finally the map document you'll be using the new Data Driven Pages feature class with.

Here's an example of the output, with California's Bay Area divided into regions:

The input is the green features, and the output is the purple-bordered polygons. Smaller features of the input have been replaced by rectangular features at the maximum scale.

Using the output feature class to drive our Data Driven Pages, the maps created won't zoom in closer than desired, even on the smaller features:

The code:

#----------------------------#
#-------- User Inputs -------#
#----------------------------#
 
 
#Feature class used to generate DDP
DataDrivenPageFeature = r"C:\Users\e1b8\Desktop\E1B8\Blog\BlogData.gdb\Input"
#Maximum scale
MaxScale = 20000
#Percent margin map extent
MMEx = 125
 
#New feature class name
NewDDPFeat = "DDP_Ref_Output"
#New feature class GDB location
NewFeatGDB = r"C:\Users\e1b8\Desktop\E1B8\Blog\BlogData.gdb"
 
#Map document
MXDoc = r"C:\Users\e1b8\Desktop\E1B8\Blog\DdpMaxScale.mxd"
 
#----------------------------#
#---- End of User Inputs ----#
#----------------------------#
 
 
import arcpy
import os
 
arcpy.env.overwriteOutput = True
 
mxd = arcpy.mapping.MapDocument(MXDoc)
df = arcpy.mapping.ListDataFrames(mxd)[0]
mxd.activeView = "PAGE_LAYOUT"
 
#Data drive page feature type
fcShapeType = arcpy.Describe (DataDrivenPageFeature).shapeType
#Create temp feature class
DDPFeature = os.path.join ("in_memory", "tempFc")
if fcShapeType == "Polygon":
    print "Copying feature class to memory"
    arcpy.CopyFeatures_management (DataDrivenPageFeature,
                                   DDPFeature)
else:
    #Create minimum bounding rectangles
    print "Converting features to minimum bounding rectangles"
    arcpy.MinimumBoundingGeometry_management (DataDrivenPageFeature,
                                              DDPFeature, "ENVELOPE")
 
#Create layer for selection
arcpy.MakeFeatureLayer_management (DDPFeature, "DDPlyr")
 
#Get values for new features at maximum scale
df.scale = MaxScale / ((float(MMEx) / 100))
XAdjust = ((df.extent.XMax) - (df.extent.XMin)) / 2
YAdjust = ((df.extent.YMax) - (df.extent.YMin)) / 2
 
#Create new feature class
print
print "Creating new feature class"
print
 
NewFeat = os.path.join (NewFeatGDB, NewDDPFeat)
sr = arcpy.Describe(DDPFeature).SpatialReference
arcpy.CreateFeatureclass_management (NewFeatGDB,
                                     NewDDPFeat,
                                     "POLYGON",
                                     DDPFeature,
                                     "", "", sr)
 
#List fields
copyFeatureOIDList = []
 
shpFld = arcpy.Describe(DDPFeature).shapeFieldName
fieldNameList = [fld.name for fld in
                 arcpy.ListFields (DDPFeature)
                 if fld.editable and
                 fld.name != shpFld]
 
#Create cursor object
cursor = arcpy.da.SearchCursor(DDPFeature, ["SHAPE@",
                                            "SHAPE@TRUECENTROID",
                                            "OID@"]
                               + fieldNameList)
 
FeatureCount = 0
for row in cursor:
    df.extent = row[0].extent
    #Check if scale is larger than maximum
    if df.scale * (float(MMEx) / 100) >= MaxScale:
        #Add OID of feature to list if feature is large enough
        copyFeatureOIDList.append(str(row[2]))
        continue
     
    FeatureCount += 1
    print "Creating new feature", FeatureCount
    #Make x,y for four corners of extent
    array = arcpy.Array()
    FeatPnt1 = (row[1][0] + XAdjust, row[1][1] + YAdjust)
    FeatPnt2 = (row[1][0] + XAdjust, row[1][1] - YAdjust)
    FeatPnt3 = (row[1][0] - XAdjust, row[1][1] - YAdjust)
    FeatPnt4 = (row[1][0] - XAdjust, row[1][1] + YAdjust)
     
    pointList = [FeatPnt1, FeatPnt2, FeatPnt3, FeatPnt4]
    #Create each point and add to array
    for point in pointList:
        pnt = arcpy.Point(point[0], point[1])
        array.add(pnt)
    #Add first point to array to close polygon
    array.add(array.getObject(0))
    #Create a new polygon from array
    newPolygon = arcpy.Polygon(array)
    #Add polygon to new feature class
    arcpy.Append_management (newPolygon, NewFeat, "NO_TEST")
    #Update new feature with values
    oidFld = arcpy.Describe (NewFeat).OIDFieldName
    delimOidFld = arcpy.AddFieldDelimiters (NewFeat,
                                            oidFld)
    sql = "{0} = {1}".format (delimOidFld, FeatureCount)
    with arcpy.da.UpdateCursor(NewFeat,
                               ["OID@",
                                "SHAPE@",
                                "SHAPE@TRUECENTROID"]
                               + fieldNameList,
                               sql) as updateCursor:
        for udrow in updateCursor:
            #Populate fields
            if fieldNameList:
                udrow[3:] = row[3:]
            #Shift new rectangle from centroid
            #(center of gravity) to spatial center
            FeatmidpointX = (row[0].extent.XMin
                             + row[0].extent.XMax) / 2
            FeatmidpointY = (row[0].extent.YMin
                             + row[0].extent.YMax) / 2
            NewmidpointX = (udrow[1].extent.XMin
                            + udrow[1].extent.XMax) / 2
            NewmidpointY = (udrow[1].extent.YMin
                            + udrow[1].extent.YMax) / 2
            DiffX = NewmidpointX - FeatmidpointX
            DiffY = NewmidpointY - FeatmidpointY
            udrow[2] = udrow[2][0] - DiffX, udrow[2][1] - DiffY
            updateCursor.updateRow(udrow)
                 
    del updateCursor
     
del row
del cursor
 
#Select all features in list
if copyFeatureOIDList:
    if len(copyFeatureOIDList) == 1:
        print "Copying feature"
    else:
        print "Copying features"
    oidStr = ", ".join (map (str, copyFeatureOIDList))
    oidFld = arcpy.Describe("DDPlyr").OIDFieldName
    sql = "{0} IN ({1})".format (oidFld, oidStr)
    arcpy.SelectLayerByAttribute_management ("DDPlyr", "", sql)
     
#Copy selection
if arcpy.Describe("DDPlyr").FIDSet:
    arcpy.Append_management ("DDPlyr", NewFeat)
 
#delete temp feature
arcpy.Delete_management (DDPFeature)
 
print "Created:", NewFeat
print "Done"

Monday, November 2, 2015

Selecting Features Randomly by Percent or Count in ArcMap

A coworker of mine recently asked for code to select features from a layer at random by percent. I created a function that could be pasted into the Python shell in ArcMap, and then called when the user wanted to make the random selection. I also developed a version to select by count instead of percent. There was a related question on GIS Stackexchage that didn't have an accepted answer, so I posted the code there as well.

The code for a selection by percent:

def SelectRandomByPercent (layer, percent):
    #layer variable is the layer name in TOC
    #percent is percent as whole number  (0-100)
    if percent > 100:
        print "percent is greater than 100"
        return
    if percent < 0:
        print "percent is less than zero"
        return
    import random
    featureCount = float (arcpy.GetCount_management 
                          (layer).getOutput (0))
    count = int (featureCount * float (percent) / float (100))
    oidFldName = arcpy.Describe (layer).OIDFieldName
    delimOidFld = arcpy.AddFieldDelimiters (layer, oidFldName)
    if not count:
        #select zero
        sql = "{0} = -1".format (delimOidFld)
        arcpy.SelectLayerByAttribute_management (layer, "", sql)
        return
    oids = [oid for oid, in arcpy.da.SearchCursor (layer, "OID@")]
    randOids = random.sample (oids, count)
    oidsStr = ", ".join (map (str, randOids))
    sql = "{0} IN ({1})".format (delimOidFld, oidsStr)
    arcpy.SelectLayerByAttribute_management (layer, "", sql)

The code for a selection by count:

def SelectRandomByCount (layer, count):
    #layer is the layer name in TOC
    #count is the count of features to be selected
    import random
    layerCount = int (arcpy.GetCount_management 
                      (layer).getOutput (0))
    if layerCount < count:
        print "input count is greater than layer count"
        return
    oids = [oid for oid, in arcpy.da.SearchCursor (layer, "OID@")]
    oidFldName = arcpy.Describe (layer).OIDFieldName
    delimOidFld = arcpy.AddFieldDelimiters (layer, oidFldName)
    randOids = random.sample (oids, count)
    oidsStr = ", ".join (map (str, randOids))
    sql = "{0} IN ({1})".format (delimOidFld, oidsStr)
    arcpy.SelectLayerByAttribute_management (layer, "", sql)

Saturday, October 10, 2015

Working Around Oracle's SQL Where Clause 1000 Item Limit with ArcPy

One option for working in an ArcSDE environment is to utilize Oracle databases. Oracle databases has an unfortunate limitation: they only allow for up to 1000 items when an SQL where clause is applied to them. When I create ArcGIS Python scripts, I will oftentimes perform a series of geoprocesses and throw field values (GUIDs perhaps) into a python list for features that meet certain criteria. I'll then want to filter my analyzed feature class by these values by means of a selection or through the creation of a feature layer. Both ArcPy's SelectLayerByAttribute_management and MakeFeatureLayer_management allow for the inclusion of an SQL where clause. In a large feature class, however, the 1000 item limit can be easily breached. To work around this limitation, I created a function that creates a feature layer from a list of values, even if the number of values is greater than 1000. The code first creates a new feature layer, and then performs multiple selections in 1000 item chunks. Once all the desired features have been selected, a final output feature layer is made and returned based on the selected features.

The code:

import arcpy

def LayerFromList (inLyr, inField, inList):

    """
    Returns an arcpy feature layer of the 
     features matching the values in 
     the inList in field inField
    inLyr can be a feature layer or 
     feature class
    inField is the field name in the 
     inLyr used for selection
    inList is a python list of values 
     to match to values in inField of inLyr
    Created to work around 1000 item 
     limit in Oracle database SQL querry
    """

    #convert list to strings
    mapList = map (str, inList)

    #Get Field Type
    fldType = [field.type for field
               in arcpy.ListFields (inLyr)
               if field.name == inField][0]
    
    #Make temporary feature layer to apply selections to
    tempLyr = "lyr0"
    num = 0
    #Get available layer name
    while arcpy.Exists (tempLyr):
        num += 1
        tempLyr = "lyr" + str (num)
    #Make layer
    arcpy.MakeFeatureLayer_management (inLyr, tempLyr)

    #features selected counter
    selected = 0
    #iterate
    while True:
        #Get next 1000 rows
        partial = mapList[selected:selected + 1000]
        #Exit while loop if there are no more items in the list
        if not partial:
            break

        #Create the SQL where cluase
        if fldType in ["String", "Guid"]:
            joinStr = "'{0}'".format ("', '".join (partial))
        else:
            joinStr = ", ".join (partial)
        sql = "{0} IN ({1})".format (arcpy.AddFieldDelimiters
                                     (tempLyr, inField), joinStr)

        #Add next 1000 rows to selection
        arcpy.SelectLayerByAttribute_management (tempLyr, 
                                                 "ADD_TO_SELECTION", 
                                                 sql)
        #Add 1000 to selected varaible
        selected += 1000


    #Make output feature layer
    num += 1
    outLyr = "lyr" + str (num)
    while arcpy.Exists (outLyr):
        num += 1
        outLyr = "lyr" + str (num)
    arcpy.MakeFeatureLayer_management (tempLyr, outLyr)

    #Delete temporary layer
    arcpy.Delete_management (tempLyr)
    
    return outLyr

Tuesday, October 6, 2015

Zipping Shapefiles with Python

My previous employer liked using GIS Pro to collect GIS field data on iPads. My employer wanted to equip surveyors with a variety of GIS information to be used in the field - parcels, political boundaries, company assets, and others. To do so, we first had to create shapefiles and add them to zip files for our data to be compatible with GIS Pro. All this information required geoprocessing to make sure the right data was added to each surveyor's iPad without bogging down GIS Pro's software with a crippling overabundance of data. Once that processing was complete I needed to create a function that would get my shapefiles zipped up nicely and ready for upload. To start, I hunted down all the possible shapefile file extensions. With use of Python's zipfile module I developed my function to add all shapefile files to a new zipfile that inherits its file name from the shapefile. I also added the option to keep or delete the shapefile from its directory after zipping by setting the Boolean Delete variable to either True or False.

The code:

import os
import zipfile
import arcpy

def ZipShp (inShp, Delete = True):

    """
    Creates a zip file containing the input shapefile
    inputs -
    inShp: Full path to shapefile to be zipped
    Delete: Set to True to delete shapefile files after zip
    """
    
    #List of shapefile file extensions
    extensions = [".shp",
                  ".shx",
                  ".dbf",
                  ".sbn",
                  ".sbx",
                  ".fbn",
                  ".fbx",
                  ".ain",
                  ".aih",
                  ".atx",
                  ".ixs",
                  ".mxs",
                  ".prj",
                  ".xml",
                  ".cpg"]

    #Directory of shapefile
    inLocation = arcpy.Describe (inShp).path
    #Base name of shapefile
    inName = arcpy.Describe (inShp).baseName
    #Create zipfile name
    zipfl = os.path.join (inLocation, inName + ".zip")
    #Create zipfile object
    ZIP = zipfile.ZipFile (zipfl, "w")
    
    #Iterate files in shapefile directory
    for fl in os.listdir (inLocation):
        #Iterate extensions
        for extension in extensions:
            #Check if file is shapefile file
            if fl == inName + extension:
                #Get full path of file
                inFile = os.path.join (inLocation, fl)
                #Add file to zipfile
                ZIP.write (inFile, fl)
                break

    #Delete shapefile if indicated
    if Delete == True:
        arcpy.Delete_management (inShp)

    #Close zipfile object
    ZIP.close()

    #Return zipfile full path
    return zipfl

And for those of you without the ArcPy module:

import os
import zipfile

def ZipShp (inShp, Delete = True):

    """
    Creates a zip file containing the input shapefile
    inputs -
    inShp: Full path to shapefile to be zipped
    Delete: Set to True to delete shapefile files after zip
    """
    
    #List of shapefile file extensions
    extensions = [".shp",
                  ".shx",
                  ".dbf",
                  ".sbn",
                  ".sbx",
                  ".fbn",
                  ".fbx",
                  ".ain",
                  ".aih",
                  ".atx",
                  ".ixs",
                  ".mxs",
                  ".prj",
                  ".xml",
                  ".cpg",
                  ".shp.xml"]

    #Directory of shapefile
    inLocation = os.path.dirname (inShp)
    #Base name of shapefile
    inName = os.path.basename (os.path.splitext (inShp)[0])
    #Create zipfile name
    zipfl = os.path.join (inLocation, inName + ".zip")
    #Create zipfile object
    ZIP = zipfile.ZipFile (zipfl, "w")

    #Empty list to store files to delete
    delFiles = []
    
    #Iterate files in shapefile directory
    for fl in os.listdir (inLocation):
        #Iterate extensions
        for extension in extensions:
            #Check if file is shapefile file
            if fl == inName + extension:
                #Get full path of file
                inFile = os.path.join (inLocation, fl)
                #Add file to delete files list
                delFiles += [inFile]
                #Add file to zipfile
                ZIP.write (inFile, fl)
                break

    #Delete shapefile if indicated
    if Delete == True:
        for fl in delFiles:
            os.remove (fl)

    #Close zipfile object
    ZIP.close()

    #Return zipfile full path
    return zipfl

Sunday, September 27, 2015

Determining Lengths of Polyline Features Regardless of Spatial Reference using ArcPy

Edit 2017: After all the below work, I've come to learn that there's a much easier way to determining feature lengths regardless of spatial reference with use of the field calcutor. This blog explains how.

Here's an example:

arcpy.CalculateField_management ("InputLayer", "LenMiles", "!shape.length@miles!", "PYTHON_9.3")

Original Blog:

When working with GIS, knowing lengths of polyline features can be fundamental to accomplishing your goals. However, something so basic isn't necessarily always simple. I wanted to create a python function that would take any feature class and return each feature's length in a standard unit (meters). I wanted the code to work for any feature class, regardless of projection or how large or small an area of the globe the feature class's features covered.

Determining lengths of features in a projected feature class is easy. ArcPy allows you to do so by the feature class's spatial reference object. This object has the metersPerUnit method, which returns the conversion factor for the feature class's units to meters. Once this is known, a cursor can be used to obtain the length of each feature. This distance can be multiplied by the conversion factor for the standardized distance.

Geographic coordinate systems do not have a measure unit, so determining each feature's length is not as straight forward a task. Initially I couldn't find a solution, and so asked this question on GIS StackExchange. The answer presented the solution of setting the feature class's coordinate system within a cursor. Picking which coordinate system also wasn't easy. The answer suggested WGS Web Mercator. However, one of the commenters warned that WGS Web Mercator isn't actually in meters. He suggested using the UTM zone the data fell in as the projection. This was a good suggestion, except for the fact that I wanted my code to be robust enough to work regardless of where it was geographically. I had to find a way to determine which UTM zone each feature fell within. With a little poking around I came across Calculate UTM Zone. This tool appears to be designed for use with data driven pages, but I found it suited my needs relatively well. The output, which populates a field with WKT, wasn't perfect however. I asked this question for suggestions on how I could determine each feature's UTM zone from its output consistently, but didn't find an answer. Initially I came up with my own string manipulation method, but later discovered the loadFromString () method of the spatial reference object, that lets you update a spatial reference from well known text. From here, it was simply a matter of projecting each feature into its corresponding UTM zone and creating the dictionary.

The code:

def FeatureLengths (inFc):
 
    """
    input - inFc: the input feature class
    returns dictionary -
    key: feature OID
    value: distance in meters
    """
 
    #get spatial reference type
    sr = arcpy.Describe (inFc).spatialReference
    srType = sr.type
 
 
    #Projected coordinate systems
    if srType == "Projected":
        #meters per unit
        metersPerUnit = sr.metersPerUnit
 
        #out Di
        return dict ([(oid, dist * metersPerUnit) for
                      oid, dist in
                      arcpy.da.SearchCursor
                      (inFc, ["OID@", "SHAPE@LENGTH"])])
     
    #Geographic coordinate systems
    #add UTM field to feature class
    flds = [f.name for f in arcpy.ListFields (inFc)]
    utmFld = "UTM"
    i = 0
    #Make sure field name is not in use
    while utmFld in flds:
        utmFld = "UTM{0}".format (i)
        i += 1
    arcpy.AddField_management (inFc,
                               utmFld,
                               "TEXT",
                               field_length = 1000)
 
    
    #calculate UTM zone
    arcpy.CalculateUTMZone_cartography (inFc, utmFld)

    #add sql field delimiters
    delimUtmFld = arcpy.AddFieldDelimiters (inFc, utmFld)

    #output dictionary that will hold oids and distances in meters
    outDi = {}
    #list to hold utm zones
    utms = []

    #get unique layer name
    layer = "lyr"
    while arcpy.Exists (layer):
        i += 1
        layer = "lyr" + str (i)
 
    #access table
    with arcpy.da.SearchCursor (inFc, [utmFld]) as cursor:
        for utmWkt, in cursor:
            #check if utm has been iterated
            if utmWkt in utms:
                continue
            utms += [utmWkt]

            #create new spatial reference
            sr = arcpy.SpatialReference ()

            #update spatial reference with wkt
            sr.loadFromString (utmWkt)

            #create sql to filter for utm zone
            sql = "{0} = '{1}'".format (delimUtmFld, utmWkt)

            #create feature layer per utm zone
            arcpy.MakeFeatureLayer_management (inFc, layer, sql)

            #iterate layer with projection
            with arcpy.da.SearchCursor (layer,
                                        ["OID@", "SHAPE@LENGTH"],
                                        spatial_reference
                                        = sr) as lyrCursor:
                for oid, shpLen in lyrCursor:
                    #update dictionary with length
                    if shpLen:
                        outDi [oid] = shpLen
                    #if shape error, return zero
                    else:
                        outDi [oid] = 0
            del lyrCursor

            #Delete layer
            arcpy.Delete_management (layer)
            
    del cursor

    #delete utm zone field
    arcpy.DeleteField_management (inFc, utmFld)
    
    #return out dictionary
    return outDi