NYGeog

Geography, GIS, Geospatial, NYC, etc.

Tuesday, October 25, 2011

My first Stata command, summing two fields into a new field, why its easier than ArcGIS

So I barely use any software that's not GIS-related. Anyway, without making a list of the silly things that take forever in ArcGIS - at modelbuilder and even at python - (aka, deleting multiple fields for large datasets, took 8 hours last night!!!!), I've known I need to learn Stata to better manage my output files, so I requested Stata 12 and figured I'd share a little of my learning experience for my own reference but also in case anyone else is just starting out.

I'd like to start generating some calculated variables outside of ArcGIS. Some of my variables such as percent (PCT) of a population or densities could more efficiently be calculated outside of ArcGIS.

Here's how I've been adding a new field and calculating in ArcGIS:

I created model that has input parameters and allows the user to input everything they need to add a new field and calculate. It saves me just a little time by not having to add both models or type out all of these lengthy commands.

Here's the Python code:
# Import arcpy module
import arcpy

# Script arguments
Input_Table = arcpy.GetParameterAsText(0)

New_Field_Name = arcpy.GetParameterAsText(1)

Field_Type = arcpy.GetParameterAsText(2)
if Field_Type == '#' or not Field_Type:
Field_Type = "LONG" # provide a default value if unspecified

Calculate_Field_Name = arcpy.GetParameterAsText(3)

Calc_Expression = arcpy.GetParameterAsText(4)

Field_Expression_Type = arcpy.GetParameterAsText(5)
if Field_Expression_Type == '#' or not Field_Expression_Type:
Field_Expression_Type = "PYTHON_9.3" # provide a default value if unspecified

# Local variables:
Output_Feature_Class = Input_Table
Output_Feature_Class__2_ = Output_Feature_Class

# Process: Add Field
arcpy.AddField_management(Input_Table, New_Field_Name, Field_Type, "", "", "", "", "NULLABLE", "NON_REQUIRED", "")

# Process: Calculate Field
arcpy.CalculateField_management(Output_Feature_Class, Calculate_Field_Name, Calc_Expression, Field_Expression_Type, "")

Granted, there's Parameters instead of real values, but you get the idea, each of those parameters need to be filled out and all this code needs to run.

Here's the Stata code:

gen SUM_M11M12 = TRT_POPM11 + TRT_POPM12

That was easy!

Granted, this was only a numeric calculation, I still feel super confident about my Python/VBScript string 'skillz'[:5] + 's', but just saying, Stata is super fast and at the end of the day, those last numeric calc's might be best served in Stata.