none
Converting a geometry polygon (eastings/northings) to geography polygon (lat/long)

    Question

  • Hi guys,

    First off, my understanding of Spatial concepts is pretty poor, so plesae excuse me if this question is fundamentally dumb.

    I have a UK OS dataset which contains a polygon for each postcode in the UK. They are stored at MID/MIF files. I want to store the polygons in a SQL geography field. Obviously, there are going to be some conversion hoops to jump through.

    I managed to use the MapInfo universal translator to convert the files to shapefiles, knowing that I could then use Shape2SQL to import them into a SQL geography field. I've done this a few times with other polygon datasets, and it's worked well.

    The conversion to shapefiles worked fine, but when I went to use Shape2SQL to load the shapefile into SQL, I could only load into a geometry - when I tried to load into geography, it gave me a warning about the points being out of range.

    I was a bit puzzled, but loaded the data into a geometry field, thinking I could then convert geometry to geography with no problem. However, when I examined the content of the gemetry field by CONVERTing it to text, I could see the polygon was made up of points stored as eastings and northings.

    I am assuming that the geography datatype can only store polygons where the points are stored as lat / long; even with my limited understanding, this would make sense - eastings and northings are 'flat', and geography is 'round'.

    In principle then, do I need to convert each point in the geometry polygon to lat/long, then convert the geometry polygon to geography? And if this is the case, what is the best tool for doing this?

    Any help would be greatly appreciated.

    Sam

     

    Friday, November 25, 2011 9:45 AM

Answers

  • Hi Sam,

    Not a dumb question at all. In fact, I'd say from your description of the problem that your understanding of spatial concepts is pretty good, actually :)

    You're absolutely correct - geography is for lat/long data. geometry is for x/y data.

    I'm guessing your postcodes are defined using the OS National Grid of Great Britain, which, by default, should be imported into SQL Server using the geometry datatype and SRID 27700.

    However, all spatial data used in a given query in SQL Server (say, to find out which points intersect a polygon) needs to be defined using the same datatype, and the same SRID. So, if you want to use this postcode data in queries involving your other geography data then you need to convert it to whatever coordinate system that data is defined in (probably SRID 4326), and use the corresponding geography datatype instead.

    SQL Server does not provide the ability to reproject data, so you need to convert the data outside of SQL Server prior to importing it. Shape2SQL can't do this, but I wrote a few posts using OGR2OGR (open source) to convert and load Ordnance Survey data from shapefile to SQL Server which you might find helpful:

    http://alastaira.wordpress.com/2011/02/21/using-ogr2ogr-to-convert-reproject-and-load-spatial-data-to-sql-server/

    (shows how to reproject OS data from geometry 27700 to geography 4326 and saves the result as WKT because, at the time, OGR2OGR couldn't load directly to SQL Server. If you get the latest version 1.8+ it can, as described in the next post:)

    http://alastaira.wordpress.com/2011/06/18/importing-spatial-data-to-sql-server-with-ogr2ogrnow-even-easier/

     


    twitter: @alastaira blog: http://alastaira.wordpress.com/
    • Marked as answer by Sam Loud Friday, November 25, 2011 5:31 PM
    Friday, November 25, 2011 4:55 PM

All replies

  • Hi Sam,

    Not a dumb question at all. In fact, I'd say from your description of the problem that your understanding of spatial concepts is pretty good, actually :)

    You're absolutely correct - geography is for lat/long data. geometry is for x/y data.

    I'm guessing your postcodes are defined using the OS National Grid of Great Britain, which, by default, should be imported into SQL Server using the geometry datatype and SRID 27700.

    However, all spatial data used in a given query in SQL Server (say, to find out which points intersect a polygon) needs to be defined using the same datatype, and the same SRID. So, if you want to use this postcode data in queries involving your other geography data then you need to convert it to whatever coordinate system that data is defined in (probably SRID 4326), and use the corresponding geography datatype instead.

    SQL Server does not provide the ability to reproject data, so you need to convert the data outside of SQL Server prior to importing it. Shape2SQL can't do this, but I wrote a few posts using OGR2OGR (open source) to convert and load Ordnance Survey data from shapefile to SQL Server which you might find helpful:

    http://alastaira.wordpress.com/2011/02/21/using-ogr2ogr-to-convert-reproject-and-load-spatial-data-to-sql-server/

    (shows how to reproject OS data from geometry 27700 to geography 4326 and saves the result as WKT because, at the time, OGR2OGR couldn't load directly to SQL Server. If you get the latest version 1.8+ it can, as described in the next post:)

    http://alastaira.wordpress.com/2011/06/18/importing-spatial-data-to-sql-server-with-ogr2ogrnow-even-easier/

     


    twitter: @alastaira blog: http://alastaira.wordpress.com/
    • Marked as answer by Sam Loud Friday, November 25, 2011 5:31 PM
    Friday, November 25, 2011 4:55 PM
  • What a sensational answer; I shall have a go with your OGR2OGR approach. Thank you very much indeed.
    Friday, November 25, 2011 5:33 PM
  • OK, I finally accomplished what I set out to do, and I'm going to share a few of the things I learned on here - I know most people on here are probably spatial ninjas; if this is you, please ignore, but someone with limited spatial expereince might need to do what I did, and this may help them.

     

    I wanted to load OS Code-Point with polygons data into a SQL geography datatype. The problem was that the polygons in Code-Point stored their points as Eastings / Northings, and geography needs them stored as lat / long.

    Acting on Alistair's advice, and fantastic blog posts, I downloaded the OGR2OGR tool, and built the solution. Unfortuntely, when I ran it, it threw errors indicating that there was a missing dll - proj.dll. After Googling it, I found that this was a fairly common problem, and downloaded  proj446_win32_bin.zip from http://trac.osgeo.org/proj/. After copying these files into OGR2OGR/bin, this problem went away.

    Next, I planned to run OGR2OGR, pulling the raw-ish data from the Code-Point MID/MIF files all the way into the SQL geography field. I hoped I could do all of the transforms / reprojections 'in-flight' without staging the data anywhere. This was not to be.

    Although theoretically possible, whenever I tried to load into a geography datatype I got System.ArgumentException: 24200: The specified input does not represent a valid geography instance. The data loaded into a geometry datatype just fine, but even after MakeValid, and SET ogr_geography = geography::STGeomFromWKB(ogr_geometry.STAsBinary(), 4326), I couldn't get the data to load into the geography field.

    So, I ended up using OGR2OGR to convert from MIF to .shp. The reprojection from EPSG:2770 to EPSG:4326 worked just fine. Getting the command line syntax just right was a bit tricky, but in the end, this is what worked for me (be really, really careful with the quoting):

    C:\OGR2OGR\bin\ogr2ogr.exe -s_srs EPSG:27700 -t_srs EPSG:4326 -f "ESRI Shapefile" "D:\SSIS\DataWarehouseIn\OSData\CodePoint\MK.shp" "D:\SSIS\DataWarehouseIn\OSData\CodePoint\MK.mif"

    So then I used Shape2SQL to load the .shp into the geography field, which worked fine. However, I had 120 of the MIF files to process, so it had to be automated. I used the (somewhat lightly documented) Shape2SQL command line. Again, I got the syntax wrong a few times before I got it right, so here's what worked for me:

    C:\Shape2SQL\Shape2Sql.exe -shp="D:\SSIS\DataWarehouseIn\OSData\CodePoint\MK.shp" -connstr="Data Source=.;Initial Catalog=XXX_Staging;Integrated Security=SSPI;" -table="STG_PostcodePolygons" -srid="4326" -oidname="ID" -geomname="geography" -append="true" -geography=true

     

    Now I needed to pull all this together into a SSIS package that would loop over all of the MIF files in the directory, and use OGR2OGR to conver to the .shp, then Shape2SQL to load the .shp into the dB table.

    This required an Execute Process Task, and you won't be suprised to know that there were various problems with this. It's a fairly well known SSIS trick, when using Execute Process Tasks to call cmd.exe and pass in the actual executable you want to run along with the /c switch, and all of the arguments. I ended up doing this.

    Only 2 things need setting in the Execute Process task: Executable, and an EXPRESSION on the Arguments

    N.B. @[User::Filename] is the file SSIS is currently workign on

    Executable:

    c:\windows\system32\cmd.exe

    EXPRESSION on Arguments:

     

    "/c C:\\OGR2OGR\\bin\\ogr2ogr.exe -s_srs EPSG:27700 -t_srs EPSG:4326 -f \"ESRI Shapefile\" \"D:\\SSIS\\DataWarehouseIn\\OSData\\CodePoint\\"+ @[User::Filename]+".shp\" \"D:\\SSIS\\DataWarehouseIn\\OSData\\CodePoint\\"+ @[User::Filename]+".mif\" -overwrite"

    You'll  note there is a lot of double quote escaping going on - this is the \ before each ". Note also the double \\ in the file paths - this is escaping the \.

     

    The same goes for Executing Shape2SQL in the Execute Process task

     

    Executable:

    c:\windows\system32\cmd.exe

    EXPRESSION on Arguments:

    "/c C:\\Shape2SQL\\Shape2Sql.exe -shp=\"D:\\SSIS\\DataWarehouseIn\\OSData\\CodePoint\\"+ @[User::Filename]+".shp\" -connstr=\"Data Source=.;Initial Catalog=XXX_Staging;Integrated Security=SSPI;\" -table=\"STG_PostcodePolygons\" -srid=\"4326\" -oidname=\"ID\" -geomname=\"geography\" -append=\"true\" -geography=true"

     

    Sorry for length of this post, just hope it may help someone. Tip of the hat to Alistair for all his ace info.

     

    Wednesday, November 30, 2011 2:42 PM