Converting GeoPDFs with GDAL
GDAL, or the Geospatial Data Abstraction Library is a set of tools that can be used to both read and write information into geospatial raster data formats. This is an open-source tool released under the Open Source Geospatial Foundation license. This software can be used as an alternative to programs such as Global Mapper and Esri’s ArcGIS for raster processing tasks.
GDAL is entirely run from the command prompt window in Microsoft Windows. For more information on using GDAL, please visit their website. Here you will be able to access help forums, various tutorials and information on how to use various GDAL utilities, such as gdal_translate, the utility featured in this blog posting.
If you haven’t already downloaded your USGS Topo Maps, please revisit the Using the National map Viewer to Download US Topo Maps blog posting.
To begin, open the OSGeo4W Shell that was installed with OSGeo4W. You should be able to find this on your desktop. Once the Shell has been opened, you should see the following:
Now, we will use the following batch script to transform USGS GeoPDF topographic maps into the GeoTIFF format for easier use with ArcGIS.
Please note that this will only convert your file from GeoPDF to GeoTIFF. Although we will remove information from the collar such as scale bars and text, this will not remove the actual collar. A white border will still remain. To remove this remaining collar, you can use ArcGIS.
Here is some important information about this script:
- This will transform the GeoPDF file to a GeoTIFF file.
- The GDAL PDF LAYERS OFF command is used when many layers are desired in the output. This also helps to create a GeoTIFF that is similar in appearance to USGS Topo Maps.
- Exclude both the Shaded Relief and Orthoimages layers to increase processing speed and to decrease output size.
- The -of Gtiff specifies that this will be transformed into GeoTIFFs.
- Change DPI to desired level – 400 results in good quality, but large file size (around 900MB-1GB).
- Use the @echo=on command to show progress in OSGeo4W Shell prompt.
FOR /F %%i IN (‘DIR /B *.pdf’) DO (
gdal_translate -of GTiff !infile! !outfile! –config GDAL_PDF_LAYERS_OFF “Map_Collar”,”Map_Frame.Projection_and_Grids”,”Map_Frame.Terrain.Shaded_Relief”,”Images.Orthoimage” –config GDAL_PDF_DPI 400
Before we enter the script into the OSGeo4W Shell, there are a few preliminary steps that must be taken.
First, we will need to navigate to the drive and directory containing the GeoPDFs. To do this, type the following:
- On the first line, type your drive letter followed by the colon. Hit ENTER when done.
- See the yellow underlined portion above.
- On the second line, begin by typing lowercase cd following by a space. This command means “change directory”. Next, type in your directory location.
- See the orange underlined potion above.
- You can also add your directory by copying it from Windows Explorer and pasting it, by right clicking and selecting Paste in the Shell window.
- You are now ready to run your script.
- First, copy and paste the script text from the box above into a Notepad document.
- Save this in the same directory as your GeoPDF files. Give it a name such as “GDAL_GeoPDF_to_GeoTIFF_Script.bat”
- Make sure you save this as a .bat file (batch file). This will allow you to select the file in the OSGeo4W Shell and run the script.
- Next, in your Shell window, use the TAB key to scroll through the files within your directory until you have your batch file (.bat) selected.
- Press ENTER to being running.
You should see something that looks similar to this:
The red arrow above indicates where you will see the progression towards completion. This will increase to 100 when processing is completed. This process will repeat for as many GeoPDF files as you are converting.
When finished, you will have GeoTIFF files with a white border area. To remove these collars, use Esri ArcGIS to clip off the borders.
This blog posting was developed with the support of a competitive grant (cooperative agreement number P09AC00212; task agreement number P13AC00875) from the National Park Service in partnership with the North Atlantic Coast Cooperative Ecosystems Studies Unit. It is part of a larger document available for download on the IRMA Portal.