Using AI to Digitize Data From Scanned Maps

A recent announcement by Michael Egan, one of the cofounders of Bunting Labs, on LinkedIn the other day caught my attention. Bunting Labs has a new release, the QGIS AI Map Tracing Plugin, which is a tool for artificial intelligence-driven automated digitization of features from scanned maps and plans. Egan and cofounder Brendan Ashworth developed this plugin to rapidly speed up the process of extracting geographic features from scanned maps and plans.

Typically, extracting geospatial information from a PDF involves a laborious process of manually tracing features, such as lines, polygons, and points, from the PDF or scanned map image to create GIS vector data. Depending on the complexity of the PDFs or scanned maps, this process can be quite time-consuming.

Bunting Labs launched the map tracing plugin after six months of development and 158,000 map training datapoints to automate this process. The plugin is designed to replace the the Add Line/Polygon Feature or Create Features tool in QGIS that GIS users normally use for heads-up digitizing. Instead, the user will seed the digitizing process by tracing the beginnings of a line or polygon and then let the autocomplete feature of the plugin take over.

Given the enormous value that a plugin like this has to offer for rapidly extracting data from scanned maps and PDFs, I wanted to test it out on a sample map. Users can use the QGIS AI Map Tracing Plugin as a demo. If you provide your work email, you can test up 2,000 completions (which Bunting Labs defines as “every line segment (additional vertex) added to a line or polygon after you click.”

How to install the QGIS AI Map Tracing Plugin

The plugin is available both via the QGIS plugin repository (Plugins –> Manage and Install Plugins and then searching for Bunting Labs) or by visiting the Bunting Labs site and then uploading it into QGIS.

Once the plugin is installed, you can take advantage of the 2,000 completion credits by registering your work email. Registering your work email will then give you access to the “Plugin Secret Key” to take advantage of those credits.

Testing out the Digitizing plugin with some sample maps

Testing out the features of the plugin requires that a digital file be loaded directly into QGIS such as a GeoTIFF, TIFF, PDF, or other image format. Serving an image from a Web service/REST Server doesn’t work. For this review, I georeferenced a historical maps to test out but if you don’t want to georeference a scanned map yourself, search for a historical map that already has geographic coordinates attached like this collection of maps from the David Rumsey Map Collection.

The historical map that I selected was this scanned 1896 map of property owners in Clay County, Iowa. (Side note, I chose this map in part because the cartographer who made the map apparently couldn’t even be bothered to spell county correctly).

Map typos aside, this map represented a good example of a condition of an older scanned map containing the types of features that a cartographer might want to extract from a historical map.

Using the AI Vectorizer to autocomplete digitization of features from a map

Screenshot showing how to create a new layer in QGIS.

Select Layer –> Create Layer –> New Shapefile Layer to set up a new dataset in QGIS for digitizing.

Using the AI digitizing plugin is pretty simple. After adding the georeferenced map to a QGIS project, I set up a new data layer with polygons as the feature class so I could test the plugin’s capabilities when digitizing one of the lakes on the Clay county map.

Screenshot showing how to set up a new shapefile layer.

For digitizing lakes from the 1986 Clay County map, I set up the shapefile with a polygon geometry type.

I focused on digitizing Swan Lake which is located in the upper right quadrant of the map.

Screenshot showing the pencil icon tool that toggles editing on in QGIS.

Clicking on the pencil icon toggles editing to on in QGIS.

Following the instructions from Bunting Lab I first toggled on the editing mode for the newly created polygon shapefile by clicking the pencil icon. Once editing mode is on, the “Vectorize with AI” tool button will revert from a grayed out state to an active one.

Screenshot showing the location of the "vectorize with AI" tool.

Clicking on the “Vectorize with AI” tool activates the plugin.

The “Vectorize with AI” tool replaces the need to trigger the “Add Line/Polygon Feature” or “Create Features” tool that you would normally use to start digitizing features. By clicking on the “vectorize with AI” the tool activates and the cursor changes from a black arrow to a target. The tool itself seems to have no settings – you simply trigger the tool and seed the digitizing by tracing a small section of the feature you want to capture.

To start the AI-driven digitization, I placed three clicks along the lake polygon feature I wanted to capture and moved my cursor inside the polygon. Watching the plugin in action is really quite satisfying – for the most part, the plugin was able to cleanly trace the boundaries of the lake, entering far more vertexes than I would have manually digitized. If you want to see this automated digitizing in action, Bunting Lab has a short clip of what it looks like.

A screenshot showing a drawing of a lake on a map with a red screen overlay indicating where the AI plugin is digitizing the feature.

The AI vectoring plugin in process.

There was one section where the AI plugin veered off along the property lines adjacent to the lake.

A screenshot showing a close up view of a lake on an old map with red screen overlay showing the results of digitization veering off from the lake and along property lines.

Property lines near the lake caused the AI digitizing plugin to veer off course.

The solution to errant digitizing, according to Bunting Labs, is to click on the shift key and left click on the mouse while hovering over the section where the digitizing went off course to cut that area. I found correcting the over-digitization pretty straightforward by following those instructions even restricted to using a trackpad.

Side by side map clips showing correcting errant digitizing.

Using the Shift key halts the automated digitizing. The last digitized vertex is then highlighted in teal so the user can go back and remove errant vertexes.

Hitting the Shift key highlights the last vertex in teal which makes it easy to hover over it and click to remove. After removing the errant digitizing, I was able to then manually click new vertexes in the problem area to move the AI plugin past the confusing part and then let the vectorizer resume automating.

My end result from digitizing Swan Lake from the map was fairly decent. The vertexes that the AI tool digitized were cleaner then what I would have produced manually digitizing — there were still some jagged sections where I had stepped in to manually move the tool past some tricky parts.

It did take some trial and error to get comfortable with how the tool functions. I found if I wasn’t quite quick enough with the shift key to pause the AI digitizing, too many errant vertexes were created and I just ended up with weird polygons trying to fix it. It was easier to delete the feature and try again, making sure to pause the AI tool before too many vertexes were digitized.

A completed polygon that has been digitized with artificial intelligence from a map.

The completed AI digitized polygon for Swan Lake.

Overall thoughts

This tool is quite exciting given its promise. In testing out its capabilities, the AI vectorizing tool definitely speed up onscreen digitization by rapidly identifying and vectorizing features from the scanned maps. Older maps can have a lot of inherent noise due to age and the scanning process so there is understandably still quite a bit of errant digitizing happening so this isn’t a completely hands off process.

I found more success when I actively guided the tool by manually adding vertices in spots that I had previously seen where the tool would go haywire and start digitizing features outside of where I wanted to digitize.

Using the Shift key to pause the AI digitizing and then manually removing errant vertexes was relatively straightforward. Depending on the level of spatial accuracy, there would still need to be some manual cleaning of the end vector product to adjust for areas of the features that deviated from the original source map.

Screenshot showing the AI vectorization tool jumping off the screen.

A section of a road on a map where the AI vectorizing tool jumped off the screen.

In some situations, the tool went far off track – jumping to features outside of the current view of the map canvas even though there were no features nearby the one I was trying to digitize. There are no settings to set tolerance levels or constrain the tool to specific areas of the map. Adding in these settings might help reduce user frustration from a situation like this.

Overall, the tool is definitely something I would want in my toolkit for extracting features from scanned maps. Even with errant digitizing, it produced a faster and cleaner vectorization of the lake polygon with some minor manual adjustments from a yellowed, historical map.

The 2,000 completions credit that come with the free trial do deplete quickly. In testing out the plugin trialing out both polygon and line features, I quickly ran down the credits. If you have only a map or two a month that you want to digitize a few features from, the free trial version allows up to 1,500 completion credits per month.

For users with a larger digitization workload, there are two pricing tiers. Both pricing tiers give you access to faster and more accurate AI models as well as priority support. The highest tier (currently $99/month) also provides user provided map training to further improve the AI model. Bunting Labs is also offering free 20 minute demo sessions with the developers.

Related GIS articles

Fonte : National Geographic