Building an interactive tool to generate image maps

Published on June 04, 2024 under the Coding category.

Image maps let you make images where you can click on different regions to open web pages. For example, you could have an image map of a whiteboard with sticky notes where every sticky note linked to a document about that note. I use image maps across my site, such as on my explore page where I let you explore my blog by clicking items on my desk (there is even a musical easter egg on that page!).

With that said, making image maps is time-consuming: you have to manually draw the polygons around the objects that you want to make clickable and add URLs.

At IndieWebCamp Düsseldorf this year, I built a tool that runs on your computer that makes it easier to generate image maps. The tool lets you click on an object of interest in an image and assign a URL to that object. The corresponding image map code is then generated that you can copy-paste onto your website. The code comes auto-includes a reference to David J. Bradshaw’s open source image-map-resizer, a package that makes image maps responsive on mobile devices and different screen resolutions.

In this article, I am going to summarise how I built the tool, and what I learned doing so.

Here is a demo of the tool I made:

The scene: A hackathon

IndieWebCamps culminate in a Create Day, in which participants have a few hours to create a project related to their websites. I took on a project to build a tool to generate image maps that works in the browser.

My plan was for the application to:

Let you upload an image.
Download a version of the Segment Anything machine learning model that can identify precise boundaries around an image.
Let you click on regions of an image to segment them, and allow you to assign a URL that will open when a user clicks that region in your image.
Generate the code for an image map that will work with your image as you go.

I experimented with an open source implementation of Segment Anything in the browser, but the code was too complex for me to understand in the time I had. I needed more guidance. I learned a valuable lesson: I should not take on a project that involves learning a complex technology with which I had limited experience.

I decided to re-orient. Instead of making this tool work in the browser, I decided to make it a local-first web application. This means that the application would run on your machine rather than in the browser. While not ideal, this is something I was able to accomplish in the time I had.

Building the system

With my local-first decision made, I created a blank Flask Python application. This application loads a version of Segment Anything called FastSAM. This model lets you provide a point in an image and returns a polygon that outlines the region of the image on which you clicked. For example, if you click on a sticky note, FastSAM should return a polygon whose edges align with the sticky note.

When you run the application, from the command line, you need to provide an image to load. This image is then loaded into the model and displayed in the web application. From the web application, you can click on an object to generate the polygon boundaries for that whole object.

The user flow is as follows:

Click on an object.
If the object boundary is correct, you can type a URL and press enter to save the polygon.
If the object boundary is incorrect, you can click inside the polygon to refine and remove part that is incorrect, or outside of the polygon to expand the polygon to the new place you clicked.
The image map is generated as you go.

With this flow, a user – me! – can make an image map by clicking on an object once (in the default case), rather than having to carefully draw polygon boundaries around an object.

On the web page, there is the image you selected when you run the application from the command line. On top of this image, there is a canvas. Every time the user clicks on the image, the point the user clicked on the canvas is uploaded the web application. The web application then runs FastSAM and returns the polygon from the model. This polygon is then displayed in the web browser. Here is the demo mentioned earlier in this guide that shows the system in action:

The polygon coordinates are then used to create a polygon that renders in the browser. This polygon has a blue fill colour. This fill ensures it is clear to the user what region has been selected.

Using this project

Usage instructions for this project are available on GitHub. If you require any assistance in getting the code to work, let me know! I would love for this to be a command line tool you could install. I may work on this in the future. I would also love a hosted website to which I can go that runs a Segment Anything model in my browser. I don’t plan to work on that, but I would definitely use such software!