What is this?

Auto Alt Text is a chrome extension that can generate descriptive captions for pictures.

Currently, users who are visually impaired must rely on metadata and alt-text descriptions put in by website developers in order to understand what an image actually contains. However, not all web developers take the time to caption all their images. This is where Auto Alt Text steps in.

Using artificial intelligence, the extension can analyze an image and detect the contents of the scene depicted in it within 5 seconds!

How does it work?

It's pretty simple to get up and running!:

  1. Download the extension
  2. Right click on any image element (note does not work with background images at the moment)
  3. Click "Get Image Info" from the dropdown
  4. Wait a few seconds and get your caption! (Note, the caption is also spoken! Adjust volume accordingly)

right click menu with auto alt text

What is the tech behind it?

Auto Alt Text is based off of the im2txt model which was created by Vinyals et al for the 2015 MCOCO Image Captioning Challenge.

The model itself is based off of a encoder-decoder neural network (basically a deep conv net paired with a LSTM). The deep conv net first encodes an image into a vector representation using Inception v3 (a popular image recognition model). The LSTM then creates a captioning model based on the Inception v3 encodings.

I converted the model into an API and pared it down so that it could fit on a Lambda instance and stay loaded into memory for blazing fast responses under 5 seconds (compared to the > 15 seconds needed for the model to classify out of the box).

If you want to learn more about the model itself, you can read the paper here.

neural network icon

Who made this/How do I help?

My name is Abhinav Suri and I am junior at the University of Pennsylvania. I love CS + Biology and am always looking for ways to benefit the community around me through programming.

One of the causes I am involved in is Hack4Impact. We're a 501(c)3 student-led organization that works with nonprofits and other socially responsible organizations to build apps to serve the community. We've worked on apps to combat wage theft, help foster youth find resources around them, and much more. If you'd like to work with us (or know any nonprofits that have app ideas), shoot me an email at [email protected] or donate to us.

All donation proceeds go towards Hack4Impact

Also, if you want to help with the code, I have open sourced everything related to the API on Github. Shoot me a PR and i'll take a look :)