PNG2SRT (tool to OCR image subtitles)

Download on Github

This is a tool that can perform OCR (optical character recognition) on XML/PNG subtitles and output the result as an SRT file. This can be used for subtitles obtained from DVD and Blu-ray. The Google Cloud Vision API is used for the OCR, and it has very good accuracy. This program is based on a python script originally posted by zx573 on the kanji koohii forums.

Before using this program, you may need to get your subtitles into the XML/PNG format. For DVD or Blu-ray, I’m not going to write a detailed guide on ripping subtitles from the disc, as there are plenty of other guides out there on the internet. It is assumed that you can figure out how to obtain your subtitles as SUB/IDX or SUP format. From there, I recommend using a Windows program called Subtitle Edit to convert them into XML/PNG format. There may be other software that can do this, but Subtitle Edit is the one I am most familiar with.

Using Subtitle Edit to convert DVD or Blu-ray subs to XML/PNG

The File menu in Subtitle Edit has several options to import your Subtitles that are in SUB/IDX or SUP format. Just choose the appropriate one, and then you will come to an import screen. From here, you just need to right-click on one of the subtitle lines, then select Export > BDN xml/png.

Then on the next screen then comes up, you just want to select “export all lines”, and select a folder to save to.

Now you should have a folder containing a bunch of PNG images and an XML file. The next step is to create an API key on the Google Cloud Platform.

Create an API Key for Google Cloud Vision API

Google’s OCR is by far the most accurate I have seen, and works quite well. It is also free for a limited amount of use each month. According to their current pricing structure, you can OCR up to 1,000 items per month for free. My program can batch several PNG images into a single item, so you should be able to do several episodes or movies in a single month without having to pay anything. Google also offers a great trial offer (at least at the time I write this). You can get $300 of free credit when you sign up, and you have no obligation pay anything or continue using the service.

If you sign up for the Google Cloud Platform, then after logging in, you need to enable the Cloud Vision API and generate an API key.

  1. In the left hand menu, select APIs & Services > Dashboard
  2. Select Enable APIs & Services
  3. In the search box, type “vision”, and then select Google Cloud Vision API.
  4. Select Enable. It may walk you through setting up a billing profile at this point if one has not been created already. Again, there is no obligation to actually pay anything, as you can use this API a certain amount for free each month, and you may get free credits when signing up.
  5. Back at the APIs & Services Dashboard, select Credentials > Create Credentials > API Key.
  6. Once you have generated the API key, be sure to copy it or keep it open in your browser so you can access it later.

Use PNG2SRT to OCR the images

Now, we can use PNG2SRT to send the subtitle images through the Cloud Vision API.


Version 1.0.1 – May 12, 2018

Download on Github

Download the appropriate version for your computer, and then extract the archive.

Next, you need to paste your API Key into a text file named API_KEY.txt located in the same folder as the application (the file should contain ONLY your API key, and no other text).

When you run the application, it should look like this:

First, you need to make sure that your API Key is displayed correctly in the top area. If not, make sure you did the previous step correctly.

Then, you just select a folder containing XML/PNG files, which is what will be converted to SRT.

Note: You may get an error if the folder name contains unicode characters. In that case, please rename the folder to use English characters.

There is also an option to select the language that you want Google to recognize. It defaults to Japanese, because that is what I use, but you can select whichever language you need. You can find a full list of language codes here.

The only other option is the chunk size. The default of 15 is usually fine. If you press the start button, and the program appears to begin working but then gives you an error message part way through, you might need to decrease the chunk size to a smaller value like 10 or even 5.

After you press start, if all goes well, the program should run and it will output an SRT file inside your input folder.

64 thoughts to “PNG2SRT (tool to OCR image subtitles)”

  1. Hey, Alan. I followed your steps, but only got a subtitle with all lines with “TEXT HERE”. Any idea what I did wrong?

    1. Not sure, but it may be an issue with the format of your images. If you could upload a few of them somewhere with the xml file I could take a look.

    1. According to the error, it’s saying it didn’t locate an XML file in the folder that you selected. Your input needs to contain both png images and an xml file.

      1. Thanks, I did not notice that Subtitle Edit generated an XML file before moving the PNG files into a specific directory.
        It says now that Cloud Vision API has not been enabled, I’ll have to sort that out. The Google interface is far from user friendly!

          1. Looks like its a problem with the format of the images you are trying to use. Mind uploading your original subtitle file somewhere, plus a few of the png files you are trying to use? I’ll try to check it out and see if I can figure out whats wrong.

          2. Hi Alan,

            I am so sorry didn’t tell you details about my case.

            I use VideoSubFinder ( to extract the hard-subtitles from videos, not extract subtitles from DVD.

            I have some Chinese subtitle pictures with background (not transperency). I want to use your tool to get the OCR subtitles file.


            (I already put all of them to Grayscale color)

            Can you edit your script to solve this? So, the program can work with not only the transperency pictures, but also the pictures having background?

          3. I think the main problem here is that your images are all JPEG files, and this software is only designed to work with PNG images. If you can convert your images to PNG, it might work.

  2. I’m a beginner Linux user and I’d like to know how to download and use this program. I have a Mac as well, but when I downloaded the Mac file I wasn’t sure how to install/there wasn’t the primer file that does all the setup. Was wondering if you could point me to a resource that walks through how to install the program here. Thanks

    1. For Linux, you can download and run the source code, its just a single python file. You will need to make sure python 3 is on your system (most linux distributions include python, but some of them only include python 2).
      You also need 3 python packages: Requests, Pillow, Gooey. Some googling should show you how to install them.

  3. Question…. i’m trying to sign up for Google Cloud Platform and its asking me for my credit card info without stating the cost. The last thing I want is to enter my credit card info and end up spending thousands just to convert PNG into Chinese subtitles for Love 020 so I can watch it on my phone on my way to work.

  4. Hello I tried to convert .sup subtitles to .srt subtitles, It worked with english subtitles but then when I tried with japanese subtitles it doesn’t work, I get unicode characters

    1. Seems to be working, you just need to type “fa” as the language code to use. The accuracy of the OCR doesn’t appear to be very good though, with it leaving several lines blank.

  5. Is it possible for the OCR to recognize symbols? The music symbol is in many of my subtitles and it is wrongly OCR’d as a bunch of random letters.
    Very useful program btw – thanks.

  6. Thank you very much for this post and the tools you’ve created!! I work with Arabic and this post saved me! Appreciate the hard work!

  7. I have two copies of the same video — one with hardcoded subtitles and the other with no subtitles. How do I get the hardcoded subtitles into the XML/PNG format? Thanks.

  8. Hi, Alan.

    I’m wondering, instead of converting to srt files, if there’s a way to create sub/idx files from the images, xml, and idx extracted from netflix with the script you suggested? I googled about creating sub files, but I only got how to converting sub to srt or sub to images etc, not images to sub.

    Any ideas?

  9. Hi, Alan.
    First, I have to thank you for this tool, because it’s exactly what I was searching for to convert Japanese vobsub files from DVDs.
    Anyway, I’m using it on a Mac (using the Windows version of Subtitle Edit through Wine) and I have a problem with movies that have more than 615 lines of subtitles. Under 615 lines everything is ok, and I can obtain my .srt file, and it’s perfect.
    But if it’s longer than 615 lines your script gives me this kind of error:

    Generating request (41/64)…
    Requesting OCR text…
    [5975] Failed to execute script PNG2SRT
    Traceback (most recent call last):
    File “”, line 192, in
    File “gooey/python_bindings/”, line 78, in
    File “”, line 189, in main
    File “”, line 141, in PNG2SRT
    File “”, line 117, in ocr_text
    KeyError: ‘responses’

    If the chunk size is set at 15 it always stops at 41, and if the chunk is set to 10 it stops at 61, as you can see here:

    Requesting OCR text…
    Generating request (61/96)…
    Requesting OCR text…
    [6369] Failed to execute script PNG2SRT
    Traceback (most recent call last):
    File “”, line 192, in
    File “gooey/python_bindings/”, line 78, in
    File “”, line 189, in main
    File “”, line 141, in PNG2SRT
    File “”, line 117, in ocr_text
    KeyError: ‘responses’

    The same happens if the chunk size is set to 5 (it stops at chunk number 123).
    And it happens with different files and movies.

    I don’t know nothing about Python, but the fact that it always stops after line 615 maybe could mean something to you? I don’t know, something regarding memory, maybe? Something related only to the Mac?

    I hope you can help!

    1. That’s interesting, I have not encountered this error before.
      Can you post up a link to your sub/idx for me to download and test?

        1. Thanks, I got it.
          I’ve only tested the DVD functionality on one or two discs, so if its a problem of length, mine might not have been long enough. I’ll see what I can find out.

          1. Ok, thanks. If you need more files from DVDs or Bluray write me an email (at the address I’m using to write this comment) and I’ll send you some more. I hope it’s not a server-side problem…

      1. Hi, Alan.
        I just noticed that it seems exactly the same problem described on your Github page, here:

        In that page Coasterfr also wrote that “It goes thru 40 batches of images before giving this error.”
        There could be a limit set by the Google servers when there are more than 600 requests?

        1. After some investigation I discovered that the problem is indeed caused due to this program exceeding 600 requests per minute, which is a limit set by google. In order to solve this, I have tried to build in a simple delay to slow it down a bit.

          I will send you an email with a new build that you can test. Unfortunately I can not test it myself, as I no longer have any available quota for this month, and I don’t want to have to pay money for further testing.

          1. It works, Alan! It works!

            Generating request (69/69)…
            Requesting OCR text…
            Generating SRT file…

            I’ve tested it on two files (one from DVD and one from BD, 960 and 1.035 lines), and everything went well.
            So, thank you very much for your fast solution!

  10. Hi, Thanks for this great post about subtitles extracting.
    I just tried following the instructions but getting same errors as EDDIE’s.
    The language is Korean.
    KeyError: ‘responses’
    starting netflix
    Found input file C:\netflix subs\S01E01.WEBRip.Netflix\manifest_ttml2.xml
    Generating request (1/74)…
    Requesting OCR text…
    Traceback (most recent call last):
    File “”, line 190, in
    File “site-packages\gooey\python_bindings\”, line 78, in
    File “”, line 187, in main
    File “”, line 139, in PNG2SRT
    File “”, line 115, in ocr_text
    KeyError: ‘responses’

    1. Signed up the api thing with a new account and it finally worked!
      Probably I made a mistake getting an api key before.

      Thank you so much for the post and png2srt! It really helped me a lot.

  11. Hello, thank you for your tool.
    I’m trying to extract netflix subs by your tool. Even though I did exactly as your instructions, this error keeps coming out. I double checked the API code and reduce the chunk size to 10 and 5, but it still didnt work. Is there any solution for this T.T

    starting netflix
    Found input file D:\Drama\Prison.Playbook.S01E01\manifest_ttml2.xml
    Generating request (1/127)…
    Requesting OCR text…
    Traceback (most recent call last):
    File “”, line 190, in
    File “site-packages\gooey\python_bindings\”, line 78, in
    File “”, line 187, in main
    File “”, line 139, in PNG2SRT
    File “”, line 115, in ocr_text
    KeyError: ‘responses’

    1. Have you verified that the api key displayed in the top of the window matches your api key exactly? My best guess for this error would be an issue with the api key.

        1. I’m not sure what could be causing the problem then. If I release a new version, I’ll add some additional error checking to help narrow down the cause of problems like this.

  12. Hi. I’m planning to do a like quest for me to jump into reading a cooking book in japanese.
    Do you think I could put this tool to work with pages (recipes)? Eventualy I would scan the whole book, but I was plannning a recipe at a time.

    1. If you upload an image to google drive, then open it in google docs, google will do the ocr on it.

      It’s very difficult to enhance my tool to make it useful for books, because the most difficult part is the formatting. Something as simple as furigana over the text or having text in multiple columns can put the text in an order that is very difficult to read without constantly getting lost.

      So this trick with google drive is probably your best option for now.

  13. I wonder if it’s possible to convert manga images in a similar fashion? For my case the output doesn’t have to be properly readable. I’m really just interested in creating a list of the most common words so I can know where to focus my learning.

    Maybe there’s already a tool out there to achieve this?

    1. For that case, I would recommend just using anime scripts if there is an anime version of the manga you are looking at. The dialog is often fairly similar and will have a lot of the same words.

      This tool probably wouldn’t be very useful for such a case, but the code might be a decent starting point to create something from, if someone were interested in doing so.

      You might also try one of the following and see if either of them work for your needs:

      1. Anime scripts/subtitles are what I’ve been using up to now. There are a lot of manga that I want to read (or am reading) that don’t have this option though, so I thought it could be useful to try extract the text.
        That being said I’m kind of at the point where I’m probably wasting my time thinking too much ABOUT studying, rather than just studying…

      2. As for the linked tools, they don’t seem very suited to say, extracting the text from a whole manga series and outputting it to a text file. That and, as far as I can tell, the Google Cloud Vision API is the best OCR engine out there at the moment.

  14. Bravo!!! thanks for all the hard work.
    Will try this new tool. But already very happy with the “old” one as well.
    Netflix has becomed my well of (japanese) wisdom thanks to this nifty tools.

    1. Glad you are finding it useful. If you are just using it for Netflix content, you wont really find any improvements though. This new one basically just adds support for DVD and Bluray, so I decided it was time to change the name as well.

Leave a Reply

Your email address will not be published. Required fields are marked *