Open-source OCR for Scanned PDFs: A Comprehensive Guide

Поділитися
Вставка
  • Опубліковано 2 гру 2024

КОМЕНТАРІ • 3

  • @jdaniele
    @jdaniele 3 місяці тому

    Hi, very interesting video, thanks for sharing it.
    By the way, I'm looking for something able to keep the formatting and structure of the page, keeping all graphical elements, like images, lines and so on, placing the extracted text in the same position of the original one, keeping also the font types and styles.
    Do you think these feature are implementable in Python?
    The goal is to create a PDF as the exact copy of the scanned one, having the text selectable and searchable.
    Thanks.

  • @aarongolden106
    @aarongolden106 6 місяців тому

    Am I the only one who doesn't understand Step2? How does your interface look different than ours? Why do you have another window for the Step#2? Are we not using the same command window for the !pip and import? If we cant replicate the steps then it isn't a step by step tutorial.

    • @AmnahsLab
      @AmnahsLab  5 місяців тому

      Hello, thank you for your question. Step 2 requires you to upload your own PDF to the notebook to perform the consequent steps. In my case, I used a PDF file that I made that contains the medium article this video is based on. You can upload any PDF you'd like for this step. Hope I've clarified your doubt.