PDF to HTML converter

For some while, I’ve been suggesting making back issues of Anglo-Celtic Roots, BIFHSGO’s quarterly chronicle, full text searchable as a corpus. You can do that on individual issues posted on the BIFHSGO website as pdfs, but not across the whole collection. If you remember something mentioned in an article, but can’t recall the issue it’s a hard slog finding what you recall.

When I mentioned this in reviewing the new BIFHSGO website I received a pointed email from a Director about the need for volunteers.

I thought I might have a workaround to make the search possible. Paper to HTML Converter is “an experimental prototype that aims to render scientific papers in HTML so they can be more easily read by screen readers or on mobile devices. Because of our reliance on statistical machine learning techniques, some errors are inevitable. ”

To try it I downloaded an issue to my computer desktop and submitted it. A minute or so later I got the message “The page isn’t working.” Trying a second earlier issue broke the system.

It isn’t surprising, the system was designed for individual scientific papers, not whole issues. This does show it’s an area of active research.

Suggestions for other (affordable) approaches that would enhance the functionality of the back issues of ACR would be welcome via a comment.

7 Replies to “PDF to HTML converter”

  1. A simple work around would be to create your own private google drive (free) and upload them all to a folder. Then you can search whenever you want and the search feature searches within all the issues.

  2. I think there are two options:
    (1) There are ways to PDF files to allow them to be more searchable (some of our PDF files on the website are like that) and we’re experimenting with that option.
    (2) We can prepare an index of all the issues – starting presumably from the most recent – but that is where volunteers are required.
    Sheila Dohoo Faure, web manager

  3. Several societies that I belong to have the frustrating problem of trying to search back issues. It would be wonderful if there is an approach that would enhance the functionality.

  4. The Ottawa Branch OGS catalog includes a short description of each ACR issue in the catalog entries for Vol 1 to 25 (up to the end of 2019). The index is prepared by library volunteers. Search the catalog at
    https://ottawa.ogs.on.ca/researching/indexes/library-catalog/ Put in the search box, and the catalog entries will come up. Use the green plus sign on the left hand side to access the description. Or, search using . For example, the search on Anglo Celtic Roots returns 93 catalog entries, while Anglo Celtic Roots sharpshooters returns 5 entries. Once you know the volume and issue number you want, you can go back to the PDFs on the BIFHSGO website.

  5. The Ottawa Branch OGS catalog includes a short description of each ACR issue in the catalog entries for Vol 1 to 25 (up to the end of 2019). The index is prepared by library volunteers. Search the catalog at
    https://ottawa.ogs.on.ca/researching/indexes/library-catalog/. Put Anglo Celtic Roots in the search box, and the catalog entries will come up. Use the green plus sign on the left hand side to access the description for that issue. Or, search using Anglo Celtic Roots and a key word for the subject you are looking for. For example, the search on Anglo Celtic Roots returns 93 catalog entries, while Anglo Celtic Roots sharpshooters returns 5 entries, then look at the descriptions for those. Once you know the volume and issue number you want, you can go back to the PDFs on the BIFHSGO website. (format fixed)

Leave a Reply

Your email address will not be published.