Omniscribe v1.0.2

Main Authors: Jonathan Quach, Dawn Childress, Morgan Madjukie, Peter Broadwell
Format: info software eJournal
Bahasa: eng
Terbitan: , 2019
Subjects:
Online Access: https://zenodo.org/record/3007970
Daftar Isi:
  • Thanks for using our package! This is a command line interface for detecting annotations in IIIF-hosted printed books. Give the script a list of IIIF manifest URLs (either saved locally or hosted elsewhere) and it will generate a resultsManifest.json file that is IIIF-compliant that can be displayed via IIIF Viewers like Mirador. Other outputs available are an HTML gallery and plain text file. Installing Omniscribe Requires Python 3.6.x Download the Source Code package, unzip it, and save the Omniscribe-1.0.2 folder to your local machine or server. Download the model.h5 and save to the Omniscribe-1.0.2 folder. Using the command line, navigate to the Omniscribe-1.0.2 folder. Install dependencies by running the command pip install -r requirements.txt. <br> <br> NOTE: We recommend setting up a virtual environment to install Omniscribe. For more information on setting up a virtual environment, please refer to https://packaging.python.org/guides/installing-using-pip-and-virtualenv/ up to the Leaving the virtualenv section of this documentation. Usage Run inferencer.py with the manifest URLs where you wish to detect annotations: python3 inferencer.py [ export ] [ confidence ] [ manifest-url/path ] Export options: --manifest - exports resultsManifest.json, a IIIF manifest listing the images with detected annotations. --text - exports resultsURIs.txt, a text file that contains URLs of images with detected annotations. --html - exports resultsImages.html, a simple HTML gallery of images with detected annotations. The default export format is resultsManifest.json if no export options are specified. --confidence=VALUE - adjust this value for any values between 0 and 1 (inclusive). E.g. --confidence=0.91 sets the threshold to 0.91. This means that any region that receives a score of 0.91 or higher from our model will be inferred as an annotation. The default confidence level is 0.95 if no confidence value is specified. Gauging a "Good" Confidence Value We found that marginalia are often detected with a confidence value of 0.90 and higher, but detecting interlinear annotations require lower confidence values, somewhere between 0.70-0.85. This means that setting a confidence score of --confidence=0.90 will detect marginalia, but will be less effective in detecting interlinear annotations since these often receive scores below the threshold of 0.90. Setting a confidence score of --confidence=0.70 will detect both interlinear annotations and marginalia (as both types of annotations will receive scores that are equal or higher than the confidence score); however, using the lower confidence threshold will likely result in more false positives. Operating on Multiple Manifests The manifests can be hosted or local IIIF manifest files. You can input multiple manifest URLs or paths, and the application will crawl through all the images from each manifest such that the resulting export is a single conglomerate of all the sub-results from every manifest. https://marinus.library.ucla.edu/iiif/annotated/uclaclark_SB322S53-short.json path/to/a/localManifestFile.json Example Command Lines python3 inferencer.py --manifest --confidence=0.93 manifest1.json python3 inferencer.py --html --confidence=0.90 manifest1.json python3 inferencer.py --text --confidence=0.94 manifest1.json python3 inferencer.py --manifest --html --confidence=0.92 manifest1.json python3 inferencer.py --text --manifest --confidence=0.97 manifest1.json python3 inferencer.py --html --text --confidence=0.93 manifest1.json python3 inferencer.py --html --manifest --text --confidence=0.91 manifest1.json python3 inferencer.py --confidence=0.95 manifest1.json python3 inferencer.py --text manifest1.json python3 inferencer.py manifest1.json python3 inferencer.py --manifest --text --html --confidence=0.96 manifest1.json manifest2.json python3 inferencer.py --manifest --text manifest1.json manifest2.json manifest3.json manifest4.json Note that omitting the confidence option will be interpreted as setting the confidence score to 0.95. Additionally, omitting all export options will be interpreted as setting the export to a manifest file. Collecting the Results After inferencer.py is done processing all the images, you will see the message Finished detecting annotations. All the export files will be saved in the Omniscribe-1.0.2 folder. Sample Images Command Line will typically display this as it processes through all the images. <img src="https://imgur.com/56RFHD3.png"> TensorFlow may automatically use any available GPUs to do the predictions on the images as shown below. <img src="https://i.imgur.com/kDwYaNP.png"> HTML Gallery <img src="https://imgur.com/mXWEuCF.png"> Displaying resultsManifest.json through Mirador, an image viewing client that supports IIIF. <img src="https://imgur.com/qrPaNJrl.png">