MOA Home

MOA Collection
Conversion Process
Online Implementation
Report on OCR Accuracy
Report on the Cost of Conversion
MOA Future
MOA FAQ
MOA In the News
MOA MARC Records

 

MOA FAQ

Can I download a whole book?

We make available for downloading the uncorrected plain text of the books. Locate the volume that you are interested in, view the bibliographic citation and click on the "note on viewing the plain text of this volume." After you have read the note, you will see a download link. When you click on the download link, you will be presented with a dialog box that gives you the option of saving the file. Save the file to your computer; you will then be able to open it in your browser or a word processor. Please be aware that some of these texts are as long as 1,000 pages and will take a long time to download, particularly over a modem. Such a large download may also crash your Web browser. Also, please note that for an entire text the plain text is not broken up or formatted by page -- it is one big block of text.

Can I have access to the text instead of just the page images?

Go to the desired page and choose "view as text" from the view as menu in the toolbar at the top. You will be able to page through the text until you choose another "view as" option (such as 75% or thumbnail).

If you'd like to know more about the plain text, you might want to read our report on the accuracy of the OCR.

I keep losing my results list. How can I get back to it?

On most recent versions of Web browsers, you should be able to return to your search results by using your browser's Back button. If you cannot do this, you might want to open up the results in a new browser window that you can refer to. On a Mac, click and hold down on the page to get another window -- choose the "new window with this frame" item from the little menu that pops up. With Windows you use the right mouse button to click and select the "open frame in a new window" option.

How do you convert the TIFFs to GIFs?

At the University of Michigan, the MOA project stores only TIFF G4 (bitonal) 600dpi images online. When you select a page, the TIFF is converted at that moment to the desired level of GIF. The software that performs this operation is called tif2gif and was written by Doug Orr. Binaries and source code are available online.

I searched a word I knew was on one of the pages, but I got "No hits". Why did this happen?

The UM MOA page images are used to generate OCR for searching the collection. OCR is still an imperfect process. Most words are captured accurately, but approximately 1% of the characters come out wrong. Since words are frequently repeated, and since some search capability is better than none, we hope this provides a useful avenue into the collection.

Why are some of the pages I saw skewed on the page?

Materials are currently undergoing a "Quality Control" process. Pages with an unacceptable degree of skew will be reprocessed.

Why is some of the bibliographic information incorrect?

Some errors are cataloging errors, others are errors created in incorporating the bibliographic materials into MOA. Materials are currently undergoing a "Quality Control" process. We will correct these problems over the coming months.

How do I get a complete list of titles in the UM MOA system?

A browseable list, organized by author, is provided on the MOA "Browse" page.

Why are some of the texts in cleaned up and proofread HTML and others are images of the original pages?

The UM MoA resources have been encoded in a simple SGML form (a 40 element DTD conforming to the TEI Guidelines); consequently, we are able to seamlessly integrate both automatically processed (i.e., "raw") texts, and texts whose OCR and encoding is carefully evaluated (i.e., "cooked" texts). Users who encounter a "cooked" text will find attractively rendered HTML with links to page images, while "raw" texts are presented as page images until resources can be found to supplement them with corrected and encoded text.

Migration from "raw" to "cooked" takes place gradually, based on the availability of resources and specific demands. The Humanities Text Initiative, a part of the Digital Library Production Services at the UM, undertakes the process of proofing OCR and refining markup.



MoA Books   |   MoA Journals   |   Help   |   UMDL Texts Home