How to Preserve White Space for OCR

Extensible Markup Language, abbreviated to XML, is a markup language typically used as a set of protocols for encoding local and remote documents into a specific format. Optical Character Recognition (OCR) is a method by which an image-based computer file can be converted into a text-based document, in which all of the text is searchable. You can use the Extensible Stylesheet Language Transformations "XSLT" element "<xsl:preserve-space>" to preserve whitespace, which can be created within an XML file in Notepad.

Instructions

    • 1

      Click on the Windows "Start" menu followed by "All Programs" and "Accessories." Click the "Notepad" shortcut icon, which will launch the Windows Notepad application.

    • 2

      Copy and paste the following XML syntax into Notepad:

      <?xml version="1.0" encoding="ISO-8859-1"?>

      <xsl:stylesheet version="1.0"

      xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

      <xsl:strip-space elements="<strip choose element>" />

      <xsl:preserve-space elements="<preserve choose element>" />

      <xsl:template match="/">

      <html>

      <body>

      <xsl:for-each select="/">

      <p>

      <xsl:value-of select="value_1" /><br />

      <xsl:value-of select="value_2" /><br />

      <xsl:value-of select="value_3" /><br />

      <xsl:value-of select="value_4" /><br />

      </p>

      </xsl:for-each>

      </body>

      </html>

      </xsl:template>

      </xsl:stylesheet>

      Replace "value_1," "value_2," "value_3" and "value_4" with the specific element values of which you want to preserve and/or strip whitespace. Replace "<strip choose element>" and "<preserve choose element>" with the element values that you specify. This will provide you with the fundamentals on how to preserve white space with OCR XML.

    • 3

      Click on "File" followed by "Save." A "Save" dialog box appears. Change the saving folder location to the Windows Desktop, and then change the file type to "All Files." Name the file "whitespace.xml," and then click on "Save." This will save the file to the Windows Desktop.

    • 4

      Go to the Windows Desktop and confirm that the file has been saved.

Related Searches:

References

Comments

Related Ads

Featured