What Is XML Encoding?

What Is XML Encoding? thumbnail
Many different languages and characters sets are used within computing.

Computers encode the characters in saved files in a number of possible ways. This encoding refers to how the characters are translated to and from the numerical sequences which are used to represent them within the computer system itself. XML encoding refers to the encoding used for saving the data in an XML document. As a user of XML, you won't generally need to worry about encoding, but it can occasionally cause errors.

  1. ASCII

    • American Standard Code for Information Interchange is one of the major and most commonly used encoding systems within computing. However, XML data can contain characters that aren't within the ASCII set. Many text editors save files using ASCII by default, which means that XML files containing non-ASCII characters, but saved with an ASCII encoding method, may cause errors. XML is designed to be extremely flexible and to be used in lots of different contexts, which is partly why it allows a wide range of characters.

    Unicode

    • Unicode is a standard for character encoding that allows a wider range of characters than ASCII, including character sets for many different languages. While ASCII is broadly based on the English language, Unicode aims to support many languages, alphabets and characters. Unicode therefore provides adequate encoding for XML files. There are a number of distinct encoding categories within Unicode, determining the amount of computer memory that's allocated to characters and consequently how many can be used.

    Errors

    • The most common errors caused by incorrectly encoded XML are those relating to linguistic characters such as accented letters used in languages like French, or letters that are not part of the Latin alphabet, such as those used in Arabic. Where these errors occur, the solution in most cases is to alter the encoding used. To do this, it's normally necessary to save an XML file again, with different encoding settings enforced, and optionally to include the encoding attribute.

    Encoding Attribute

    • XML data can use an encoding attribute in the XML element. The following sample XML shows double-byte Unicode encoding being asserted at the start of an XML document: <?xml version="1.0" encoding="UTF-16"?>. The encoding attribute can indicate many different encoding standards, including single-byte Unicode and ISO. If XML data contains characters in alphabets other than the Latin one, or uses characters with accents, it's generally advisable to use UTF-16.

    Text Editors

    • XML documents can be created, viewed and edited in most standard text editor applications. Additionally, there are various editors specifically designed for XML. Some text editing programs, such as Notepad for Windows operating systems, automatically save files using ASCII, which will cause problems for certain XML documents. In such cases, developers can choose Unicode within the "Save as" menu option, and include the XML encoding attribute. The encoding attribute and saving options must reflect the same encoding standard to prevent errors when using the XML data.

Related Searches:

References

Resources

  • Photo Credit Medioimages/Photodisc/Photodisc/Getty Images

Comments

You May Also Like

  • How to Specify Encoding for XML

    Developers who create programs to read and process XML files face an odd paradox when it comes to determining each file's encoding,...

  • Base64 Encoding in XML

    Base64 encoding offers a way to represent binary data using ASCII character strings, thus allowing different types of computers to exchange binary...

  • XML to ASCII Conversion

    A chief benefit of XML is that it allows you to tag data and output (or transform) element content into any one...

  • XML & URL Encoding

    A URL, or Uniform Resource Locator, specifies the address of a resource on the Internet. Because URLs are limited to a subset...

  • How to Read XML Encoding

    Extensive Markup Language (XML) provides you with formatted data. The XML programming language is common on the Internet, because it allows software...

  • How to Find XML Encoding

    Extensible Markup Language (XML) is an encoding standard that lets website programmers display data from a database. XML must be denoted using...

Related Ads

Featured