Help
cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
jonnyseymour
ServiceNow Employee
ServiceNow Employee

After web services were first introduced, they became very popular. Web services is a great tool to exchange information. ServiceNow implementations of the web services are top notch. When encoding international characters, the safest option points to Unicode. One of the most popular implementations is UTF-8, which is the one adopted by us.   If you need to connect to your instance and use non-ascii characters, you should read this blog. Specially if you are seeing � or questions marks (?) as part of the data received using web services. Luckily the solution is very simple by safe-encoding non-ascii characters before sending them (see below) if you are not using UTF-8.

Utf8webgrowth.png

Outside encoding, incoming SOAP requests with characters like "&", "<" or ">" on the data can cause errors. Those characters are reserved as they are part of the XML tags used to delimiter the data. If used on the data, they interfere with the SOAP message itself. If they are passed on the data and read on ServiceNow, they will show as "Unable to parse SOAP document"

Recommendations to safely pass non-ascii characters

I will focus on SOAP web services. They use SOAP messages to exchange data between the soap nodes (e.g. your soap client and the instance). However, while the data is provided on the message, both parties need to agree on the "encoding." We make it easy, we would use UTF-8 to encode characters.

unicode-shield.png

If you are interacting with your instance and you use a different encoding with special characters on your data, you may face problems if you are not using Unicode. However, XML offers the option to safe encoding most characters into escape characters (safe encoded).

For SOAP Message

If the message is

encoded on UTF-8

Recommendation: The message needs to be safe encoded

Incoming to the instance

Yes

No, it is not necessary.

If safe encoded, it will also work.

Incoming to the instance

No

Yes, safe encode the data to avoid ? or � characters

Outbound from the instance

Yes

Yes, if target is not UTF-8 to avoid ? or � characters

Outbound from the instance

No

Yes, always safe encode the data

The outbound soap messages from the instance are always on UTF-8, so the messages are always encoded on UTF-8

"Incoming" means that SOAP calls toward your instance. "Outbound" means SOAP calls from the instance to your end-point. To safely encode the message, you need to transform any non-ascii character into a XML code.

Example of a message sent to the instance using an Unicode (UTF-8) encoding

On the following example, I will use SOAP UI, to transfer "comments" that contains non-ascii characters and "work_notes" that contains the same characters safely encoded. I would expect the system will have NO problem with the characters as both soap nodes are using UTF-8.

Using SOAP UI looks like:

SOAP ui.jpg

In more detail, the message looks like:

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:inc="http://www.service-now.com/incident">

    <soapenv:Header/>

    <soapenv:Body>

          <inc:insert>

                <short_description>Testing with encoding characters characters</short_description>

                <comments>Testing with encoding characters characters -not safe encoded

Basic Latin

! " # $ % &amp; ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; &#x3C; = &#x3E; ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~

Latin-1 Supplement

  ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ­ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ í€ í� Â íƒ í„ í… í† í‡ íˆ É íŠ í‹ íŒ í� íŽ í� í� í‘ í’ í“ í” í• í– × í˜ í™ íš í› íœ í� íž ß í  á â ã ä å æ ç è é ê í« ì í­ î í¯ í° ñ ò ó ô õ ö í· ø ù ú û ü ý í¾ í¿

Latin Extended-A

Ä€ Đ Ä‚ ă Ä„ Ä… Ć ć Ĉ ĉ ÄŠ Ä‹ ÄŒ Đ ÄŽ Đ Đ đ Ä’ Ä“ Ä” Ä• Ä– Ä— Ę Ä™ Äš Ä› Äœ Đ Äž ÄŸ Ä  Ä¡ Ä¢ Ä£ Ĥ Ä¥ Ħ ħ Ĩ Ä© Ī Ä« Ĭ Ä­ Ä® į Ä° ı IJ ij Ä´ ĵ Ķ Ä· ĸ Ĺ ĺ Ä» ļ Ľ ľ Ä¿ Å€ Ł Å‚ Ń Å„ Å… ņ Ň ň ʼn ÅŠ Å‹ ÅŒ Ł ÅŽ Ł Ł Å‘ Å’ œ Å” Å• Å– Å— Ř Å™ Åš Å› Åœ Ł Åž ÅŸ Å  š Å¢ Å£ Ť Å¥ Ŧ ŧ Ũ ũ Ū Å« Ŭ Å­ Å® ů Å° ű Ų ų Å´ ŵ Ŷ Å· Ÿ Ź ź Å» ż Ž ž Å¿

Latin Extended-B

Æ€ Æ� Æ‚ ƃ Æ„ Æ… Ɔ Ƈ ƈ Ɖ ÆŠ Æ‹ ÆŒ Æ� ÆŽ Æ� Æ� Æ‘ Æ’ Æ“ Æ” Æ• Æ– Æ— Ƙ Æ™ Æš Æ› Æœ Æ� Æž ÆŸ Æ  ơ Æ¢ Æ£ Ƥ Æ¥ Ʀ Ƨ ƨ Æ© ƪ Æ« Ƭ Æ­ Æ® Ư ư Ʊ Ʋ Ƴ Æ´ Ƶ ƶ Æ· Ƹ ƹ ƺ Æ» Ƽ ƽ ƾ Æ¿ Ç€ Ç� Ç‚ ǃ Ç„ Ç… dž LJ Lj lj ÇŠ Ç‹ ÇŒ Ç� ÇŽ Ç� Ç� Ç‘ Ç’ Ç“ Ç” Ç• Ç– Ç— ǘ Ç™ Çš Ç› Çœ Ç� Çž ÇŸ Ç  Ç¡ Ç¢ Ç£ Ǥ Ç¥ Ǧ ǧ Ǩ Ç© Ǫ Ç« Ǭ Ç­ Ç® ǯ Ç° DZ Dz dz Ç´ ǵ Ƕ Ç· Ǹ ǹ Ǻ Ç» Ǽ ǽ Ǿ Ç¿ ...</comments>

</comments>

                <work_notes>Testing with encoding characters characters -safe encoded

Basic Latin

! &#x22; # $ % &#x26; &#x27; ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; &#x3C; = &#x3E; ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ &#x60; a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~

Latin-1 Supplement

  &#xA1; &#xA2; &#xA3; &#xA4; &#xA5; &#xA6; &#xA7; &#xA8; &#xA9; &#xAA; &#xAB; &#xAC; &#xAD; &#xAE; &#xAF; &#xB0; &#xB1; &#xB2; &#xB3; &#xB4; &#xB5; &#xB6; &#xB7; &#xB8; &#xB9; &#xBA; &#xBB; &#xBC; &#xBD; &#xBE; &#xBF; &#xC0; &#xC1; &#xC2; &#xC3; &#xC4; &#xC5; &#xC6; &#xC7; &#xC8; &#xC9; &#xCA; &#xCB; &#xCC; &#xCD; &#xCE; &#xCF; &#xD0; &#xD1; &#xD2; &#xD3; &#xD4; &#xD5; &#xD6; &#xD7; &#xD8; &#xD9; &#xDA; &#xDB; &#xDC; &#xDD; &#xDE; &#xDF; &#xE0; &#xE1; &#xE2; &#xE3; &#xE4; &#xE5; &#xE6; &#xE7; &#xE8; &#xE9; &#xEA; &#xEB; &#xEC; &#xED; &#xEE; &#xEF; &#xF0; &#xF1; &#xF2; &#xF3; &#xF4; &#xF5; &#xF6; &#xF7; &#xF8; &#xF9; &#xFA; &#xFB; &#xFC; &#xFD; &#xFE; &#xFF;

Latin Extended-A

&#x100; &#x101; &#x102; &#x103; &#x104; &#x105; &#x106; &#x107; &#x108; &#x109; &#x10A; &#x10B; &#x10C; &#x10D; &#x10E; &#x10F; &#x110; &#x111; &#x112; &#x113; &#x114; &#x115; &#x116; &#x117; &#x118; &#x119; &#x11A; &#x11B; &#x11C; &#x11D; &#x11E; &#x11F; &#x120; &#x121; &#x122; &#x123; &#x124; &#x125; &#x126; &#x127; &#x128; &#x129; &#x12A; &#x12B; &#x12C; &#x12D; &#x12E; &#x12F; &#x130; &#x131; &#x132; &#x133; &#x134; &#x135; &#x136; &#x137; &#x138; &#x139; &#x13A; &#x13B; &#x13C; &#x13D; &#x13E; &#x13F; &#x140; &#x141; &#x142; &#x143; &#x144; &#x145; &#x146; &#x147; &#x148; &#x149; &#x14A; &#x14B; &#x14C; &#x14D; &#x14E; &#x14F; &#x150; &#x151; &#x152; &#x153; &#x154; &#x155; &#x156; &#x157; &#x158; &#x159; &#x15A; &#x15B; &#x15C; &#x15D; &#x15E; &#x15F; &#x160; &#x161; &#x162; &#x163; &#x164; &#x165; &#x166; &#x167; &#x168; &#x169; &#x16A; &#x16B; &#x16C; &#x16D; &#x16E; &#x16F; &#x170; &#x171; &#x172; &#x173; &#x174; &#x175; &#x176; &#x177; &#x178; &#x179; &#x17A; &#x17B; &#x17C; &#x17D; &#x17E; &#x17F;

Latin Extended-B

&#x180; &#x181; &#x182; &#x183; &#x184; &#x185; &#x186; &#x187; &#x188; &#x189; &#x18A; &#x18B; &#x18C; &#x18D; &#x18E; &#x18F; &#x190; &#x191; &#x192; &#x193; &#x194; &#x195; &#x196; &#x197; &#x198; &#x199; &#x19A; &#x19B; &#x19C; &#x19D; &#x19E; &#x19F; &#x1A0; &#x1A1; &#x1A2; &#x1A3; &#x1A4; &#x1A5; &#x1A6; &#x1A7; &#x1A8; &#x1A9; &#x1AA; &#x1AB; &#x1AC; &#x1AD; &#x1AE; &#x1AF; &#x1B0; &#x1B1; &#x1B2; &#x1B3; &#x1B4; &#x1B5; &#x1B6; &#x1B7; &#x1B8; &#x1B9; &#x1BA; &#x1BB; &#x1BC; &#x1BD; &#x1BE; &#x1BF; &#x1C0; &#x1C1; &#x1C2; &#x1C3; &#x1C4; &#x1C5; &#x1C6; &#x1C7; &#x1C8; &#x1C9; &#x1CA; &#x1CB; &#x1CC; &#x1CD; &#x1CE; &#x1CF; &#x1D0; &#x1D1; &#x1D2; &#x1D3; &#x1D4; &#x1D5; &#x1D6; &#x1D7; &#x1D8; &#x1D9; &#x1DA; &#x1DB; &#x1DC; &#x1DD; &#x1DE; &#x1DF; &#x1E0; &#x1E1; &#x1E2; &#x1E3; &#x1E4; &#x1E5; &#x1E6; &#x1E7; &#x1E8; &#x1E9; &#x1EA; &#x1EB; &#x1EC; &#x1ED; &#x1EE; &#x1EF; &#x1F0; &#x1F1; &#x1F2; &#x1F3; &#x1F4; &#x1F5; &#x1F6; &#x1F7; &#x1F8; &#x1F9; &#x1FA; &#x1FB; &#x1FC; &#x1FD; &#x1FE; &#x1FF;

</work_notes>

          </inc:insert>

    </soapenv:Body>

</soapenv:Envelope>

Once processed into the target table, the characters are correctly displayed.

UTF 8 ENCODED.jpg

This shows that the data is processed without any problems.

Example of a message sent to the instance using a non-Unicode (ISO-8859-1) encoding

Similarly to the previous example, I will use SOAP UI, to transfer "comments" that contains non-ascii characters and "work_notes" that contains the same characters safely encoded. This time I will encode on 'iso-8859-1'. I would expect the system will try to match the characters against UTF-8.

Using soap UI looks like:

non unicode 8 soap ui.jpg

In more detail looks like:

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:inc="http://www.service-now.com/incident">

    <soapenv:Header/>

    <soapenv:Body>

          <inc:insert>

                <short_description>Testing with encoding characters characters</short_description>

                <comments>Testing with encoding characters characters -not safe encoded

Basic Latin

! " # $ % ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ;   @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~

Latin Extended-A

A a Ä‚ ă Ä„ Ä… Ć ć C c C c ÄŒ Đ ÄŽ Đ Đ đ E e E e E e Ę Ä™ Äš Ä› G g G g G g G g H h H h I i I i I i I i I i J j K k Ĺ ĺ L l Ľ ľ Ł Å‚ Ń Å„ N n Ň ň O o O o Ł Å‘ O o Å” Å• R r Ř Å™ Åš Å› S s Åž ÅŸ Å  š Å¢ Å£ Ť Å¥ T t U u U u U u Å® ů Å° ű U u W w Y y Y Ź ź Å» ż Ž ž

Latin Extended-B

b Đ F f I l O O o t T U u z | ! A a I i O o U u U u U u U u U u A a G g G g K k O o O o j

</comments>

                <work_notes>Testing with encoding characters characters -safe encoded

Basic Latin

! &#x22; # $ % &#x26; &#x27; ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; &#x3C; = &#x3E; ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ &#x60; a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~

Latin-1 Supplement

  &#xA1; &#xA2; &#xA3; &#xA4; &#xA5; &#xA6; &#xA7; &#xA8; &#xA9; &#xAA; &#xAB; &#xAC; &#xAD; &#xAE; &#xAF; &#xB0; &#xB1; &#xB2; &#xB3; &#xB4; &#xB5; &#xB6; &#xB7; &#xB8; &#xB9; &#xBA; &#xBB; &#xBC; &#xBD; &#xBE; &#xBF; &#xC0; &#xC1; &#xC2; &#xC3; &#xC4; &#xC5; &#xC6; &#xC7; &#xC8; &#xC9; &#xCA; &#xCB; &#xCC; &#xCD; &#xCE; &#xCF; &#xD0; &#xD1; &#xD2; &#xD3; &#xD4; &#xD5; &#xD6; &#xD7; &#xD8; &#xD9; &#xDA; &#xDB; &#xDC; &#xDD; &#xDE; &#xDF; &#xE0; &#xE1; &#xE2; &#xE3; &#xE4; &#xE5; &#xE6; &#xE7; &#xE8; &#xE9; &#xEA; &#xEB; &#xEC; &#xED; &#xEE; &#xEF; &#xF0; &#xF1; &#xF2; &#xF3; &#xF4; &#xF5; &#xF6; &#xF7; &#xF8; &#xF9; &#xFA; &#xFB; &#xFC; &#xFD; &#xFE; &#xFF;

Latin Extended-A

&#x100; &#x101; &#x102; &#x103; &#x104; &#x105; &#x106; &#x107; &#x108; &#x109; &#x10A; &#x10B; &#x10C; &#x10D; &#x10E; &#x10F; &#x110; &#x111; &#x112; &#x113; &#x114; &#x115; &#x116; &#x117; &#x118; &#x119; &#x11A; &#x11B; &#x11C; &#x11D; &#x11E; &#x11F; &#x120; &#x121; &#x122; &#x123; &#x124; &#x125; &#x126; &#x127; &#x128; &#x129; &#x12A; &#x12B; &#x12C; &#x12D; &#x12E; &#x12F; &#x130; &#x131; &#x132; &#x133; &#x134; &#x135; &#x136; &#x137; &#x138; &#x139; &#x13A; &#x13B; &#x13C; &#x13D; &#x13E; &#x13F; &#x140; &#x141; &#x142; &#x143; &#x144; &#x145; &#x146; &#x147; &#x148; &#x149; &#x14A; &#x14B; &#x14C; &#x14D; &#x14E; &#x14F; &#x150; &#x151; &#x152; &#x153; &#x154; &#x155; &#x156; &#x157; &#x158; &#x159; &#x15A; &#x15B; &#x15C; &#x15D; &#x15E; &#x15F; &#x160; &#x161; &#x162; &#x163; &#x164; &#x165; &#x166; &#x167; &#x168; &#x169; &#x16A; &#x16B; &#x16C; &#x16D; &#x16E; &#x16F; &#x170; &#x171; &#x172; &#x173; &#x174; &#x175; &#x176; &#x177; &#x178; &#x179; &#x17A; &#x17B; &#x17C; &#x17D; &#x17E; &#x17F;

Latin Extended-B

&#x180; &#x181; &#x182; &#x183; &#x184; &#x185; &#x186; &#x187; &#x188; &#x189; &#x18A; &#x18B; &#x18C; &#x18D; &#x18E; &#x18F; &#x190; &#x191; &#x192; &#x193; &#x194; &#x195; &#x196; &#x197; &#x198; &#x199; &#x19A; &#x19B; &#x19C; &#x19D; &#x19E; &#x19F; &#x1A0; &#x1A1; &#x1A2; &#x1A3; &#x1A4; &#x1A5; &#x1A6; &#x1A7; &#x1A8; &#x1A9; &#x1AA; &#x1AB; &#x1AC; &#x1AD; &#x1AE; &#x1AF; &#x1B0; &#x1B1; &#x1B2; &#x1B3; &#x1B4; &#x1B5; &#x1B6; &#x1B7; &#x1B8; &#x1B9; &#x1BA; &#x1BB; &#x1BC; &#x1BD; &#x1BE; &#x1BF; &#x1C0; &#x1C1; &#x1C2; &#x1C3; &#x1C4; &#x1C5; &#x1C6; &#x1C7; &#x1C8; &#x1C9; &#x1CA; &#x1CB; &#x1CC; &#x1CD; &#x1CE; &#x1CF; &#x1D0; &#x1D1; &#x1D2; &#x1D3; &#x1D4; &#x1D5; &#x1D6; &#x1D7; &#x1D8; &#x1D9; &#x1DA; &#x1DB; &#x1DC; &#x1DD; &#x1DE; &#x1DF; &#x1E0; &#x1E1; &#x1E2; &#x1E3; &#x1E4; &#x1E5; &#x1E6; &#x1E7; &#x1E8; &#x1E9; &#x1EA; &#x1EB; &#x1EC; &#x1ED; &#x1EE; &#x1EF; &#x1F0; &#x1F1; &#x1F2; &#x1F3; &#x1F4; &#x1F5; &#x1F6; &#x1F7; &#x1F8; &#x1F9; &#x1FA; &#x1FB; &#x1FC; &#x1FD; &#x1FE; &#x1FF;

</work_notes>

          </inc:insert>

    </soapenv:Body>

</soapenv:Envelope>

Below, you can see some characters would be translated. However, with XML encoded characters, you can safely send UTF-8 characters.

incorrectly translated UTF .jpg

Using non-ascii characters can cause them to get transferred incorrectly. However, using XML encoded characters you can safely transfer those characters.

Using XML encoded characters, you can safely transfer non-ascii characters when encoding is not UTF-8.

How to safely encode XML data

There are several methods to achieve the data to be encoded. Below is a simple script that can encode the data transferred.

Here is a simple background script function that encode data:

// Simple encoding XML data function - Do not double-escape any characters.

function escapeXMLEntities(xmldata) {

      return xmldata.replace(/[\u00A0-\u2666<>\&]/g, function (a) {

              return "&#" + a.charCodeAt(0) + ";"

      })

};

var str = "A a Ä‚ ă Ä„ Ä… Ć ć C c C c ÄŒ Đ ÄŽ Đ Đ đ E e E e E e Ę Ä™ Äš Ä› G g G g G g G g H h H h I i I i I i I i I i J j K k Ĺ ĺ L l Ľ ľ Ł Å‚ Ń Å„ N n Ň ň O o O o Ł Å‘ O o Å” Å• R r Ř Å™ Åš Å› S s Åž ÅŸ Å  š Å¢ Å£ Ť Å¥ T t U u U u U u Å® ů Å° ű U u W w Y y Y Ź ź Å» ż Ž ž";

gs.print(escapeXMLEntities(str));

Result:

Script completed in scope global: script

*** Script: A a &#258; &#259; &#260; &#261; &#262; &#263; C c C c &#268; &#269; &#270; &#271; &#272; &#273; E e E e E e &#280; &#281; &#282; &#283; G g G g G g G g H h H h I i I i I i I i I i J j K k &#313; &#314; L l &#317; &#318; &#321; &#322; &#323; &#324; N n &#327; &#328; O o O o &#336; &#337; O o &#340; &#341; R r &#344; &#345; &#346; &#347; S s &#350; &#351; &#352; &#353; &#354; &#355; &#356; &#357; T t U u U u U u &#366; &#367; &#368; &#369; U u W w Y y Y &#377; &#378; &#379; &#380; &#381; &#382;

When using SOAP messages, ensure you are using UTF-8 to transfer the data, or ensure non-ascii characters are escaped safely into XML safe codes. A good note, this can also be used on many other situations. Luckily, programmatically, it is not a big challenge.

More information can be found here:

3 Comments