What are non-GSM characters

You are here:
< Back

GSM-7 is a character encoding standard that packs the most commonly used letters and symbols in many languages into 7 bits each for usage on GSM networks.  As SMS messages are transmitted 140 8-bit octets at a time, GSM-7 encoded SMS messages can carry up to 160 characters.

The basic character set for GSM-7 can be found here.

For some characters, such as ‘{‘ and ‘],’ an escape code is required – so even in a GSM-7 encoded message, these characters will be encoded using two characters.

As SMS messages contain 140 8-bit octets, up to 160 GSM-7 characters may be transmitted: (140*8)/7 = 160.

How We Encode Your Messages

When sending SMS messages, we’ll automatically send messages in the most compact encoding possible. If you include any non-GSM-7 characters in your message body, we will automatically fall back to UCS-2 encoding (which will limit message bodies to 70 characters each). Additionally, we prepend a User Data Header of 6 Bytes (this instructs the receiving device on how to assemble messages), leaving 153 GSM-7 characters or 67 UCS-2 characters for your message.

Note that this may cause more messages to be sent than you expect – a body with 152 GSM-7-compatible characters and a single Unicode character will be split into 3 messages when encoded in UCS-2. This will incur charges for 3 outgoing messages against your account.

How Do I Check if My Message Can Be Encoded in GSM-7?

This page contains an interactive tool which can check if encoding your message in GSM-7 is possible or if UCS-2 is needed.

How Can I Avoid My Messages Being Split When I Expect Them to be in GSM-7?

Unfortunately, GSM-7 is not a supported character encoding in many text editors. Even setting the encoding to ASCII (or US_ASCII, or UTF-8) will not guarantee that the text you write will be limited to GSM-7. You can use the above-linked tool to quickly check the number of segments, total messages – some text will be divided into.

If you are writing in an editor with Unicode support, you’ll need to be particularly careful. Text editors designed for writing might automatically add angled smart quotes, non-standard spaces, or punctuation, which looks similar to GSM-7 but is a different Unicode character.