JavaScript Encode URL, Escape String

By Xah Lee. Date: . Last updated: .

Browser provides the following functions to encode URL:

  1. encodeURI and decodeURI → Useful to encode non-ASCII chars in URL. Changes some char to percent encoded form of the char's UTF-8 byte sequence.
  2. encodeURIComponent and decodeURIComponent → Useful to embed a URL in a URL. Like encodeURI but changes more chars.
  3. escape and unescape → deprecated. Changes some characters to a percent encoded form of the char's Unicode codepoint.

Note: these functions are properties of browser's window object.

encodeURI

encodeURI(str) → return a new string that is percent encoded form of str.

// encode a url that contains EN DASH Unicode 8211 (U+2013).
console.log(
    encodeURI("http://en.wikipedia.org/wiki/Sylvester–Gallai_theorem")
);
// prints http://en.wikipedia.org/wiki/Sylvester%E2%80%93Gallai_theorem

The result can be used in a HTML link, example

<a href="http://en.wikipedia.org/wiki/Sylvester%E2%80%93Gallai_theorem">Sylvester–Gallai theorem</a>

The %E2%80%93 is hexadecimals E2 80 93. It is the byte sequence of the en-dash char by UTF-8 encoding. This encoding is called percent encoding. It is required for all Unicode chars in the URI. (but usually browsers can handle it fine without percent encoding.)

What characters are changed by encodeURI?

Printable ASCII chars that are changed are: {{ } [ ] < > % | \ ^ " `}

And all non-ASCII Unicode are also changed.

Characters are changed to a sequence of %dd where dd is 2 hexadecimal digits, and this sequence is the byte sequence of the char by UTF-8 encoding.

For example, the “U+2013: EN DASHem dash” is changed to %E2%80%93.

The following chars are NOT changed by encodeURI:

// unchanged chars of encodeURI

var xx = "-_.!~*'()";
console.log(encodeURI(xx) === xx); // true

var xx = ";,/?:@&=+$#";
console.log(encodeURI(xx) === xx); // true

var xx = "0123456789";
console.log(encodeURI(xx) === xx); // true

var xx = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
console.log(encodeURI(xx) === xx); // true

var xx = "abcdefghijklmnopqrstuvwxyz";
console.log(encodeURI(xx) === xx); // true

encodeURIComponent

encodeURIComponent(str) is like encodeURI(str), but it also encodes the following printable ASCII chars: {; , / ? : @ & = + $ #} .

console.log(
    encodeURIComponent("http://en.wikipedia.org/wiki/Sylvester–Gallai_theorem")
);
// prints http%3A%2F%2Fen.wikipedia.org%2Fwiki%2FSylvester%E2%80%93Gallai_theorem

encodeURIComponent is useful if you want to embed a URI as a parameter inside a URI. For example:

http://example.com/url?url=http%3A%2F%2Fen.wikipedia.org%2Fwiki%2FSylvester%E2%80%93Gallai_theorem
// unchanged chars

var yy = "-_.!~*'()";
console.log(encodeURIComponent(yy) === yy); // true

// unchanged chars
var yy = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
console.log(encodeURIComponent(yy) === yy); // true

// ASCII chars that are changed
console.log(encodeURIComponent(
  " ;,/?:@&=+$#"
));
// %20%3B%2C%2F%3F%3A%40%26%3D%2B%24%23

The “escape” Function

escape(str) → return a new string from str. The chars that does not change are: ASCII letters, digits, and the { @ * _ + - . /}. All other chars are replaced by the form %dd or %udddd, where dd is 2 digits of hexadecimal and dddd is 4 digits of hexadecimal. They are the Unicode code point of the char. 〔Unicode Basics: What's Character Set, Character Encoding, UTF-8?

escape(…) is deprecated.

Use unescape(…) function to decode a string encoded with escape(…).

console.log(
    escape("http://en.wikipedia.org/wiki/Sylvester–Gallai_theorem")
);
// prints http%3A//en.wikipedia.org/wiki/Sylvester%u2013Gallai_theorem
// -*- coding: utf-8 -*-
// escape() examples

// unchanged
var xx = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
console.log(escape(xx) === xx); // true

// double quote and backslash
console.log(escape(
"\"\\"
));
// %22%5C

console.log(escape(
"!#$%&'()*+,-./:;<=>?@[]^_`{|}~"
));
// %21%23%24%25%26%27%28%29*+%2C-./%3A%3B%3C%3D%3E%3F@%5B%5D%5E_%60%7B%7C%7D%7E

// space, tab, return
console.log(escape(
" \n\t"
));
// %20%0A%09

// unicode
console.log(escape(
"αβγ♥"
));
// %u03B1%u03B2%u03B3%u2665
// print out all ASCII chars that escape() encodes

for (var i = 0; i < 128; i++) {
    var char = String.fromCharCode(i);
    if ( escape(char) !== char ) {
        console.log(  i + " " + char);
    }
}

// 0 &#0;
// 1 &#1;
// 2 &#2;
// 3 &#3;
// 4 &#4;
// 5 &#5;
// 6 &#6;
// 7 &#7;
// 8 &#8;
// 9
// 10

// 11 &#11;
// 12 &#12;
// 13

// 14 &#14;
// 15 &#15;
// 16 &#16;
// 17 &#17;
// 18 &#18;
// 19 &#19;
// 20 &#20;
// 21 &#21;
// 22 &#22;
// 23 &#23;
// 24 &#24;
// 25 &#25;
// 26 &#26;
// 27 &#27;
// 28 &#28;
// 29 &#29;
// 30 &#30;
// 31 &#31;
// 32
// 33 !
// 34 "
// 35 #
// 36 $
// 37 %
// 38 &
// 39 '
// 40 (
// 41 )
// 44 ,
// 58 :
// 59 ;
// 60 <
// 61 =
// 62 >
// 63 ?
// 91 [
// 92 \
// 93 ]
// 94 ^
// 96 `
// 123 {
// 124 |
// 125 }
// 126 ~
// 127 &#127;
Like what you read? Buy JavaScript in Depth
or, buy a new keyboard, see Keyboard Reviews.