Golang: String

By Xah Lee. Date: . Last updated: .

Intepreted String Literal

String syntax is like this:

"abc"

package main

import "fmt"

func main() {

    var x = "abc and ♥"

    fmt.Println(x)
    // abc and ♥
}

String can contain Unicode character, e.g. (U+2665: BLACK HEART SUIT)

Any character can appear between the "double quotes", except the quote itself or newline character.

Literal newline is not allowed. The following is syntax error.

var x = "can't do this"

To include a quote character, use \", e.g. "the \"thing\""

To include newline, use \n.

Backslash Escapes

Within a double quoted string, character sequence starting with backslash may have special meaning. e.g. \n means newline.

package main

import "fmt"

func main() {
    var x = "a\nb"
    fmt.Println(x)
}

// prints
// a
// b

Here's complete list:

[see ASCII Character Symbols ␀ ␣ ¶]

package main

import "fmt"

func main() {
    fmt.Printf("%v\n", "A" == "\x41")       // true
    fmt.Printf("%v\n", "♥" == "\u2665")     // true
    fmt.Printf("%v\n", "😂" == "\U0001f602") // true
}

Raw String Literal

If you don't want backslash to have special meaning, use ` (U+60: GRAVE ACCENT) to quote the string.

var x = `long text`

Anything can appear inside except the grave accent char itself.

And, carriage return character (Unicode codepoint 13) in it is discarded. If you run the command line tool gofmt, it will remove carriage return.

package main

import "fmt"

var x = `Alice was beginning to get very tired of sitting by her
sister on the bank, and of having nothing to do: once or twice she had
peeped into the book her sister was reading, but it had no pictures or
conversations in it, «and what is the use of a book,» thought Alice «without
pictures or conversation?».`

func main() {

    fmt.Printf("%v\n", x)
}

String is a Sequence of Bytes

Golang string is a sequence of bytes, not characters.

Normally, we use string as a sequence of characters. Go string can contain any Unicode character.

In go, each character is stored as 1 to 4 bytes by utf8 encoding.

if you are not familiar with unicode, encoding, first read Unicode Basics: What's Character Set, Character Encoding, UTF-8?

You can use string to store bytes, any byte.

String can contain byte sequences that is not valid encoding for any Unicode character.

You can create a string of any byte by using the hexadecimal escape \xhh

For example, character A has codepoint 65 in decimal , and 41 in hexadecimal. So, "A" and "\x41" creates the same string. But you can create byte sequences that's not valid encoding for any Unicode character.

package main

import "fmt"

func main() {

    fmt.Printf("%v\n", "A" == "\x41") // true

}

String Index

s[n] → returns the nth byte of string s. The return value's type is unit8. (unit8 is an alias of byte)

[see Golang: Basic Types]

Index start at 0.

package main

import "fmt"

func main() {

	var x = "abc"

	// byte at index 2
	fmt.Printf("%#v\n", x[2]) // '0x63'

	// byte at index 2, print as char
	fmt.Printf("%q\n", x[2]) // 'c'

	// result is type uint8 (aka byte)
	fmt.Printf("%T\n", x[2]) // uint8

	var y = "👍"
	// utf8 encoding for thumb up char emoji 👍 is 4 bytes: #xF0 #x9F #x91 #x8D

	// byte at index 0
	fmt.Printf("%#v\n", "👍"[0]) // 0xf0

	// type
	fmt.Printf("%T\n", y[0]) // uint8

}

SubString

s[n:m] → returns a substring of s from index n to m (excluding m). The return value's type is string.

package main

import "fmt"

func main() {

	var x = "012345"

	// substring
	fmt.Printf("%#v\n", x[2:3]) // "2"

	fmt.Printf("%#v\n", x[2:4]) // "23"

	fmt.Printf("%#v\n", x[2:2]) // ""

}

Remember, string is a sequence of bytes. So, if you have non-ascii unicode in string, arbitrary index range may create a string that's not a sequence of characters.

package main

import "fmt"

func main() {

	const x = "♥♥♥"

	fmt.Printf("%#v\n", x[2:4]) // "\xa5\xe2"

}

Length (Number of Bytes)

len(string) → returns the number of bytes in string.

package main

import "fmt"

func main() {
    fmt.Printf("%v\n", len("abc")) // 3
    fmt.Printf("%v\n", len("ab♥")) // 5
}

Join String

Use + to join string. e.g.

"abc" + "def"

package main

import "fmt"

func main() {
    fmt.Printf("%v\n", "a"+"b") // ab
}

Print String: Bytes vs Characters

Because string is byte sequence, sometimes you want to print them as hexadecimal to see the bytes. Other times you want to print them as characters.

The fmt.Printf function has several verbs to help.

[see ASCII Table]

package main

import "fmt"

func main() {
    var x = "♥\t😂" // with a tab (U+0009) in middle

    fmt.Printf("%s\n", x)  // ♥ 😂
    fmt.Printf("%q\n", x)  // "♥\t😂"
    fmt.Printf("%+q\n", x) // "\u2665\t\U0001f602"
    fmt.Printf("% x\n", x) // e2 99 a5 09 f0 9f 98 82

    // turn the string into rune slice, then print it with %U
    fmt.Printf("%U\n", []rune(x)) // [U+2665 U+0009 U+1F602]
}

Rune

Before you can work with string as character (instead of byte) sequence, you need to understand rune.

Golang: Rune

Working with String as Character Sequence

Golang: String as Chars

Reference

The Go Programming Language Specification - The Go Programming Language#String_literals

If you have a question, put $5 at patreon and message me.

Golang

  1. Compile, Run
  2. Source Encoding
  3. Package, Import
  4. Comment
  5. Print
  6. String
  7. Rune
  8. String as Chars
  9. Variable
  10. Zero Value
  11. Constant
  12. If Then Else
  13. Switch/Case
  14. Loop
  15. Basic Types
  16. Array
  17. Slice
  18. Map
  19. Struct
  20. Function
  21. regexp
  22. Read File
  23. Write to File
  24. Walk Dir
  25. Check File Exist
  26. System Call
  27. Get Script Path
  28. Pointer
  29. Defer
  30. Random Number

Examples

  1. match any regexp
  2. Validate Links
  3. Generate Sitemap

Reference

  1. Go Spec