Golang: String

By Xah Lee. Date: . Last updated: .

Source File Encoding

Golang file must be encoded in UTF-8.

This means, make sure to setup your text editor to save file as UTF-8. Find it in a menu or preference setting.

[see Unicode Basics: What's Character Set, Character Encoding, UTF-8?]

Intepreted String Literal

String syntax is like this:

"abc"

package main

import "fmt"

func main() {

    var x = "abc and ♥"

    fmt.Println(x)
    // abc and ♥
}

String can contain Unicode character, e.g. (U+2665: BLACK HEART SUIT)

Any character can appear between the "double quotes", except the quote itself or newline character.

Literal newline is not allowed. The following is syntax error.

var x = "can't do this"

To include a quote character, use \", e.g. "the \"thing\""

To include newline, use \n.

Backslash Escapes

Within a double quoted string, character sequence starting with backslash may have special meaning. e.g. \n means newline.

package main

import "fmt"

func main() {
    var x = "a\nb"
    fmt.Println(x)
}

// prints
// a
// b

Here's complete list:

[see ASCII Character Symbols ␀ ␣ ¶]

package main

import "fmt"

func main() {
    fmt.Printf("%v\n", "A" == "\x41")       // true
    fmt.Printf("%v\n", "♥" == "\u2665")     // true
    fmt.Printf("%v\n", "😂" == "\U0001f602") // true
}

Raw String Literal

If you don't want backslash to have special meaning, use ` (U+60: GRAVE ACCENT) to quote the string.

var x = `long text`

Anything can appear inside except the grave accent char itself.

And, carriage return character (Unicode codepoint 13) in it is discarded. If you run the command line tool gofmt, it will remove carriage return.

package main

import "fmt"

var x = `Alice was beginning to get very tired of sitting by her
sister on the bank, and of having nothing to do: once or twice she had
peeped into the book her sister was reading, but it had no pictures or
conversations in it, «and what is the use of a book,» thought Alice «without
pictures or conversation?».`

func main() {

    fmt.Printf("%v\n", x)
}

Embed Expression in String?

There is no embeding expression in string. (as in Ruby or JavaScript)

The closest is this:

fmt.Sprintf("Name: %v\nAge: %v", "John", 10)

package main

import "fmt"

func main() {

    var name = "John"
    var age = 30

    var x = fmt.Sprintf("Name: %v, Age: %v", name, age)

    fmt.Println(x) // Name: John, Age: 30
}

Length (Number of Bytes)

len(string) → returns the number of bytes in string.

package main

import "fmt"

func main() {
    fmt.Printf("%v\n", len("abc")) // 3
    fmt.Printf("%v\n", len("ab♥")) // 5
}

Number of Characters

from package unicode/utf8

utf8.RuneCountInString(string) → returns the number of character in string. (character here means Unicode codepoint, aka rune)

package main

import "fmt"
import "unicode/utf8"

func main() {
    var x = "I ♥ U"

    // number of bytes
    fmt.Printf("%v\n", len(x)) // 7

    // number of characters
    fmt.Printf("%v\n", utf8.RuneCountInString(x)) // 5
}

Join String

Use + to join string. e.g.

"abc" + "def"

package main

import "fmt"

func main() {
    fmt.Printf("%v\n", "a"+"b") // ab
}

String Functions

String functions are in package “strings”.

see https://golang.org/pkg/strings/

Here's some example.

package main

import "fmt"
import "strings"

var pl = fmt.Println

func main() {

    pl("ab" == "ab") // 0

    pl(strings.Contains("abcd", "bc"))  // true
    pl(strings.HasPrefix("abca", "ab")) // true
    pl(strings.HasSuffix("abca", "ca")) // true

    pl(strings.ToLower("ABC") == "abc") // true

    pl(strings.Trim(" abc ", " ") == "abc") // true

    pl(strings.Count("abcaab", "ab") == 2) // true

    pl(strings.Index("abc", "bc") == 1) // true

    pl(strings.Join([]string{"a", "and", "b"}, " ") == "a and b") // true

    // split into slice
    pl(strings.Split("a b c", " ")) // [a b c]
}

Find Replace

Use the regex package.

See: Golang: regexp

String is a Sequence of Bytes

Golang string is a sequence of bytes.

Normally, you use string as a sequence of characters, any Unicode character.

Each character is turned into 1 to 4 bytes by utf8 encoding.

You can use string to store bytes, any byte.

String can contain byte sequences that is not valid encoding for any Unicode character.

You can create a string of any byte by using the hexadecimal escape \xhh

For example, character A has codepoint 65 in decimal , and 41 in hexadecimal. So, "A" and "\x41" creates the same string. But you can create byte sequences that's not valid encoding for any Unicode character.

package main

import "fmt"

func main() {

    fmt.Printf("%v\n", "A" == "\x41") // true

}

Loop Thru Character in String

for i, c := range string {…} → go thru characters in string. i is the index (with respect to bytes), c is the character.

package main

import "fmt"

func main() {
    const x = "abc♥ 😂d"
    for i, c := range x {
        fmt.Printf("%v %q\n", i, c)
    }
}

// 0 'a'
// 1 'b'
// 2 'c'
// 3 '♥'
// 6 ' '
// 7 '😂'
// 11 'd'

if you don't need the index, do:

for _, c := range string {…}

package main

import "fmt"

func main() {
    const x = "♥ 😂"
    for _, c := range x {
        fmt.Printf("%q, %U\n", c, c)
    }
}

// '♥', U+2665
// ' ', U+0020
// '😂', U+1F602

Note: when you loop thru string by range, each character in string is basically turned into a “rune” type, which is golang's term for Unicode codepoint. That is, a integer id for the character.

package main

import "fmt"

func main() {
    const x = "♥ 😂"
    for _, c := range x {
        // print the char and its type
        fmt.Printf("%q, %T\n", c, c)
    }
}

// '♥', int32
// ' ', int32
// '😂', int32

[see Golang: Rune]

Print String: Bytes vs Characters

because string is byte sequence, sometimes you want to print them as hexadecimal to see the bytes. Other times you want to print them as characters.

The fmt.Printf function has several verbs to help.

[see ASCII Character Symbols ␀ ␣ ¶]

package main

import "fmt"

func main() {
    var x = "♥\t😂" // with a tab (U+0009) in middle

    fmt.Printf("%s\n", x)  // ♥ 😂
    fmt.Printf("%q\n", x)  // "♥\t😂"
    fmt.Printf("%+q\n", x) // "\u2665\t\U0001f602"
    fmt.Printf("% x\n", x) // e2 99 a5 09 f0 9f 98 82

    // turn the string into rune slice, then print it with %U
    fmt.Printf("%U\n", []rune(x)) // [U+2665 U+0009 U+1F602]
}

Rune

Golang: Rune

Reference

The Go Programming Language Specification - The Go Programming Language#String_literals

If you have a question, put $5 at patreon and message me.

Golang

  1. Compile, Run
  2. Package, Import
  3. Comment
  4. Print
  5. String
  6. Rune
  7. Variable
  8. Zero Value
  9. Constant
  10. If Then Else
  11. Switch/Case
  12. Loop
  13. Basic Types
  14. Array
  15. Slice
  16. Map
  17. Struct
  18. Function
  19. regexp
  20. Read File
  21. Write to File
  22. Walk Dir
  23. Check File Exist
  24. System Call
  25. Pointer
  26. Defer
  27. Random Number

Examples

  1. Validate Links
  2. Generate Sitemap

Reference

  1. Go Spec