Elisp: Parse URL 🚀

By Xah Lee. Date: .

Here's a emacs lisp function that parses URL.

(defun xah-html-parse-url (HrefVal)
  "Parse URL HrefVal, return a hashtable.
Result has following keys (string type) (shown with sample values):

href → http://www.example.com:49158/a/b/c?x=1&y=2#xx
origin → http://www.example.com:49158
protocol → http:
host → www.example.com:49158
hostname → www.example.com
port → 49158
pathname → /a/b/c
search → ?x=1&y=2
hash → #xx

if value does not exist, such as hash string, value is nil.

(keys modeled after browser URL object)
All values are strings.

URL `http://xahlee.info/emacs/emacs/emacs_parse_url.html'
Version: 2023-02-20 2023-02-21 2023-02-23"
  (let (xproto xhost xorigin xhostname xport xpath xsearch xhash xp1 xp2
               (xhtable (make-hash-table :test 'equal :size 10)))
    (puthash "href" HrefVal xhtable)
      (insert HrefVal)
      (goto-char (point-min))
      (search-forward "://")
      (setq xp1 (point))
      (setq xproto (buffer-substring-no-properties (point-min) (- xp1 2)))
      (puthash "protocol" xproto xhtable)
      (if (search-forward "/" nil "move")
            (setq xp2 (point))
            (setq xhost (buffer-substring-no-properties xp1 (1- xp2)))
            (puthash "host" xhost xhtable)
            (setq xorigin (buffer-substring-no-properties (point-min) (1- xp2)))
            (puthash "origin" xorigin  xhtable)

            (re-search-forward "\\([^?#]*\\)?\\(\\?[^#]*\\)?\\(#.+\\)?")
            (setq xpath (concat "/" (match-string 1)))
            (setq xsearch (match-string 2))
            (setq xhash (match-string 3)))
        (puthash "pathname" xpath xhtable)
        (puthash "search" xsearch xhtable)
        (puthash "hash" xhash xhtable))
        (setq xhost (buffer-substring-no-properties xp1 (point-max)))
        (puthash "host" xhost xhtable)
        (puthash "origin" (buffer-substring-no-properties (point-min) (point-max))  xhtable)
        (puthash "pathname" "/" xhtable)
        (puthash "search" nil xhtable)
        (puthash "hash" nil xhtable)))
      (let ((xx (string-match-p ":" xhost)))
        (if xx
              (setq xhostname (substring xhost 0 xx))
              (setq xport (substring xhost (1+ xx))))
            (setq xhostname xhost)
            (setq xport nil))))
      (puthash "hostname" xhostname xhtable)
      (puthash "port" xport xhtable))

;; ;; test path, query string, frag
;; (xah-html-parse-url "http://www.example.com:49158/a/b/c?x=1&y=2#xx")
;; (xah-html-parse-url "http://www.example.com:49158/a/b/c")
;; (xah-html-parse-url "http://www.example.com:49158/a/b/c#xx")
;; (xah-html-parse-url "http://www.example.com:49158/a/b/c?x=1&y=2")
;; (xah-html-parse-url "http://www.example.com:49158/?x=1&y=2")
;; (xah-html-parse-url "http://www.example.com:49158/")

;; ;; test path
;; (xah-html-parse-url "http://www.example.com")
;; (xah-html-parse-url "http://www.example.com/")
;; (xah-html-parse-url "http://www.example.com/a")
;; (xah-html-parse-url "http://www.example.com/a/")

;; ;; test port
;; (xah-html-parse-url "http://www.example.com:49158/a/b/")
;; (xah-html-parse-url "http://www.example.com:/a/b/")
;; (xah-html-parse-url "http://www.example.com/a/b/")