Emacs: HTML, Make Citation Link

By Xah Lee. Date: . Last updated: .

Here's a elisp command that makes a HTML citation link. Example:

Emacs Lisp Basics
http://ergoemacs.org/emacs/elisp_basics.html
By Xah Lee
2020-01-17

becomes:

[<cite>Emacs Lisp Basics</cite> <time>2020-01-17</time> By Xah Lee. At <a href="http://ergoemacs.org/emacs/elisp_basics.html" data-accessed="2020-11-14">http://ergoemacs.org/emacs/elisp_basics.html</a> ]

With proper Cascading Style Sheet (CSS) [see Visual CSS] , it is rendered in browsers like this:

[Emacs Lisp Basics By Xah Lee. At http://xahlee.info/emacs/emacs/elisp_basics.html ]

The article title, author, date, helps solving the link rot problem. This way, when a link is dead, you can try to search by title, author, date.

Solution

(defun xah-html-make-citation ()
  "Reformat current text block or selection into a canonical citation format.
For example, place cursor somewhere in the following block:

Why Utopian Communities Fail
https://areomagazine.com/2018/03/08/why-utopian-communities-fail/
2018-03-08
by Ewan Morrison

becomes

 [<cite>Why Utopian Communities Fail</cite> <time>2018-03-08</time> By Ewan Morrison. At <a href=\"https://areomagazine.com/2018/03/08/why-utopian-communities-fail/\" data-accessed=\"2018-03-24\">https://areomagazine.com/2018/03/08/why-utopian-communities-fail/</a> ]

If there's a text selection, use it for input, otherwise the input is a text block between blank lines.

The order of lines for {title, author, date/time, url} needs not be in that order. Author should start with “by”.

URL `http://xahlee.info/emacs/emacs/elisp_make-citation.html'
Version 2020-07-15 2021-05-02"
  (interactive)
  (let* (
         ($bds (xah-get-bounds-of-thing 'block))
         ($p1 (car $bds))
         ($p2 (cdr $bds))
         ($inputText (buffer-substring-no-properties $p1 $p2))
         ;; ($inputText (replace-regexp-in-string "^[[:space:]]*" "" (elt $bds 0))) ; remove white space in front
         ;; ($lines (split-string $inputText "[ \t]*\n[ \t]*" t "[[:space:]]*"))
         ($lines (split-string $inputText "\n" t " *"))
         $title $author $date $url )
    ;; set title, date, url, author,
    (let ($x (case-fold-search t))
      ;; the whole thing here is not optimal implementation. data structure should be hash or so. easier... basically, we have n items, and we need to identify them into n things. that is, pairing them up. Now, some items are easily recognized with 100% certainty. We pair those first. Then, in the end, we'll have 2 or so items that we need to identify, but by then, the items are few, and we can easily distinguish them. So, for this, we need a data structure such that we can easily remove item for those we already identified.
      (while (> (length $lines) 0)
        (setq $x (pop $lines))
        (cond
         ((string-match "https?://" $x) (setq $url $x))
         ((xah-html--is-datetimestamp-p $x) (setq $date $x))
         ((string-match "^ *[bB]y:* " $x) (setq $author $x))
         (t (setq $title $x)))))
    (when (not $url) (error "I can't find “url” %s" $url))
    (when (not $date) (error "error 74188 I can't find “date” %s" $date))
    (when (not $title) (error "I can't find “title” %s" $title))
    (when (not $author) (error "I can't find “author” %s" $author))
    (setq $title (string-trim $title))
    (setq $title (replace-regexp-in-string "^\"\\(.+\\)\"$" "\\1" $title))
    (setq $title (xah-replace-pairs-in-string $title '(["’" "'"] ["&" "&"] )))
    (setq $author (string-trim $author))
    (setq $author (replace-regexp-in-string "\\. " " " $author)) ; remove period in Initals
    (setq $author (replace-regexp-in-string "^ *[Bb]y:* +" "" $author))
    (setq $author (upcase-initials (downcase $author)))
    (setq $date (string-trim $date))
    (setq $date (xah-fix-datetime-string $date))
    (setq $url (string-trim $url))
    (setq $url (with-temp-buffer (insert $url) (xah-html-source-url-linkify 1) (buffer-string)))
    (delete-region $p1 $p2 )
    (insert (concat "[<cite>" $title "</cite> ")
            "<time>" $date "</time>"
            " By " $author
            ". At " $url
            " ]")))

The code is in Emacs: Xah HTML Mode. You need it for the function called.