Edge Cases in Computing, and What Exactly is Whitespace?

By Xah Lee. Date: 2017-01-26. Last updated: 2017-02-01.

edge cases are a thorny and inevitable widespread problem in computing. every function, protocol, consideration, have lots edge cases.

error checking of input is also a big problem. To check or not to check. To check is massive waste of resource. But not to check, is large source of bugs.

one edge case in computing is the meaning of newline char. No perspective is absolutely perfect. You ends up with a mess.

Two ways to view newlines, both of which are self-consistent, are that newlines either separate lines or that they terminate lines. If a newline is considered a separator, there will be no newline after the last line of a file. Some programs have problems processing the last line of a file if it is not terminated by a newline. On the other hand, programs that expect newline to be used as a separator will interpret a final newline as starting a new (empty) line. Conversely, if a newline is considered a terminator, all text lines including the last are expected to be terminated by a newline. If the final character sequence in a text file is not a newline, the final line of the file may be considered to be an improper or incomplete text line, or the file may be considered to be improperly truncated.

Representations…

LF

CR+LF

CR

RS

0x9B

LF+CR

Unicode

LF: Line Feed, U+000A

VT: Vertical Tab, U+000B

FF: Form Feed, U+000C

CR: Carriage Return, U+000D

CR+LF: CR (U+000D) followed by LF (U+000A)

NEL: Next Line, U+0085

LS: Line Separator, U+2028

PS: Paragraph Separator, U+2029

[2017-02-01 Wikipedia Newline ]

and here's JavaScript take on newline. ECMAScript 2015 §ECMAScript Language: Lexical Grammar#sec-unicode-format-control-characters

Unicode created new chars for newline. Next Line U+0085, Line Separator U+2028, Paragraph Separator U+2029. They only added complexity.

a simple question is, exactly what does your language's trimString function do? bet you don't know!