Python Doc Problem: os.path.split()

By Xah Lee. Date:

Quote from:

Split the pathname path into a pair, (head, tail) where tail is the last pathname component and head is everything leading up to that. The tail part will never contain a slash; if path ends in a slash, tail will be empty. If there is no slash in path, head will be empty. If path is empty, both head and tail are empty. Trailing slashes are stripped from head unless it is the root (one or more slashes only). In nearly all cases, join(head, tail) equals path (the only exception being when there were multiple slashes separating head from tail).

This confusive verbiage is a result of the author's pretention in a austere style and his failure to think clearly before writing.

Suggested rewrite:

returns a pair (dirname,filename), where dirname is the part of path up to the last slash, and filename is the rest of the string after the last slash.

Exceptional cases are:
• if path is a single slash (or repeated), then path == dirname and filename is empty.
• If the last slash is repeated, they are treated as one single slash.

I was working on a program where i needed to split a path into dirname, corename, and suffix.

I came to this page and took me a while to understand what split() is about. There are other path related functions splitext(), splitdrive(), basename(), dirname(). User has to scan the whole page and read painfully each one to fully understand how to choose and use them for the task at hand.

As i have explained before (see references at bottom), documentation should be organized oriented towards programer's tasks, not alphabetically, compiler view, or computer sciency scheme. On this os.path module, split(), splittext(), dirname(), basename() should all be under one section. This way, their usefulness and each's fitness becomes clearer, and also easier to document as a collective. Other functions that test files or get info about files should be grouped together. Don't be afraid of having functions that won't fit into some grouping scheme. e.g. the walk() and supports_unicode_filenames() can be lumped at the bottom as Other. The need to present materials in some aloof, computer sciency, academic, technically precise way is a major usability problem of the Python doc.

(the Pythoners's need to present materials in a formal style is a backlash of the happy-go-lucky sloppiness of unix/perl community. However, they being pretty much the same crowd without significant critical thinking and writing skills, cannot do better by hiding in formality.)

Also, at the top we see:

Warning: On Windows, many of these functions do not properly support UNC pathnames. splitunc() and ismount() do handle them correctly.

As indicated before, this is a exhibition of tech geeking and jargonizing. If this warning is necessary, place it at the bottom of the page as a footnote. Also, spell out UNC, and provide a link to its proper spec.

Tech geekers are very pretentious and cryptic in their tech docs. They are afraid, as if spelling out UNC would make them unprofessional, that their peers would deem them inferior. There are a myriad of technical standards that any programer could only be familiar with a fraction, confined to his area of expertise. Standards and its acronyms come and go, and each with varying degrees of precision, actual relevance, and they are intermingled with de facto practices in the commercial world that may not even have official names. The tech geekers are clouded by their tech-expertise. The purpose of documentation is not some cold academic presentation. Vast majority who came to use os.path wouldn't know what UNC is nor do they need to know. Spell things out when in doubt.

UNC here, isn't really a significant “standard”. This warning should be left out.


See also: Why Open Source Documentation is of Low Quality