Python: Sort

By Xah Lee. Date: . Last updated: .

To sort a list, use the “sort” method.

# -*- coding: utf-8 -*-
# python

li = [1,9,2,3]
li.sort()
print li    # [1, 2, 3, 9]

Note that sort is a method, and the variable is changed in-place.

Sorting a Matrix

Here's a example of sorting a matrix by second column:

# -*- coding: utf-8 -*-
# python

# sort a matrix

li = [[2,6],[1,3],[5,4]]

li.sort(lambda x, y: cmp(x[1],y[1]))

print li; #  [[1, 3], [5, 4], [2, 6]]

The argument to sort is a function of two arguments, that returns -1, 0, 1. This function is a decision function that tells sort() how to decide the order of any two elements in your list. If the first argument is “less” then second argument, the function should return -1. If equal, then 0. Else, 1.

If you want to sort without modifying the variable, use “sorted()” (Python 2.4).

Sort by Custom Order

Here's a more complex example. Suppose you have a list of strings.

'my283.jpg'
'my23i.jpg'
'web7-s.jpg'
'fris88large.jpg'
…

You want to sort them by the number embedded in them.

You need to define a ordering function, and pass it to sort. The function should takes two strings, and compare the integer inside the string. Here's the solution:

# -*- coding: utf-8 -*-
# python

# sort by custom order

import re

li = [
'my283.jpg',
'my23i.jpg',
'web7-s.jpg',
'fris88large.jpg',
]

# compare number inside string
def myComp (x,y):
    def getNum(str): return float(re.findall(r'\d+',str)[0])
    return cmp(getNum(x),getNum(y))

li.sort(myComp)

print li
# ['web7-s.jpg', 'my23i.jpg', 'fris88large.jpg', 'my283.jpg']

Here, we defined a function “myComp” to tell sort about the ordering.

Math of Ordering

In general, the function f used to determine the order of any two element must satisfy some constraints:

If the comparison function does not behave as the above, then it is not consistent. It means that the result ordered list may actually be different depending the order of the original list or how the language happens to implement sort.

The significance of all this is that in real software you may want to sort a list of complex object by a specialized ordering. For example, you may want to sort a list of triangles in 3D space by their orientation. Or, you need to sort graphical user interface window objects in a Word Processor by their width. Or, sorting image selections in a photo-editing program. It is in advanced cases like these, understanding the basic math about ordering is important. Otherwise, you might have a bewildering result yet unable to locate any flaws in your code.

The order-deciding function given by the user is called a “predicate” in computer science. (A predicate function is any function that returns either true or false) And, it is usually defined inline using a pure function construct called “lambda”.

Python's “sort” method's optional parameters: “key” and “reverse”

Most of the time, sorting is done for a list of atomic element such as [3,2,4]. This is simply done by myList.sort() without any argument. Other than simple list, sort is frequently used on matrixes (For example, [[2,6],[1,3],[5,4]]). For matrixes, almost always a particular column is used for the basis of ordering. For example, if we want to sort by second column, we do: li.sort(lambda x, y: cmp(x[1],y[1])). Since this is frequently used, Python provides a somewhat shorter syntax for it, by specifying the column used as the ordering “key”. For example:

li=[[2,6],[1,3],[5,4]]
li.sort(key=lambda x:x[1] ) # is equivalent to the following
#li.sort(lambda x, y: cmp(x[1],y[1]))
print li; # prints [[1, 3], [5, 4], [2, 6]]

Because the Python compiler is not very refined, this specialized syntax is algorithmically a order of magnitude faster than the general form lambda x, y: cmp(x[1],y[1]).

That is to say, the programer must now learn 2 syntaxes for the ordering function, one general and one specialized, and he must choose the right one otherwise he will be penalized by a magnitude of slow down. This also means that, in order to make the right choice of the syntax, the programer must known in advance of the nature of the list to be sorted.

Another idiosyncratic provision is the optional “reverse” argument. This parameter is necessary when using the “key” parameter. Normally, when using the order comparison function lambda(x,y), the ascending/descending nature can be changed by simply switching the parameters x and y. But now a single key=lambda(x) can't provide that, thus another optional parameter “reverse” is provided. Example:

The following are equivalent:

Of course, one can reverse a list by the reverse() method for lists. But li.sort(…).reverse() is algorithmically another order of magnitude slower since Python compiler is not smart.

The official doc on Python's sort method is at (bottom): http://python.org/doc/2.4/lib/typesseq-mutable.html

Thanks to Lasse Vågsæther Karlsen for informing me the existence of the sorted() function in Python 2.4.

See also: Python Doc Problem Example: sort()

If you have a question, put $5 at patreon and message me.

Python by Example

  1. Python Basics
  2. Print Version String
  3. Builtin Help
  4. Quote String
  5. String Operations
  6. String Methods
  7. Format String
  8. True, False
  9. if then else
  10. for, while, Loops
  11. List Basics
  12. Loop Thru List
  13. Map Function to List
  14. List Comprehension
  15. List Methods
  16. Dictionary
  17. Loop Thru Dict
  18. Dict Methods
  19. Function
  20. Class
  21. List Modules
  22. Write a Module
  23. Unicode 🐍

Regex

  1. Regex Basics
  2. Regex Reference

Text Processing

  1. Read/Write File
  2. Traverse Directory
  3. Manipulate Path
  4. Process Unicode
  5. Convert File Encoding
  6. Find Replace in dir
  7. Find Replace by Regex
  8. Count Word Frequency

Web

  1. Send Email
  2. GET Web Page
  3. Web Crawler
  4. HTTP POST
  5. Check Page Load Size
  6. Thumbnail Generation

Misc

  1. JSON
  2. Find Script Path
  3. Get Env Var
  4. System Call
  5. Decompress Gzip
  6. Complex Numbers

Advanced

  1. Sort
  2. Copy Nested List
  3. Tuple vs List
  4. Sets, Union, Intersection
  5. Closure in Python 2
  6. Decorator
  7. Append String in Loop
  8. Timing f timeit
  9. Keyword Arg Default Value Unstable