1
2
3
4
5
6
7
8
9
10
11

import math
nums = [5, 9, 10, 11, 22, 1, 0, 1] #< Fill list with values
nums.sort() #< Sort the list in ascending order
try:
mid_num = ( len( nums )  1) / 2
median = nums[ mid_num ]
except TypeError: #< There were an even amount of values
# Make sure to type results of math.floor/ceil to int for use in list indices
ceil = int( math.ceil( mid_num ) )
floor = int( math.floor( mid_num ) )
median = ( nums[ ceil ] + nums[ floor ] ) / 2

Lower Quartile ∞
1
2
3
4
5

import math
nums = [5, 9, 10, 11, 22, 1, 0, 1] #< Fill list with values
nums.sort() #< Sort the list in ascending order
low_mid = int( round( ( len(nums) + 1 ) / 4.0 ) – 1 ) #< Thanks @Alex (comments)
lq = nums[low_mid]

Upper Quartile ∞
1
2
3
4
5
6
7
8
9
10
11

import math
nums = [5, 9, 10, 11, 22, 1, 0, 1] #< Fill list with values
nums.sort() #< Sort the list in ascending order
try:
high_mid = ( len( nums )  1 ) * 0.75
uq = nums[ high_mid ]
except TypeError: #< There were an even amount of values
# Make sure to type results of math.floor/ceil to int for use in list indices
ceil = int( math.ceil( high_mid ) )
floor = int( math.floor( high_mid ) )
uq = ( nums[ ceil ] + nums[ floor ] ) / 2

Credits ∞
http://www.mathsteacher.com.au/year9/ch17_statistics/06_quartiles/quartiles.htm
0
Hi, you forgot to mention that it works only on a sorted list.
Good call; I went ahead and added that in. Thanks for your input!
median = ( sizes[ ceil ] + sizes[ floor ] ) / 2
your function “sizes” is not defined in the math or normal python library (I’m using 2.7.2.5)? What are you trying to calculate here? Are you using a certain version of python to access this function?
Sorry about that, it looks like I decided to change the variable name from `sizes` to `nums`, but missed some references. Had I been referencing the variables correctly, this code is compatible with Python versions >= 2.6. The post has been updated; thank you for catching this.
Please delete the above, my brain wasn’t working.
On entering the except: Only way to do this (again, wrt 2.7.5, not sure of implementation changes in 3) is to use modulo, %2 and check for !=0.
On calculating the quartile:
You can’t take ( float( len ) – 1 )/ 4. Imagine what happens in the case of len = 9 numbers (same example as the stats link). You formula results in index of 2, or 3rd number in the list. That is wrong, it should be the average of the 2nd and 3rd numbers (if following the stats link method).
The formula at the stats link will produce inconsistent results. For example, with 8 numbers, where you would correctly assume the 2nd number would be the largest number in the lower quartile, the formula would calculate it as the 9/4 = 2.25th number. At the point you would need to decide whether to use linear interpolation, round off, or round to the nearest .5. If their approach is to round except when exactly the midpoint between two positions, their stated goal of finding the “median of the lower half of the data set” is clearly incorrect. There are better approaches. Lets stick for now with the easiest approach, which is to always round to the nearest position where L = round ( 1/4( n + 1). In that case your code still won’t work but is closer:
nums.sort() #< Sort the list in ascending order
low_mid = int( round( ( len(testList) + 1 ) / 4.0 ) – 1 )
lq = nums[low_mid]
This will result in consistent treatment of any length list. There are nicer ways of handling lower quartile however: http://mathforum.org/library/drmath/view/60969.html
Sorry for spamming your blog, please delete the comments preceding this one, and thanks for writing the formula and making me think!
No worries on all the comments, I’m happy to see that my article provoked some thought! I actually came back to this recently while porting the code to Node.JS, and noticed exactly what you did. Excellent formula, I’m updating the article now. Thanks for reading my blog, and for the spam :)