Median and Quartiles – Python

[[[TOC]]]
———-
= Median =
{{{ lang=python line=1
import math
nums = [5, 9, 10, 11, 22, 1, 0, -1] #< Fill list with values nums.sort() #< Sort the list in ascending order try: mid_num = ( len( nums ) - 1) / 2 median = nums[ mid_num ] except TypeError: #< There were an even amount of values # Make sure to type results of math.floor/ceil to int for use in list indices ceil = int( math.ceil( mid_num ) ) floor = int( math.floor( mid_num ) ) median = ( nums[ ceil ] + nums[ floor ] ) / 2 }}} ---------- = Lower Quartile = {{{ lang=python line=1 import math nums = [5, 9, 10, 11, 22, 1, 0, -1] #< Fill list with values nums.sort() #< Sort the list in ascending order low_mid = int( round( ( len(nums) + 1 ) / 4.0 ) – 1 ) #< Thanks @Alex (comments) lq = nums[low_mid] }}} ------------ = Upper Quartile = {{{ lang=python line=1 import math nums = [5, 9, 10, 11, 22, 1, 0, -1] #< Fill list with values nums.sort() #< Sort the list in ascending order try: high_mid = ( len( nums ) - 1 ) * 0.75 uq = nums[ high_mid ] except TypeError: #< There were an even amount of values # Make sure to type results of math.floor/ceil to int for use in list indices ceil = int( math.ceil( high_mid ) ) floor = int( math.floor( high_mid ) ) uq = ( nums[ ceil ] + nums[ floor ] ) / 2 }}} ---------- = Credits = [[http://www.mathsteacher.com.au/year9/ch17_statistics/06_quartiles/quartiles.htm]]


Posted

in

by

Tags:

Comments

6 responses to “Median and Quartiles – Python”

  1. Antoine Avatar
    Antoine

    Hi, you forgot to mention that it works only on a sorted list.

    1. dlasley Avatar

      Good call; I went ahead and added that in. Thanks for your input!

  2. Rom Avatar
    Rom

    median = ( sizes[ ceil ] + sizes[ floor ] ) / 2

    your function “sizes” is not defined in the math or normal python library (I’m using 2.7.2.5)? What are you trying to calculate here? Are you using a certain version of python to access this function?

    1. dlasley Avatar

      Sorry about that, it looks like I decided to change the variable name from `sizes` to `nums`, but missed some references. Had I been referencing the variables correctly, this code is compatible with Python versions >= 2.6. The post has been updated; thank you for catching this.

  3. Alex Avatar

    Please delete the above, my brain wasn’t working.

    On entering the except: Only way to do this (again, wrt 2.7.5, not sure of implementation changes in 3) is to use modulo, %2 and check for !=0.

    On calculating the quartile:
    You can’t take ( float( len ) – 1 )/ 4. Imagine what happens in the case of len = 9 numbers (same example as the stats link). You formula results in index of 2, or 3rd number in the list. That is wrong, it should be the average of the 2nd and 3rd numbers (if following the stats link method).

    The formula at the stats link will produce inconsistent results. For example, with 8 numbers, where you would correctly assume the 2nd number would be the largest number in the lower quartile, the formula would calculate it as the 9/4 = 2.25th number. At the point you would need to decide whether to use linear interpolation, round off, or round to the nearest .5. If their approach is to round except when exactly the midpoint between two positions, their stated goal of finding the “median of the lower half of the data set” is clearly incorrect. There are better approaches. Lets stick for now with the easiest approach, which is to always round to the nearest position where L = round ( 1/4( n + 1). In that case your code still won’t work but is closer:

    nums.sort() #< Sort the list in ascending order

    low_mid = int( round( ( len(testList) + 1 ) / 4.0 ) – 1 )

    lq = nums[low_mid]

    This will result in consistent treatment of any length list. There are nicer ways of handling lower quartile however: http://mathforum.org/library/drmath/view/60969.html

    Sorry for spamming your blog, please delete the comments preceding this one, and thanks for writing the formula and making me think!

    1. Dave Lasley Avatar

      No worries on all the comments, I’m happy to see that my article provoked some thought! I actually came back to this recently while porting the code to Node.JS, and noticed exactly what you did. Excellent formula, I’m updating the article now. Thanks for reading my blog, and for the spam :)

Leave a Reply to Rom Cancel reply

Your email address will not be published. Required fields are marked *