CS331 - Datastructures and Algorithms

Version 1

Course webpage for CS331

Searching, Sorting, and Timing

Agenda

  1. Timing (Empirical runtime analysis)
  2. Prelude: Timing list indexing
  3. Linear search
  4. Binary search
  5. Insertion sort
  6. Bubble sort

1. Timing

import time
print(time.time())

1612976291.3405042

t1 = time.time()
time.sleep(1)
t2 = time.time()
print(t2 - t1)

1.0015649795532227

2. Prelude: Timing list indexing

lst = [0] * 10**5
import timeit
print(timeit.timeit(stmt='lst[0]', globals=globals()))

0.03113916900474578

print(timeit.timeit(stmt='lst[10**5-1]', globals=globals()))

0.03360662600607611

print('lst[{}]'.format(1))

lst[1]

times = [timeit.timeit(stmt='lst[{}]'.format(i),
                         globals=globals(),
                         number=1000)
           for i in range(10**5)]
times[:10]
  [3.1036994187161326e-05,
  2.981899888254702e-05,
  3.060800372622907e-05,
  2.9064001864753664e-05,
  3.014199319295585e-05,
  3.0212002457119524e-05,
  2.9727991204708815e-05,
  2.9686998459510505e-05,
  2.9867005650885403e-05,
  3.0117997084744275e-05]
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(times, 'bo')
[<matplotlib.lines.Line2D at 0x1091ebd60>]

Observation: accessing an element in a list by index takes a constant amount of time, regardless of position.

How? A Python list uses an array as its underlying data storage mechanism. To access an element in an array, the interpreter:

  1. Computes an offset into the array by multiplying the element’s index by the size of each array entry (which are uniformly sized, since they are merely references to the actual elements)
  2. Adds the offset to the base address of the array

Task: to locate an element with a given value in a list (array).

def lindex(lst, x):
    for i in range(len(lst)):
        if x == lst[i]:
            return i
    return -1
lst = list(range(100))
lindex(lst, 10)
10
lindex(lst, 99)
99
lindex(lst, -2)
-1
import timeit
lst = list(range(1000))
ltimes = [timeit.timeit(stmt='lindex(lst, {})'.format(x),
                         globals=globals(),
                         number=100)
           for x in range(1000)]
import matplotlib.pyplot as plt
plt.plot(ltimes, 'ro')
[<matplotlib.lines.Line2D at 0x110299ca0>]

Task: to locate an element with a given value in a list (array) whose contents are sorted in ascending order.

def index(lst, x):
    def binsearch_rec(lst,x,l,h):
        mid = ((h - l) // 2) + l
        if lst[mid] == x:
            return mid
        if (h - l) == 1:
            return -1
        newlow = mid + 1 if lst[mid] < x else l
        newhigh = mid - 1 if lst[mid] > x else h
        return binsearch_rec(lst,x,newlow,newhigh)
    return binsearch_rec(lst,x,0,len(lst))
print(index(lst, 999))
print(index(lst, -1))
import timeit
lst = list(range(1000))
times = [timeit.timeit(stmt='index(lst, {})'.format(x),
                         globals=globals(),
                         number=1000)
           for x in range(1000)]
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(times, 'ro')
plt.show()
def iter_index(lst, x):
    l = 0
    h = len(lst)
    while h > l:
        mid = ((h - l) // 2) + l
        if lst[mid] == x:
            return mid
        l = mid + 1 if lst[mid] < x else l
        h = mid - 1 if lst[mid] > x else h
    return -1
import timeit
iter_no_times = []
for size in range(1000, 100000, 100):
    lst = list(range(size))
    iter_no_times.append(timeit.timeit(stmt='iter_index(lst, -1)',
                                 globals=globals(),
                                 number=1000))
import matplotlib.pyplot as plt
plt.plot(iter_no_times, 'ro')
[<matplotlib.lines.Line2D at 0x111dcb0a0>]
import timeit
etimes = []
for e in range(5, 20):
    lst = list(range(2**e))
    etimes.append(timeit.timeit(stmt='iter_index(lst, -1)',
                                 globals=globals(),
                                 number=100000))
import matplotlib.pyplot as plt
plt.plot(etimes, 'ro')
plt.show()

5. Insertion sort

  • Task: to sort the values in a given list (array) in ascending order.

      import random
      lst = list(range(1000))
      random.shuffle(lst)
    
      plt.plot(lst, 'ro')
      plt.show()
    
      def insertion_sort(lst):
          for i in range(1,len(lst)): # number of times? n-1
              for j in range(i,0,-1): # number 1, 2, 3, 4, ..., n-1
                  if lst[j] <= lst[j-1]:
                      lst[j-1], lst[j] = lst[j], lst[j-1]
                  else:
                      break
    
      insertion_sort(lst)
    
      plt.plot(lst, 'ro')
      plt.show()
    
      import timeit
      import random
      times = [timeit.timeit(stmt='insertion_sort(lst)',
                               setup='lst=list(range({})); random.shuffle(lst)'.format(size),
                               globals=globals(),
                               number=1)
                 for size in range(100, 5000, 250)]
    
      plt.plot(times, 'ro')
      plt.show()
    

6. Bubble sort

  • Another simple sort algorithm is Bubble sort. This algorithm

      def bubble_sort(lst):
          pass
    
Last updated on Wednesday, February 10, 2021
Published on Wednesday, February 3, 2021
 Edit on GitHub