CMU 15-112: Fundamentals of Programming and Computer Science
Class Notes: Searching and Sorting


  1. Searching
  2. Sorting

  1. Searching
    • Linear search
      • Basic idea: check each element in turn
      • Use: find an element in an unsorted list
      • Cost: O(N)
    • Binary search
      • Basic idea: in a sorted list, check middle element, eliminate half on each pass
      • Uses:
        • Find an element in a sorted list
        • Number-guessing game (eg: guess a random number between 1 and 1000)
        • Find a root (zero) of a function with bisection (adapted binary search)
      • Cost: O(logN)

  2. Sorting (selection, bubble, merge, builtin sorts)
    Sorting videos:
    • selectionsort
    • bubblesort
    • mergesort
    • sorting timing
    # sorting.py import time, random #################################################### # swap #################################################### def swap(a, i, j): (a[i], a[j]) = (a[j], a[i]) #################################################### # selectionSort #################################################### def selectionSort(a): n = len(a) for startIndex in range(n): minIndex = startIndex for i in range(startIndex+1, n): if (a[i] < a[minIndex]): minIndex = i swap(a, startIndex, minIndex) #################################################### # bubbleSort #################################################### def bubbleSort(a): n = len(a) end = n swapped = True while (swapped): swapped = False for i in range(1, end): if (a[i-1] > a[i]): swap(a, i-1, i) swapped = True end -= 1 #################################################### # mergeSort #################################################### def merge(a, start1, start2, end): index1 = start1 index2 = start2 length = end - start1 aux = [None] * length for i in range(length): if ((index1 == start2) or ((index2 != end) and (a[index1] > a[index2]))): aux[i] = a[index2] index2 += 1 else: aux[i] = a[index1] index1 += 1 for i in range(start1, end): a[i] = aux[i - start1] def mergeSort(a): n = len(a) step = 1 while (step < n): for start1 in range(0, n, 2*step): start2 = min(start1 + step, n) end = min(start1 + 2*step, n) merge(a, start1, start2, end) step *= 2 #################################################### # builtinSort (wrapped as a function) #################################################### def builtinSort(a): a.sort() #################################################### # testSort #################################################### def testSort(sortFn, n): a = [random.randint(0,2**31) for i in range(n)] sortedA = sorted(a) startTime = time.time() sortFn(a) endTime = time.time() elapsedTime = endTime - startTime assert(a == sortedA) print("%20s n=%d time=%6.3fs" % (sortFn.__name__, n, elapsedTime)) def testSorts(): n = 2**8 # use 2**8 for Brython, use 2**12 or larger for Python for sortFn in [selectionSort, bubbleSort, mergeSort, builtinSort]: testSort(sortFn, n) testSorts()

  3. Sorting Links

  4. SelectionSort vs MergeSort
    • Definitions
      • Selection Sort: repeatedly select largest remaining element and swap it into sorted position
      • Mergesort: sort blocks of 1's into 2's, 2's into 4's, etc, on each pass merging sorted blocks into sorted larger blocks
    • Analysis
      This is mostly informal, and all you need to know for a 112-level analysis of these algorithms. You can easily find much more detailed and rigorous proofs on the web.
      • selectionsort
        On the first pass, we need N compares and swaps (N-1 compares and 1 swap).
        On the second pass, we need only N-1 (since one value is already sorted).
        On the third pass, only N-2.
        So, total steps are about 1 + 2 + ... + (N-1) + N = N(N+1)/2 = O(N2).

      • mergesort
        On each pass, we need about 3N compares and copies (N compares, N copies down, N copies back).
        So total cost = (3N steps per pass) x (# of passes)
        After pass 0, we have sorted lists of size 20 (1)
        After pass 1, we have sorted lists of size 21 (2)
        After pass 2, we have sorted lists of size 22 (4)
        After pass k, we have sorted lists of size 2k
        So we need k passes, where N = 2k
        So # of passes = k = log2N
        Recall that total cost = (3N steps per pass) x (# of passes)
        So total cost = (3N)(log2N) = O(NlogN).
        Note: This is the theoretical best-possible Big-O for comparison-based sorting!