What's the fastest way to alphabetize your bookshelf? - Chand John
- 2,493,475 Views
- 5,078 Questions Answered
- TEDEd Animation
Let’s Begin…
You work at the college library. You’re in the middle of a quiet
afternoon when suddenly, a shipment of 1,280 books arrives. The books
are in a straight line, but they're all out of order, and the automatic
sorting system is broken. How can you sort the books quickly? Chand John
shows how, shedding light on how algorithms help librarians and search engines speedily sort information.
Additional Resources for you to Explore
Sorting happens all around us. We often don't even notice it. Have you ever visited a new city and searched for the nearest grocery store or restaurant? Have you ever sorted recipes alphabetically so you could find them? Have you ever stacked boxes so that the biggest, heaviest one is at the bottom and the lightest, smallest one is at the top? All of these situations involve sorting.
Most of the time, items are sorted so that we can search more quickly. Imagine, in the video, if we had not sorted the 1,280 books in order, and students had flooded into the library looking for books. Most of them might have been unable to find the books they were looking for, since they would have had no idea where to look! Sorting things in advance can make it much easier to search for things later.
Some items can be sorted in multiple ways, depending on what criteria you're using to compare them. You can make a list of the largest cities in the world ranked by land area, by population, or by average temperature. You can sort your favorite foods in terms of how much you like them, by how much they cost, or by how many calories they contain.
Even though there are many, many ways of sorting things, and many different situations in which things can be sorted, there are a few basic principles that guide the development of efficient sorting strategies.
Perhaps the most important principle is what computer scientists call divide-and-conquer. The divide-and-conquer approach involves taking a large problem, such as the problem we saw in the video of sorting 1,280 books, and dividing it up into smaller and smaller sub-problems, and then solving those sub-problems in easier ways, leading to a solution to the original big problem. This is precisely what QuickSort does.
Usually, when you first learn about sorting algorithms, you'll learn about the three simple, but slow algorithms: Bubble Sort, Insertion Sort, and Selection Sort. After that, you'll learn about some more efficient, but more complex algorithms like Merge Sort and QuickSort. Often, right before or right after learning about sorting algorithms, you'll learn about searching algorithms: in particular, the slower but very simple Linear Search algorithm and the much faster but somewhat more complex Binary Search algorithm.
Merge Sort, QuickSort, and Binary Search are often fast precisely because they all use a divide-and-conquer strategy.
Efficiency of algorithms is measured in something called “big-O notation.” If you study computer science, you'll hear a lot about “big-O notation” and “N-squared” and “N-log-N” algorithms. What people may not tell you is that they have an image in their mind of what these things mean, and in fact, if you watched the video, you've already seen these concepts in action.
If we place a single grain of salt on a sheet of paper each time we compare two books during the execution of an algorithm in the video, and arrange the grains of salt into geometric shapes, we'd see that Bubble Sort ends up making a big right triangle of dots, while Insertion Sort also makes a big right triangle (the same width as Bubble Sort's triangle but perhaps about half the height), and QuickSort makes more of a thin rectangle of dots, much much thinner but about the same width as the triangles for Bubble Sort and Insertion Sort. If you measured the area of the Bubble Sort triangle, it might be about double the area of the Insertion Sort triangle, while the QuickSort rectangle would be much, much smaller in area than either of those triangles. If the Bubble Sort triangle is about N grains of salt wide and about N grains of salt tall, its area would be base times height over 2, or roughly (N^2)/2 (“N-squared over 2”). The Insertion Sort triangle might instead be roughly N times N/2 over 2, or (N^2)/4 (“N-squared over 4”). The QuickSort rectangle, on the other hand, might be N grains of salt wide and just “log-base-2 of N” (often written as just “log N”) grains of salt tall, so the area of the rectangle would be base times height or N*(log N). This is the gist of what computer scientists are picturing when they talk about “N-squared” or “N-log-N” algorithms – they're describing the areas of these shapes associated with each algorithm's comparisons. “N-squared” sorting algorithms, which include any algorithm leading to an area of N^2 times or divided by any constant number like 2 or 4 (like Bubble Sort and Insertion Sort), tend to be quite slow when dealing with large numbers of items, while “N-log-N” sorting algorithms like QuickSort tend to be much faster.
In summary, strategies that take big problems and divide them into smaller sub-problems, solving each sub-problem in an effective way, tends to be much, much faster than trying to solve a big problem all at once.
Most of the time, items are sorted so that we can search more quickly. Imagine, in the video, if we had not sorted the 1,280 books in order, and students had flooded into the library looking for books. Most of them might have been unable to find the books they were looking for, since they would have had no idea where to look! Sorting things in advance can make it much easier to search for things later.
Some items can be sorted in multiple ways, depending on what criteria you're using to compare them. You can make a list of the largest cities in the world ranked by land area, by population, or by average temperature. You can sort your favorite foods in terms of how much you like them, by how much they cost, or by how many calories they contain.
Even though there are many, many ways of sorting things, and many different situations in which things can be sorted, there are a few basic principles that guide the development of efficient sorting strategies.
Perhaps the most important principle is what computer scientists call divide-and-conquer. The divide-and-conquer approach involves taking a large problem, such as the problem we saw in the video of sorting 1,280 books, and dividing it up into smaller and smaller sub-problems, and then solving those sub-problems in easier ways, leading to a solution to the original big problem. This is precisely what QuickSort does.
Usually, when you first learn about sorting algorithms, you'll learn about the three simple, but slow algorithms: Bubble Sort, Insertion Sort, and Selection Sort. After that, you'll learn about some more efficient, but more complex algorithms like Merge Sort and QuickSort. Often, right before or right after learning about sorting algorithms, you'll learn about searching algorithms: in particular, the slower but very simple Linear Search algorithm and the much faster but somewhat more complex Binary Search algorithm.
Merge Sort, QuickSort, and Binary Search are often fast precisely because they all use a divide-and-conquer strategy.
Efficiency of algorithms is measured in something called “big-O notation.” If you study computer science, you'll hear a lot about “big-O notation” and “N-squared” and “N-log-N” algorithms. What people may not tell you is that they have an image in their mind of what these things mean, and in fact, if you watched the video, you've already seen these concepts in action.
If we place a single grain of salt on a sheet of paper each time we compare two books during the execution of an algorithm in the video, and arrange the grains of salt into geometric shapes, we'd see that Bubble Sort ends up making a big right triangle of dots, while Insertion Sort also makes a big right triangle (the same width as Bubble Sort's triangle but perhaps about half the height), and QuickSort makes more of a thin rectangle of dots, much much thinner but about the same width as the triangles for Bubble Sort and Insertion Sort. If you measured the area of the Bubble Sort triangle, it might be about double the area of the Insertion Sort triangle, while the QuickSort rectangle would be much, much smaller in area than either of those triangles. If the Bubble Sort triangle is about N grains of salt wide and about N grains of salt tall, its area would be base times height over 2, or roughly (N^2)/2 (“N-squared over 2”). The Insertion Sort triangle might instead be roughly N times N/2 over 2, or (N^2)/4 (“N-squared over 4”). The QuickSort rectangle, on the other hand, might be N grains of salt wide and just “log-base-2 of N” (often written as just “log N”) grains of salt tall, so the area of the rectangle would be base times height or N*(log N). This is the gist of what computer scientists are picturing when they talk about “N-squared” or “N-log-N” algorithms – they're describing the areas of these shapes associated with each algorithm's comparisons. “N-squared” sorting algorithms, which include any algorithm leading to an area of N^2 times or divided by any constant number like 2 or 4 (like Bubble Sort and Insertion Sort), tend to be quite slow when dealing with large numbers of items, while “N-log-N” sorting algorithms like QuickSort tend to be much faster.
In summary, strategies that take big problems and divide them into smaller sub-problems, solving each sub-problem in an effective way, tends to be much, much faster than trying to solve a big problem all at once.
TED-Ed
Lesson Creator
New York, NY
Create and share a new lesson based on this one.