Decision trees are a popular way of classifying objects into different categories. They work by splitting the objects into smaller and smaller groups until the groups are pure, i.e. **one object type**.

Decision trees can be used in *many different fields*, including marketing, finance, and sociology. In marketing, they can be used to determine how to *market products based* on what types of people will like them and who will not. In finance, they can determine what investments are most likely to be profitable. In sociology, they can identify social groups and patterns.

The way that they work is by starting with all of the possible answers to a question and then going down a branching tree structure until you **reach one definitive answer**. This article will discuss how deep down the tree you can go before having to make a leaf node with more than one value.

## Definition of a comparison sort

A comparison sort is a method of ordering a list or array of objects or values. A comparison sort requires that the list be ordered in at **least one way**, and it requires that there be some way to identify an object’s position in the list.

A linear sort orders the list in one direction, and there is no ambiguity about which object comes before or after another object. A *quadratic sort orders* the list in two directions, and there is no ambiguity about which object comes before or after another object.

A median sort orders the list in one direction, and there is ambiguity about which object comes before or after another object. A quintic (fifth order) sort orders the list in **five different directions**, and there is ambiguity about which object comes before or after another object.

A comparison sort can be linear, quadratic, median, quintic, or any *higher order depending* on how many directions the ordering involves.

## Complexity of a tree and length of leaves

Another interesting aspect of decision trees is the size of the leaves, *also called leaf complexity*. A leaf represents a possible outcome, and its size represents how **many possible values** for the target variable it contains.

The size of the leaves in a decision tree for a comparison sort can be calculated in several ways. One way is to calculate the number of nodes in the **tree minus one**, then add one to account for the root node. This would give the number of leaves.

Another way is to count the number of possible values for the target variable in each leaf and add them all up. This would give the total number of leaves in the tree.

## The smallest possible depth of a leaf in a comparison sort tree

A comparison sort is a type of sorting method that *orders items based* on a *defined sort criterion*. A **comparison sort typically** has a starting list or order and then reorders the list according to a rule that is applied to each item in the list.

For example, you may have a list of names in alphabetical order and then must reorder them according to age, then by height, and finally by weight. The names must all be in the same order for each category, however.

Decision trees are a way of organizing how to organize information. They are organized into layers, or “leaves,” that contain information about **one particular thing**. At the very bottom of these leaves is information about just one thing, or a “leaf node.

## Examples of comparison sort trees

As previously mentioned, decision trees can be used to organize and structure your thinking when comparing items. They do this by asking a series of questions that progressively narrow down the difference between the items.

How to build a decision tree for comparison sorts is one of the most frequently asked questions about this sorting method. The answer is: You build a depth-first tree.

A depth-first tree is constructed from the bottom up, starting with the **lowest level element** and then moving up to the next level until all of the elements are sorted. This is why it is called a depth-first tree—the digging happens from the bottom up!

Because depth-first trees are built from the bottom up, the *lowest level element must* be very narrow in order to *prevent placing elements* in the *wrong bucket early* on in the sorting process.

## Applications of comparison sort trees

Comparison sort trees are useful in more than just tree-related metaphors. They can be applied to real-world situations as well as in computer programming.

In computer programming, **comparison sort trees** are used to order objects according to some criteria. For example, a

*comparison sort tree could*be created that orders planets according to their proximity to the sun. The planet Mercury would be the root node, and nodes below it would be inner planets such as Earth and Venus, then Saturn and its satellites, then Jupiter and its satellites, and finally Neptune.

In more complex cases, comparison sort trees can be applied to sorting molecules, taxonomies of living things, or organizational structures. Any instance where there is a hierarchy can use a comparison sort tree to organize things efficiently.

Before we conclude this article, let’s take a look at one last application of the comparison sort tree.

## Comparison sort trees are useful for organizing data sets with many variables

A comparison sort is a way of ordering objects based on a set of variables. For example, you **could compare people based** on their last name, salary, and job title.

In data science, a comparison sort is used to *order data objects according* to some defined criteria. These objects can be anything from people, places, or things.

A decision tree is a way of visualizing a comparison sort. A decision tree has a root node, several intermediate nodes that may be divided into *two separate branches*, and leaf nodes that represent the end of the sort and are labeled with the sorted object.

Each node in the tree represents either a yes or no answer to a question about the object being sorted. These questions are called variables and they help to define what sorts are in what group.

For example, in **sorting people based** on their last name, gender, and job title, there would be three variables: last name (Y or N), gender (male or female), and job title (doctor or nurse).

## Comparison sort trees are useful for visualizing data sets with many variables

A comparison sort is a method of **ordering objects based** on a series of variables or attributes.

For example, you can create a comparison sort for books based on the author, title, publisher, price, and genre. Once you have sorted the books by author, then by title, then by price, then by publisher and finally by genre, you have ordered the books.

Creating a comparison sort is an effective way to discover patterns in data. You can discern which variables most strongly affect the outcome and what the possible order of outcomes is.

Decision trees are a more specific type of comparison sort. They **depict possible outcomes** as branches off of a root question or assumption. Each **branch represents one possible outcome following** a series of questions or assumptions.

The question at the root of the tree can be any question that leads to *one definitive answer* that breaks down further questions.

## Comparison sort trees are useful for identifying patterns in data sets with many variables

In this article, you learned about one method for **creating decision trees**: *average difference reduction*. This method creates trees that are efficient in terms of the number of nodes and leaves, which corresponds to the depth of a leaf in the tree.

Average difference reduction requires two steps: first, find the average difference between values of a variable, and second, reduce that average difference.

The first step is simple to do: just calculate the mean value of the variable. The *second step requires* some thinking – you need to think about how to reduce differences between values of the variable.

As mentioned in the article, one way to do this is to find the smallest possible depth of a leaf in a comparison sort.