What is Indexing

What is an index?

Basically an index on a table works like an index in a book (that's where the name came from):

Let's say you have a book about databases and you want to find some information about, say, storage. Without an index (assuming no other aid, such as a table of contents) you'd have to go through the pages one by one, until you found the topic (that's a full table scan). On the other hand, an index has a list of keywords, so you'd consult the index and see that storage is mentioned on pages 113-120,231 and 354. Then you could flip to those pages directly, without searching (that's a search with an index, somewhat faster).

Of course, how useful the index will be, depends on many things - a few examples, using the simile above:
  • If you had a book on databases and indexed the word "database", you'd see that it's mentioned on pages 1-59,61-290, and 292 to 400. In such case, the index is not much help and it might be faster to go through the pages one by one (in a database, this is "poor selectivity").
  • For a 10-page book, it makes no sense to make an index, as you may end up with a 10-page book prefixed by a 5-page index, which is just silly - just scan the 10 pages and be done with it.
  • The index also needs to be useful - there's generally no point to index e.g. the frequency of the letter "L" per page.
In  technical term an index is a data structure (most commonly a B- tree) that stores the values for a specific column in a table. An index is created on a column of a table. So, the key points to remember are that an index consists of column values from one table, and that those values are stored in a data structure. The index is a data structure – remember that.

Let’s start out our tutorial and explanation of why you would need a database index by going through a very simple example. Suppose that we have a database table called Employee with three columns – Employee_Name, Employee_Age, and Employee_Address. Assume that the Employee table has thousands of rows.

Now, let’s say that we want to run a query to find all the details of any employees who are named ‘Jesus’? So, we decide to run a simple query like this:


SELECT * FROM Employee WHERE Employee_Name = 'Jesus'

What would happen without an index on the table?

Once we run that query, what exactly goes on behind the scenes to find employees who are named Jesus? Well, the database software would literally have to look at every single row in the Employee table to see if the Employee_Name for that row is ‘Jesus’. And, because we want every row with the name ‘Jesus’ inside it, we can not just stop looking once we find just one row with the name ‘Jesus’, because there could be other rows with the name Jesus. So, every row up until the last row must be searched` – which means thousands of rows in this scenario will have to be examined by the database to find the rows with the name ‘Jesus’. This is what is called a full table scan.

How a database index can help performance?

You might be thinking that doing a full table scan sounds inefficient for something so simple – shouldn’t software be smarter? It’s almost like looking through the entire table with the human eye – very slow and not at all sleek. But, as you probably guessed by the title of this article, this is where indexes can help a great deal. The whole point of having an index is to speed up search queries by essentially cutting down the number of records/rows in a table that need to be examined.

What kind of data structure is an index?

B- trees are the most commonly used data structures for indexes. The reason B- trees are the most popular data structure for indexes is due to the fact that they are time efficient – because look-ups, deletions, and insertions can all be done in logarithmic time. And, another major reason B- trees are more commonly used is because the data that is stored inside the B- tree can be sorted. The RDBMS typically determines which data structure is actually used for an index. But, in some scenarios with certain RDBMS’s, you can actually specify which data structure you want your database to use when you create the index itself.

How does an index improve performance?

Because an index is basically a data structure that is used to store column values, looking up those values becomes much faster. And, if an index is using the most commonly used data structure type – a B- tree – then the data structure is also sorted. Having the column values be sorted can be a major performance enhancement – read on to find out why. Let’s say that we create a B- tree index on the Employee_Name column This means that when we search for employees named “Jesus” using the SQL we showed earlier, then the entire Employee table does not have to be searched to find employees named “Jesus”. Instead, the database will use the index to find employees named Jesus, because the index will presumably be sorted alphabetically by the Employee’s name. And, because it is sorted, it means searching for a name is a lot faster because all names starting with a “J” will be right next to each other in the index! It’s also important to note that the index also stores pointers to the table row so that other column values can be retrieved.

How to create an index in MYSQL/SQL? 

Here’s what the actual SQL would look like to create an index on the Employee_Name column from our example earlier:
CREATE INDEX name_index ON Employee (Employee_Name);

How to create a multi-column index in SQL?

CREATE INDEX name_index ON Employee (Employee_Name, Employee_Age);


If you like this post leave a comment from comment box.

Chears :)
Happy Coding..

Post a Comment

Previous Post Next Post