Indexing and its Types

Indexing

  • Indexing is defined as a data structure technique which allows you to quickly retrieve records from a database file.
  • It is based on the same attributes on which the indices has been done.
  • An index
    • Takes a search key as input
    • Efficiently returns a collection of matching records.
  • An Index is a small table having only two columns.
    • The first column comprises a copy of the primary or candidate key of a table.
    • The second column contains a set of pointers for holding the address of the disk block where that specific key value stored.
  • There are mainly 4 types of indexing methods
    • Primary Indexing
    • Secondary Indexing
    • Cluster Indexing
    • Multilevel Indexing
Types of indexing

Primary Indexing

  • Primary Index is an ordered file which has fixed length size with two fields.
  • The first field is the same a primary key and second field is pointed to that specific data block.
  • In the primary index, there is always one to one relationship between the entries in the index table.
  • Primary Indexing is further divided into two types.
    • Dense Index
    • Sparse Index

Dense Index

  • In a dense index, a record is created for every search key valued in the database.
  • Dense indexing helps you to search faster but needs more space to store index records.
  • In dense indexing, records contain search key value and points to the real record on the disk.

Sparse Index

  • Sparse index is an index record that appears for only some of the values in the file.
  • Sparse Index helps you to resolve the issues of dense indexing.
  • In sparse indexing technique, a range of index columns stores the same data block address, and when data needs to be retrieved, this block address will be fetched.
  • Sparse indexing method stores index records for only some search key values.
  • It needs less space, less maintenance overhead for insertion, and deletions but it is slower compared to the dense index for locating records.
Sparse Index

Secondary Indexing

  • The secondary index can be generated by a field which has a unique value for each record.
  • It is also known as a non-clustering index.
  • This two-level database indexing technique is used to reduce the mapping size of the first level.
  • For the first level, a large range of numbers is selected, because of this mapping size always remains small.

Example

  • In a bank account database, data is stored sequentially by Account_No,  we may want to find all accounts in of a specific branch of some bank.
  • In this case, we can have a secondary index for every search key.
  • Index record is a record pointing to a bucket that contains pointers to all the records with their specific search key value.
Secondary Index

Cluster Indexing

  • In a clustered index, records themselves are stored in the index and not pointers.
  • Sometimes the index is created on non-primary key columns which might not be unique for each record. In such a situation, you can group two or more columns to get the unique values and create an index which is called clustered Index.
  • This also helps you to identify the record faster.

Example

  • Consider a company recruited many employees in various departments. In this case, clustering index should be created for all employees who belong to the same dept.
  • In a single cluster it is considered that an index points to the cluster as a whole.

Multilevel Indexing

  • Multilevel Indexing is created when a primary index does not fit in memory.
  • In this type of indexing method, you can reduce the number of disk accesses to short any record and kept on a disk as a sequential file and create a sparse base on that file.

B-Tree Indexing

  • B-tree index is the widely used data structures for indexing.
  • It is a multilevel index format technique which is balanced binary search trees.
  • All leaf nodes of the B tree signify actual data pointers.
  • In B-Tree indexing, all leaf nodes are interlinked with a link list, which allows a it to support both random and sequential access.

Properties of B-Tree

  • Every path from the root to leaf are mostly of equal length.
  • Every node which is not a root or a leaf has between n/2 and n children, where n is the degree of B-tree.
B-tree Index

Advantages of Indexing

  • It helps you to reduce the total number of I/O operations needed to retrieve the data, so you don’t need to access a row in the database from an index structure.
  • Offers faster search and retrieval of data to users.
  • Indexing also helps you to reduce table space as you don’t need to link to a row in a table, as there is no need to store the ROWID in the Index. Thus you will able to reduce the table space.
  • You don’t need to sort data in the lead nodes as the value of the primary key classifies it.

Disadvantages of Indexing

  • To perform the indexing database management system, you need a primary key on the table with a unique value.
  • You are not allowed to partition an index-organized table.
  • SQL indexing decrease performance in INSERT, DELETE, and UPDATE query.