Skip to main content

What Are We Looking For?

Author
Abstract

A Comparative Review of Clustering Algorithms and Cluster Quality Indices For Sequence Analysis
Sequence Analysis (SA) aims to provide a holistic view of life trajecto ries by creating a typology. Technically, it involves three steps: comparing the trajectories using a dissimilarity measure, regrouping similar pathways into types using a clustering algorithm before evaluating its quality using a cluster quality index.
This article aims to provide a comprehensive overview of the different clustering algorithms and cluster quality indices and to draw guidelines on their choice. The different methods are evaluated using simulations designed to reflect the different kinds of research questions addressed with sequence analysis as well as common longitudinal data characteristics. The results highlight the good performance of Consensus Clustering
with Ward linkage and Partition Around Medoids depending on the re search aims and data characteristics. It also highlights the need to further define the required level of detail of a typology before choosing a cluster quality index. While the Calinsky-Harabasz Index with squared distances can be advocated for when aiming to create a parsimonious typology, the Hubert C index is recommended otherwise.

Year of Publication
2026
Journal
LIVES Working Papers
Volume
108
Start Page
1-74
Number of Pages
74
Date Published
02/2026
ISSN Number
2296-1658
URL
http://dx.doi.org/10.12682/lives.2296-1658.2026.108
DOI
10.12682/lives.2296-1658.2026.108
Keywords
Download citation
File (PDF)