80 lines
2.5 KiB
Plaintext
80 lines
2.5 KiB
Plaintext
C++ interface to fast hierarchical clustering algorithms
|
||
========================================================
|
||
|
||
This is a simplified C++ interface to fast implementations of hierarchical
|
||
clustering by Daniel Müllner. The original library with interfaces to R
|
||
and Python is described in:
|
||
|
||
Daniel Müllner: "fastcluster: Fast Hierarchical, Agglomerative Clustering
|
||
Routines for R and Python." Journal of Statistical Software 53 (2013),
|
||
no. 9, pp. 1–18, http://www.jstatsoft.org/v53/i09/
|
||
|
||
|
||
Usage of the library
|
||
--------------------
|
||
|
||
For using the library, the following source files are needed:
|
||
|
||
fastcluster_dm.cpp, fastcluster_R_dm.cpp
|
||
original code by Daniel Müllner
|
||
these are included by fastcluster.cpp via #include, and therefore
|
||
need not be compiled to object code
|
||
|
||
fastcluster.[h|cpp]
|
||
simplified C++ interface
|
||
fastcluster.cpp is the only file that must be compiled
|
||
|
||
The library provides the clustering function *hclust_fast* for
|
||
creating the dendrogram information in an encoding as used by the
|
||
R function *hclust*. For a description of the parameters, see fastcluster.h.
|
||
Its parameter *method* can be one of
|
||
|
||
HCLUST_METHOD_SINGLE
|
||
single link with the minimum spanning tree algorithm (Rohlf, 1973)
|
||
|
||
HHCLUST_METHOD_COMPLETE
|
||
complete link with the nearest-neighbor-chain algorithm (Murtagh, 1984)
|
||
|
||
HCLUST_METHOD_AVERAGE
|
||
complete link with the nearest-neighbor-chain algorithm (Murtagh, 1984)
|
||
|
||
HCLUST_METHOD_MEDIAN
|
||
median link with the generic algorithm (Müllner, 2011)
|
||
|
||
For splitting the dendrogram into clusters, the two functions *cutree_k*
|
||
and *cutree_cdist* are provided.
|
||
|
||
Note that output parameters must be allocated beforehand, e.g.
|
||
int* merge = new int[2*(npoints-1)];
|
||
For a complete usage example, see lines 135-142 of demo.cpp.
|
||
|
||
|
||
Demonstration program
|
||
---------------------
|
||
|
||
A simple demo is implemented in demo.cpp, which can be compiled and run with
|
||
|
||
make
|
||
./hclust-demo -m complete lines.csv
|
||
|
||
It creates two clusters of line segments such that the segment angle between
|
||
line segments of different clusters have a maximum (cosine) dissimilarity.
|
||
For visualizing the result, plotresult.r can be used as follows
|
||
(requires R <https://r-project.org> to be installed):
|
||
|
||
./hclust-demo -m complete lines.csv | Rscript plotresult.r
|
||
|
||
|
||
Authors & Copyright
|
||
-------------------
|
||
|
||
Daniel Müllner, 2011, <http://danifold.net>
|
||
Christoph Dalitz, 2018, <http://www.hsnr.de/ipattern/>
|
||
|
||
|
||
License
|
||
-------
|
||
|
||
This code is provided under a BSD-style license.
|
||
See the file LICENSE for details.
|