Referencias
Alonso C., J. C. (2020). Herramientas del Business Analitycs en R : Análisis de Componentes Principales para resumir variables. Economics Lecture Notes, 10, 1–32.
Alonso, J. C. (2002). A new accelerator for the EM ALgorithm (p. 61) [Master’s thesis]. Iowa State University; Thesis (Ms)–Iowa State University, 2002.
Alonso, J. C. (2022). Empezando a transformar bases de datos con R y dplyr. Universidad Icesi. https://doi.org/10.18046/EUI/bda.h.2
Alonso, J. C. (2024). Introducción al modelo clásico de regresión para científico de datos en R. Universidad Icesi. https://doi.org/10.18046/EUI/bda.h.4
Alonso, J. C., & Arboleda, A. M. (2025). Introducción al análisis de canastas de compra para analytics translators y científicos de datos (empleando R). Universidad Icesi.
Alonso, J. C., & Hoyos, C. C. (2025a). Una introducción a los modelos de clasificación empleando R. Universidad Icesi. https://doi.org/10.18046/EUI/bda.h.5
Alonso, J. C., & Hoyos, C. C. (2025b). Una introducción a los modelos de clasificación empleando R. Universidad Icesi. https://doi.org/10.18046/EUI/bda.h.5
Alonso, J. C., & Largo, M. F. (2023). Empezando a visualizar datos con R y ggplot2. (2. ed.). Universidad Icesi. https://doi.org/10.18046/EUI/bda.h.3.2
Alonso, J. C., & Ocampo, M. P. (2022). Empezando a usaR: Una guía paso a paso. Universidad Icesi. https://doi.org/10.18046/EUI/bda.h.1
Arthur, D., Vassilvitskii, S., et al. (2007). K-means++: The advantages of careful seeding. Soda, 7, 1027–1035.
Baker, F. B., & Hubert, L. J. (1976). A graph-theoretic approach to goodness-of-fit in complete-link hierarchical clustering. Journal of the American Statistical Association, 71(356), 870–878.
Ball, G. H., & Hall, D. J. (1965). ISODATA, a novel method of data analysis and pattern classification. Stanford research inst Menlo Park CA.
Beale, E. (1969). Euclidean cluster analysis. Scientific Control Systems Limited.
Bezdek, J. C., & Pal, N. R. (1998). Some new indexes of cluster validity. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 28(3), 301–315.
Brock, G., Pihur, V., Datta, S., & Datta, S. (2008). clValid: An R package for cluster validation. Journal of Statistical Software, 25(4), 1–22. https://www.jstatsoft.org/v25/i04/
Caliński, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics-Theory and Methods, 3(1), 1–27.
Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2014). NbClust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, 61(6), 1–36. http://www.jstatsoft.org/v61/i06/
Chen, Y., Ruys, W., & Biros, G. (2020). KNN-DBSCAN: A DBSCAN in high dimensions. arXiv Preprint arXiv:2009.04552.
Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2, 224–227.
de Vries, A., & Ripley, B. D. (2024). Ggdendro: Create dendrograms and tree diagrams using ’ggplot2’. https://CRAN.R-project.org/package=ggdendro
Dimitriadou, E., Dolničar, S., & Weingessel, A. (2002). An examination of indexes for determining the number of clusters in binary data sets. Psychometrika, 67(1), 137–159.
Duda, R. O., Hart, P. E., et al. (1973). Pattern classification and scene analysis (Vol. 3). Wiley New York.
Dunn, J. C. (1974). Well-separated clusters and optimal fuzzy partitions. Journal of Cybernetics, 4(1), 95–104.
Edwards, A. W., & Cavalli-Sforza, L. L. (1965). A method for cluster analysis. Biometrics, 362–375.
Ester, M., Kriegel, H.-P., Sander, J., Xu, X., et al. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd, 96, 226–231.
Fox, J., & Weisberg, S. (2019). An R companion to applied regression (Third). Sage. https://socialsciences.mcmaster.ca/jfox/Books/Companion/
Frey, T., & Van Groenewoud, H. (1972). A cluster analysis of the D2 matrix of white spruce stands in saskatchewan based on the maximum-minimum principle. The Journal of Ecology, 873–886.
Friedman, H. P., & Rubin, J. (1967). On some invariant criteria for grouping data. Journal of the American Statistical Association, 62(320), 1159–1178.
Fukunaga, K., & Koontz, W. L. (1970). A criterion and an algorithm for grouping data. IEEE Transactions on Computers, 100(10), 917–923.
Godichon-Baggioni, A., & Surendran, S. (2023). Kmedians: K-medians. https://CRAN.R-project.org/package=Kmedians
Gordon, A. (1999). Cluster description. University of St. Andrews Scotland.
Hahsler, M., Piekenbrock, M., & Doran, D. (2019). dbscan: Fast density-based clustering with R. Journal of Statistical Software, 91(1), 1–30. https://doi.org/10.18637/jss.v091.i01
Haldiki, M., Batistakis, Y., & Vazirgiannis, M. (2002). Cluster validity methods. SIGMOD, 31, 40–45.
Halkidi, M., & Vazirgiannis, M. (2001). Clustering validity assessment: Finding the optimal partitioning of a data set. Proceedings 2001 IEEE International Conference on Data Mining, 187–194.
Hartigan, J. A. (1975). Clustering algorithms. John Wiley & Sons, Inc.
Hill, R. S. (1980). A stopping rule for partitioning dendrograms. Botanical Gazette, 141(3), 321–324.
Hubert, L. J., & Levin, J. R. (1976). A general statistical framework for assessing categorical clustering in free recall. Psychological Bulletin, 83(6), 1072.
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Kassambara, A. (2017). Practical guide to cluster analysis in r: Unsupervised machine learning (Vol. 1). Sthda.
Kassambara, A., & Mundt, F. (2020). Factoextra: Extract and visualize the results of multivariate data analyses. https://CRAN.R-project.org/package=factoextra
Kaufman, L., & Rousseeuw, P. J. (2009). Finding groups in data: An introduction to cluster analysis (Vol. 344). John Wiley & Sons.
Kraemer, H. C. (2004). Biserial correlation. Encyclopedia of Statistical Sciences, 1.
Lance, G. N., & Williams, W. T. (1967). Mixed-data classificatory programs i - agglomerative systems. Australian Computer Journal, 1(1), 15–20.
Lebart, L., Morineau, A., & Piron, M. (1995). Statistique exploratoire multidimensionnelle (Vol. 3). Dunod Paris.
Lüdecke, D., Ben-Shachar, M. S., Patil, I., & Makowski, D. (2020). Extracting, computing and exploring the parameters of statistical models using R. Journal of Open Source Software, 5(53), 2445. https://doi.org/10.21105/joss.02445
Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., & Hornik, K. (2022). Cluster: Cluster analysis basics and extensions. https://CRAN.R-project.org/package=cluster
Marriott, F. (1971). Practical problems in a method of cluster analysis. Biometrics, 501–514.
McClain, J. O., & Rao, V. R. (1975). Clustisz: A program to test for the quality of clustering of a set of objects. Journal of Marketing Research, 456–460.
Milligan, G. W. (1980). An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 45(3), 325–342.
Milligan, G. W. (1981). A monte carlo study of thirty internal criterion measures for cluster analysis. Psychometrika, 46(2), 187–199.
Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50(2), 159–179.
Murtagh, F., & Legendre, P. (2014). Ward’s hierarchical agglomerative clustering method: Which algorithms implement ward’s criterion? Journal of Classification, 31(3), 274–295.
Orlóci, L. (1967). An agglomerative method for classification of plant communities. The Journal of Ecology, 193–206.
R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
R Core Team. (2023). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
Ratkowsky, D., & Lance, G. (1978). Criterion for determining the number of groups in a classification.
Rdusseeun, L., & Kaufman, P. (1987). Clustering by means of medoids. Proceedings of the Statistical Data Analysis Based on the L1 Norm Conference, Neuchatel, Switzerland, 31.
Rohlf, F. J. (1974). Methods of comparing classifications. Annual Review of Ecology and Systematics, 5(1), 101–113.
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.
Schloerke, B., Cook, D., Larmarange, J., Briatte, F., Marbach, M., Thoen, E., Elberg, A., & Crowley, J. (2023). GGally: Extension to ’ggplot2’. https://CRAN.R-project.org/package=GGally
Scott, A. J., & Symons, M. J. (1971). Clustering methods based on likelihood ratio criteria. Biometrics, 387–397.
Scrucca, L., Fraley, C., Murphy, T. B., & Raftery, A. E. (2023). Model-based clustering, classification, and density estimation using mclust in R. Chapman; Hall/CRC. https://doi.org/10.1201/9781003277965
Shah, A. (2021). Credit card customer data. https://www.kaggle.com/datasets/aryashah2k/credit-card-customer-data?select=Credit+Card+Customer+Data.csv
Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 411–423.
Ward Jr, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org
Wickham, H., François, R., Henry, L., & Müller, K. (2022). Dplyr: A grammar of data manipulation. https://CRAN.R-project.org/package=dplyr
You, K. (2023). Maotai: Tools for matrix algebra, optimization and inference. https://CRAN.R-project.org/package=maotai
Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353. https://doi.org/https://doi.org/10.1016/S0019-9958(65)90241-X