Электронная книга: Mohamed Nadif «Co-Clustering. Models, Algorithms and Applications»

Co-Clustering. Models, Algorithms and Applications

Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixtures adapted to different types of data. The algorithms used are described and related works with different classical methods are presented and commented upon. This chapter is useful in tackling the problem of co-clustering under the mixture approach. Chapter 2 is devoted to the latent block model proposed in the mixture approach context. The authors discuss this model in detail and present its interest regarding co-clustering. Various algorithms are presented in a general context. Chapter 3 focuses on binary and categorical data. It presents, in detail, the appropriated latent block mixture models. Variants of these models and algorithms are presented and illustrated using examples. Chapter 4 focuses on contingency data. Mutual information, phi-squared and model-based co-clustering are studied. Models, algorithms and connections among different approaches are described and illustrated. Chapter 5 presents the case of continuous data. In the same way, the different approaches used in the previous chapters are extended to this situation. Contents 1. Cluster Analysis. 2. Model-Based Co-Clustering. 3. Co-Clustering of Binary and Categorical Data. 4. Co-Clustering of Contingency Tables. 5. Co-Clustering of Continuous Data. About the Authors Gérard Govaert is Professor at the University of Technology of Compiègne, France. He is also a member of the CNRS Laboratory Heudiasyc (Heuristic and diagnostic of complex systems). His research interests include latent structure modeling, model selection, model-based cluster analysis, block clustering and statistical pattern recognition. He is one of the authors of the MIXMOD (MIXtureMODelling) software. Mohamed Nadif is Professor at the University of Paris-Descartes, France, where he is a member of LIPADE (Paris Descartes computer science laboratory) in the Mathematics and Computer Sciencedepartment. His research interests include machine learning, data mining, model-based cluster analysis, co-clustering, factorization and data analysis. Cluster Analysis is an important tool in a variety of scientific areas. Chapter 1 briefly presents a state of the art of already well-established aswell more recent methods. The hierarchical, partitioning and fuzzy approaches will be discussed amongst others. The authors review the difficulty of these classical methods in tackling the high dimensionality, sparsity and scalability. Chapter 2 discusses the interests of coclustering, presenting different approaches and defining a co-cluster. The authors focus on co-clustering as a simultaneous clustering and discuss the cases of binary, continuous and co-occurrence data. The criteria and algorithms are described and illustrated on simulated and real data. Chapter 3 considers co-clustering as a model-based co-clustering. A latent block model is defined for different kinds of data. The estimation of parameters and co-clustering is tackled under two approaches: maximum likelihood and classification maximum likelihood. Hard and soft algorithms are described and applied on simulated and real data. Chapter 4 considers co-clustering as a matrix approximation. The trifactorization approach is considered and algorithms based on update rules are described. Links with numerical and probabi

Издательство: "John Wiley&Sons Limited"

ISBN: 9781118649497

электронная книга

Купить за 7794.92 руб и скачать на Litres

Другие книги схожей тематики:

АвторКнигаОписаниеГодЦенаТип книги
Clarisse DhaenensMetaheuristics for Big DataBig Data is a new field, with many technological challenges to be understood in order to use it to its full potential. These challenges arise at all stages of working with Big Data, beginning with… — @John Wiley&Sons Limited, @ @ @ @ Подробнее...
8108.47электронная книга

Look at other dictionaries:

  • k-means clustering — In statistics and data mining, k means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. This results into a partitioning of… …   Wikipedia

  • Ant colony optimization algorithms — Ant behavior was the inspiration for the metaheuristic optimization technique. In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems which can be… …   Wikipedia

  • HSL and HSV — Fig. 1. HSL (a–d) and HSV (e–h). Above (a, e): cut away 3D models of each. Below: two dimensional plots showing two of a model’s three parameters at once, holding the other constant: cylindrical shells (b, f) of constant saturation, in this case… …   Wikipedia

  • Eigenvalues and eigenvectors — For more specific information regarding the eigenvalues and eigenvectors of matrices, see Eigendecomposition of a matrix. In this shear mapping the red arrow changes direction but the blue arrow does not. Therefore the blue arrow is an… …   Wikipedia

  • Conceptual clustering — is a machine learning paradigm for unsupervised classification developed mainly during the 1980s. It is distinguished from ordinary data clustering by generating a concept description for each generated class. Most conceptual clustering methods… …   Wikipedia

  • Environment for DeveLoping KDD-Applications Supported by Index-Structures — ELKI 0.4 visualisiert OPTICS Ergebnis Basisdaten Maintainer …   Deutsch Wikipedia

  • Mixture model — See also: Mixture distribution In statistics, a mixture model is a probabilistic model for representing the presence of sub populations within an overall population, without requiring that an observed data set should identify the sub population… …   Wikipedia

  • 2-satisfiability — In computer science, 2 satisfiability (abbreviated as 2 SAT or just 2SAT) is the problem of determining whether a collection of two valued (Boolean or binary) variables with constraints on pairs of variables can be assigned values satisfying all… …   Wikipedia

  • Non-negative matrix factorization — NMF redirects here. For the bridge convention, see new minor forcing. Non negative matrix factorization (NMF) is a group of algorithms in multivariate analysis and linear algebra where a matrix, , is factorized into (usually) two matrices, and… …   Wikipedia

  • Word-sense disambiguation — Disambiguation redirects here. For other uses, see Disambiguation (disambiguation). In computational linguistics, word sense disambiguation (WSD) is an open problem of natural language processing, which governs the process of identifying which… …   Wikipedia

  • Spatial analysis — In statistics, spatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties. The phrase properly refers to a variety of techniques, many still in… …   Wikipedia


We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.