Researchers Database

HAMASUNA Yukihiro

Department of Informatics　Associate Professor

Last Updated :2024/04/25

Researcher Information

Research funding number

70610559

J-Global ID

201401059571433209

Research Interests

Data Science Machine Learning Soft Computing Clustering

Research Areas

Informatics / Sensitivity (kansei) informatics

Informatics / Soft computing

Informatics / Intelligent informatics

Informatics / Information theory

Published Papers

A Novel Noise Clustering Based on Local Outlier Factor
Yukihiro Hamasuna; Yoshitomo Mori
Lecture Notes in Computer Science Springer Nature Switzerland 14376 179 - 191 0302-9743 2023/10 [Refereed]

The relationship between Gaussian process based c-regression models and kernel c-regression models
Yukihiro Hamasuna; Yuya Yokoyama; Kaito Takegawa
2022 Joint 12th International Conference on Soft Computing and Intelligent Systems and 23rd International Symposium on Advanced Intelligent Systems (SCIS&ISIS) IEEE 2022/11 [Refereed]

Network Clustering with Controlled Node Size
Yukihiro Hamasuna; Shusuke Nakano; Yasunori Endo
Modeling Decisions for Artificial Intelligence, LNAI 12898 Springer International Publishing 243 - 256 0302-9743 2021/09 [Refereed]

Jensen–Shannon Divergence-Based k-Medoids Clustering
Yuto Kingetsu; Yukihiro Hamasuna
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press Ltd. 25 (2) 226 - 233 1343-0130 2021/03 [Refereed]

Several conventional clustering methods use the squared L₂-norm as the dissimilarity. The squared L₂-norm is calculated from only the object coordinates and obtains a linear cluster boundary. To extract meaningful cluster partitions from a set of massive objects, it is necessary to obtain cluster partitions that consisting of complex cluster boundaries. In this study, a JS-divergence-based k-medoids (JSKMdd) is proposed. In the proposed method, JS-divergence, which is calculated from the object distribution, is considered as the dissimilarity. The object distribution is estimated from kernel density estimation to calculate the dissimilarity based on both the object coordinates and their neighbors. Numerical experiments were conducted using five artificial datasets to verify the effectiveness of the proposed method. In the numerical experiments, the proposed method was compared with the k-means clustering, k-medoids clustering, and spectral clustering. The results show that the proposed method yields better results in terms of clustering performance than other conventional methods.

Three Controlled-Sized Clustering Methods for Time-Series Data
TSUDA Nobuhiko; HAMASUNA Yukihiro; ENDO Yasunori
Journal of Japan Society for Fuzzy Theory and Intelligent Informatics Japan Society for Fuzzy Theory and Intelligent Informatics 33 (2) 608 - 616 1347-7986 2021 [Refereed]

Time-series data is data that contains information about time-varying phenomena, and it has a wide range of applications. Clustering is one of the data analysis methods to analyze large complex time-series data and extract their features. The important issues in clustering time-series data is the selection of a suitable dissimilarity and the selection of a suitable clustering algorithm. In this paper, we propose new clustering methods to handle imbalanced time-series data by introducing the concept of size-control into the clustering methods for time-series data. The proposed methods are constructed by extending k-medoids using dynamic time warping (DTW) for dissimilarity, k-medoids and k-shape using shape-based distance (SBD) for dissimilarity, which are typical methods for time-series data. The performance of the proposed methods is verified by numerical experiments using 12 datasets available in the UCR Time Series Classification Archive. From the numerical experiments, we confirmed that k-medoids with size control using DTW obtains the best cluster partition among the proposed methods.

Controlled-Sized Clustering for Time-Series Data
Nobuhiko Tsuda; Yukihiro Hamasuna
2020 Joint 11th International Conference on Soft Computing and Intelligent Systems and 21st International Symposium on Advanced Intelligent Systems (SCIS-ISIS) IEEE 2020/12 [Refereed]

On Fuzzy c-Regression Models Based on Gaussian Process Regression Models
Yukihiro Hamasuna; Daiki Kobayashi; Yasunori Endo
The 17th International Conference on Modeling Decisions for Artificial Intelligence 2020/09 [Refereed]

On hard c-means with cluster radius that makes the cluster partition homotopy equivalent to weighted 𝛼-complex
Yasunori Endo; Kanata Hoshino; Yukihiro Hamasuna
Journal of Ambient Intelligence and Humanized Computing Springer Science and Business Media LLC 1868-5137 2020/07 [Refereed]

k-Medoids Clustering Based on Kernel Density Estimation and Jensen-Shannon Divergence
Yukihiro Hamasuna; Yuto Kingetsu; Shusuke Nakano
The 16th International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2019) 2019/09 [Refereed]

Cluster Validity Measures Based Agglomerative Hierarchical Clustering for Network Data
Yukihiro Hamasuna; Shusuke Nakano; Ryo Ozaki; Yasunori Endo
Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 23 (3) 577 - 583 2019/05 [Refereed]

Cluster Validity Measures for Network Data.
Yukihiro Hamasuna; Daiki Kobayashi; Ryo Ozaki; Yasunori Endo
Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 22 (4) 544 - 550 2018/07 [Refereed]

Fuzzified Even-Sized Clustering Based on Optimization.
Kei Kitajima; Yasunori Endo; Yukihiro Hamasuna
Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 22 (4) 537 - 543 2018/07 [Refereed]

Two-stage clustering based on cluster validity measures
Yukihiro Hamasuna; Ryo Ozaki; Yasunori Endo
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 22 (1) 54 - 61 1883-8014 2018/01 [Refereed]

To handle a large-scale object, a two-stage clustering method has been previously proposed. The method generates a large number of clusters during the first stage and merges clusters during the second stage. In this paper, a novel two-stage clustering method is proposed by introducing cluster validity measures as the merging criterion during the second stage. The significant cluster validity measures used to evaluate cluster partitions and determine the suitable number of clusters act as the criteria for merging clusters. The performance of the proposed method based on six typical indices is compared with eight artificial datasets. These experiments show that a trace of the fuzzy covariance matrix Wtr and its kernelization KWtr are quite effective when applying the proposed method, and obtain better results than the other indices.

Even-sized clustering based on optimization and its variants
Yasunori Endo; Yukihiro Hamasuna; Tsubasa Hirano; Naohiko Kinoshita
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 22 (1) 62 - 69 1883-8014 2018/01 [Refereed]

A clustering method referred to as K-member clustering classifies a dataset into certain clusters, the size of which is more than a given constant K. Even-sized clustering, which classifies a dataset into even-sized clusters, is also considered along with K-member clustering. In our previous study, we proposed Even-sized Clustering Based on Optimization (ECBO) to output adequate results by formulating an even-sized clustering problem as linear programming. The simplex method is used to calculate the belongingness of each object to clusters in ECBO. In this study, ECBO is extended by introducing ideas that were introduced in Kmeans or fuzzy c-means to resolve problems of initialvalue dependence, robustness against outliers, calculation costs, and nonlinear boundaries of clusters. We also reconsider the relation between the dataset size, the cluster number, and K in ECBO. Moreover, we verify the effectiveness of the variants of ECBO based on experimental results using synthetic datasets and a benchmark dataset.

Agglomerative hierarchical clustering based on local optimization for cluster validity measures
Ryo Ozaki; Yukihiro Hamasuna; Yasunori Endo
2017 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2017 Institute of Electrical and Electronics Engineers Inc. 2017- 1822 - 1827 1062-922X 2017/11 [Refereed]

Modularity is an evaluation measure for graph clustering. Louvain method is constructed by local optimization for modularity and is bottom up method as well as agglomerative hierarchical clustering. Cluster validity measures are used to evaluate cluster partitions as well as modularity. They are traditional evaluation measures in the field of clustering. We propose a novel graph clustering which is based on agglomerative hierarchical clustering. The proposed method in this study is constructed by local optimization for cluster validity measures. The effectiveness of the proposed method is shown through numerical examples. Numerical examples show that the proposed method has different clustering propety from Louvain method because of the feature of cluster validity measures.

On Fuzzified Even-sized Clustering Based on Optimization
Kei Kitajima; Yasunori Endo; Yukihiro Hamasuna
The 14th International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2017) 2017/10 [Refereed]

On Edge Penalty Based Hard and Fuzzy c-Medoids for Uncertain Networks
Yukihiro Hamasuna; Yasunori Endo
The 14th International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2017) 2017/10 [Refereed]

Hierarchical clustering algorithms with automatic estimation of the number of clusters
Ryosuke Abe; Sadaaki Miyamoto; Yasunori Endo; Yukihiro Hamasuna
IFSA-SCIS 2017 - Joint 17th World Congress of International Fuzzy Systems Association and 9th International Conference on Soft Computing and Intelligent Systems Institute of Electrical and Electronics Engineers Inc. 1 - 5 2017/08 [Refereed]

The problem of estimating appropriate number of clusters has been a main and difficult issue in clustering researches. There are different methods for this in hierarchical clustering a typical approach is to try clustering for different number of clusters, and compare them using a measure to estimate cluster numbers. On the other hand, there is no such method to estimate automatically the number of clusters in agglomerative hierarchical clustering (AHC), since AHC produces a family of clusters with different cluster numbers at the same time using the form of dendrograms. An exception is the Newman method in network clustering, but this method does not have a useful dendrogram output. The aim of the present paper is to propose new methods to automatically estimate the number of clusters in AHC. We show two approaches for this purpose, one is to use a variation of cluster validity measure, and another is to use statistical model selection method like BIC.

A study on cluster validity measures for clustering network data
Yukihiro Hamasuna; Ryo Ozaki; Yasunori Endo
IFSA-SCIS 2017 - Joint 17th World Congress of International Fuzzy Systems Association and 9th International Conference on Soft Computing and Intelligent Systems Institute of Electrical and Electronics Engineers Inc. 1 - 6 2017/08 [Refereed]

Modularity is one of the evaluation measures for network data and used as the criterion of merging two clusters in Louvain method. To construct useful cluster validity measures for network data, the effectiveness of eight conventional cluster validity measures are compared with Modularity. Cluster partitions of six artificial network datasets are obtained by k-medoids and evaluated by cluster validity measures including Modularity. Numerical experiments show that the Dunn's index is effective in conventional cluster validity measures than other indices.

On various types of controlled-sized clustering based on optimization
Yasunori Endo; Sachiko Ishida; Naohiko Kinoshita; Yukihiro Hamasuna
IEEE International Conference on Fuzzy Systems Institute of Electrical and Electronics Engineers Inc. 1 - 6 1098-7584 2017/08 [Refereed]

Clustering is one of unsupervised classification method, that is, it classifies a data set into some clusters without any external criterion. Typical clustering methods, e.g. k-means (KM) or fuzzy c-means (FCM) are constructed based on optimization of the given objective function. Many clustering methods as well as KM and FCM are formulated as optimization problems with typical objective functions and constraints. The objective function itself is also an evaluation guideline of results of clustering methods. Considered together with its theoretical extensibility, there is the great advantage to construct clustering methods in the framework of optimization. From the viewpoint of optimization, some of the authors proposed an Even-sized Clustering method Based on Optimization (ECBO), which is with tight constraints of cluster size, and constructed some variations of ECBO. The constraint considered in ECBO is that each cluster size is K or K + 1, and the belongingness of each object to clusters is calculated by the simplex method in each iteration. It is considered that ECBO has the advantage in the viewpoint of clustering accuracy, cluster size, and optimization framework than other similar methods. However, the constraint of cluster sizes of ECBO is tight in the meaning of cluster size so that it may be inconvenient in case that some extra margin of cluster size is allowed. Moreover, it is expected that new clustering algorithms in which each cluster size can be controlled deal with more various datasets. From the above view point, we proposed two new clustering algorithms based on ECBO. One is COntrolled-sized Clustering Based on Optimization (COCBO), and the other is an extended COCBO, which is referred to as COntrolled-sized Clustering Based on Optimization++ (COCBO++). Each cluster size can be controlled in the algorithms. However, these algorithms have some problems. In this paper, we will describe various types of COCBO to solve the above problems and estimate the methods in some numerical examples.

Two Roles of Cluster Validity Measures for Clustering Network Data
Yukihiro Hamasuna; Ryo Ozaki; Yasunori Endo
The 2017 conference of the International Federation of Classification Societies (IFCS2017) 2017/08 [Refereed]

On some clustering algorithms based on tolerance
Yukihiro Hamasuna; Yasunori Endo
Studies in Computational Intelligence Springer Verlag 671 87 - 99 1860-949X 2017/01 [Refereed]

A large number of clustering algorithms have been proposed to handle target data and deal with various real-world problems such as uncertain data mining, semi-supervised learning and so on. We focus above two topics and introduce two concepts to construct significant clustering algorithms. We propose tolerance and penalty-vector concepts for handling uncertain data. We also propose clusterwise tolerance concept for semi-supervised learning. These concepts are quite similar approach in the viewpoint of handling objects to be flexible to each clustering topics. We construct two clustering algorithms FCMT and FCMQ for handling uncertain data. We also construct two clustering algorithms FCMCT and SSFCMCT for semi- supervised learning. We consider that those concepts have a potential to resolve conventional and brand new clustering topics in various ways.

Comparison of Trace of Fuzzy Covariance Matrix with Its Kernelization in Cluster Validity Measures based x-means
Yukihiro Hamasuna; Yasunori Endo
The 14th International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2017) 2016/09 [Refereed]

Comparison of cluster validity measures based x-means
Yukihiro Hamasuna; Naohiko Kinoshita; Yasunori Endo
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 20 (5) 845 - 853 1883-8014 2016/09 [Refereed]

The x-means determines the suitable number of clusters automatically by executing k-means recursively. The Bayesian Information Criterion is applied to evaluate a cluster partition in the x-means. A novel type of x-means clustering is proposed by introducing cluster validity measures that are used to evaluate the cluster partition and determine the number of clusters instead of the information criterion. The proposed x- means uses cluster validity measures in the evaluation step, and an estimation of the particular probabilistic model is therefore not required. The performances of a conventional x-means and the proposed method are compared for crisp and fuzzy partitions using eight datasets. The comparison shows that the proposed method obtains better results than the conventional method, and that the cluster validity measures for a fuzzy partition are effective in the proposed method.

On Fuzzy non-metric model for data with tolerance and its application to incomplete data clustering
Yasunori Endo; Tomoyuki Suzuki; Naohiko Kinoshita; Yukihiro Hamasuna
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 20 (4) 571 - 579 1883-8014 2016 [Refereed]

The fuzzy non-metric model (FNM) is a representative non-hierarchical clustering method, which is very useful because the belongingness or the membership degree of each datum to each cluster can be calculated directly from the dissimilarities between data and the cluster centers are not used. However, the original FNM cannot handle data with uncertainty. In this study, we refer to the data with uncertainty as "uncertain data," e.g., incomplete data or data that have errors. Previously, a methods was proposed based on the concept of a tolerance vector for handling uncertain data and some clustering methods were constructed according to this concept, e.g. fuzzy c-means for data with tolerance. These methods can handle uncertain data in the framework of optimization. Thus, in the present study, we apply the concept to FNM. First, we propose a new clustering algorithm based on FNMusing the concept of tolerance, which we refer to as the fuzzy non-metric model for data with tolerance. Second, we show that the proposed algorithm can handle incomplete data sets. Third, we verify the effectiveness of the proposed algorithm based on comparisons with conventional methods for incomplete data sets in some numerical examples.

On Various Types of Even-Sized Clustering Based on Optimization
Yasunori Endo; Tsubasa Hirano; Naohiko Kinoshita; Yikihiro Hamasuna
MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, (MDAI 2016) SPRINGER-VERLAG BERLIN 9880 165 - 177 0302-9743 2016 [Refereed]

Clustering is a very useful tool of data mining. A clustering method which is referred to as K-member clustering is to classify a dataset into some clusters of which the size is more than a given constant K. The K-member clustering is useful and it is applied to many applications. Naturally, clustering methods to classify a dataset into some even-sized clusters can be considered and some even-sized clustering methods have been proposed. However, conventional even-sized clustering methods often output inadequate results. One of the reasons is that they are not based on optimization. Therefore, we proposed Even-sized Clustering Based on Optimization (ECBO) in our previous study. The simplex method is used to calculate the belongingness of each object to clusters in ECBO. In this study, ECBO is extended by introducing some ideas which were introduced in k-means or fuzzy c-means to improve problems of initial-value dependence, robustness against outliers, calculation cost, and nonlinear boundaries of clusters. Moreover, we reconsider the relation between the dataset size, the cluster number, and K in ECBO.

A Method of Two-Stage Clustering Based on Cluster Validity Measures
Ryo Ozaki; Yukihiro Hamasuna; Yasunori Endo
2016 JOINT 8TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 17TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS) IEEE 410 - 415 2016 [Refereed]

Two-stage clustering is constructed from generating stage and merging one. To handle a large scale of objects, an algorithm of the two-stage clustering generates a large number of clusters in the first stage and merge clusters in the second stage. A novel two-stage clustering method is proposed by introducing cluster validity measures which are used to evaluate cluster partition and determine the suitable number of clusters. The significant cluster validity measure is used in the second stage and play a role as criterion to merge clusters. The performance of the proposed method are compared with six artificial datasets and three benchmark datasets. These experiments show that several cluster validity measures, that is, trace of fuzzy covariance matrix and membership degrees based indices are effective in the proposed method and obtain better results than other indices.

On Kernelized Sequential Hard Clustering
Yukihiro Hamasuna; Yasunori Endo
2016 JOINT 8TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 17TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS) IEEE 416 - 419 2016 [Refereed]

A method of sequential clustering extracts a cluster sequentially without determining the number of clusters. The sequential hard clustering is based on noise clustering and one of the typical sequential clustering methods. A kernelized sequential hard clustering is proposed by introducing the kernel method to sequential hard clustering to handle datasets which consists non-linear clusters and execute robust clustering. The performance of the proposed method is evaluated with a typical dataset which consists non-linear cluster boundary. Negative results are obtained through numerical examples and those show that the proposed method can not extract non-linear clusters

Fuzzy non-metric model for data with tolerance and its application to incomplete data clustering
Yasunori Endo; Tomoyuki Suzuki; Naohiko Kinoshita; Yukihiro Hamasuna; Sadaaki Miyamoto
IEEE International Conference on Fuzzy Systems Institute of Electrical and Electronics Engineers Inc. 2015- 1098-7584 2015/11 [Refereed]

Clustering is a technique of unsupervised classification. The methods are classified into two types, one is hierarchical and the other is non-hierarchical. Fuzzy non-metric model (FNM) is a representative method of non-hierarchical clustering. FNM is very useful because belongingness or the membership degree of each datum to each cluster is calculated directly from dissimilarities between data, and cluster centers are not used. However FNM cannot handle data with uncertainty, called uncertain data, e.g. incomplete data, or data which have errors. In order to handle such data, concept of tolerance vector has been proposed. The clustering methods using the concept can handle the uncertain data in the framework of optimization, e.g. fuzzy c-means for data with tolerance (FCM-T). In this paper, we will first propose new clustering algorithm to apply the concept of tolerance to FNM, called fuzzy non-metric model for data with tolerance (FNM-T). Second, we will show that the proposed algorithm handle incomplete data sets. Third, we will verify the effectiveness of the proposed algorithm in comparison with conventional ones for incomplete data sets through some numerical examples.

On Cluster Validity Measures based x-means
Yukihiro Hamasuna; Yasunori Endo
The 12th International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2015) Japan Society for Fuzzy Theory and Intelligent Informatics 31 99 - 100 2015/09 [Refereed]

The x-means divides a set of objects without determining the number of clusters by using iterative k-means and evaluation criteria.A series of cluster validity measures is also used in order to evaluate the clustering results and determine suitable number of clusters. We propose cluster validity measures based x-means by introducing cluster validity measures instead of information criteria.We moreover show the effectiveness of the proposed methodthrough numerical examples.

Fuzzy c-means with quadratic penalty-vector regularization using kullback-leibler information for uncertain data
Naohiko Kinoshita; Yasunori Endo; Yukihiro Hamasuna
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 19 (5) 624 - 631 1883-8014 2015/09 [Refereed]

Clustering, a highly useful unsupervised classification, has been applied in many fields. When, for example, we use clustering to classify a set of objects, it generally ignores any uncertainty included in objects. This is because uncertainty is difficult to deal with and model. It is desirable, however, to handle individual objects as is so that we may classify objects more precisely. In this paper, we propose new clustering algorithms that handle objects having uncertainty by introducing penalty vectors. We show the theoretical relationship between our proposal and conventional algorithms verifying the effectiveness of our proposed algorithms through numerical examples.

On sequential cluster extraction based on L1-regularized possibilistic c-means
Yukihiro Hamasuna; Yasunori Endo
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 19 (5) 655 - 661 1883-8014 2015/09 [Refereed]

Sequential cluster extraction algorithms are useful clustering methods that extract clusters one by one without the number of clusters having to be determined in advance. Typical examples of these algorithms are sequential hard c-means (SHCM) and possibilistic clustering (PCM) based algorithms. Two types of L1-regularized possibilistic clustering are proposed to induce crisp and possibilistic allocation rules and to construct a novel sequential cluster extraction algorithm. The relationship between the proposed method and SHCM is also discussed. The effectiveness of the proposed method is verified through numerical examples. Results show that the entropy-based method yields better results for the Rand Index and the number of extracted clusters.

Fuzzy Non-metric Model for Data with Tolerance and Its Application to Incomplete Data Clustering
Yasunori Endo; Tomoyuki Suzuki; Naohiko Kinoshita; Yukihiro Hamasuna; Sadaaki Miyamoto
2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015) IEEE 20 (4) 571 - 579 1544-5615 2015 [Refereed]

Clustering is a technique of unsupervised classification. The methods are classified into two types, one is hierarchical and the other is non-hierarchical. Fuzzy non-metric model (FNM) is a representative method of non-hierarchical clustering. FNM is very useful because belongingness or the membership degree of each datum to each cluster is calculated directly from dissimilarities between data, and cluster centers are not used. However FNM cannot handle data with uncertainty, called uncertain data, e.g. incomplete data, or data which have errors. In order to handle such data, concept of tolerance vector has been proposed. The clustering methods using the concept can handle the uncertain data in the framework of optimization, e.g. fuzzy c-means for data with tolerance (FCM-T). In this paper, we will first propose new clustering algorithm to apply the concept of tolerance to FNM, called fuzzy non-metric model for data with tolerance (FNM-T). Second, we will show that the proposed algorithm handle incomplete data sets. Third, we will verify the effectiveness of the proposed algorithm in comparison with conventional ones for incomplete data sets through some numerical examples.

On a family of new sequential hard clustering
Yukihiro Hamasuna; Yasunori Endo
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 19 (6) 759 - 765 1883-8014 2015 [Refereed]

This paper presents a new algorithm of sequential cluster extraction based on hard c-means and hard c-medoids clustering. Sequential cluster extraction means that the algorithm extracts 'one cluster at a time.' A characteristic parameter, called a noise parameter, is used in noise clustering based sequential clustering. We propose a novel sequential clustering method called new sequential clustering, extracts an arbitrary number of objects as one cluster by considering the noise parameter as a variable to be optimized. Experimental results with four data sets confirm the effectiveness of our proposal. These results also show that classification results strongly depend on parameter ν and that our proposal is applicable to the first stage in a two-stage clustering algorithm.

Fuzzy Non-metric Model for Data with Tolerance and Its Application to Incomplete Data Clustering
Yasunori Endo; Tomoyuki Suzuki; Naohiko Kinoshita; Yukihiro Hamasuna; Sadaaki Miyamoto
2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015) IEEE 1 - 7 1544-5615 2015 [Refereed]

Clustering is a technique of unsupervised classification. The methods are classified into two types, one is hierarchical and the other is non-hierarchical. Fuzzy non-metric model (FNM) is a representative method of non-hierarchical clustering. FNM is very useful because belongingness or the membership degree of each datum to each cluster is calculated directly from dissimilarities between data, and cluster centers are not used. However FNM cannot handle data with uncertainty, called uncertain data, e.g. incomplete data, or data which have errors. In order to handle such data, concept of tolerance vector has been proposed. The clustering methods using the concept can handle the uncertain data in the framework of optimization, e.g. fuzzy c-means for data with tolerance (FCM-T). In this paper, we will first propose new clustering algorithm to apply the concept of tolerance to FNM, called fuzzy non-metric model for data with tolerance (FNM-T). Second, we will show that the proposed algorithm handle incomplete data sets. Third, we will verify the effectiveness of the proposed algorithm in comparison with conventional ones for incomplete data sets through some numerical examples.

On cluster extraction from relational data using L1-regularized possibilistic assignment prototype algorithm
Yukihiro Hamasuna; Yasunori Endo
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 19 (1) 23 - 28 1883-8014 2015/01 [Refereed]

This paper proposes entropy-based L1-regularized possibilistic clustering and a method of sequential cluster extraction from relational data. Sequential cluster extraction means that the algorithm extracts cluster one by one. The assignment prototype algorithm is a typical clustering method for relational data. The membership degree of each object to each cluster is calculated directly from dissimilarities between objects. An entropy-based L1-regularized possibilistic assignment prototype algorithm is proposed first to induce belongingness for a membership grade. An algorithm of sequential cluster extraction based on the proposed method is constructed and the effectiveness of the proposed methods is shown through numerical examples.

On New Sequential Hard c-Medoids
Yukihiro Hamasuna; Yasunori Endo
2014 JOINT 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 15TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS) IEEE 489 - 494 2014 [Refereed]

This paper presents a new sequential cluster extraction algorithm based on hard c-medoids clustering. The word sequential cluster extraction means that the algorithm extract one cluster at a time. The hard c-medoids is one of the variants of hard c-means clustering. The cluster medoid which is referred to as representative of each cluster is an object in hard c-medoids. The sequential clustering algorithms are based on Dave's noise clustering approach. A characteristic parameter which is called noise parameter is used in noise clustering. We construct a new sequential hard c-medoids algorithm by considering the noise parameter as a variables in optimization problem. First, the optimization problem of new sequential hard c-medoids clustering is introduced. Next, the sequential clustering algorithm is constructed based on the optimization problem. Moreover, the effectiveness of proposed method is shown through numerical experiments.

On Even-sized Clustering Algorithm Based on Optimization
Tsubasa Hirano; Yasunori Endo; Naohiko Kinoshita; Yukihiro Hamasuna
2014 JOINT 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 15TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS) IEEE 495 - 500 2014 [Refereed]

Clustering methods to divide a data set into some clusters of which the size is more than a given constant K, are very useful in many applications. The methods are called K-member clustering (KMC). As a natural result, clustering methods to divide a data set into even-sized clusters can be considered. However, there are no algorithms of such methods based on optimization. That is why the conventional algorithms often output inadequate results. Therefore we should consider an algorithm based on optimization. In this paper, we propose even-sized clustering algorithm using simplex method which is one of optimization method, and verify the proposed method through some numerical examples.

On Cluster Extraction from Relational Data Using Entropy Based Relational Crisp Possibilistic Clustering
Yukihiro Hamasuna; Yasunori Endo
KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2013), VOL 2 SPRINGER-VERLAG BERLIN 245 57 - 67 2194-5357 2014 [Refereed]

The relational clustering is one of the clustering methods for relational data. The membership grade of each datum to each cluster is calculated directly from dissimilarities between datum and the cluster center which is referred to as representative of cluster is not used in relational clustering. This paper discusses a new possibilistic approach for relational clustering from the viewpoint of inducing the crispness. In the previous study, crisp possibilistic clustering and its variant has been proposed by using L-1-regularization. These crisp possibilistic clustering methods induce the crispness in the membership function. In this paper, entropy based crisp possibilistic relational clustering is proposed for handling relational data. Next, the way of sequential extraction is also discussed. Moreover, the effectiveness of proposed method is shown through numerical examples.

Hard and Fuzzy c-means Algorithms with Pairwise Constraints by Non-metric Terms
Yasunori Endo; Naohiko Kinoshita; Kuniaki Iwakura; Yukihiro Hamasuna
MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2014 SPRINGER-VERLAG BERLIN 8825 145 - 157 0302-9743 2014 [Refereed]

Recently, semi-supervised clustering has been focused, e.g., Refs. [2-5]. The semi-supervised clustering algorithms improve clustering results by incorporating prior information with the unlabeled data. This paper proposes three new clustering algorithms with pairwise constraints by introducing non-metric term to objective functions of the well-known clustering algorithms. Moreover, its effectiveness is verified through some numerical examples.

Semi-Supervised Hard and Fuzzy c-Means with Assignment Prototype Term
Yukihiro Hamasuna; Yasunori Endo
MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2014 SPRINGER-VERLAG BERLIN 8825 135 - 144 0302-9743 2014 [Refereed]

Semi-supervised learning is an important task in the field of data mining. Pairwise constraints such as must-link and cannot-link are used in order to improve clustering properties. This paper proposes a new type of semi-supervised hard and fuzzy c-means clustering with assignment prototype term. The assignment prototype term is based on the Windham's assignment prototype algorithm which handles pairwise constraints between objects in the proposed method. First, an optimization problem of the proposed method is formulated. Next, a new clustering algorithm is constructed based on the above discussions. Moreover, the effectiveness of the proposed method is shown through numerical experiments.

On New Sequential Hard c-Means and its Kernelization
Yukihiro Hamasuna; Yasunori Endo
2014 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC) IEEE 82 - 87 2014 [Refereed]

This paper presents a new sequential clustering algorithm based on sequential hard c-means clustering. The word sequential cluster extraction means that the algorithm extract one cluster at a time. The sequential hard c-means is one of the typical and conventional sequential clustering methods. The proposed new sequential clustering algorithm is based on Dave's noise clustering approach. A characteristic parameter which is called noise parameter is applied in Dave's approach. We construct a new sequential hard c-means algorithm by introducing another new parameter which controls a number of extracting objects and considering the noise parameter as a variables in optimization problem. First, the optimization problem of new sequential hard c-means clustering is introduced. Next, the sequential clustering algorithm and its kernelization are constructed based on above optimization problem. Moreover, the effectiveness of proposed method is shown through numerical experiments.

Non Metric Model Based on Rough Set Representation.
Yasunori Endo; Ayako Heki; Yukihiro Hamasuna
Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 17 (4) 540 - 551 2013/07 [Refereed]

Sequential extraction by using two types of crisp possibilistic clustering
Yukihiro Hamasuna; Yasunori Endo
Proceedings - 2013 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2013 IEEE 2013 Vol.5 3505 - 3510 1062-922X 2013 [Refereed]

Possibilistic clustering is well-known as one of the useful clustering methods because it is robust against noise or outlier in data. In the previous study, sparse possibilistic clustering and its variant has been proposed by using 1-regularization. These possibilistic clustering methods with 1-regularization are quite different from the viewpoint of membership function. Two types of new possibilistic approach with 1- regularization named crisp possibilistic clustering are proposed in this paper. Classification function of proposed methods which shows allocation rule in whole space and the way of sequential cluster extraction are also proposed. The effectiveness of proposed methods is, moreover, shown through numerical examples. © 2013 IEEE.

On sequential cluster extraction based on L1-regularized possibilistic non-metric model
Yukihiro Hamasuna; Yasunori Endo
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Springer 8234 204 - 213 0302-9743 2013 [Refereed]

The fuzzy non-metric model is one of the clustering methods in which the membership grade of each datum to each cluster is calculated directly from dissimilarities between data. The cluster center which is referred to as representative of cluster is not used in fuzzy non-metric model. This paper discusses a new possibilistic approach for non-metric model from the viewpoint of being in the cluster. In the previous study, new possibilistic clustering and its variant have been proposed by using L1-regularization. These possibilistic clustering methods with L1-regularization induce a change in the membership function. Two types of non-metric model based on possibilistic approach named L1-regularized possibilistic non-metric model are proposed in this paper. Next, the way of sequential extraction algorithm is also discussed. Moreover, the results of sequential extraction based on proposed methods are shown. © 2013 Springer-Verlag.

On semi-supervised fuzzy c-means clustering for data with clusterwise tolerance by opposite criteria
Yukihiro Hamasuna; Yasunori Endo
SOFT COMPUTING SPRINGER 17 (1) 71 - 81 1432-7643 2013/01 [Refereed]

This paper presents a new semi-supervised fuzzy c-means clustering for data with clusterwise tolerance by opposite criteria. In semi-supervised clustering, pairwise constraints, that is, must-link and cannot-link, are frequently used in order to improve clustering performances. From the viewpoint of handling pairwise constraints, a new semi-supervised fuzzy c-means clustering is proposed by introducing clusterwise tolerance-based pairwise constraints. First, a concept of clusterwise tolerance-based pairwise constraints is introduced. Second, the optimization problems of the proposed method are formulated. Especially, must-link and cannot-link are handled by opposite criteria in our proposed method. Third, a new clustering algorithm is constructed based on the above discussions. Finally, the effectiveness of the proposed algorithm is verified through numerical examples.

On Sparse Possibilistic Clustering with Crispness - Classification Function and Sequential Extraction
Yukihiro Hamasuna; Yasunori Endo
6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS IEEE 1801 - 1806 2012 [Refereed]

In addition to fuzzy c-means clustering, possibilistic clustering is well-known as one of the useful techniques because it is robust against noise in data. Especially sparse possibilistic clustering is quite different from other possibilistic clustering methods in the point of membership function. We propose a way to induce the crispness in possibilistic clustering by using L-1-regularization and show classification function of sparse possibilistic clustering with crispness for understanding allocation rule. We, moreover, show the way of sequential extraction by proposed method. After that, we show the effectiveness of the proposed method through numerical examples.

On rough set based non metric model
Yasunori Endo; Ayako Heki; Yukihiro Hamasuna
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Springer 7647 394 - 407 0302-9743 2012 [Refereed]

Non metric model is a kind of clustering method in which belongingness or the membership grade of each object to each cluster is calculated directly from dissimilarities between objects and cluster centers are not used. By the way, the concept of rough set is recently focused. Conventional clustering algorithms classify a set of objects into some clusters with clear boundaries, that is, one object must belong to one cluster. However, many objects belong to more than one cluster in real world, since the boundaries of clusters overlap with each other. Fuzzy set representation of clusters makes it possible for each object to belong to more than one cluster. On the other hand, the fuzzy degree sometimes may be too descriptive for interpreting clustering results. Rough set representation could handle such cases. Clustering based on rough set representation could provide a solution that is less restrictive than conventional clustering and less descriptive than fuzzy clustering. This paper shows two type of Rough set based Non Metric model (RNM). One algorithm is Rough set based Hard Non Metric model (RHNM) and the other is Rough set based Fuzzy Non Metric model (RFNM). In the both algorithms, clusters are represented by rough sets and each cluster consists of lower and upper approximation. Second, the proposed methods are kernelized by introducing kernel functions which are a powerful tool to analize clusters with nonlinear boundaries. © 2012 Springer-Verlag.

Hard c-means using quadratic penalty-vector regularization for uncertain data
Yasunori Endo; Arisa Taniguchi; Yukihiro Hamasuna
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 16 (7) 831 - 840 1883-8014 2012 [Refereed]

Clustering is an unsupervised classification technique for data analysis. In general, each datum in real space is transformed into a point in a pattern space to apply clustering methods. Data cannot often be represented by a point, however, because of its uncertainty, e.g., measurement error margin and missing values in data. In this paper, we will introduce quadratic penalty-vector regularization to handle such uncertain data using Hard c-Means (HCM), which is one of the most typical clustering algorithms. We first propose a new clustering algorithm called hard c-means using quadratic penalty-vector regularization for uncertain data (HCMP). Second, we propose sequential extraction hard c-means using quadratic penalty-vector regularization (SHCMP) to handle datasets whose cluster number is unknown. Furthermore, we verify the effectiveness of our proposed algorithms through numerical examples.

Comparison of semi-supervised hierarchical clustering using clusterwise tolerance
Yukihiro Hamasuna; Yasunori Endo
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 16 (7) 819 - 824 1883-8014 2012 [Refereed]

This paper presents a new semi-supervised agglomerative hierarchical clustering algorithm with the ward method using clusterwise tolerance. Semi-supervised clustering has recently been noted and studied in many research fields. Must-link and cannot-link, called pairwise constraints, are frequently used in order to improve clustering properties in semi-supervised clustering. First, clusterwise tolerance based pairwise constraints are introduced in order to handle mustlink and cannot-link constraints. Next, a new semisupervised hierarchical clustering algorithm with the ward method is constructed based on the above discussions. The effectiveness of the proposed algorithms is, moreover, verified through numerical examples.

On agglomerative hierarchical clustering using clusterwise tolerance based pairwise constraints
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 16 (1) 174 - 179 1883-8014 2012 [Refereed]

This paper presents semi-supervised agglomerative hierarchical clustering algorithm using clusterwise tolerance based pairwise constraints. In semi-supervised clustering, pairwise constraints, that is, must-link and cannot-link, are frequently used in order to improve clustering properties. From that sense, we will propose another way named clusterwise tolerance based pairwise constraints to handle must-link and cannot-link constraints in L 2-space. In addition, we will propose semi-supervised agglomerative hierarchical clustering algorithm based on it. We will, moreover, show the ffectiveness of the proposed method through numerical examples.

On Hard c-Means Using Quadratic Penalty-Vector Regularization for Uncertain Data
Yasunori Endo; Arisa Taniguchi; Aoi Takahashi; Yukihiro Hamasuna
MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2011 SPRINGER-VERLAG BERLIN 6820 126 - + 0302-9743 2011 [Refereed]

Clustering is one of the unsupervised classification techniques of the data analysis. Data are transformed from a real space into a pattern space to apply clustering methods. However, the data cannot be often represented by a point because of uncertainty of the data, e.g., measurement error margin and missing values in data. In this paper, we introduce quadratic penalty-vector regularization to handle such uncertain data into hard c-means (HCM) which is one of the most typical clustering algorithms. First, we propose a new clustering algorithm called hard c-means using quadratic penalty-vector regularization for uncertain data (HCMP). Second, we propose sequential extraction hard c-means using quadratic penalty-vector regularization (SHCMP) to handle datasets whose cluster number is unknown. Moreover, we verify the effectiveness of our propose algorithms through some numerical examples.

Semi-supervised Agglomerative Hierarchical Clustering with Ward Method Using Clusterwise Tolerance
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2011 SPRINGER-VERLAG BERLIN 6820 103 - + 0302-9743 2011 [Refereed]

This paper presents a new semi-supervised agglomerative hierarchical clustering algorithm with ward method using clusterwise tolerance. Recently, semi-supervised clustering has been remarked and studied in many research fields. In semi-supervised clustering, must-link and cannot-link called pairwise constraints are frequently used in order to improve clustering properties. First, a clusterwise tolerance based pairwise constraints is introduced in order to handle must-link and cannot-link constraints. Next, a new semi-supervised agglomerative hierarchical clustering algorithm with ward method is constructed based on above discussions. Moreover, the effectiveness of proposed algorithms is verified through numerical examples.

Fuzzy c-Means Clustering with Mutual Relation Constraints Construction of Two Types of Algorithms
Yasunori Endo; Yukihiro Hamasuna
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT I SPRINGER 6881 131 - 140 0302-9743 2011 [Refereed]

Recently, semi-supervised clustering attracts many researchers' interest. In particular, constraint-based semi-supervised clustering is focused and the constraints of must-link and cannot-link play very important role in the clustering. There are many kinds of relations as well as must-link or cannot-link and one of the most typical relations is the trade-off relation. Thus, in this paper we formulate the trade-off relation and propose a new "semi-supervised" concept called mutual relation. Moreover, we construct two types of new clustering algorithms with the mutual relation constraints based on the well-known and useful fuzzy c-means, called fuzzy c-means with the mutual relation constraints.

Hard and fuzzy c-means clustering with mutual relation constraints
Yasunon Endo; Yukihiro Hamasuna
International Conference on Intelligent Systems Design and Applications, ISDA IEEE 557 - 562 2164-7143 2011 [Refereed]

Recently, semi-supervised clustering attracts many researchers' interest. In particular, constraint-based semi-supervised clustering is focused and the constraints of must-link and cannot-link play very important role in the clustering. There are many kinds of relations as well as must-link or cannot-link and one of the most typical relations is the trade-off relation. Thus, in this paper we formulate the trade-off relation and propose a new "semi-supervised" concept called mutual relation. Moreover, we construct two types of new clustering algorithms with the mutual relation constraints based on the well-known and useful hard c-means (HCM) and fuzzy c-means (FCM), called hard c-means with the mutual relation constraints (HCMMR) and fuzzy c-means with the mutual relation constraints (FCMMR). © 2011 IEEE.

On semi-supervised fuzzy c-means clustering with clusterwise tolerance by opposite criteria
Yukihiro Hamasuna; Yasunori Endo
Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 IEEE 225 - 230 2011 [Refereed]

The importance of semi-supervised clustering is to handle pairwise constraints as a prior knowledge. In this paper, we will propose a new semi-supervised fuzzy c-means clustering with clusterwise tolerance by opposite criteria. First, the concept of clusterwise tolerance and pairwise constraints are introduced. Second, the optimization problem of proposed method is formulated. Especially, must-link and cannot-link constraints are handled and introduced by opposite criteria in proposed method. Third, a new clustering algorithm is constructed based on the above discussions. Finally, the effectiveness of proposed algorithm is verified through numerical examples. © 2011 IEEE.

On principal component analysis for data with tolerance
Yasunori Endo; Tatsuyoshi Tsuji; Yukihiro Hamasuna; Kota Kurihara
Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 IEEE 177 - 182 2011 [Refereed]

In many cases, data are handled as intervals on the pattern space because the data generally contain the uncertainty of error, loss and so on. The concept of tolerance in this paper enables us to handle these data as a point on the pattern space. The advantage is that we can handle uncertain data in the framework of optimization without introducing any particular measures between intervals. In recent years, this concept is positively introduced into clustering methods and the effectiveness is confirmed. However, there are few applications of the concept into multivariate analysis methods except regression models in spite of its effectiveness. Therefore, we propose a new algorithm of principal component analysis for uncertain data by introducing the concept of the tolerance in this paper. Moreover, we verify the effectiveness through some numerical examples. © 2011 IEEE.

On Mahalanobis Distance Based Fuzzy c-Means Clustering for Uncertain Data Using Penalty Vector Regularization
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011) IEEE 810 - 815 1098-7584 2011 [Refereed]

This paper presents Mahalanobis distance based fuzzy c-means clustering for uncertain data using penalty vector regularization. When we handle a set of data, data contains inherent uncertainty e. g., errors, ranges or some missing value of attributes. In order to handle such uncertain data as a point in a pattern space the concept of penalty vector has been proposed. Some significant clustering algorithms based on it have been also proposed. In conventional clustering algorithms, Mahalanobis distance have been used as dissimilarity as well as squared L-2 and L-1-norm. From the viewpoint of the guideline of dissimilarity, Mahalanobis distance based fuzzy c-means clustering for uncertain data should be considered. In this paper, we introduce fuzzy c-means clustering for uncertain data using penalty vector regularization as our conventional works. Next, we propose Mahalanobis distance based one. Moreover, we show the effectiveness of proposed method through numerical examples.

Kernelized Fuzzy c-Means Clustering for Uncertain Data using Quadratic Penalty-Vector Regularization with Explicit Mappings
Yasunori Endo; Isao Takayama; Yukihiro Hamasuna; Sadaaki Miyamoto
IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011) IEEE 804 - 809 1098-7584 2011 [Refereed]

Recently, fuzzy c-means clustering with kernel functions is remarkable in the reason that these algorithms can handle datasets which consist of some clusters with nonlinear boundaries. However the algorithms have the following problems: (1) the cluster centers can not be calculated explicitly, (2) it takes long time to calculate clustering results. By the way, we have proposed the clustering algorithms using penalty-vector regularization to handle uncertain data. In this paper, we propose new clustering algorithms using quadratic penalty-vector regularization by introducing explicit mappings of kernel functions to solve the following problems. Moreover, we construct fuzzy classification functions for our proposed clustering methods.

Fuzzy c-means clustering for data with clusterwise tolerance based on L2- and L1-regularization
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 15 (1) 68 - 75 1883-8014 2011 [Refereed]

Detecting various kinds of cluster shape is an important problem in the field of clustering. In general, it is difficult to obtain clusters with different sizes or shapes by single-objective function. From that sense, we have proposed the concept of clusterwise tolerance and constructed clustering algorithms based on it. In the field of data mining, regularization techniques are used in order to derive significant classifiers. In this paper, we propose another concept of clusterwise tolerance from the viewpoint of regularization. Moreover, we construct clustering algorithms for data with clusterwise tolerance based on L2- and L1-regularization. After that, we describe fuzzy classification functions of proposed algorithms. Finally, we show the effectiveness of proposed algorithms through numerical examples.

Fuzzy c-means clustering for uncertain data using quadratic penalty-vector regularization
Yasunori Endo; Yasushi Hasegawa; Yukihiro Hamasuna; Yuchi Kanzawa
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 15 (1) 76 - 82 1883-8014 2011
Clustering - defined as an unsupervised data-analysis classification transforming real-space information into data in pattern space and analyzing it - may require that data be represented by a set, rather than points, due to data uncertainty, e.g., measurement error margin, data regarded as one point, or missing values. These data uncertainties have been represented as interval ranges for which many clustering algorithms are constructed, but the lack of guidelines in selecting available distances in individual cases has made selection difficult and raised the need for ways to calculate dissimilarity between uncertain data without introducing a nearest-neighbor or other distance. The tolerance concept we propose represents uncertain data as a point with a tolerance vector, not as an interval, while this is convenient for handling uncertain data, tolerance-vector constraints make mathematical development difficult. We attempt to remove the tolerance-vector constraints using quadratic penaltyvector regularization similar to the tolerance vector. We also propose clustering algorithms for uncertain data considering optimization and obtaining an optimal solution to handle uncertainty appropriately.

Semi-supervised Fuzzy c-Means Clustering for Data with Clusterwise Tolerance with Pairwise Constraints
Yukihiro Hamasuna; Yasunori Endo
Joint 5th International Conference on Soft Computing and Intelligent Systems and 11th International Symposium on Advanced Intelligent Systems (SCIS & ISIS 2010) 2010/12 [Refereed]

On tolerant fuzzy c-means clustering and tolerant possibilistic clustering
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
SOFT COMPUTING SPRINGER 14 (5) 487 - 494 1432-7643 2010/03 [Refereed]

This paper presents two new types of clustering algorithms by using tolerance vector called tolerant fuzzy c-means clustering and tolerant possibilistic clustering. In the proposed algorithms, the new concept of tolerance vector plays very important role. The original concept is developed to handle data flexibly, that is, a tolerance vector attributes not only to each data but also each cluster. Using the new concept, we can consider the influence of clusters to each data by the tolerance. First, the new concept of tolerance is introduced into optimization problems. Second, the optimization problems with tolerance are solved by using Karush-Kuhn-Tucker conditions. Third, new clustering algorithms are constructed based on the optimal solutions for clustering. Finally, the effectiveness of the proposed algorithms is verified through numerical examples and its fuzzy classification function.

Semi-supervised Agglomerative Hierarchical Clustering Using Clusterwise Tolerance Based Pairwise Constraints
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI) SPRINGER-VERLAG BERLIN 6408 152 - 162 0302-9743 2010 [Refereed]

Recently, semi-supervised clustering has been remarked and discussed in many researches. In semi-supervised clustering, pairwise constraints, that is, must-link and cannot-link are frequently used in order to improve clustering results by using prior knowledges or informations. In this paper, we will propose a clusterwise tolerance based pairwise constraint. In addition, we will propose semi-supervised agglomerative hierarchical clustering algorithms with centroid method based on it. Moreover, we will show the effectiveness of proposed method through numerical examples.

Semi-supervised fuzzy c-means clustering using clusterwise tolerance based pairwise constraints
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
Proceedings - 2010 IEEE International Conference on Granular Computing, GrC 2010 IEEE Computer Society 188 - 193 2010 [Refereed]

Recently, semi-supervised clustering has been remarked and discussed in many research fields. In semisupervised clustering, prior knowledge or information are often formulated as pairwise constraints, that is, must-link and cannot-link. Such pairwise constraints are frequently used in order to improve clustering properties. In this paper, we will propose a new semi-supervised fuzzy c-means clustering by using clusterwise tolerance and pairwise constraints. First, the concept of clusterwise tolerance and pairwise constraints are introduced. Second, the optimization problem of fuzzy cmeans clustering using clusterwise tolerance based pairwise constraint is formulated. Especially, must-link constraint is considered and introduced as pairwise constraints. Third, a new clustering algorithm is constructed based on the above discussions. Finally, the effectiveness of proposed algorithm is verified through numerical examples. © 2010 IEEE.

Cluster Validity Measures for Data with Tolerance
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
2010 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2010) IEEE 1 - 6 1098-7584 2010 [Refereed]

Cluster validity measures are used in order to determine an appropriate number of clusters and evaluate cluster partitions obtained by clustering algorithms. When we handle a set of data, data contains inherent uncertainty e. g., errors, ranges or some missing value of attributes. The concept of tolerance has been proposed from the viewpoint of handling such uncertain data. In this paper, we introduce clustering algorithms for data with tolerance. Moreover, we propose new five measures for data with tolerance, that is, the determinants and the traces of fuzzy covariance matrices, the Xie-Beni's index, the Fukuyama-Sugeno's index, and the Davies-Bouldin's index. We compare the performance of conventional ones with their tolerance versions. We found that our proposed measures takes smaller value than conventional ones. These results indicate tolerance based clustering method is suitable for handling uncertain data.

Hard and Fuzzy c-Regression Models for Data with Tolerance in Independent and Dependent Variables
Yasunori Endo; Kouta Kurihara; Sadaaki Miyamoto; Yukihiro Hamasuna
2010 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2010) IEEE 1 - 8 1098-7584 2010 [Refereed]

c-regression models are known as very useful tools in many fields. Since now, many trials to construct c-regression models for data with uncertainty in independent and dependent variables have been done. However, there are few c-regression models for data with uncertainty in independent variables in comparison with dependent variables now. The reason is as follows. The models are constructed using optimal solutions which is derived by solving an optimization problem "analytically". The problem for data with uncertainty in dependent variables can be easily solved but it is very difficult to solve the problem for data with uncertainty in independent variables "analytically". Therefore, most of the models for data with uncertainty in independent variables are constructed in which the solutions are calculated "numerically". By the way, we have proposed "tolerance" of a convenient tool to handle data with uncertainty [3] and applied it to some of clustering algorithms [4]-[7]. This concept of tolerance is very useful. The reason is that we can handle data with uncertainty in the framework of optimization to use the concept, without introducing some particular measure between intervals. Especially when we handle the data with missing values of its attributes in the framework of optimization like as fuzzy c-means clustering [6], this tool is effective. Besides, we think that the tolerance is also available when we consider to construct a regression model for data with uncertainty in independent and dependent variables. In this paper, we first derive the optimal solutions for c-regression models for data with uncertainty in independent and dependent variables "analytically" by using the concept of tolerance. Second, we construct hard and fuzzy c-regression models for data with tolerance in independent and dependent variables. Moreover, we estimate effectiveness of the algorithms through some numerical examples.

Fuzzy c-Regression Model for Data with Tolerance
Kouta Kurihara; Yasunori Endo; Yukihiro Hamasuna; Sadaaki Miyamoto
The 6th International Conference on Modeling Decisions for Artificial Intelligence (MDAI2009) 2009/12 [Refereed]

On Hierarchical Clustering for Data with Tolerance
Yasunori Endo; Yukihiro Hamasuna; Ayaka Tagaya
The 6th International Conference on Modeling Decisions for Artificial Intelligence (MDAI2009) 2009/12 [Refereed]

Two types of Tolerant Hard c-Means Clustering
Yukihiro Hamasuna; Yasunori Endo
2009 International Symposium on Nonlinear Theory and its Applications (Nolta2009) 2009/10 [Refereed]

Clustering algorithm based on probabilistic dissimilarity
Makito Yamashiro; Yasunori Endo; Yukihiro Hamasuna
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 13 (4) 429 - 433 1883-8014 2009 [Refereed]

The clustering algorithm we propose is based on probabilistic dissimilarity, which is formed by introducing the concept of probability into conventional dissimilarity. After defining probabilistic dissimilarity, we present examples of probabilistic dissimilarity functions. After considering an objective function with probabilistic dissimilarity. Furthermore, we construct a clustering algorithm probabilistic dissimilarity based using optimal solutions maximizing the objective function. Numerical examples verify the effectiveness of our algorithm.

On L-1-Norm based Tolerant Fuzzy c-Means Clustering
Hamasuna Yukihiro; Endo Yasunori; Miyamoto Sadaaki
2009 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3 IEEE 1125 - + 2009 [Refereed]

In this paper, we will propose two types of L-1-norm based tolerant fuzzy c-means clustering (TFCM) from the viewpoint of handling data more flexibly. One is based on the constraint for tolerance vector and the other is based on the regularization term. First, the concept of clusterwise tolerance is introduced into optimization problems. In these methods, a tolerance vector attributes not only to each data but also each cluster. First, the concept of clusterwise tolerance is introduced into optimization problems. Second, optimal solutions for these optimization problems are derived. Third, new clustering algorithms are constructed based on the explicit optimal solutions. Finally, effectiveness of proposed algorithms is verified through numerical examples.

On Tolerant Fuzzy c-Means Clustering with L-1-Regularization
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
PROCEEDINGS OF THE JOINT 2009 INTERNATIONAL FUZZY SYSTEMS ASSOCIATION WORLD CONGRESS AND 2009 EUROPEAN SOCIETY OF FUZZY LOGIC AND TECHNOLOGY CONFERENCE EUROPEAN SOC FUZZY LOGIC & TECHNOLOGY 1152 - 1157 2009 [Refereed]

We have proposed tolerant fuzzy c-means clustering (TFCM) from the viewpoint of handling data more flexibly. This paper presents a new type of tolerant fuzzy c-means clustering with L-1-regularization. L-1-regularization is well-known as the most successful techniques to induce sparseness. The proposed algorithm is different from the viewpoint of the sparseness for tolerance vector. In the original concept of tolerance, a tolerance vector attributes to each data. This paper develops the concept to handle data flexibly, that is, a tolerance vector attributes not only to each data but also each cluster. First, the new concept of tolerance is introduced into optimization problems. These optimization problems are based on conventional fuzzy c-means clustering (FCM). Second, the optimization problems with tolerance are solved by using Karush-Kuhn-Tucker conditions and an optimization method for L-1-regularization. Third, new clustering algorithms are constructed based on the explicit optimal solutions. Finally, the effectiveness of the proposed algorithm is verified through some numerical examples.

Comparison of Tolerant Fuzzy c-Means Clustering with L-2- and L-1-Regularization
Hamasuna Yukihiro; Endo Yasunori; Miyamoto Sadaaki
2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009) IEEE 197 - + 2009 [Refereed]

In this paper, we will propose two types of tolerant fuzzy c-means clustering with regularization terms. One is L-2-regularization term and the other is L-1-regularization one for tolerance vector. Introducing a concept of clusterwise tolerance, we have proposed tolerant fuzzy c-means clustering from the viewpoint of handling data more flexibly. In tolerant fuzzy c-means clustering, a constraint for tolerance vector which restricts the upper bound of tolerance vector is used. In this paper, regularization terms for tolerance vector are used instead of the constraint. First, the concept of clusterwise tolerance is introduced. Second, optimization problems for tolerant fuzzy c-means clustering with regularization term are formulated. Third, optimal solutions of these optimization problems are derived. Fourth, new clustering algorithms are constructed based on the explicit optimal solutions. Finally, effectiveness of proposed algorithms is verified through numerical examples.

On Fuzzy c-Means Clustering for Uncertain Data Using Quadratic Regularization of Penalty Vectors
Endo Yasunori; Hamasuna Yukihiro; Kanzawa Yuchi; Miyamoto Sadaaki
2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009) IEEE 148 - + 2009 [Refereed]

In recent years, data from many natural and social phenomena are accumulated into huge databases in the world wide network of computers. Thus, advanced data analysis techniques to get valuable knowledge from data using computing power of today are required. Clustering is one of the unsupervised classification technique of the data analysis and both of hard and fuzzy c-means clusterings are the most typical technique of clustering. By the way, information on a real space is transformed to data in a pattern space and analyzed in clustering. However, the data should be often represented not by a point but by a set because of uncertainty of the data, e.g., measurement error margin, data that cannot be regarded as one point, and missing values in data. These uncertainties of data have been represented as interval range and many clustering algorithms for these interval ranges of data have been constructed. However, the guideline to select an available distance in each case has not been shown so that this selection problem is difficult. Therefore, methods to calculate the dissimilarity between such uncertain data without introducing a particular distance, e.g., nearest neighbor one and so on, have been strongly desired. From this viewpoint, we have proposed a concept of tolerance. The concept represents a uncertain data not as an interval but as a point with a tolerance vector. In this paper, we try to remove the constraint for tolerance vectors by using quadratic regularization of penalty vector which is similar to tolerance vector and propose new clustering algorithms for uncertain data through considering the optimization problems and obtaining the optimal solution, to handle such uncertainty more appropriately.

On L-1-Norm based Tolerant Fuzzy c-Means Clustering
Hamasuna Yukihiro; Endo Yasunori; Miyamoto Sadaaki
2009 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3 IEEE 1125 - + 1098-7584 2009 [Refereed]

In this paper, we will propose two types of L-1-norm based tolerant fuzzy c-means clustering (TFCM) from the viewpoint of handling data more flexibly. One is based on the constraint for tolerance vector and the other is based on the regularization term. First, the concept of clusterwise tolerance is introduced into optimization problems. In these methods, a tolerance vector attributes not only to each data but also each cluster. First, the concept of clusterwise tolerance is introduced into optimization problems. Second, optimal solutions for these optimization problems are derived. Third, new clustering algorithms are constructed based on the explicit optimal solutions. Finally, effectiveness of proposed algorithms is verified through numerical examples.

On Semi-Supervised Fuzzy c-Means Clustering
Endo Yasunori; Hamasuna Yukihiro; Yamashiro Makito; Miyamoto Sadaaki
2009 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3 IEEE 1119 - + 1098-7584 2009 [Refereed]

We have two methods of pattern classification, one is supervised and the other is unsupervised. Unsupervised classification, which is called clustering and classifies data except external criteria, is very useful in the methods of pattern classification so that it has been applied in many fields. There are two types of clustering, one is hierarchical and the other is non-hierarchical. We often use hard c-means clustering (HCM) or fuzzy c-means blustering (FCM) as typical methods of non-hierarchical clustering. By the way, supervised classification can achieve practical classification results but can't handle a lot of data. On the other hand unsupervised classification can handle a lot of data but the method is complex and sometimes results look a bit of strange. Therefore recently, study of semi-supervised classification has been studied. This classification has advantages of both of the above-mentioned methods, e.g., practical results, low costs and short calculation time. In this paper, we propose new semi-supervised classification algorithms based on fuzzy c-means clustering in which some membership grades are given as supervised membership grade in advance.

On Tolerant Fuzzy c-Means Clustering with L-1-Regularization
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
PROCEEDINGS OF THE JOINT 2009 INTERNATIONAL FUZZY SYSTEMS ASSOCIATION WORLD CONGRESS AND 2009 EUROPEAN SOCIETY OF FUZZY LOGIC AND TECHNOLOGY CONFERENCE EUROPEAN SOC FUZZY LOGIC & TECHNOLOGY 1152 - 1157 2009 [Refereed]

We have proposed tolerant fuzzy c-means clustering (TFCM) from the viewpoint of handling data more flexibly. This paper presents a new type of tolerant fuzzy c-means clustering with L-1-regularization. L-1-regularization is well-known as the most successful techniques to induce sparseness. The proposed algorithm is different from the viewpoint of the sparseness for tolerance vector. In the original concept of tolerance, a tolerance vector attributes to each data. This paper develops the concept to handle data flexibly, that is, a tolerance vector attributes not only to each data but also each cluster. First, the new concept of tolerance is introduced into optimization problems. These optimization problems are based on conventional fuzzy c-means clustering (FCM). Second, the optimization problems with tolerance are solved by using Karush-Kuhn-Tucker conditions and an optimization method for L-1-regularization. Third, new clustering algorithms are constructed based on the explicit optimal solutions. Finally, the effectiveness of the proposed algorithm is verified through some numerical examples.

Clustering algorithm based on probabilistic dissimilarity
Makito Yamashiro; Yasunori Endo; Yukihiro Hamasuna
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 13 (4) 429 - 433 1883-8014 2009 [Refereed]

The clustering algorithm we propose is based on probabilistic dissimilarity, which is formed by introducing the concept of probability into conventional dissimilarity. After defining probabilistic dissimilarity, we present examples of probabilistic dissimilarity functions. After considering an objective function with probabilistic dissimilarity. Furthermore, we construct a clustering algorithm probabilistic dissimilarity based using optimal solutions maximizing the objective function. Numerical examples verify the effectiveness of our algorithm.

On tolerant fuzzy c-means clustering
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 13 (4) 421 - 428 1883-8014 2009 [Refereed]

This paper presents a new type of clustering algorithms by using a tolerance vector called tolerant fuzzy c-means clustering (TFCM). In the proposed algorithms, the new concept of tolerance vector plays very important role. In the original concept of tolerance, a tolerance vector attributes to each data. This concept is developed to handle data flexibly, that is, a tolerance vector attributes not only to each data but also each cluster. Using the new concept, we can consider the influence of clusters to each data by the tolerance. First, the new concept of tolerance is introduced into optimization problems based on conventional fuzzy c-means clustering (FCM). Second, the optimization problems with tolerance are solved by using Karush-Kuhn-Tucker conditions. Third, new clustering algorithms are constructed based on the explicit optimal solutions of the optimization problems. Finally, the effectiveness of the proposed algorithms is verified through numerical examples by fuzzy classification function.

On Projection Correlation Proposal for a New Dissimilarity and Application to Hierarchical Clustering Algorithms
Yasunori Endo; Fuyuki Uchida; Yukihiro Hamasuna
Modeling Decisions for Artificial Intelligence (MDAI2008) 2008/10 [Refereed]

New Clustering Algorithms by using Tolerance Vector
Yukihiro Hamasuna; Yasunori Endo
Modeling Decisions for Artificial Intelligence (MDAI2008) 2008/10 [Refereed]

On a New Dissimilarity of Projection Correlation
Yasunori Endo; Fuyuki Uchida; Yukihiro Hamasuna
Joint 4th International Conference on Soft Computing and Intelligent Systems and 9th International Symposium on advanced Intelligent Systems (SCIS & ISIS 2008) 2008/09 [Refereed]

On Fuzzy c-Means for Data with Uncertainty using Spring Modulus
Yasushi Hasegawa; Yasunori Endo; Yukihiro Hamasuna
Joint 4th International Conference on Soft Computing and Intelligent Systems and 9th International Symposium on advanced Intelligent Systems (SCIS & ISIS 2008) 2008/09 [Refereed]

Fuzzy c-means for Data with Rectangular Maximum Tolerance Range
Yasunori Endo; Yasushi Hasegawa; Yukihiro Hamasuna; Sadaaki Miyamoto
Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) Fuji Technology Press (富士技術出版出版株式会社) 12 (5) 461 - 466 2008/09 [Refereed]

許容範囲付きデータに対するハードクラスタリング
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto; Yasushi Hasegawa
Journal of Japan Society for Fuzzy Theory and Intelligent Informatics Japan Society for Fuzzy Theory and intelligent informatics 20 (3) 388 - 398 1881-7203 2008/06 [Refereed]

In this paper, two clustering algorithms that handle data with tolerance are proposed. One is based on hard c-means (HCM) while the other is based on the learning vector quantization (LVQC). We consider a tolerance which is a new concept to handle data with uncertainty such as errors, ranges, or a lost attribute of data in the optimization framework. The concept of tolerance is included in both algorithms. Dissimilarity in the former clustering algorithms is defined by using nearest-neighbor, furthest-neighbor or Hausdorff distance. On the other hand, dissimilarity in the proposed algorithms is defined by squared L₂ (euclidean)-norm and the algorithm can handle the data with uncertainty in the strict optimization problems. First, the concept of tolerance which implies errors, ranges and the loss of attribute of data is described. Optimization problems that take the tolerance into account are formulated. A unique and explicit optimal solution is given by Karush-Kuhn-Tucker conditions. An alternate minimization algorithm and a learning algorithm are constructed. Moreover, effectiveness of the proposed algorithms is verified through numerical examples.

On Tolerant Entropy Regularized Fuzzy c-Means
Yukihiro Hamasuna; Yasunori Endo; Makito Yamashiro
2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2 IEEE 244 - 247 2008 [Refereed]

This paper presents a new type of clustering algorithm by using tolerance vector The tolerance vector is considered from a new viewpoint that the vector shows a correlation between each data and cluster centers in proposed algorithm. First, a new concept of tolerance is introduced into optimization problem. This optimization problem is based on entropy regularized fuzzy c-means. Second, the optimization problem with the tolerance is solved by using the Karush-Kuhn-Tucker conditions. Next, new clustering algorithm is constructed based on the unique and explicit optimal solutions of the optimization problem. Finally, the effectiveness of the proposed algorithm is verified through some numerical examples.

On Tolerant Entropy Regularized Fuzzy c-Means
Yukihiro Hamasuna; Yasunori Endo; Makito Yamashiro
2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2 IEEE 244 - 247 2008 [Refereed]

This paper presents a new type of clustering algorithm by using tolerance vector The tolerance vector is considered from a new viewpoint that the vector shows a correlation between each data and cluster centers in proposed algorithm. First, a new concept of tolerance is introduced into optimization problem. This optimization problem is based on entropy regularized fuzzy c-means. Second, the optimization problem with the tolerance is solved by using the Karush-Kuhn-Tucker conditions. Next, new clustering algorithm is constructed based on the unique and explicit optimal solutions of the optimization problem. Finally, the effectiveness of the proposed algorithm is verified through some numerical examples.

Support Vector Machine for Data with Tolerance based on Hard-Margin and Soft-Margin
Hamasuna Yukihiro; Endo Yasunori; Miyamoto Sadaaki
2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5 IEEE 750 - + 1098-7584 2008 [Refereed]

This paper presents two new types of Support Vector Machine (SVM) algorithms, one is based on Hard-margin SVM and the other is based on Soft-margin SVM. These algorithms can handle data with tolerance of which the concept includes some errors, ranges or missing values in data. First, the concept of tolerance is introduced into optimization problems of Support Vector Machine. Second, the optimization problems with the tolerance are solved by using the Karush-Kuhn-Tucker conditions. Next, new algorithms are constructed based on the unique and explicit optimal solutions of the optimization problem. Finally, the effectiveness of the proposed algorithms is verified through some numerical examples for the artificial data.

Clustering Algorithms Based on Tolerance Vector Concept
Yasunori Endo; Yasushi Hasegawa; Yukihiro Hamasuna; Sadaaki Miyamoto
Proc. 2007 International Symposium on Nonlinear Theory and Its Applications (Nolta2007) 2007/09 [Refereed]

Fuzzy c-means for data with tolerance defined as Hyper-Rectangle
Yasushi Hasegawa; Yasunori Endo; Yukihiro Hamasuna; Sadaaki Miyamoto
MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, PROCEEDINGS SPRINGER-VERLAG BERLIN 4617 237 - + 0302-9743 2007 [Refereed]

The paper presents some new clustering algorithms which are based on fuzzy c-means. The algorithms can treat data with tolerance defined as hyper-rectangle. First, the tolerance is introduced into optimization problems of clustering. This is generalization of calculation errors or missing values. Next, the problems are solved and some algorithms are constructed based on the results. Finally, usefulness of the proposed algorithms are verified through numerical examples.

Agglomerative hierarchical clustering for data with tolerance
Endo Yasunori; Hamasuna Yukihiro; Miyamoto Sadaaki
GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS IEEE COMPUTER SOC 404 - 409 2007 [Refereed]

This paper presents new clustering algorithms which are based on agglomerative hierarchical clustering (AHC) with centroid method. The algorithms can handle with data with tolerance of which the concept includes some errors, ranges, or missing values in data. First, the tolerance is introduced into optimization problems of clustering. Second, an objective function is introduced for calculating the centroid of cluster and the problem is solved using Kuhn-Tucker conditions. Next, new algorithms are constructed based on the solution of the problem. Finally, the effectiveness of the proposed algorithms in this paper is verified through some numeric examples for the artificial data.

Two clustering algorithms for data with tolerance based on hard c-means
Yukihiro Hamasuna; Yasunori Endo; Yasushi Hasegawa; Sadaaki Miyamoto
2007 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-4 IEEE, ELECTRON DEVICES SOC & RELIABILITY GROUP 687 - + 1098-7584 2007 [Refereed]

Two clustering algorithms that handle data with tolerance are proposed. One is based on hard c-means while the other uses the learning vector quantization. The concept of the tolerance includes. First, the concept of tolerance which implies errors, ranges and the loss of attribute of data is described. Optimization problems that take the tolerance into account are formulated. Since the Kuhn-Tucker condition give a unique and explicit optimal solution, an alternate minimization algorithm and a learning algorithm are constructed. Moreover, the effectiveness of the proposed algorithms is verified through numerical examples.

Metaheuristic Algorithms for Container Loading Problems: Framework and Knowledge Utilization
Sadaaki Miyamoto; Yasunori Endo; Koki Hanzawa; Yukihiro Hamasuna
Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) Fuji Technology Press (富士技術出版出版株式会社) 11 (1) 51 - 60 2007/01 [Refereed]

Metaheuristic Algorithms for Container Loading Problem by Grouping Objects
Yasunori Endo; Koki Hanzawa; Yukihiro Hamasuna
Journal of Japan Society for Fuzzy Theory and Intelligent Informatics Japan Society for Fuzzy Theory and intelligent informatics 18 (6) 859 - 866 1881-7203 2006/12 [Refereed]

A family of automatic container loading problems is studied and algorithms are proposed. The algorithms are constructed with metaheuristics and include flat and/or vertical loading schemes, loading efficiency, stability of loaded objects, and computational requirement. Handling groups of objects in a metaheuristic scheme is moreover considered. Numerical examples are given.

Metaheuristic Al- gorithms for Container Loading Problem Using Grouping Objects
Yasunori Endo; Sadaaki Miyamoto; Koki Hanzawa; Yukihiro Hamasuna
Proc. 2006 International Symposium on Nonlinear Theory and Its Applications (Nolta2006) 2006/09 [Refereed]

Container Loading Problem: Formulation, Knowledge Utilization, and Algorithms
Yasunori Endo; Sadaaki Miyamoto; Koki Hanzawa; Yukihiro Hamasuna
Modeling Decisions for Artificial Intelligence (MDAI2005) 2005/07 [Refereed]

MISC

On Sequential Cluster Extraction Using Possibilistic Size Controll Clustering
Ryota Uto; Yukihiro Hamasuna 第39回ファジィシステムシンポジウム講演論文集 2023/09

A Study on Cluster Validity Measures Based on Fuzzy Membership for Time-Series Data
⃝Kenshin Fujita; Yukihiro Hamasuna 第39回ファジィシステムシンポジウム講演論文集 2023/09

A Study on Parameter Estimation in Gaussian Process based c-Regression Models
⃝Yuya Yokoyama; Yukihiro Hamasuna 第39回ファジィシステムシンポジウム講演論文集 2023/09

A Study on Initial Value Determination Using k-medoids++ in Controlled Edge-Sized Network Clustering
Hiroto Migita; Yukihiro Hamasuna 第39回ファジィシステムシンポジウム講演論文集 2023/09

Noise Clustering based on Local Outlier Factor
Yoshitomo Mori; Yukihiro Hamasuna 第39回ファジィシステムシンポジウム講演論文集 2023/09

A Study on Network Clustering Using Similarity Based on Node Neighborhood Sets
Katsumi Endo; Yukihiro Hamasuna 第39回ファジィシステムシンポジウム講演論文集 2023/09

Comparison of Automatic Cluster Number Estimation Methods by Hierarchical Clustering
Atusya Higashino; Yukihiro Hamasuna 第39回ファジィシステムシンポジウム講演論文集 2023/09

Hyperparameter Optimization for Gaussian Process Sequential Regression Models
Kaito Takegawa; Yukihiro Hamasuna 第39回ファジィシステムシンポジウム講演論文集 2023/09

L1 ノルムを用いたサイズコントロール機能を持つファジィクラスタリングに関する一考察
青木悠真; 濵砂幸裕計測自動制御学会システム・情報部門学術講演会 2022(SSI2022) 2022/11

Doc2Vec と階層的クラスタリングを用いたクラスタリングにおけるロバスト性に関する分析
奥早和紀; 濵砂幸裕計測自動制御学会システム・情報部門学術講演会 2022(SSI2022) 2022/11

ネットワーククラスタリングにおけるエッジコントロールの検討
Yota Echikawa; Yukihiro Hamasuna 計測自動制御学会システム・情報部門学術講演会 2022(SSI2022) 2022/11

ガウス過程に基づく逐次抽出型回帰モデルの検討
武川海斗; 濵砂幸裕第38回ファジィシステムシンポジウム (FSS2022) 2022/09

ガウス過程回帰に基づくc-回帰モデル
横山裕哉; 濵砂幸裕第38回ファジィシステムシンポジウム (FSS2022) 2022/09

階層的クラスタリングを用いたクラスタ数の自動推定に関する検討
東埜淳哉; 濵砂幸裕第38回ファジィシステムシンポジウム (FSS2022) 2022/09

階層的クラスタリングを用いたネットワークデータのクラスタ分割に関する考察
遠藤克海; 濵砂幸裕第38回ファジィシステムシンポジウム (FSS2022) 2022/09

時系列データに対するクラスタ妥当性基準に関する一考察
藤田憲伸; 濵砂幸裕第38回ファジィシステムシンポジウム (FSS2022) 2022/09

時系列データに対するWard法の検討
大野淳寛; 濵砂幸裕第38回ファジィシステムシンポジウム (FSS2022) 2022/09

サイズコントロール機能を持つ可能性クラスタリング
宇戸涼太; 濵砂幸裕第30回インテリジェント・システム・シンポジウム (FAN2022) 35 -40 2022/09

A Study on Controlled-Sized Clustering for Time Series Data
Nobuhiko Tsuda; Yukihiro Hamasuna Joint 11th International Conference on Soft Computing and Intelligent Systems and 21st International Symposium on Advanced Intelligent Systems(SCIS/ISIS) 417 -422 2020/09

A Study on c-Regression Model Based on Gaussian Process
Yuto Kingetsu; Yukihiro Hamasuna 第36回ファジィシステムシンポジウム 2020/09

Map Segmentation in RoboCupRescue Using Louvain Method and Its Application to Agent Control
Soma Kitamura, Yoshihiro Nishimura, Yukihiro Hamasuna 第36回ファジィシステムシンポジウム 2020/09

JS ダイバージェンスを用いた k-medoids
金月優斗; 濵砂幸裕第35回ファジィシステムシンポジウム(FSS2019) 2019/08

ボロノイ図に基づくクラスタ分割の妥当性評価
津田暢彦; 濵砂幸裕第35回ファジィシステムシンポジウム(FSS2019) 2019/08

RoboCup 2D リーグにおけるパッキングレートを用いた評価
大津拓登; 北村壮馬; 濵砂幸裕第35回ファジィシステムシンポジウム(FSS2019) 2019/08

RoboCup 2D リーグに対する 5 レーン理論の実装と評価
北村壮馬; 大津拓登; 濵砂幸裕第35回ファジィシステムシンポジウム(FSS2019) 2019/08

クラスタ分割が重み付きアルファ複体とホモトピー同値になるようなクラスタリングについて
星野翔大; 遠藤靖典; 濵砂幸裕第35回ファジィシステムシンポジウム(FSS2019) 2019/08

DP-means と階層的クラスタリングを用いた 2 段階クラスタリン
大津拓登; 濵砂幸裕第 23 回曖昧な気持ちに挑むワークショップ (H&M2018) 2018/12

重みなしネットワークデータに対するクラスタリングとその評価
小林大記; 濵砂幸裕第34 回ファジィシステムシンポジウム(FSS2018) 2018/09

ノード数の制約に基づくネットワーククラスタリングの検討
中野秀亮; 濵砂幸裕; 遠藤靖典第34回ファジィシステムシンポジウム(FSS2018) 2018/09

クラスタ分割が重み付きアルファ複体とホモトピー同値になるような目的関数最適化に基づくクラスタリングについて
星野翔太; 遠藤靖典; 濵砂幸裕第34回ファジィシステムシンポジウム(FSS2018) 2018/09

ネットワークデータに対するクラスタ数推定アルゴリズムの検討
尾﨑稜; 濵砂幸裕第27回インテリジェント・システム・シンポジウム 2017/11

A Study on Outlier Detection for Network Data by Using Fuzzy Clustering
Hamasuna Yukihiro; Ozaki Ryo Proceedings of the Japan Joint Automatic Control Conference 60- (0) 1550 -1551 2017/11

Cluster Validity Measures Based Agglomerative Hierarchical Clustering for Network Data
尾﨑稜; 濵砂幸裕ファジィシステムシンポジウム講演論文集 33- 435 -440 2017/09

ネットワークデータに対する外れ値検出の検討
濵砂幸裕; 尾﨑稜第33回ファジィシステムシンポジウム(FSS2017) 2017/09

Clustering and others : beyond the k-means
濵砂幸裕システム制御情報学会研究発表講演会講演論文集 61- 6p 2017/05

カーネル法に基づく妥当性基準を用いた2段階クラスタリング
尾﨑稜; 濵砂幸裕; 遠藤靖典第26 回インテリジェント・システム・シンポジウム(FAN2016) 2016/10

グラフクラスタリングに対する妥当性基準に関する一考察
藤澤拓也; 尾﨑稜; 濵砂幸裕第26回インテリジェント・システム・シンポジウム(FAN2016) 2016/10

カーネル関数を用いた逐次抽出型クラスタリングの検討
濵砂幸裕; 遠藤靖典第32回ファジィシステムシンポジウム(FSS2016) 2016/09

On Cluster Validity Measures Based x-means for Fuzzy Partition
Hamasuna Yukihiro; Endo Yasunori Proceedings of the Fuzzy System Symposium 31- 99 -100 2015/09

Practical Statistical Tests and Machine Learning(6)Introduction to Clustering : Beyond k-means
HAMASUNA Yukihiro Systems, control and information 59- (6) 240 -245 2015

妥当性基準を用いたx-meansについて
濵砂幸裕; 遠藤靖典第6 回コンピューテーショナル・インテリジェンス研究会 2014/12

On Semi-supervised Clustering with Assignment Prototype Term
Hamasuna Yukihiro; Endo Yasunori Proceedings of the Fuzzy System Symposium 30- 450 -451 2014/09

L1正則化Assignment-Prototype Algorithmを用いたクラスタの逐次抽出
濵砂幸裕; 遠藤靖典 2014/03

L1正則化を用いたエントロピー型可能性クラスタリングについて
濵砂幸裕; 遠藤靖典第29 回ファジィシステムシンポジウム(FSS2013) 2013/09

On Entropy Based Fuzzy Non-Metric Model with A Variable Controlling Cluster Sizes
Hamasuna Yukihiro; Endo Yasunori Proceedings of the Fuzzy System Symposium 29- 163 -163 2013/09

情報量基準を用いたファジィc-回帰モデルのクラスタ数推定
濵砂幸裕; 遠藤靖典第57回システム制御情報学会研究発表講演会(SCI'13) 2013/05

ファジィc-回帰モデルにおける最適クラスタ数の推定
濵砂幸裕; 遠藤靖典第39 回ファジィワークショップ 2013/03

On Sparse Possibilistic Clustering with Crispness
Yukihiro Hamasuna; Yasunori Endo Proceedings of the Fuzzy System Symposium 28- (0) 859 -862 2012/09

クラスタワイズ許容を用いた逐次抽出型ハードクラスタリングについて
樋口徹; 濵砂幸裕; 遠藤靖典第56回システム制御情報学会研究発表講演会(SCI'12) 2012/05

クラスタワイズ許容による対制約を用いた半教師付き階層的クラスタリングについ
中矢亮祐; 濵砂幸裕; 遠藤靖典第56回システム制御情報学会研究発表講演会(SCI'12) 2012/05

クラスタワイズ許容を用いた半教師付きc-平均法の性能比較
濵砂幸裕; 遠藤靖典第38 回ファジィワークショップ 2012/03

On Semi-supervised Fuzzy c-Means Clustering Using Clusterwise Tolerance
濵砂幸裕; 遠藤靖典ファジィシステムシンポジウム講演論文集 27- 323 -326 2011/09

On Support Vector Machine for Uncertain Data with Penalty Vectors
髙山勲; 遠藤靖典; 濵砂幸裕ファジィシステムシンポジウム講演論文集 27- 327 -330 2011/09

A Study on Cluster Validity Measures for Data with Tolerance
日置彩子; 遠藤靖典; 濵砂幸裕ファジィシステムシンポジウム講演論文集 27- 317 -322 2011/09

On principal component analysis for data with tolerance
Yasunori Endo; Tatsuyoshi Tsuji; Yukihiro Hamasuna; Kota Kurihara Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 27- 177 -182 2011

不確実データに対するペナルティ項を伴うファジィ$c$-平均法について
宮本智明; 浜砂幸裕; 遠藤靖典第26回ファジィシステムシンポジウム (FSS2010) 2010/09

ウィーン経済大学における大学院教育に関する現地調査
遠藤靖典; 濱砂幸裕; 宮本定明リスク工学研究 4- 12 -15 2008/03

Awards & Honors

2012/09 日本知能情報ファジィ学会奨励賞

Research Grants & Projects

構造的ゆらぎを伴うネットワークデータに対するクラスタリング手法の拡張と高度化
日本学術振興会：科学研究費補助金基盤研究（C)
Date (from‐to) : 2019/04 -2022/03
Author : 濵砂幸裕

データ構造に対して頑健なクラスタリングの開発
公益財団法人電気通信普及財団：研究調査助成
Date (from‐to) : 2019/04 -2020/03
Author : 濵砂幸裕

構造的ゆらぎを伴うグラフデータに対するクラスタリング手法の確立
Japan Society for the Promotion of Science：Grant-in-Aid for Young Scientists (B)
Date (from‐to) : 2016/04 -2019/03
Author : HAMASUNA Yukihiro

グラフデータに対する知識融合型クラスタリング技法の開発
公益財団法人電気通信普及財団：研究調査助成
Date (from‐to) : 2016/04 -2017/03
Author : 濵砂幸裕

半教師付きスペクトラルクラスタリングの高度化~特に、ソーシャルデータの解析を目的として~
公益財団法人電気通信普及財団：研究調査助成
Date (from‐to) : 2013/04 -2014/03
Author : 濵砂幸裕

知識ベースの融合によるクラスタリングの高度化～特に不確実データの解析について～
日本学術振興会：科学研究費助成事業
Date (from‐to) : 2009 -2010
Author : 濱砂幸裕

情報通信技術の著しい発達により,以前とは比較にならない大規模・複雑なデータが蓄積されており,そのようなデータを対象として,人間のように柔軟な処理を経て,有用な情報を抽出する必然性は高まる一方である.そのようなデータ解析手法の一つにクラスタリングがある.クラスタリングは,大規模・複雑なデータから,人間には抽出困難な構造を抽出するための重要な手法であり,自然言語・画像認識など様々な分野に応用されている.通常,クラスタリングで対象となるデータはパターン空間上の点として表される.しかしながら,データが誤差や欠損といった固有の不確実性を伴う場合,データは区間や幅として表されるため,既存の手法で扱うことは困難である.そこで,本研究課題では,不確実性に対して人間のように柔軟な処理を行える方法論の構築を目標とし,データに伴う不確実性を許容範囲付きデータとして扱うクラスタリング手法の高度化に取り組んだ.本研究課題の成果として,許容範囲付きデータに対するクラスタリング手法,回帰分析の構築,クラスタワイズ許容を用いたクラスタリング手法の確立が成された.また,クラスタリングにより得られる分類結果を評価する妥当性基準の不確実データへの拡張を試み,許容範囲付きデータに対する妥当性基準を新たに構築した.それらに並行して,教師あり学習の一手法である回帰分析の不確実データへの拡張を行った.これらの研究により得られた成果から,許容の概念を用いて不確実データを扱うデータ解析の方法論が確立されたと考えている.特に,不確実データに対するクラスタリング手法では,データの分類からその評価までを許容の概念を用いた統一的な枠組みで議論することを可能とした点は従来のデータ解析手法と大きく異なっている点である.さらに,本研究課題の発展であるクラスタワイズ許容の半教師付きクラスタリング手法への援用など,様々な発展性を示すことができ,本研究課題の目的は十分に達成することができたと考えている.

Others

2019/04 -2020/03 時系列データに対するクラスタリングの高度化
近畿大学学内研究助成金奨励研究助成金 SR12 研究内容：時系列データに対するクラスタリングの高度化

2016/04 -2017/03 大規模データに対するクラスタ数推定アルゴリズムの開発
近畿大学学内研究助成金奨励研究助成金 SR01 研究内容：クラスタ構造のモデル化によるクラスタ数推定アルゴリズムの開発

2013/04 -2013/04 クラスタ数自動推定アルゴリズムの開発～特に、情報量基準の観点から～
近畿大学学内研究助成金奨励研究助成金 SR04 研究内容：クラスタ数自動推定アルゴリズムの開発

Researchers Database

HAMASUNA Yukihiro

Researcher Information

Research funding number

J-Global ID

Research Interests

Research Areas

Published Papers

MISC

Awards & Honors

Research Grants & Projects

Others

Other link

researchmap