HAMASUNA Yukihiro

Department of InformaticsAssociate Professor

Last Updated :2024/10/10

■Researcher basic information

Researcher number

70610559

Research Keyword

  • Data Science   Machine Learning   Soft Computing   Clustering   

Research Field

  • Informatics / Sensitivity (kansei) informatics
  • Informatics / Soft computing
  • Informatics / Intelligent informatics
  • Informatics / Information theory

■Research activity information

Award

  • 2012/09 日本知能情報ファジィ学会 奨励賞

Paper

  • Yukihiro Hamasuna; Yoshitomo Mori
    Lecture Notes in Computer Science Springer Nature Switzerland 77 - 89 0302-9743 2024/08 [Refereed]
  • Gaussian process based sequential regression models
    Kaito Takegawa; Yuya Yokoyama; Yukihiro Hamasuna
    IEEE World Congress on Computational Intelligence (IEEE WCCI 2024) 2024/07 [Refereed]
  • Yukihiro Hamasuna; Yoshitomo Mori
    Lecture Notes in Computer Science Springer Nature Switzerland 14376 179 - 191 0302-9743 2023/10 [Refereed]
  • Yukihiro Hamasuna; Yuya Yokoyama; Kaito Takegawa
    2022 Joint 12th International Conference on Soft Computing and Intelligent Systems and 23rd International Symposium on Advanced Intelligent Systems (SCIS&ISIS) IEEE 2022/11 [Refereed]
  • Yukihiro Hamasuna; Shusuke Nakano; Yasunori Endo
    Modeling Decisions for Artificial Intelligence, LNAI 12898 Springer International Publishing 243 - 256 0302-9743 2021/09 [Refereed]
  • Yuto Kingetsu; Yukihiro Hamasuna
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press Ltd. 25 (2) 226 - 233 1343-0130 2021/03 [Refereed]
     
    Several conventional clustering methods use the squared L2-norm as the dissimilarity. The squared L2-norm is calculated from only the object coordinates and obtains a linear cluster boundary. To extract meaningful cluster partitions from a set of massive objects, it is necessary to obtain cluster partitions that consisting of complex cluster boundaries. In this study, a JS-divergence-based k-medoids (JSKMdd) is proposed. In the proposed method, JS-divergence, which is calculated from the object distribution, is considered as the dissimilarity. The object distribution is estimated from kernel density estimation to calculate the dissimilarity based on both the object coordinates and their neighbors. Numerical experiments were conducted using five artificial datasets to verify the effectiveness of the proposed method. In the numerical experiments, the proposed method was compared with the k-means clustering, k-medoids clustering, and spectral clustering. The results show that the proposed method yields better results in terms of clustering performance than other conventional methods.
  • TSUDA Nobuhiko; HAMASUNA Yukihiro; ENDO Yasunori
    Journal of Japan Society for Fuzzy Theory and Intelligent Informatics Japan Society for Fuzzy Theory and Intelligent Informatics 33 (2) 608 - 616 1347-7986 2021 [Refereed]
     

    Time-series data is data that contains information about time-varying phenomena, and it has a wide range of applications. Clustering is one of the data analysis methods to analyze large complex time-series data and extract their features. The important issues in clustering time-series data is the selection of a suitable dissimilarity and the selection of a suitable clustering algorithm. In this paper, we propose new clustering methods to handle imbalanced time-series data by introducing the concept of size-control into the clustering methods for time-series data. The proposed methods are constructed by extending k-medoids using dynamic time warping (DTW) for dissimilarity, k-medoids and k-shape using shape-based distance (SBD) for dissimilarity, which are typical methods for time-series data. The performance of the proposed methods is verified by numerical experiments using 12 datasets available in the UCR Time Series Classification Archive. From the numerical experiments, we confirmed that k-medoids with size control using DTW obtains the best cluster partition among the proposed methods.

  • Nobuhiko Tsuda; Yukihiro Hamasuna
    2020 Joint 11th International Conference on Soft Computing and Intelligent Systems and 21st International Symposium on Advanced Intelligent Systems (SCIS-ISIS) IEEE 2020/12 [Refereed]
  • Yukihiro Hamasuna; Daiki Kobayashi; Yasunori Endo
    The 17th International Conference on Modeling Decisions for Artificial Intelligence 2020/09 [Refereed]
  • Yasunori Endo; Kanata Hoshino; Yukihiro Hamasuna
    Journal of Ambient Intelligence and Humanized Computing Springer Science and Business Media LLC 1868-5137 2020/07 [Refereed]
  • k-Medoids Clustering Based on Kernel Density Estimation and Jensen-Shannon Divergence
    Yukihiro Hamasuna; Yuto Kingetsu; Shusuke Nakano
    The 16th International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2019) 2019/09 [Refereed]
  • Cluster Validity Measures Based Agglomerative Hierarchical Clustering for Network Data
    Yukihiro Hamasuna; Shusuke Nakano; Ryo Ozaki; Yasunori Endo
    Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 23 (3) 577 - 583 2019/05 [Refereed]
  • Daiki Kobayashi; Yukihiro Hamasuna
    Proceedings - 2018 Joint 10th International Conference on Soft Computing and Intelligent Systems and 19th International Symposium on Advanced Intelligent Systems, SCIS-ISIS 2018 Institute of Electrical and Electronics Engineers Inc. 648 - 653 2018/07 
    ost network data obtained from the real world consists only edge connection. To analyze network data using clustering, weight or dissimilarity between nodes are required. Although various weighting methods have been proposed, it has not been discussed which weighting is suitable. In this study, we used two weighting methods which are calculated from edge connection to verify the suitable weighting method to unweighted network data. One is the Euclidean distance based on adjacency matrix and the other is the diffusion kernel. Next, the k-medoids method and the Louvain method are executed to obtain network cluster partition. After that, obtained network cluster partition is evaluated by cluster validity measures including the Modularity. The result showed that the k-medoids with diffusion kernel is effective. In addition, the cluster validity measures for the partition obtained by the k-medoids with diffusion kernel is well performed.
  • Shusuke Nakano; Yukihiro Hamasuna; Yasunori Endo
    Proceedings - 2018 Joint 10th International Conference on Soft Computing and Intelligent Systems and 19th International Symposium on Advanced Intelligent Systems, SCIS-ISIS 2018 Institute of Electrical and Electronics Engineers Inc. 826 - 831 2018/07 
    The Louvain method is one of the typical network clusterings. It is well-known that the Louvain method obtains better cluster partition in a short time. However, there are several network data which are not obtained better cluster partition by the Louvain method. One of the reasons for the above is that the Louvain method focuses on an only edge connection. We proposed the method which focuses on node size. The proposed method optimizes the objective function of k-medoids by solving the linear programming problem under the constraints on node size. We verified the usefulness of the proposed method in the viewpoint of calculation time and accuracy with an artificial and benchmark unweighted network datasets. The numerical examples show that the proposed method is faster and obtains better cluster partition than the Louvain method. The Euclidean distance in adjacency matrix does not obtain better cluster partition for the datasets, which consist of terminal nodes or high degree nodes.
  • Yukihiro Hamasuna; Daiki Kobayashi; Ryo Ozaki; Yasunori Endo
    Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 22 (4) 544 - 550 2018/07 [Refereed]
  • Kei Kitajima; Yasunori Endo; Yukihiro Hamasuna
    Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 22 (4) 537 - 543 2018/07 [Refereed]
  • Yukihiro Hamasuna; Ryo Ozaki; Yasunori Endo
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 22 (1) 54 - 61 1883-8014 2018/01 [Refereed]
     
    To handle a large-scale object, a two-stage clustering method has been previously proposed. The method generates a large number of clusters during the first stage and merges clusters during the second stage. In this paper, a novel two-stage clustering method is proposed by introducing cluster validity measures as the merging criterion during the second stage. The significant cluster validity measures used to evaluate cluster partitions and determine the suitable number of clusters act as the criteria for merging clusters. The performance of the proposed method based on six typical indices is compared with eight artificial datasets. These experiments show that a trace of the fuzzy covariance matrix Wtr and its kernelization KWtr are quite effective when applying the proposed method, and obtain better results than the other indices.
  • Yasunori Endo; Yukihiro Hamasuna; Tsubasa Hirano; Naohiko Kinoshita
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 22 (1) 62 - 69 1883-8014 2018/01 [Refereed]
     
    A clustering method referred to as K-member clustering classifies a dataset into certain clusters, the size of which is more than a given constant K. Even-sized clustering, which classifies a dataset into even-sized clusters, is also considered along with K-member clustering. In our previous study, we proposed Even-sized Clustering Based on Optimization (ECBO) to output adequate results by formulating an even-sized clustering problem as linear programming. The simplex method is used to calculate the belongingness of each object to clusters in ECBO. In this study, ECBO is extended by introducing ideas that were introduced in Kmeans or fuzzy c-means to resolve problems of initialvalue dependence, robustness against outliers, calculation costs, and nonlinear boundaries of clusters. We also reconsider the relation between the dataset size, the cluster number, and K in ECBO. Moreover, we verify the effectiveness of the variants of ECBO based on experimental results using synthetic datasets and a benchmark dataset.
  • Ryo Ozaki; Yukihiro Hamasuna; Yasunori Endo
    2017 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2017 Institute of Electrical and Electronics Engineers Inc. 2017- 1822 - 1827 1062-922X 2017/11 [Refereed]
     
    Modularity is an evaluation measure for graph clustering. Louvain method is constructed by local optimization for modularity and is bottom up method as well as agglomerative hierarchical clustering. Cluster validity measures are used to evaluate cluster partitions as well as modularity. They are traditional evaluation measures in the field of clustering. We propose a novel graph clustering which is based on agglomerative hierarchical clustering. The proposed method in this study is constructed by local optimization for cluster validity measures. The effectiveness of the proposed method is shown through numerical examples. Numerical examples show that the proposed method has different clustering propety from Louvain method because of the feature of cluster validity measures.
  • On Fuzzified Even-sized Clustering Based on Optimization
    Kei Kitajima; Yasunori Endo; Yukihiro Hamasuna
    The 14th International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2017) 2017/10 [Refereed]
  • On Edge Penalty Based Hard and Fuzzy c-Medoids for Uncertain Networks
    Yukihiro Hamasuna; Yasunori Endo
    The 14th International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2017) 2017/10 [Refereed]
  • Ryosuke Abe; Sadaaki Miyamoto; Yasunori Endo; Yukihiro Hamasuna
    IFSA-SCIS 2017 - Joint 17th World Congress of International Fuzzy Systems Association and 9th International Conference on Soft Computing and Intelligent Systems Institute of Electrical and Electronics Engineers Inc. 1 - 5 2017/08 [Refereed]
     
    The problem of estimating appropriate number of clusters has been a main and difficult issue in clustering researches. There are different methods for this in hierarchical clustering a typical approach is to try clustering for different number of clusters, and compare them using a measure to estimate cluster numbers. On the other hand, there is no such method to estimate automatically the number of clusters in agglomerative hierarchical clustering (AHC), since AHC produces a family of clusters with different cluster numbers at the same time using the form of dendrograms. An exception is the Newman method in network clustering, but this method does not have a useful dendrogram output. The aim of the present paper is to propose new methods to automatically estimate the number of clusters in AHC. We show two approaches for this purpose, one is to use a variation of cluster validity measure, and another is to use statistical model selection method like BIC.
  • Yukihiro Hamasuna; Ryo Ozaki; Yasunori Endo
    IFSA-SCIS 2017 - Joint 17th World Congress of International Fuzzy Systems Association and 9th International Conference on Soft Computing and Intelligent Systems Institute of Electrical and Electronics Engineers Inc. 1 - 6 2017/08 [Refereed]
     
    Modularity is one of the evaluation measures for network data and used as the criterion of merging two clusters in Louvain method. To construct useful cluster validity measures for network data, the effectiveness of eight conventional cluster validity measures are compared with Modularity. Cluster partitions of six artificial network datasets are obtained by k-medoids and evaluated by cluster validity measures including Modularity. Numerical experiments show that the Dunn's index is effective in conventional cluster validity measures than other indices.
  • Yasunori Endo; Sachiko Ishida; Naohiko Kinoshita; Yukihiro Hamasuna
    IEEE International Conference on Fuzzy Systems Institute of Electrical and Electronics Engineers Inc. 1 - 6 1098-7584 2017/08 [Refereed]
     
    Clustering is one of unsupervised classification method, that is, it classifies a data set into some clusters without any external criterion. Typical clustering methods, e.g. k-means (KM) or fuzzy c-means (FCM) are constructed based on optimization of the given objective function. Many clustering methods as well as KM and FCM are formulated as optimization problems with typical objective functions and constraints. The objective function itself is also an evaluation guideline of results of clustering methods. Considered together with its theoretical extensibility, there is the great advantage to construct clustering methods in the framework of optimization. From the viewpoint of optimization, some of the authors proposed an Even-sized Clustering method Based on Optimization (ECBO), which is with tight constraints of cluster size, and constructed some variations of ECBO. The constraint considered in ECBO is that each cluster size is K or K + 1, and the belongingness of each object to clusters is calculated by the simplex method in each iteration. It is considered that ECBO has the advantage in the viewpoint of clustering accuracy, cluster size, and optimization framework than other similar methods. However, the constraint of cluster sizes of ECBO is tight in the meaning of cluster size so that it may be inconvenient in case that some extra margin of cluster size is allowed. Moreover, it is expected that new clustering algorithms in which each cluster size can be controlled deal with more various datasets. From the above view point, we proposed two new clustering algorithms based on ECBO. One is COntrolled-sized Clustering Based on Optimization (COCBO), and the other is an extended COCBO, which is referred to as COntrolled-sized Clustering Based on Optimization++ (COCBO++). Each cluster size can be controlled in the algorithms. However, these algorithms have some problems. In this paper, we will describe various types of COCBO to solve the above problems and estimate the methods in some numerical examples.
  • Yukihiro Hamasuna; Ryo Ozaki; Yasunori Endo
    The 2017 conference of the International Federation of Classification Societies (IFCS2017) 2017/08 [Refereed]
  • Yukihiro Hamasuna; Yasunori Endo
    Studies in Computational Intelligence Springer Verlag 671 87 - 99 1860-949X 2017/01 [Refereed]
     
    A large number of clustering algorithms have been proposed to handle target data and deal with various real-world problems such as uncertain data mining, semi-supervised learning and so on. We focus above two topics and introduce two concepts to construct significant clustering algorithms. We propose tolerance and penalty-vector concepts for handling uncertain data. We also propose clusterwise tolerance concept for semi-supervised learning. These concepts are quite similar approach in the viewpoint of handling objects to be flexible to each clustering topics. We construct two clustering algorithms FCMT and FCMQ for handling uncertain data. We also construct two clustering algorithms FCMCT and SSFCMCT for semi- supervised learning. We consider that those concepts have a potential to resolve conventional and brand new clustering topics in various ways.
  • Comparison of Trace of Fuzzy Covariance Matrix with Its Kernelization in Cluster Validity Measures based x-means
    Yukihiro Hamasuna; Yasunori Endo
    The 14th International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2017) 2016/09 [Refereed]
  • Yukihiro Hamasuna; Naohiko Kinoshita; Yasunori Endo
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 20 (5) 845 - 853 1883-8014 2016/09 [Refereed]
     
    The x-means determines the suitable number of clusters automatically by executing k-means recursively. The Bayesian Information Criterion is applied to evaluate a cluster partition in the x-means. A novel type of x-means clustering is proposed by introducing cluster validity measures that are used to evaluate the cluster partition and determine the number of clusters instead of the information criterion. The proposed x- means uses cluster validity measures in the evaluation step, and an estimation of the particular probabilistic model is therefore not required. The performances of a conventional x-means and the proposed method are compared for crisp and fuzzy partitions using eight datasets. The comparison shows that the proposed method obtains better results than the conventional method, and that the cluster validity measures for a fuzzy partition are effective in the proposed method.
  • Yasunori Endo; Tomoyuki Suzuki; Naohiko Kinoshita; Yukihiro Hamasuna
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 20 (4) 571 - 579 1883-8014 2016 [Refereed]
     
    The fuzzy non-metric model (FNM) is a representative non-hierarchical clustering method, which is very useful because the belongingness or the membership degree of each datum to each cluster can be calculated directly from the dissimilarities between data and the cluster centers are not used. However, the original FNM cannot handle data with uncertainty. In this study, we refer to the data with uncertainty as "uncertain data," e.g., incomplete data or data that have errors. Previously, a methods was proposed based on the concept of a tolerance vector for handling uncertain data and some clustering methods were constructed according to this concept, e.g. fuzzy c-means for data with tolerance. These methods can handle uncertain data in the framework of optimization. Thus, in the present study, we apply the concept to FNM. First, we propose a new clustering algorithm based on FNMusing the concept of tolerance, which we refer to as the fuzzy non-metric model for data with tolerance. Second, we show that the proposed algorithm can handle incomplete data sets. Third, we verify the effectiveness of the proposed algorithm based on comparisons with conventional methods for incomplete data sets in some numerical examples.
  • Yasunori Endo; Tsubasa Hirano; Naohiko Kinoshita; Yikihiro Hamasuna
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, (MDAI 2016) SPRINGER-VERLAG BERLIN 9880 165 - 177 0302-9743 2016 [Refereed]
     
    Clustering is a very useful tool of data mining. A clustering method which is referred to as K-member clustering is to classify a dataset into some clusters of which the size is more than a given constant K. The K-member clustering is useful and it is applied to many applications. Naturally, clustering methods to classify a dataset into some even-sized clusters can be considered and some even-sized clustering methods have been proposed. However, conventional even-sized clustering methods often output inadequate results. One of the reasons is that they are not based on optimization. Therefore, we proposed Even-sized Clustering Based on Optimization (ECBO) in our previous study. The simplex method is used to calculate the belongingness of each object to clusters in ECBO. In this study, ECBO is extended by introducing some ideas which were introduced in k-means or fuzzy c-means to improve problems of initial-value dependence, robustness against outliers, calculation cost, and nonlinear boundaries of clusters. Moreover, we reconsider the relation between the dataset size, the cluster number, and K in ECBO.
  • Ryo Ozaki; Yukihiro Hamasuna; Yasunori Endo
    2016 JOINT 8TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 17TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS) IEEE 410 - 415 2016 [Refereed]
     
    Two-stage clustering is constructed from generating stage and merging one. To handle a large scale of objects, an algorithm of the two-stage clustering generates a large number of clusters in the first stage and merge clusters in the second stage. A novel two-stage clustering method is proposed by introducing cluster validity measures which are used to evaluate cluster partition and determine the suitable number of clusters. The significant cluster validity measure is used in the second stage and play a role as criterion to merge clusters. The performance of the proposed method are compared with six artificial datasets and three benchmark datasets. These experiments show that several cluster validity measures, that is, trace of fuzzy covariance matrix and membership degrees based indices are effective in the proposed method and obtain better results than other indices.
  • Yukihiro Hamasuna; Yasunori Endo
    2016 JOINT 8TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 17TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS) IEEE 416 - 419 2016 [Refereed]
     
    A method of sequential clustering extracts a cluster sequentially without determining the number of clusters. The sequential hard clustering is based on noise clustering and one of the typical sequential clustering methods. A kernelized sequential hard clustering is proposed by introducing the kernel method to sequential hard clustering to handle datasets which consists non-linear clusters and execute robust clustering. The performance of the proposed method is evaluated with a typical dataset which consists non-linear cluster boundary. Negative results are obtained through numerical examples and those show that the proposed method can not extract non-linear clusters
  • Yasunori Endo; Tomoyuki Suzuki; Naohiko Kinoshita; Yukihiro Hamasuna; Sadaaki Miyamoto
    IEEE International Conference on Fuzzy Systems Institute of Electrical and Electronics Engineers Inc. 2015- 1098-7584 2015/11 [Refereed]
     
    Clustering is a technique of unsupervised classification. The methods are classified into two types, one is hierarchical and the other is non-hierarchical. Fuzzy non-metric model (FNM) is a representative method of non-hierarchical clustering. FNM is very useful because belongingness or the membership degree of each datum to each cluster is calculated directly from dissimilarities between data, and cluster centers are not used. However FNM cannot handle data with uncertainty, called uncertain data, e.g. incomplete data, or data which have errors. In order to handle such data, concept of tolerance vector has been proposed. The clustering methods using the concept can handle the uncertain data in the framework of optimization, e.g. fuzzy c-means for data with tolerance (FCM-T). In this paper, we will first propose new clustering algorithm to apply the concept of tolerance to FNM, called fuzzy non-metric model for data with tolerance (FNM-T). Second, we will show that the proposed algorithm handle incomplete data sets. Third, we will verify the effectiveness of the proposed algorithm in comparison with conventional ones for incomplete data sets through some numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo
    The 12th International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2015) Japan Society for Fuzzy Theory and Intelligent Informatics 31 99 - 100 2015/09 [Refereed]
     
    The x-means divides a set of objects without determining the number of clusters by using iterative k-means and evaluation criteria.A series of cluster validity measures is also used in order to evaluate the clustering results and determine suitable number of clusters. We propose cluster validity measures based x-means by introducing cluster validity measures instead of information criteria.We moreover show the effectiveness of the proposed methodthrough numerical examples.
  • Naohiko Kinoshita; Yasunori Endo; Yukihiro Hamasuna
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 19 (5) 624 - 631 1883-8014 2015/09 [Refereed]
     
    Clustering, a highly useful unsupervised classification, has been applied in many fields. When, for example, we use clustering to classify a set of objects, it generally ignores any uncertainty included in objects. This is because uncertainty is difficult to deal with and model. It is desirable, however, to handle individual objects as is so that we may classify objects more precisely. In this paper, we propose new clustering algorithms that handle objects having uncertainty by introducing penalty vectors. We show the theoretical relationship between our proposal and conventional algorithms verifying the effectiveness of our proposed algorithms through numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 19 (5) 655 - 661 1883-8014 2015/09 [Refereed]
     
    Sequential cluster extraction algorithms are useful clustering methods that extract clusters one by one without the number of clusters having to be determined in advance. Typical examples of these algorithms are sequential hard c-means (SHCM) and possibilistic clustering (PCM) based algorithms. Two types of L1-regularized possibilistic clustering are proposed to induce crisp and possibilistic allocation rules and to construct a novel sequential cluster extraction algorithm. The relationship between the proposed method and SHCM is also discussed. The effectiveness of the proposed method is verified through numerical examples. Results show that the entropy-based method yields better results for the Rand Index and the number of extracted clusters.
  • Yasunori Endo; Tomoyuki Suzuki; Naohiko Kinoshita; Yukihiro Hamasuna; Sadaaki Miyamoto
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015) IEEE 20 (4) 571 - 579 1544-5615 2015 [Refereed]
     
    Clustering is a technique of unsupervised classification. The methods are classified into two types, one is hierarchical and the other is non-hierarchical. Fuzzy non-metric model (FNM) is a representative method of non-hierarchical clustering. FNM is very useful because belongingness or the membership degree of each datum to each cluster is calculated directly from dissimilarities between data, and cluster centers are not used. However FNM cannot handle data with uncertainty, called uncertain data, e.g. incomplete data, or data which have errors. In order to handle such data, concept of tolerance vector has been proposed. The clustering methods using the concept can handle the uncertain data in the framework of optimization, e.g. fuzzy c-means for data with tolerance (FCM-T). In this paper, we will first propose new clustering algorithm to apply the concept of tolerance to FNM, called fuzzy non-metric model for data with tolerance (FNM-T). Second, we will show that the proposed algorithm handle incomplete data sets. Third, we will verify the effectiveness of the proposed algorithm in comparison with conventional ones for incomplete data sets through some numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 19 (6) 759 - 765 1883-8014 2015 [Refereed]
     
    This paper presents a new algorithm of sequential cluster extraction based on hard c-means and hard c-medoids clustering. Sequential cluster extraction means that the algorithm extracts 'one cluster at a time.' A characteristic parameter, called a noise parameter, is used in noise clustering based sequential clustering. We propose a novel sequential clustering method called new sequential clustering, extracts an arbitrary number of objects as one cluster by considering the noise parameter as a variable to be optimized. Experimental results with four data sets confirm the effectiveness of our proposal. These results also show that classification results strongly depend on parameter ν and that our proposal is applicable to the first stage in a two-stage clustering algorithm.
  • Yasunori Endo; Tomoyuki Suzuki; Naohiko Kinoshita; Yukihiro Hamasuna; Sadaaki Miyamoto
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015) IEEE 1 - 7 1544-5615 2015 [Refereed]
     
    Clustering is a technique of unsupervised classification. The methods are classified into two types, one is hierarchical and the other is non-hierarchical. Fuzzy non-metric model (FNM) is a representative method of non-hierarchical clustering. FNM is very useful because belongingness or the membership degree of each datum to each cluster is calculated directly from dissimilarities between data, and cluster centers are not used. However FNM cannot handle data with uncertainty, called uncertain data, e.g. incomplete data, or data which have errors. In order to handle such data, concept of tolerance vector has been proposed. The clustering methods using the concept can handle the uncertain data in the framework of optimization, e.g. fuzzy c-means for data with tolerance (FCM-T). In this paper, we will first propose new clustering algorithm to apply the concept of tolerance to FNM, called fuzzy non-metric model for data with tolerance (FNM-T). Second, we will show that the proposed algorithm handle incomplete data sets. Third, we will verify the effectiveness of the proposed algorithm in comparison with conventional ones for incomplete data sets through some numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 19 (1) 23 - 28 1883-8014 2015/01 [Refereed]
     
    This paper proposes entropy-based L1-regularized possibilistic clustering and a method of sequential cluster extraction from relational data. Sequential cluster extraction means that the algorithm extracts cluster one by one. The assignment prototype algorithm is a typical clustering method for relational data. The membership degree of each object to each cluster is calculated directly from dissimilarities between objects. An entropy-based L1-regularized possibilistic assignment prototype algorithm is proposed first to induce belongingness for a membership grade. An algorithm of sequential cluster extraction based on the proposed method is constructed and the effectiveness of the proposed methods is shown through numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo
    2014 JOINT 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 15TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS) IEEE 489 - 494 2014 [Refereed]
     
    This paper presents a new sequential cluster extraction algorithm based on hard c-medoids clustering. The word sequential cluster extraction means that the algorithm extract one cluster at a time. The hard c-medoids is one of the variants of hard c-means clustering. The cluster medoid which is referred to as representative of each cluster is an object in hard c-medoids. The sequential clustering algorithms are based on Dave's noise clustering approach. A characteristic parameter which is called noise parameter is used in noise clustering. We construct a new sequential hard c-medoids algorithm by considering the noise parameter as a variables in optimization problem. First, the optimization problem of new sequential hard c-medoids clustering is introduced. Next, the sequential clustering algorithm is constructed based on the optimization problem. Moreover, the effectiveness of proposed method is shown through numerical experiments.
  • Tsubasa Hirano; Yasunori Endo; Naohiko Kinoshita; Yukihiro Hamasuna
    2014 JOINT 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 15TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS) IEEE 495 - 500 2014 [Refereed]
     
    Clustering methods to divide a data set into some clusters of which the size is more than a given constant K, are very useful in many applications. The methods are called K-member clustering (KMC). As a natural result, clustering methods to divide a data set into even-sized clusters can be considered. However, there are no algorithms of such methods based on optimization. That is why the conventional algorithms often output inadequate results. Therefore we should consider an algorithm based on optimization. In this paper, we propose even-sized clustering algorithm using simplex method which is one of optimization method, and verify the proposed method through some numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo
    KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2013), VOL 2 SPRINGER-VERLAG BERLIN 245 57 - 67 2194-5357 2014 [Refereed]
     
    The relational clustering is one of the clustering methods for relational data. The membership grade of each datum to each cluster is calculated directly from dissimilarities between datum and the cluster center which is referred to as representative of cluster is not used in relational clustering. This paper discusses a new possibilistic approach for relational clustering from the viewpoint of inducing the crispness. In the previous study, crisp possibilistic clustering and its variant has been proposed by using L-1-regularization. These crisp possibilistic clustering methods induce the crispness in the membership function. In this paper, entropy based crisp possibilistic relational clustering is proposed for handling relational data. Next, the way of sequential extraction is also discussed. Moreover, the effectiveness of proposed method is shown through numerical examples.
  • Yasunori Endo; Naohiko Kinoshita; Kuniaki Iwakura; Yukihiro Hamasuna
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2014 SPRINGER-VERLAG BERLIN 8825 145 - 157 0302-9743 2014 [Refereed]
     
    Recently, semi-supervised clustering has been focused, e.g., Refs. [2-5]. The semi-supervised clustering algorithms improve clustering results by incorporating prior information with the unlabeled data. This paper proposes three new clustering algorithms with pairwise constraints by introducing non-metric term to objective functions of the well-known clustering algorithms. Moreover, its effectiveness is verified through some numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2014 SPRINGER-VERLAG BERLIN 8825 135 - 144 0302-9743 2014 [Refereed]
     
    Semi-supervised learning is an important task in the field of data mining. Pairwise constraints such as must-link and cannot-link are used in order to improve clustering properties. This paper proposes a new type of semi-supervised hard and fuzzy c-means clustering with assignment prototype term. The assignment prototype term is based on the Windham's assignment prototype algorithm which handles pairwise constraints between objects in the proposed method. First, an optimization problem of the proposed method is formulated. Next, a new clustering algorithm is constructed based on the above discussions. Moreover, the effectiveness of the proposed method is shown through numerical experiments.
  • Yukihiro Hamasuna; Yasunori Endo
    2014 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC) IEEE 82 - 87 2014 [Refereed]
     
    This paper presents a new sequential clustering algorithm based on sequential hard c-means clustering. The word sequential cluster extraction means that the algorithm extract one cluster at a time. The sequential hard c-means is one of the typical and conventional sequential clustering methods. The proposed new sequential clustering algorithm is based on Dave's noise clustering approach. A characteristic parameter which is called noise parameter is applied in Dave's approach. We construct a new sequential hard c-means algorithm by introducing another new parameter which controls a number of extracting objects and considering the noise parameter as a variables in optimization problem. First, the optimization problem of new sequential hard c-means clustering is introduced. Next, the sequential clustering algorithm and its kernelization are constructed based on above optimization problem. Moreover, the effectiveness of proposed method is shown through numerical experiments.
  • Yasunori Endo; Ayako Heki; Yukihiro Hamasuna
    Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 17 (4) 540 - 551 2013/07 [Refereed]
  • Yukihiro Hamasuna; Yasunori Endo
    Proceedings - 2013 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2013 IEEE 2013 Vol.5 3505 - 3510 1062-922X 2013 [Refereed]
     
    Possibilistic clustering is well-known as one of the useful clustering methods because it is robust against noise or outlier in data. In the previous study, sparse possibilistic clustering and its variant has been proposed by using 1-regularization. These possibilistic clustering methods with 1-regularization are quite different from the viewpoint of membership function. Two types of new possibilistic approach with 1- regularization named crisp possibilistic clustering are proposed in this paper. Classification function of proposed methods which shows allocation rule in whole space and the way of sequential cluster extraction are also proposed. The effectiveness of proposed methods is, moreover, shown through numerical examples. © 2013 IEEE.
  • Yukihiro Hamasuna; Yasunori Endo
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Springer 8234 204 - 213 0302-9743 2013 [Refereed]
     
    The fuzzy non-metric model is one of the clustering methods in which the membership grade of each datum to each cluster is calculated directly from dissimilarities between data. The cluster center which is referred to as representative of cluster is not used in fuzzy non-metric model. This paper discusses a new possibilistic approach for non-metric model from the viewpoint of being in the cluster. In the previous study, new possibilistic clustering and its variant have been proposed by using L1-regularization. These possibilistic clustering methods with L1-regularization induce a change in the membership function. Two types of non-metric model based on possibilistic approach named L1-regularized possibilistic non-metric model are proposed in this paper. Next, the way of sequential extraction algorithm is also discussed. Moreover, the results of sequential extraction based on proposed methods are shown. © 2013 Springer-Verlag.
  • Yukihiro Hamasuna; Yasunori Endo
    SOFT COMPUTING SPRINGER 17 (1) 71 - 81 1432-7643 2013/01 [Refereed]
     
    This paper presents a new semi-supervised fuzzy c-means clustering for data with clusterwise tolerance by opposite criteria. In semi-supervised clustering, pairwise constraints, that is, must-link and cannot-link, are frequently used in order to improve clustering performances. From the viewpoint of handling pairwise constraints, a new semi-supervised fuzzy c-means clustering is proposed by introducing clusterwise tolerance-based pairwise constraints. First, a concept of clusterwise tolerance-based pairwise constraints is introduced. Second, the optimization problems of the proposed method are formulated. Especially, must-link and cannot-link are handled by opposite criteria in our proposed method. Third, a new clustering algorithm is constructed based on the above discussions. Finally, the effectiveness of the proposed algorithm is verified through numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo
    6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS IEEE 1801 - 1806 2012 [Refereed]
     
    In addition to fuzzy c-means clustering, possibilistic clustering is well-known as one of the useful techniques because it is robust against noise in data. Especially sparse possibilistic clustering is quite different from other possibilistic clustering methods in the point of membership function. We propose a way to induce the crispness in possibilistic clustering by using L-1-regularization and show classification function of sparse possibilistic clustering with crispness for understanding allocation rule. We, moreover, show the way of sequential extraction by proposed method. After that, we show the effectiveness of the proposed method through numerical examples.
  • Yasunori Endo; Ayako Heki; Yukihiro Hamasuna
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Springer 7647 394 - 407 0302-9743 2012 [Refereed]
     
    Non metric model is a kind of clustering method in which belongingness or the membership grade of each object to each cluster is calculated directly from dissimilarities between objects and cluster centers are not used. By the way, the concept of rough set is recently focused. Conventional clustering algorithms classify a set of objects into some clusters with clear boundaries, that is, one object must belong to one cluster. However, many objects belong to more than one cluster in real world, since the boundaries of clusters overlap with each other. Fuzzy set representation of clusters makes it possible for each object to belong to more than one cluster. On the other hand, the fuzzy degree sometimes may be too descriptive for interpreting clustering results. Rough set representation could handle such cases. Clustering based on rough set representation could provide a solution that is less restrictive than conventional clustering and less descriptive than fuzzy clustering. This paper shows two type of Rough set based Non Metric model (RNM). One algorithm is Rough set based Hard Non Metric model (RHNM) and the other is Rough set based Fuzzy Non Metric model (RFNM). In the both algorithms, clusters are represented by rough sets and each cluster consists of lower and upper approximation. Second, the proposed methods are kernelized by introducing kernel functions which are a powerful tool to analize clusters with nonlinear boundaries. © 2012 Springer-Verlag.
  • Yasunori Endo; Arisa Taniguchi; Yukihiro Hamasuna
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 16 (7) 831 - 840 1883-8014 2012 [Refereed]
     
    Clustering is an unsupervised classification technique for data analysis. In general, each datum in real space is transformed into a point in a pattern space to apply clustering methods. Data cannot often be represented by a point, however, because of its uncertainty, e.g., measurement error margin and missing values in data. In this paper, we will introduce quadratic penalty-vector regularization to handle such uncertain data using Hard c-Means (HCM), which is one of the most typical clustering algorithms. We first propose a new clustering algorithm called hard c-means using quadratic penalty-vector regularization for uncertain data (HCMP). Second, we propose sequential extraction hard c-means using quadratic penalty-vector regularization (SHCMP) to handle datasets whose cluster number is unknown. Furthermore, we verify the effectiveness of our proposed algorithms through numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 16 (7) 819 - 824 1883-8014 2012 [Refereed]
     
    This paper presents a new semi-supervised agglomerative hierarchical clustering algorithm with the ward method using clusterwise tolerance. Semi-supervised clustering has recently been noted and studied in many research fields. Must-link and cannot-link, called pairwise constraints, are frequently used in order to improve clustering properties in semi-supervised clustering. First, clusterwise tolerance based pairwise constraints are introduced in order to handle mustlink and cannot-link constraints. Next, a new semisupervised hierarchical clustering algorithm with the ward method is constructed based on the above discussions. The effectiveness of the proposed algorithms is, moreover, verified through numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 16 (1) 174 - 179 1883-8014 2012 [Refereed]
     
    This paper presents semi-supervised agglomerative hierarchical clustering algorithm using clusterwise tolerance based pairwise constraints. In semi-supervised clustering, pairwise constraints, that is, must-link and cannot-link, are frequently used in order to improve clustering properties. From that sense, we will propose another way named clusterwise tolerance based pairwise constraints to handle must-link and cannot-link constraints in L 2-space. In addition, we will propose semi-supervised agglomerative hierarchical clustering algorithm based on it. We will, moreover, show the ffectiveness of the proposed method through numerical examples.
  • Yasunori Endo; Arisa Taniguchi; Aoi Takahashi; Yukihiro Hamasuna
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2011 SPRINGER-VERLAG BERLIN 6820 126 - + 0302-9743 2011 [Refereed]
     
    Clustering is one of the unsupervised classification techniques of the data analysis. Data are transformed from a real space into a pattern space to apply clustering methods. However, the data cannot be often represented by a point because of uncertainty of the data, e.g., measurement error margin and missing values in data. In this paper, we introduce quadratic penalty-vector regularization to handle such uncertain data into hard c-means (HCM) which is one of the most typical clustering algorithms. First, we propose a new clustering algorithm called hard c-means using quadratic penalty-vector regularization for uncertain data (HCMP). Second, we propose sequential extraction hard c-means using quadratic penalty-vector regularization (SHCMP) to handle datasets whose cluster number is unknown. Moreover, we verify the effectiveness of our propose algorithms through some numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2011 SPRINGER-VERLAG BERLIN 6820 103 - + 0302-9743 2011 [Refereed]
     
    This paper presents a new semi-supervised agglomerative hierarchical clustering algorithm with ward method using clusterwise tolerance. Recently, semi-supervised clustering has been remarked and studied in many research fields. In semi-supervised clustering, must-link and cannot-link called pairwise constraints are frequently used in order to improve clustering properties. First, a clusterwise tolerance based pairwise constraints is introduced in order to handle must-link and cannot-link constraints. Next, a new semi-supervised agglomerative hierarchical clustering algorithm with ward method is constructed based on above discussions. Moreover, the effectiveness of proposed algorithms is verified through numerical examples.
  • Yasunori Endo; Yukihiro Hamasuna
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT I SPRINGER 6881 131 - 140 0302-9743 2011 [Refereed]
     
    Recently, semi-supervised clustering attracts many researchers' interest. In particular, constraint-based semi-supervised clustering is focused and the constraints of must-link and cannot-link play very important role in the clustering. There are many kinds of relations as well as must-link or cannot-link and one of the most typical relations is the trade-off relation. Thus, in this paper we formulate the trade-off relation and propose a new "semi-supervised" concept called mutual relation. Moreover, we construct two types of new clustering algorithms with the mutual relation constraints based on the well-known and useful fuzzy c-means, called fuzzy c-means with the mutual relation constraints.
  • Yasunon Endo; Yukihiro Hamasuna
    International Conference on Intelligent Systems Design and Applications, ISDA IEEE 557 - 562 2164-7143 2011 [Refereed]
     
    Recently, semi-supervised clustering attracts many researchers' interest. In particular, constraint-based semi-supervised clustering is focused and the constraints of must-link and cannot-link play very important role in the clustering. There are many kinds of relations as well as must-link or cannot-link and one of the most typical relations is the trade-off relation. Thus, in this paper we formulate the trade-off relation and propose a new "semi-supervised" concept called mutual relation. Moreover, we construct two types of new clustering algorithms with the mutual relation constraints based on the well-known and useful hard c-means (HCM) and fuzzy c-means (FCM), called hard c-means with the mutual relation constraints (HCMMR) and fuzzy c-means with the mutual relation constraints (FCMMR). © 2011 IEEE.
  • Yukihiro Hamasuna; Yasunori Endo
    Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 IEEE 225 - 230 2011 [Refereed]
     
    The importance of semi-supervised clustering is to handle pairwise constraints as a prior knowledge. In this paper, we will propose a new semi-supervised fuzzy c-means clustering with clusterwise tolerance by opposite criteria. First, the concept of clusterwise tolerance and pairwise constraints are introduced. Second, the optimization problem of proposed method is formulated. Especially, must-link and cannot-link constraints are handled and introduced by opposite criteria in proposed method. Third, a new clustering algorithm is constructed based on the above discussions. Finally, the effectiveness of proposed algorithm is verified through numerical examples. © 2011 IEEE.
  • Yasunori Endo; Tatsuyoshi Tsuji; Yukihiro Hamasuna; Kota Kurihara
    Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 IEEE 177 - 182 2011 [Refereed]
     
    In many cases, data are handled as intervals on the pattern space because the data generally contain the uncertainty of error, loss and so on. The concept of tolerance in this paper enables us to handle these data as a point on the pattern space. The advantage is that we can handle uncertain data in the framework of optimization without introducing any particular measures between intervals. In recent years, this concept is positively introduced into clustering methods and the effectiveness is confirmed. However, there are few applications of the concept into multivariate analysis methods except regression models in spite of its effectiveness. Therefore, we propose a new algorithm of principal component analysis for uncertain data by introducing the concept of the tolerance in this paper. Moreover, we verify the effectiveness through some numerical examples. © 2011 IEEE.
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011) IEEE 810 - 815 1098-7584 2011 [Refereed]
     
    This paper presents Mahalanobis distance based fuzzy c-means clustering for uncertain data using penalty vector regularization. When we handle a set of data, data contains inherent uncertainty e. g., errors, ranges or some missing value of attributes. In order to handle such uncertain data as a point in a pattern space the concept of penalty vector has been proposed. Some significant clustering algorithms based on it have been also proposed. In conventional clustering algorithms, Mahalanobis distance have been used as dissimilarity as well as squared L-2 and L-1-norm. From the viewpoint of the guideline of dissimilarity, Mahalanobis distance based fuzzy c-means clustering for uncertain data should be considered. In this paper, we introduce fuzzy c-means clustering for uncertain data using penalty vector regularization as our conventional works. Next, we propose Mahalanobis distance based one. Moreover, we show the effectiveness of proposed method through numerical examples.
  • Yasunori Endo; Isao Takayama; Yukihiro Hamasuna; Sadaaki Miyamoto
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011) IEEE 804 - 809 1098-7584 2011 [Refereed]
     
    Recently, fuzzy c-means clustering with kernel functions is remarkable in the reason that these algorithms can handle datasets which consist of some clusters with nonlinear boundaries. However the algorithms have the following problems: (1) the cluster centers can not be calculated explicitly, (2) it takes long time to calculate clustering results. By the way, we have proposed the clustering algorithms using penalty-vector regularization to handle uncertain data. In this paper, we propose new clustering algorithms using quadratic penalty-vector regularization by introducing explicit mappings of kernel functions to solve the following problems. Moreover, we construct fuzzy classification functions for our proposed clustering methods.
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 15 (1) 68 - 75 1883-8014 2011 [Refereed]
     
    Detecting various kinds of cluster shape is an important problem in the field of clustering. In general, it is difficult to obtain clusters with different sizes or shapes by single-objective function. From that sense, we have proposed the concept of clusterwise tolerance and constructed clustering algorithms based on it. In the field of data mining, regularization techniques are used in order to derive significant classifiers. In this paper, we propose another concept of clusterwise tolerance from the viewpoint of regularization. Moreover, we construct clustering algorithms for data with clusterwise tolerance based on L2- and L1-regularization. After that, we describe fuzzy classification functions of proposed algorithms. Finally, we show the effectiveness of proposed algorithms through numerical examples.
  • Yasunori Endo; Yasushi Hasegawa; Yukihiro Hamasuna; Yuchi Kanzawa
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 15 (1) 76 - 82 1883-8014 2011 
    Clustering - defined as an unsupervised data-analysis classification transforming real-space information into data in pattern space and analyzing it - may require that data be represented by a set, rather than points, due to data uncertainty, e.g., measurement error margin, data regarded as one point, or missing values. These data uncertainties have been represented as interval ranges for which many clustering algorithms are constructed, but the lack of guidelines in selecting available distances in individual cases has made selection difficult and raised the need for ways to calculate dissimilarity between uncertain data without introducing a nearest-neighbor or other distance. The tolerance concept we propose represents uncertain data as a point with a tolerance vector, not as an interval, while this is convenient for handling uncertain data, tolerance-vector constraints make mathematical development difficult. We attempt to remove the tolerance-vector constraints using quadratic penaltyvector regularization similar to the tolerance vector. We also propose clustering algorithms for uncertain data considering optimization and obtaining an optimal solution to handle uncertainty appropriately.
  • Semi-supervised Fuzzy c-Means Clustering for Data with Clusterwise Tolerance with Pairwise Constraints
    Yukihiro Hamasuna; Yasunori Endo
    Joint 5th International Conference on Soft Computing and Intelligent Systems and 11th International Symposium on Advanced Intelligent Systems (SCIS & ISIS 2010) 2010/12 [Refereed]
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    SOFT COMPUTING SPRINGER 14 (5) 487 - 494 1432-7643 2010/03 [Refereed]
     
    This paper presents two new types of clustering algorithms by using tolerance vector called tolerant fuzzy c-means clustering and tolerant possibilistic clustering. In the proposed algorithms, the new concept of tolerance vector plays very important role. The original concept is developed to handle data flexibly, that is, a tolerance vector attributes not only to each data but also each cluster. Using the new concept, we can consider the influence of clusters to each data by the tolerance. First, the new concept of tolerance is introduced into optimization problems. Second, the optimization problems with tolerance are solved by using Karush-Kuhn-Tucker conditions. Third, new clustering algorithms are constructed based on the optimal solutions for clustering. Finally, the effectiveness of the proposed algorithms is verified through numerical examples and its fuzzy classification function.
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI) SPRINGER-VERLAG BERLIN 6408 152 - 162 0302-9743 2010 [Refereed]
     
    Recently, semi-supervised clustering has been remarked and discussed in many researches. In semi-supervised clustering, pairwise constraints, that is, must-link and cannot-link are frequently used in order to improve clustering results by using prior knowledges or informations. In this paper, we will propose a clusterwise tolerance based pairwise constraint. In addition, we will propose semi-supervised agglomerative hierarchical clustering algorithms with centroid method based on it. Moreover, we will show the effectiveness of proposed method through numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    Proceedings - 2010 IEEE International Conference on Granular Computing, GrC 2010 IEEE Computer Society 188 - 193 2010 [Refereed]
     
    Recently, semi-supervised clustering has been remarked and discussed in many research fields. In semisupervised clustering, prior knowledge or information are often formulated as pairwise constraints, that is, must-link and cannot-link. Such pairwise constraints are frequently used in order to improve clustering properties. In this paper, we will propose a new semi-supervised fuzzy c-means clustering by using clusterwise tolerance and pairwise constraints. First, the concept of clusterwise tolerance and pairwise constraints are introduced. Second, the optimization problem of fuzzy cmeans clustering using clusterwise tolerance based pairwise constraint is formulated. Especially, must-link constraint is considered and introduced as pairwise constraints. Third, a new clustering algorithm is constructed based on the above discussions. Finally, the effectiveness of proposed algorithm is verified through numerical examples. © 2010 IEEE.
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    2010 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2010) IEEE 1 - 6 1098-7584 2010 [Refereed]
     
    Cluster validity measures are used in order to determine an appropriate number of clusters and evaluate cluster partitions obtained by clustering algorithms. When we handle a set of data, data contains inherent uncertainty e. g., errors, ranges or some missing value of attributes. The concept of tolerance has been proposed from the viewpoint of handling such uncertain data. In this paper, we introduce clustering algorithms for data with tolerance. Moreover, we propose new five measures for data with tolerance, that is, the determinants and the traces of fuzzy covariance matrices, the Xie-Beni's index, the Fukuyama-Sugeno's index, and the Davies-Bouldin's index. We compare the performance of conventional ones with their tolerance versions. We found that our proposed measures takes smaller value than conventional ones. These results indicate tolerance based clustering method is suitable for handling uncertain data.
  • Yasunori Endo; Kouta Kurihara; Sadaaki Miyamoto; Yukihiro Hamasuna
    2010 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2010) IEEE 1 - 8 1098-7584 2010 [Refereed]
     
    c-regression models are known as very useful tools in many fields. Since now, many trials to construct c-regression models for data with uncertainty in independent and dependent variables have been done. However, there are few c-regression models for data with uncertainty in independent variables in comparison with dependent variables now. The reason is as follows. The models are constructed using optimal solutions which is derived by solving an optimization problem "analytically". The problem for data with uncertainty in dependent variables can be easily solved but it is very difficult to solve the problem for data with uncertainty in independent variables "analytically". Therefore, most of the models for data with uncertainty in independent variables are constructed in which the solutions are calculated "numerically". By the way, we have proposed "tolerance" of a convenient tool to handle data with uncertainty [3] and applied it to some of clustering algorithms [4]-[7]. This concept of tolerance is very useful. The reason is that we can handle data with uncertainty in the framework of optimization to use the concept, without introducing some particular measure between intervals. Especially when we handle the data with missing values of its attributes in the framework of optimization like as fuzzy c-means clustering [6], this tool is effective. Besides, we think that the tolerance is also available when we consider to construct a regression model for data with uncertainty in independent and dependent variables. In this paper, we first derive the optimal solutions for c-regression models for data with uncertainty in independent and dependent variables "analytically" by using the concept of tolerance. Second, we construct hard and fuzzy c-regression models for data with tolerance in independent and dependent variables. Moreover, we estimate effectiveness of the algorithms through some numerical examples.
  • Fuzzy c-Regression Model for Data with Tolerance
    Kouta Kurihara; Yasunori Endo; Yukihiro Hamasuna; Sadaaki Miyamoto
    The 6th International Conference on Modeling Decisions for Artificial Intelligence (MDAI2009) 2009/12 [Refereed]
  • On Hierarchical Clustering for Data with Tolerance
    Yasunori Endo; Yukihiro Hamasuna; Ayaka Tagaya
    The 6th International Conference on Modeling Decisions for Artificial Intelligence (MDAI2009) 2009/12 [Refereed]
  • Two types of Tolerant Hard c-Means Clustering
    Yukihiro Hamasuna; Yasunori Endo
    2009 International Symposium on Nonlinear Theory and its Applications (Nolta2009) 2009/10 [Refereed]
  • Makito Yamashiro; Yasunori Endo; Yukihiro Hamasuna
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 13 (4) 429 - 433 1883-8014 2009 [Refereed]
     
    The clustering algorithm we propose is based on probabilistic dissimilarity, which is formed by introducing the concept of probability into conventional dissimilarity. After defining probabilistic dissimilarity, we present examples of probabilistic dissimilarity functions. After considering an objective function with probabilistic dissimilarity. Furthermore, we construct a clustering algorithm probabilistic dissimilarity based using optimal solutions maximizing the objective function. Numerical examples verify the effectiveness of our algorithm.
  • Hamasuna Yukihiro; Endo Yasunori; Miyamoto Sadaaki
    2009 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3 IEEE 1125 - + 2009 [Refereed]
     
    In this paper, we will propose two types of L-1-norm based tolerant fuzzy c-means clustering (TFCM) from the viewpoint of handling data more flexibly. One is based on the constraint for tolerance vector and the other is based on the regularization term. First, the concept of clusterwise tolerance is introduced into optimization problems. In these methods, a tolerance vector attributes not only to each data but also each cluster. First, the concept of clusterwise tolerance is introduced into optimization problems. Second, optimal solutions for these optimization problems are derived. Third, new clustering algorithms are constructed based on the explicit optimal solutions. Finally, effectiveness of proposed algorithms is verified through numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    PROCEEDINGS OF THE JOINT 2009 INTERNATIONAL FUZZY SYSTEMS ASSOCIATION WORLD CONGRESS AND 2009 EUROPEAN SOCIETY OF FUZZY LOGIC AND TECHNOLOGY CONFERENCE EUROPEAN SOC FUZZY LOGIC & TECHNOLOGY 1152 - 1157 2009 [Refereed]
     
    We have proposed tolerant fuzzy c-means clustering (TFCM) from the viewpoint of handling data more flexibly. This paper presents a new type of tolerant fuzzy c-means clustering with L-1-regularization. L-1-regularization is well-known as the most successful techniques to induce sparseness. The proposed algorithm is different from the viewpoint of the sparseness for tolerance vector. In the original concept of tolerance, a tolerance vector attributes to each data. This paper develops the concept to handle data flexibly, that is, a tolerance vector attributes not only to each data but also each cluster. First, the new concept of tolerance is introduced into optimization problems. These optimization problems are based on conventional fuzzy c-means clustering (FCM). Second, the optimization problems with tolerance are solved by using Karush-Kuhn-Tucker conditions and an optimization method for L-1-regularization. Third, new clustering algorithms are constructed based on the explicit optimal solutions. Finally, the effectiveness of the proposed algorithm is verified through some numerical examples.
  • Hamasuna Yukihiro; Endo Yasunori; Miyamoto Sadaaki
    2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009) IEEE 197 - + 2009 [Refereed]
     
    In this paper, we will propose two types of tolerant fuzzy c-means clustering with regularization terms. One is L-2-regularization term and the other is L-1-regularization one for tolerance vector. Introducing a concept of clusterwise tolerance, we have proposed tolerant fuzzy c-means clustering from the viewpoint of handling data more flexibly. In tolerant fuzzy c-means clustering, a constraint for tolerance vector which restricts the upper bound of tolerance vector is used. In this paper, regularization terms for tolerance vector are used instead of the constraint. First, the concept of clusterwise tolerance is introduced. Second, optimization problems for tolerant fuzzy c-means clustering with regularization term are formulated. Third, optimal solutions of these optimization problems are derived. Fourth, new clustering algorithms are constructed based on the explicit optimal solutions. Finally, effectiveness of proposed algorithms is verified through numerical examples.
  • Endo Yasunori; Hamasuna Yukihiro; Kanzawa Yuchi; Miyamoto Sadaaki
    2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009) IEEE 148 - + 2009 [Refereed]
     
    In recent years, data from many natural and social phenomena are accumulated into huge databases in the world wide network of computers. Thus, advanced data analysis techniques to get valuable knowledge from data using computing power of today are required. Clustering is one of the unsupervised classification technique of the data analysis and both of hard and fuzzy c-means clusterings are the most typical technique of clustering. By the way, information on a real space is transformed to data in a pattern space and analyzed in clustering. However, the data should be often represented not by a point but by a set because of uncertainty of the data, e.g., measurement error margin, data that cannot be regarded as one point, and missing values in data. These uncertainties of data have been represented as interval range and many clustering algorithms for these interval ranges of data have been constructed. However, the guideline to select an available distance in each case has not been shown so that this selection problem is difficult. Therefore, methods to calculate the dissimilarity between such uncertain data without introducing a particular distance, e.g., nearest neighbor one and so on, have been strongly desired. From this viewpoint, we have proposed a concept of tolerance. The concept represents a uncertain data not as an interval but as a point with a tolerance vector. In this paper, we try to remove the constraint for tolerance vectors by using quadratic regularization of penalty vector which is similar to tolerance vector and propose new clustering algorithms for uncertain data through considering the optimization problems and obtaining the optimal solution, to handle such uncertainty more appropriately.
  • Hamasuna Yukihiro; Endo Yasunori; Miyamoto Sadaaki
    2009 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3 IEEE 1125 - + 1098-7584 2009 [Refereed]
     
    In this paper, we will propose two types of L-1-norm based tolerant fuzzy c-means clustering (TFCM) from the viewpoint of handling data more flexibly. One is based on the constraint for tolerance vector and the other is based on the regularization term. First, the concept of clusterwise tolerance is introduced into optimization problems. In these methods, a tolerance vector attributes not only to each data but also each cluster. First, the concept of clusterwise tolerance is introduced into optimization problems. Second, optimal solutions for these optimization problems are derived. Third, new clustering algorithms are constructed based on the explicit optimal solutions. Finally, effectiveness of proposed algorithms is verified through numerical examples.
  • Endo Yasunori; Hamasuna Yukihiro; Yamashiro Makito; Miyamoto Sadaaki
    2009 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3 IEEE 1119 - + 1098-7584 2009 [Refereed]
     
    We have two methods of pattern classification, one is supervised and the other is unsupervised. Unsupervised classification, which is called clustering and classifies data except external criteria, is very useful in the methods of pattern classification so that it has been applied in many fields. There are two types of clustering, one is hierarchical and the other is non-hierarchical. We often use hard c-means clustering (HCM) or fuzzy c-means blustering (FCM) as typical methods of non-hierarchical clustering. By the way, supervised classification can achieve practical classification results but can't handle a lot of data. On the other hand unsupervised classification can handle a lot of data but the method is complex and sometimes results look a bit of strange. Therefore recently, study of semi-supervised classification has been studied. This classification has advantages of both of the above-mentioned methods, e.g., practical results, low costs and short calculation time. In this paper, we propose new semi-supervised classification algorithms based on fuzzy c-means clustering in which some membership grades are given as supervised membership grade in advance.
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    PROCEEDINGS OF THE JOINT 2009 INTERNATIONAL FUZZY SYSTEMS ASSOCIATION WORLD CONGRESS AND 2009 EUROPEAN SOCIETY OF FUZZY LOGIC AND TECHNOLOGY CONFERENCE EUROPEAN SOC FUZZY LOGIC & TECHNOLOGY 1152 - 1157 2009 [Refereed]
     
    We have proposed tolerant fuzzy c-means clustering (TFCM) from the viewpoint of handling data more flexibly. This paper presents a new type of tolerant fuzzy c-means clustering with L-1-regularization. L-1-regularization is well-known as the most successful techniques to induce sparseness. The proposed algorithm is different from the viewpoint of the sparseness for tolerance vector. In the original concept of tolerance, a tolerance vector attributes to each data. This paper develops the concept to handle data flexibly, that is, a tolerance vector attributes not only to each data but also each cluster. First, the new concept of tolerance is introduced into optimization problems. These optimization problems are based on conventional fuzzy c-means clustering (FCM). Second, the optimization problems with tolerance are solved by using Karush-Kuhn-Tucker conditions and an optimization method for L-1-regularization. Third, new clustering algorithms are constructed based on the explicit optimal solutions. Finally, the effectiveness of the proposed algorithm is verified through some numerical examples.
  • Makito Yamashiro; Yasunori Endo; Yukihiro Hamasuna
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 13 (4) 429 - 433 1883-8014 2009 [Refereed]
     
    The clustering algorithm we propose is based on probabilistic dissimilarity, which is formed by introducing the concept of probability into conventional dissimilarity. After defining probabilistic dissimilarity, we present examples of probabilistic dissimilarity functions. After considering an objective function with probabilistic dissimilarity. Furthermore, we construct a clustering algorithm probabilistic dissimilarity based using optimal solutions maximizing the objective function. Numerical examples verify the effectiveness of our algorithm.
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
    Journal of Advanced Computational Intelligence and Intelligent Informatics Fuji Technology Press 13 (4) 421 - 428 1883-8014 2009 [Refereed]
     
    This paper presents a new type of clustering algorithms by using a tolerance vector called tolerant fuzzy c-means clustering (TFCM). In the proposed algorithms, the new concept of tolerance vector plays very important role. In the original concept of tolerance, a tolerance vector attributes to each data. This concept is developed to handle data flexibly, that is, a tolerance vector attributes not only to each data but also each cluster. Using the new concept, we can consider the influence of clusters to each data by the tolerance. First, the new concept of tolerance is introduced into optimization problems based on conventional fuzzy c-means clustering (FCM). Second, the optimization problems with tolerance are solved by using Karush-Kuhn-Tucker conditions. Third, new clustering algorithms are constructed based on the explicit optimal solutions of the optimization problems. Finally, the effectiveness of the proposed algorithms is verified through numerical examples by fuzzy classification function.
  • On Projection Correlation Proposal for a New Dissimilarity and Application to Hierarchical Clustering Algorithms
    Yasunori Endo; Fuyuki Uchida; Yukihiro Hamasuna
    Modeling Decisions for Artificial Intelligence (MDAI2008) 2008/10 [Refereed]
  • New Clustering Algorithms by using Tolerance Vector
    Yukihiro Hamasuna; Yasunori Endo
    Modeling Decisions for Artificial Intelligence (MDAI2008) 2008/10 [Refereed]
  • On a New Dissimilarity of Projection Correlation
    Yasunori Endo; Fuyuki Uchida; Yukihiro Hamasuna
    Joint 4th International Conference on Soft Computing and Intelligent Systems and 9th International Symposium on advanced Intelligent Systems (SCIS & ISIS 2008) 2008/09 [Refereed]
  • On Fuzzy c-Means for Data with Uncertainty using Spring Modulus
    Yasushi Hasegawa; Yasunori Endo; Yukihiro Hamasuna
    Joint 4th International Conference on Soft Computing and Intelligent Systems and 9th International Symposium on advanced Intelligent Systems (SCIS & ISIS 2008) 2008/09 [Refereed]
  • Yasunori Endo; Yasushi Hasegawa; Yukihiro Hamasuna; Sadaaki Miyamoto
    Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) Fuji Technology Press (富士技術出版出版株式会社) 12 (5) 461 - 466 2008/09 [Refereed]
  • Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto; Yasushi Hasegawa
    Journal of Japan Society for Fuzzy Theory and Intelligent Informatics Japan Society for Fuzzy Theory and intelligent informatics 20 (3) 388 - 398 1881-7203 2008/06 [Refereed]
     
    In this paper, two clustering algorithms that handle data with tolerance are proposed. One is based on hard c-means (HCM) while the other is based on the learning vector quantization (LVQC). We consider a tolerance which is a new concept to handle data with uncertainty such as errors, ranges, or a lost attribute of data in the optimization framework. The concept of tolerance is included in both algorithms. Dissimilarity in the former clustering algorithms is defined by using nearest-neighbor, furthest-neighbor or Hausdorff distance. On the other hand, dissimilarity in the proposed algorithms is defined by squared L2 (euclidean)-norm and the algorithm can handle the data with uncertainty in the strict optimization problems. First, the concept of tolerance which implies errors, ranges and the loss of attribute of data is described. Optimization problems that take the tolerance into account are formulated. A unique and explicit optimal solution is given by Karush-Kuhn-Tucker conditions. An alternate minimization algorithm and a learning algorithm are constructed. Moreover, effectiveness of the proposed algorithms is verified through numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo; Makito Yamashiro
    2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2 IEEE 244 - 247 2008 [Refereed]
     
    This paper presents a new type of clustering algorithm by using tolerance vector The tolerance vector is considered from a new viewpoint that the vector shows a correlation between each data and cluster centers in proposed algorithm. First, a new concept of tolerance is introduced into optimization problem. This optimization problem is based on entropy regularized fuzzy c-means. Second, the optimization problem with the tolerance is solved by using the Karush-Kuhn-Tucker conditions. Next, new clustering algorithm is constructed based on the unique and explicit optimal solutions of the optimization problem. Finally, the effectiveness of the proposed algorithm is verified through some numerical examples.
  • Yukihiro Hamasuna; Yasunori Endo; Makito Yamashiro
    2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2 IEEE 244 - 247 2008 [Refereed]
     
    This paper presents a new type of clustering algorithm by using tolerance vector The tolerance vector is considered from a new viewpoint that the vector shows a correlation between each data and cluster centers in proposed algorithm. First, a new concept of tolerance is introduced into optimization problem. This optimization problem is based on entropy regularized fuzzy c-means. Second, the optimization problem with the tolerance is solved by using the Karush-Kuhn-Tucker conditions. Next, new clustering algorithm is constructed based on the unique and explicit optimal solutions of the optimization problem. Finally, the effectiveness of the proposed algorithm is verified through some numerical examples.
  • Hamasuna Yukihiro; Endo Yasunori; Miyamoto Sadaaki
    2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5 IEEE 750 - + 1098-7584 2008 [Refereed]
     
    This paper presents two new types of Support Vector Machine (SVM) algorithms, one is based on Hard-margin SVM and the other is based on Soft-margin SVM. These algorithms can handle data with tolerance of which the concept includes some errors, ranges or missing values in data. First, the concept of tolerance is introduced into optimization problems of Support Vector Machine. Second, the optimization problems with the tolerance are solved by using the Karush-Kuhn-Tucker conditions. Next, new algorithms are constructed based on the unique and explicit optimal solutions of the optimization problem. Finally, the effectiveness of the proposed algorithms is verified through some numerical examples for the artificial data.
  • Clustering Algorithms Based on Tolerance Vector Concept
    Yasunori Endo; Yasushi Hasegawa; Yukihiro Hamasuna; Sadaaki Miyamoto
    Proc. 2007 International Symposium on Nonlinear Theory and Its Applications (Nolta2007) 2007/09 [Refereed]
  • Yasushi Hasegawa; Yasunori Endo; Yukihiro Hamasuna; Sadaaki Miyamoto
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, PROCEEDINGS SPRINGER-VERLAG BERLIN 4617 237 - + 0302-9743 2007 [Refereed]
     
    The paper presents some new clustering algorithms which are based on fuzzy c-means. The algorithms can treat data with tolerance defined as hyper-rectangle. First, the tolerance is introduced into optimization problems of clustering. This is generalization of calculation errors or missing values. Next, the problems are solved and some algorithms are constructed based on the results. Finally, usefulness of the proposed algorithms are verified through numerical examples.
  • Endo Yasunori; Hamasuna Yukihiro; Miyamoto Sadaaki
    GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS IEEE COMPUTER SOC 404 - 409 2007 [Refereed]
     
    This paper presents new clustering algorithms which are based on agglomerative hierarchical clustering (AHC) with centroid method. The algorithms can handle with data with tolerance of which the concept includes some errors, ranges, or missing values in data. First, the tolerance is introduced into optimization problems of clustering. Second, an objective function is introduced for calculating the centroid of cluster and the problem is solved using Kuhn-Tucker conditions. Next, new algorithms are constructed based on the solution of the problem. Finally, the effectiveness of the proposed algorithms in this paper is verified through some numeric examples for the artificial data.
  • Yukihiro Hamasuna; Yasunori Endo; Yasushi Hasegawa; Sadaaki Miyamoto
    2007 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-4 IEEE, ELECTRON DEVICES SOC & RELIABILITY GROUP 687 - + 1098-7584 2007 [Refereed]
     
    Two clustering algorithms that handle data with tolerance are proposed. One is based on hard c-means while the other uses the learning vector quantization. The concept of the tolerance includes. First, the concept of tolerance which implies errors, ranges and the loss of attribute of data is described. Optimization problems that take the tolerance into account are formulated. Since the Kuhn-Tucker condition give a unique and explicit optimal solution, an alternate minimization algorithm and a learning algorithm are constructed. Moreover, the effectiveness of the proposed algorithms is verified through numerical examples.
  • Sadaaki Miyamoto; Yasunori Endo; Koki Hanzawa; Yukihiro Hamasuna
    Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) Fuji Technology Press (富士技術出版出版株式会社) 11 (1) 51 - 60 2007/01 [Refereed]
  • Yasunori Endo; Koki Hanzawa; Yukihiro Hamasuna
    Journal of Japan Society for Fuzzy Theory and Intelligent Informatics Japan Society for Fuzzy Theory and intelligent informatics 18 (6) 859 - 866 1881-7203 2006/12 [Refereed]
     
    A family of automatic container loading problems is studied and algorithms are proposed. The algorithms are constructed with metaheuristics and include flat and/or vertical loading schemes, loading efficiency, stability of loaded objects, and computational requirement. Handling groups of objects in a metaheuristic scheme is moreover considered. Numerical examples are given.
  • Metaheuristic Al- gorithms for Container Loading Problem Using Grouping Objects
    Yasunori Endo; Sadaaki Miyamoto; Koki Hanzawa; Yukihiro Hamasuna
    Proc. 2006 International Symposium on Nonlinear Theory and Its Applications (Nolta2006) 2006/09 [Refereed]
  • Container Loading Problem: Formulation, Knowledge Utilization, and Algorithms
    Yasunori Endo; Sadaaki Miyamoto; Koki Hanzawa; Yukihiro Hamasuna
    Modeling Decisions for Artificial Intelligence (MDAI2005) 2005/07 [Refereed]

MISC

  • A Study on Automatic Estimation of the Number of Clusters in Network Clustering
    Haruto Iwasaki; Yukihiro Hamasuna  2024/09
  • A Study on Sequential Extraction of Clusters using Noise Clustering based on Local Outlier Factor
    Yoshitomo Mori; Yukihiro Hamasuna  第40回ファジィシステムシンポジウム 講演論文集  2024/09
  • A Study on Initial Value Determination in Controlled Edge-Sized Network Clustering Using Structural Similarity
    Hiroto Migita; Yukihiro Hamasuna  第40回ファジィシステムシンポジウム 講演論文集  2024/09
  • A Study on Network Clustering with Node Embedding
    Taira Shimizu; Yukihiro Hamasuna  第40回ファジィシステムシンポジウム 講演論文集  2024/09
  • On Sequential Cluster Extraction Using Possibilistic Size Controll Clustering
    Ryota Uto; Yukihiro Hamasuna  第39回ファジィシステムシンポジウム 講演論文集  2023/09
  • A Study on Cluster Validity Measures Based on Fuzzy Membership for Time-Series Data
    ⃝Kenshin Fujita; Yukihiro Hamasuna  第39回ファジィシステムシンポジウム 講演論文集  2023/09
  • A Study on Parameter Estimation in Gaussian Process based c-Regression Models
    ⃝Yuya Yokoyama; Yukihiro Hamasuna  第39回ファジィシステムシンポジウム 講演論文集  2023/09
  • A Study on Initial Value Determination Using k-medoids++ in Controlled Edge-Sized Network Clustering
    Hiroto Migita; Yukihiro Hamasuna  第39回ファジィシステムシンポジウム 講演論文集  2023/09
  • Noise Clustering based on Local Outlier Factor
    Yoshitomo Mori; Yukihiro Hamasuna  第39回ファジィシステムシンポジウム 講演論文集  2023/09
  • A Study on Network Clustering Using Similarity Based on Node Neighborhood Sets
    Katsumi Endo; Yukihiro Hamasuna  第39回ファジィシステムシンポジウム 講演論文集  2023/09
  • Comparison of Automatic Cluster Number Estimation Methods by Hierarchical Clustering
    Atusya Higashino; Yukihiro Hamasuna  第39回ファジィシステムシンポジウム 講演論文集  2023/09
  • Hyperparameter Optimization for Gaussian Process Sequential Regression Models
    Kaito Takegawa; Yukihiro Hamasuna  第39回ファジィシステムシンポジウム 講演論文集  2023/09
  • L1 ノルムを用いたサイズコントロール機能を持つファジィクラスタリン グに関する一考察
    青木悠真; 濵砂 幸裕  計測 自動制御学会 システム・情報部門 学術講演会 2022(SSI2022)  2022/11
  • Doc2Vec と階層的クラスタリングを用いたクラスタリングにおけるロバス ト性に関する分析
    奥早和紀; 濵砂 幸裕  計測 自動制御学会 システム・情報部門 学術講演会 2022(SSI2022)  2022/11
  • ネットワーククラスタリングにおけるエッジコントロールの検討
    Yota Echikawa; Yukihiro Hamasuna  計測 自動制御学会 システム・情報部門 学術講演会 2022(SSI2022)  2022/11
  • ガウス過程に基づく逐次抽出型回帰モデルの検討
    武川海斗; 濵砂幸裕  第38回ファジィシステムシンポジウム (FSS2022)  2022/09
  • ガウス過程回帰に基づくc-回帰モデル
    横山裕哉; 濵砂幸裕  第38回ファジィシステムシンポジウム (FSS2022)  2022/09
  • 階層的クラスタリングを用いたクラスタ数の自動推定に関する検討
    東埜淳哉; 濵砂幸裕  第38回ファジィシステムシンポジウム (FSS2022)  2022/09
  • 階層的クラスタリングを用いたネットワークデータのクラスタ分割に関する考察
    遠藤克海; 濵砂幸裕  第38回ファジィシステムシンポジウム (FSS2022)  2022/09
  • 時系列データに対するクラスタ妥当性基準に関する一考察
    藤田憲伸; 濵砂幸裕  第38回ファジィシステムシンポジウム (FSS2022)  2022/09
  • 時系列データに対するWard法の検討
    大野淳寛; 濵砂幸裕  第38回ファジィシステムシンポジウム (FSS2022)  2022/09
  • サイズコントロール機能を持つ可能性クラスタリング
    宇戸涼太; 濵砂 幸裕  第30回インテリ ジェント・システム・シンポジウム (FAN2022)  35  -40  2022/09
  • Nobuhiko Tsuda; Yukihiro Hamasuna  Joint 11th International Conference on Soft Computing and Intelligent Systems and 21st International Symposium on Advanced Intelligent Systems(SCIS/ISIS)  417  -422  2020/09
  • Yuto Kingetsu; Yukihiro Hamasuna  第36回ファジィシステムシンポジウム  2020/09
  • Map Segmentation in RoboCupRescue Using Louvain Method and Its Application to Agent Control
    Soma Kitamura, Yoshihiro Nishimura, Yukihiro Hamasuna  第36回ファジィシステムシンポジウム  2020/09
  • JS ダイバージェンスを用いた k-medoids
    金月 優斗; 濵砂 幸裕  第35回ファジィシステムシンポジウム(FSS2019)  2019/08
  • ボロノイ図に基づくクラスタ分割の妥当性評価
    津田 暢彦; 濵砂 幸裕  第35回ファジィシステムシンポジウム(FSS2019)  2019/08
  • RoboCup 2D リーグにおけるパッキングレートを用いた評価
    大津 拓登; 北村 壮馬; 濵砂 幸裕  第35回ファジィシステムシンポジウム(FSS2019)  2019/08
  • RoboCup 2D リーグに対する 5 レーン理論の実装と評価
    北村 壮馬; 大津 拓登; 濵砂 幸裕  第35回ファジィシステムシンポジウム(FSS2019)  2019/08
  • クラスタ分割が重み付きアルファ複体とホモトピー同値にな るようなクラスタリングについて
    星野 翔大; 遠藤 靖典; 濵砂 幸裕  第35回ファジィシステムシンポジウム(FSS2019)  2019/08
  • DP-means と階層的クラスタリングを用いた 2 段階クラスタリン
    大津拓登; 濵砂 幸裕  第 23 回曖昧な気持ちに挑むワークショップ (H&M2018)  2018/12
  • 重みなしネットワークデータに対するクラスタリングとその評価
    小林 大記; 濵砂 幸裕  第34 回ファジィシステムシンポジウム(FSS2018)  2018/09
  • ノード数の制約に基づくネットワーククラスタリングの検討
    中野 秀亮; 濵砂 幸裕; 遠藤 靖典  第34回ファジィシステムシンポジウム(FSS2018)  2018/09
  • クラスタ分割が重み付きアルファ複体とホモトピー同値になる ような目的関数最適化に基づくクラスタリングについて
    星野 翔太; 遠藤 靖典; 濵砂 幸裕  第34回ファジィシステムシンポジウム(FSS2018)  2018/09
  • ネットワークデータに対するクラスタ数推定アルゴリズムの検討
    尾﨑 稜; 濵砂 幸裕  第27回インテリジェント・システム・シンポジウム  2017/11
  • Hamasuna Yukihiro; Ozaki Ryo  Proceedings of the Japan Joint Automatic Control Conference  60-  (0)  1550  -1551  2017/11
  • 尾﨑 稜; 濵砂 幸裕  ファジィシステムシンポジウム講演論文集  33-  435  -440  2017/09
  • ネットワークデータに対する外れ値検出の検討
    濵砂 幸裕; 尾﨑 稜  第33回ファジィシステムシンポジウム(FSS2017)  2017/09
  • 濵砂 幸裕  システム制御情報学会研究発表講演会講演論文集  61-  6p  2017/05
  • カーネル法に基づく妥当性基準を用いた2段階クラスタリング
    尾﨑 稜; 濵砂 幸裕; 遠藤 靖典  第26 回インテリジェント・システム・シンポジウム(FAN2016)  2016/10
  • グラフクラスタリングに対する妥当性基準に関する一考察
    藤澤 拓也; 尾﨑 稜; 濵砂 幸裕  第26回インテリジェント・システム・シンポジウム(FAN2016)  2016/10
  • カーネル関数を用いた逐次抽出型クラスタリングの検討
    濵砂 幸裕; 遠藤 靖典  第32回ファ ジィシステムシンポジウム(FSS2016)  2016/09
  • Hamasuna Yukihiro; Endo Yasunori  Proceedings of the Fuzzy System Symposium  31-  99  -100  2015/09
  • HAMASUNA Yukihiro  Systems, control and information  59-  (6)  240  -245  2015
  • 妥当性基準を用いたx-meansについて
    濵砂 幸裕; 遠藤 靖典  第6 回コンピューテーショナル・インテリジェンス研究会  2014/12
  • Hamasuna Yukihiro; Endo Yasunori  Proceedings of the Fuzzy System Symposium  30-  450  -451  2014/09
  • L1正則化Assignment-Prototype Algorithmを用いたクラスタの逐次抽出
    濵砂 幸裕; 遠藤 靖典  2014/03
  • L1正則化を用いたエントロピー型可能性クラスタリングについて
    濵砂 幸裕; 遠藤 靖典  第29 回ファジィシステムシンポジウム(FSS2013)  2013/09
  • Hamasuna Yukihiro; Endo Yasunori  Proceedings of the Fuzzy System Symposium  29-  163  -163  2013/09
  • 情報量基準を用いたファジィc-回帰モデルのクラスタ数推定
    濵砂 幸裕; 遠藤 靖典  第57回シ ステム制御情報学会研究発表講演会(SCI'13)  2013/05
  • ファジィc-回帰モデルにおける最適クラスタ数の推定
    濵砂 幸裕; 遠藤 靖典  第39 回ファジィ ワークショップ  2013/03
  • Yukihiro Hamasuna; Yasunori Endo  Proceedings of the Fuzzy System Symposium  28-  (0)  859  -862  2012/09
  • クラスタワイズ許容を用いた逐次抽出型ハードクラスタリングについて
    樋口 徹; 濵砂 幸裕; 遠藤 靖典  第56回システム制御情報学会研究発表講演会(SCI'12)  2012/05
  • クラスタワイズ許容による対制約を用いた半教師付き階層的クラスタリングについ
    中矢 亮祐; 濵砂 幸裕; 遠藤 靖典  第56回システム制御情報学会研究発表講演会(SCI'12)  2012/05
  • クラスタワイズ許容を用いた半教師付きc-平均法の性能比較
    濵砂 幸裕; 遠藤 靖典  第38 回ファジィワークショップ  2012/03
  • 濵砂 幸裕; 遠藤 靖典  ファジィシステムシンポジウム講演論文集  27-  323  -326  2011/09
  • 髙山 勲; 遠藤 靖典; 濵砂 幸裕  ファジィシステムシンポジウム講演論文集  27-  327  -330  2011/09
  • 日置 彩子; 遠藤 靖典; 濵砂 幸裕  ファジィシステムシンポジウム講演論文集  27-  317  -322  2011/09
  • Yasunori Endo; Tatsuyoshi Tsuji; Yukihiro Hamasuna; Kota Kurihara  Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011  27-  177  -182  2011
  • 宮本 智明; 浜砂 幸裕; 遠藤 靖典  第26回ファジィシステムシンポジウム (FSS2010)  2010/09
  • 遠藤 靖典; 濱砂 幸裕; 宮本 定明  リスク工学研究  4-  12  -15  2008/03

Research Themes

  • 日本学術振興会:科学研究費補助金 基盤研究(C)
    Date (from‐to) : 2019/04 -2022/03 
    Author : 濵砂 幸裕
  • 公益財団法人電気通信普及財団:研究調査助成
    Date (from‐to) : 2019/04 -2020/03 
    Author : 濵砂 幸裕
  • 構造的ゆらぎを伴うグラフデータに対するクラスタリング手法の確立
    Japan Society for the Promotion of Science:Grant-in-Aid for Young Scientists (B)
    Date (from‐to) : 2016/04 -2019/03 
    Author : HAMASUNA Yukihiro
  • グラフデータに対する知識融合型クラスタリング技法の開発
    公益財団法人電気通信普及財団:研究調査助成
    Date (from‐to) : 2016/04 -2017/03 
    Author : 濵砂 幸裕
  • 半教師付きスペクトラルクラスタリングの高度化~特に、ソーシャルデータの 解析を目的として~
    公益財団法人電気通信普及財団:研究調査助成
    Date (from‐to) : 2013/04 -2014/03 
    Author : 濵砂 幸裕
  • 日本学術振興会:科学研究費助成事業
    Date (from‐to) : 2009 -2010 
    Author : 濱砂 幸裕
     
    情報通信技術の著しい発達により,以前とは比較にならない大規模・複雑なデータが蓄積されており,そのようなデータを対象として,人間のように柔軟な処理を経て,有用な情報を抽出する必然性は高まる一方である.そのようなデータ解析手法の一つにクラスタリングがある.クラスタリングは,大規模・複雑なデータから,人間には抽出困難な構造を抽出するための重要な手法であり,自然言語・画像認識など様々な分野に応用されている.通常,クラスタリングで対象となるデータはパターン空間上の点として表される.しかしながら,データが誤差や欠損といった固有の不確実性を伴う場合,データは区間や幅として表されるため,既存の手法で扱うことは困難である.そこで,本研究課題では,不確実性に対して人間のように柔軟な処理を行える方法論の構築を目標とし,データに伴う不確実性を許容範囲付きデータとして扱うクラスタリング手法の高度化に取り組んだ.本研究課題の成果として,許容範囲付きデータに対するクラスタリング手法,回帰分析の構築,クラスタワイズ許容を用いたクラスタリング手法の確立が成された.また,クラスタリングにより得られる分類結果を評価する妥当性基準の不確実データへの拡張を試み,許容範囲付きデータに対する妥当性基準を新たに構築した.それらに並行して,教師あり学習の一手法である回帰分析の不確実データへの拡張を行った.これらの研究により得られた成果から,許容の概念を用いて不確実データを扱うデータ解析の方法論が確立されたと考えている.特に,不確実データに対するクラスタリング手法では,データの分類からその評価までを許容の概念を用いた統一的な枠組みで議論することを可能とした点は従来のデータ解析手法と大きく異なっている点である.さらに,本研究課題の発展であるクラスタワイズ許容の半教師付きクラスタリング手法への援用など,様々な発展性を示すことができ,本研究課題の目的は十分に達成することができたと考えている.

Others

  • 2019/04 -2020/03  時系列データに対するクラスタリングの高度化 
    近畿大学学内研究助成金 奨励研究助成金 SR12 研究内容:時系列データに対するクラスタリングの高度化
  • 2016/04 -2017/03  大規模データに対するクラスタ数推定アルゴリズムの開発 
    近畿大学学内研究助成金 奨励研究助成金 SR01 研究内容:クラスタ構造のモデル化によるクラスタ数推定アルゴリズムの開発
  • 2013/04 -2013/04  クラスタ数自動推定アルゴリズムの開発 ~特に、情報量基準の観点から~ 
    近畿大学学内研究助成金 奨励研究助成金 SR04 研究内容:クラスタ数自動推定アルゴリズムの開発