Accepted Papers
- Data Mining In Healthcare For Diabetes Mellitus
Ravneet Jyot Singh and Williamjeet Singh,Punjabi University, IndiaABSTRACT
- Disease diagnosis is one of the applications where data mining tools are proving successful results. Diabetes disease is the leading cause of death all over the world in the past years. Several researchers are using statistical data. The availability of huge amounts of medical data leads to the need for powerful mining tools to help health care professionals in the diagnosis of diabetes disease. Using data mining technique in the diagnosis of diabetes disease has been comprehensively investigated, showing the acceptable levels of accuracy. Recently researchers have been investigating the effect of hybridizing more than one technique showing enhanced results in the diagnosis of diabetes disease. However, using data mining techniques to identify a suitable treatment for diabetes disease patients has received less attention. This paper identifies gaps in the research on diabetes disease diagnosis and treatment. It helps to systematically close those gaps to discover if applying data mining techniques to diabetes disease treatment data can provide as reliable performance as that achieved in controlling and diagnosing diabetes disease
- Frequent-Pattern Tree Algorithm Application to S&P and Equity Indexes
E. Younsi, H. Andriamboavonjy, A. David, S. Dokou, B. Lemrabet,ECE Paris School of Engineering,FranceABSTRACT
Software and time optimization are very important factors in financial markets, which are competitive fields, and emergence of new computer tools further stresses the challenge. In this context, any improvement of technical indicators which generate a buy or sell signal is a major issue. Thus, many tools have been created to make them more effective. This worry about efficiency has been leading in present paper to seek best (and most innovative) way giving largest improvement in these indicators. The approach consists in attaching a signature to frequent market configurations by application of frequent patterns extraction method which is here most appropriate to optimize investment strategies. The goal of proposed trading algorithm is to find most accurate signatures using back testing procedure applied to technical indicators for improving their performance. The problem is then to determine the signatures which, combined with an indicator, outperform this indicator alone. To do this, the FP-Tree algorithm has been preferred, as it appears to be the most efficient algorithm to perform this task.
Keywords Quantitative - Applications of data mining in Integrated Circuits manufacturing
Sidda Reddy Kurakula,Applied Materials India (P) Ltd,IndiaABSTRACT
Integrated circuits (a.k.a chips or IC’s) are some of the most complex devices manufactured. Making chips is a lengthy process requiring hundreds of precisely controlled steps such as film deposition, etching and patterning of various materials until the final device structure is realized. Also, each chip goes through a huge number of complicated tests and inspection steps to ensure quality. In IC manufacturing, yield is defined as the percentage of chips in a finished wafer that pass all tests and function properly. Yield improvement translates directly into increased revenue; e.g., 1% improvement in yield in a fab running 40,000 wafer starts per month could equate to $3~30million annual revenue gain depending on the technology nodes being run. A humongous amount of data (Terabytes per day) is being logged from all the equipment used in the fab. This paper describes some applications of advanced data mining techniques used by chip makers and equipment suppliers in order to improve yield, match equipment, increase equipment output and also to predict the change in equipment performance before and after maintenance. - An effective Tokenization Algorithm for Information Retrieval Systems
Vikram Singh & Balwinder Saini ,NIT Kurukshetra, IndiaABSTRACT
in the web, amount of operational data has been increasing exponentially from past few decades, the expectations of data-user is changing proportionally as well. The data-user expects more deep, exact, and detailed results. Retrieval of relevant results is always affected by the pattern, how they are stored/ indexed. There are various techniques are designed to indexed the documents, which is done on the token’s identified with in documents. Tokenization process, primarily effective is to identifying the token and their count. In this paper, we have proposed an effective tokenization approach which is based on training vector and result shows that efficiency/ effectiveness of proposed algorithm. Tokenization of a given documents helps to satisfy user’s information need more precisely and reduced search sharply, is believed to be a part of information retrieval. Tokenization involves pre-processing of documents and generates its respective tokens which is the basis of these tokens probabilistic IR generate its scoring and gives reduced search space. No of Token generated is the parameters used for result analysis. - COLOR IMAGE RETRIEVAL BASED ON FULL RANGE AUTOREGRESSIVE MODEL WITH LOW-LEVEL FEATURES
A. Annamalai Giri,Sri Kuvempu First Grade College, India,K. Seetharaman,Annamalai University,INDIAABSTRACT
This paper proposes a novel method, based on Full Range Autoregressive (FRAR) model with Bayesian approach for color image retrieval. The color image is segmented into various regions according to its structure and nature. The segmented image is modeled to RGB color space. On each region, the model parameters are computed. The model parameters are formed as a feature vector of the image. The Hotlling T2 Statistic distance is applied to measure the distance between the query and target images. Moreover, the obtained results are compared to that of the existing methods, which reveals that the proposed method outperforms the existing methods. - COLOR IMAGE RETRIEVAL BASED ON NON-PARAMETRIC STATISTICAL TESTS OF HYPOTHESIS
R. Shekhar,Manonamainam Sundaranar University,India,K. Seetharaman,Annamalai University,INDIAABSTRACT
A novel method for color image retrieval, based on statistical non-parametric tests such as two-sample Wald Test for equality of variance and Man-Whitney U test, is proposed in this paper. The proposed method tests the deviation, i.e. distance in terms of variance between the query and target images; if the images pass the test, then it is proceeded to test the spectrum of energy, i.e. distance between the mean values of the two images; otherwise, the test is dropped. If the query and target images pass the tests then it is inferred that the two images belong to the same class, i.e. both the images are same; otherwise, it is assumed that the images belong to different classes, i.e. both images are different. The proposed method is robust for scaling and rotation, since it adjusts itself and treats either the query image or the target image is the sample of other. - A Study on Computational Intelligence Techniques to Data Mining
S. Selvi,V. Divya Bharathi ,R.Priya,V. Anitha ,Government College of Engineering, IndiaABSTRACT
Nowadays rate of growth of data from various applications of resources is increasing exponentially. The collections of different data sets are formulated into Big Data. The data sets are so complex and large in volume. It is very difficult to handle with the existing Database Management tools. Soft computing is an emerging technique in the field of study of computational intelligence. It includes Fuzzy Logic, Neural Networks, Genetic Algorithm, Machine Learning and Rough Set Theory etc. Rough set theory is a tool which is used to derive knowledge from imprecise, imperfect and incomplete data. This paper presents an evaluation of rough set theory applications to data mining techniques. Some of the rough set based systems developed for data mining such as Generalized Distribution Table and Rough Set System (GDT-RS), Rough Sets with Heuristics (RSH), Rough Sets and Boolean Reasoning (RSBR), Map Reduce technique and Dynamic Data Mining etc. are analyzed. Models proposed and techniques employed in the above methods by the researchers are discussed. .