COMPUTER SCIENCE DEPARTMENT: 2017

Thursday, 7 December 2017

NEW TECHNOLOGY 2017

Microsoft Surface Laptop

Microsoft’s new Surface Laptop is a first for the Windows maker: a conventional laptop that runs a stripped-down but more battery-efficient operating system called Windows 10 S. The company is aiming the $999 Surface Laptop at the educational set, but it’s bound to go toe-to-toe with Apple’s entry-level Mac Book Air as well. Other hardware makers, meanwhile, will offer Windows 10 S on far cheaper devices that will compete with Google’s Chromebooks. Together, the Surface Laptop and the new version of Windows show Microsoft is willing to mix things up in the notebook world—to consumers’ benefit.

Wednesday, 15 November 2017

RECENT PROGRAMMING LANGUAGES

JavaScript
Java
Python
Elixir
Rust
Go
TypeScript
PHP
Ruby on Rails
C#
Swift

JavaScript

As User snap stated in the article on the best web development trends for 2017, JavaScript is the most commonly used programming language in the world.JavaScript can be illustrated by If this, then that.The latest version of JavaScript (ES2017) is launching in late 2017 and developers are already emotional about it (out of joy).Learn jQuery once you understand JavaScript. This is a library of different ‘plugins’ to add to your code which saves you time and makes it much easier for you to add a feature.PS: JS can also be a backend language, but to keep it simple I have just listed it in the front end section.

Java

No list will be complete without Java. In the long run, it’s always a great choice and the stats suggest it’s not going away anytime soon.It’s used on 15 billion (that’s not a typo) devices and over 10 million developers use Java worldwide!Learn Java if you are interested in creating Android apps, games, software and website content.Example sites that use Java are Amazon, LinkedIn, and eBay.Java 9 is launching in 2017 so definitely check it out when it’s ready.

Python

Python is an object-orientated language that closely resembles the English language which makes it a great language to learn for beginners as well as seasoned professionals.Examples sites that use Python are Instagram, YouTube, Reddit, NASA, and Usersnap (who wrote about their Python experience here)

Python 3.6 was released in December 2016 with some awesome features.

Elixir

Elixir is a functional, dynamic language created for building scalable and maintainable applications.Concurrency is one of its main benefits. It’s great for large applications that handle a lot of tasks at the same time.Example sites that use Elixir are Pinterest, Moz, and Bleacher Report.

Rust

Rust is the most loved programming language on StackOverflow for 2016 which says a lot.It’s a general-purpose language to create fast, secure applications which take advantage of the powerful features of modern multi-core processors.Example sites that use Rust are Dropbox and Coursera.

Go

Go (or GOLANG) is created by Google and it’s only going to grow in popularity in 2017.It has an excellent standard library and it compiles fast. It’s also great with concurrent tasks and programs as well.Example sites that use Go are Netflix, YouTube, and Adobe.

TypeScript

TypeScript is a statically typed language that compiles to JavaScript and it’s growing fast! The new version 2.1 includes all the new features of JavaScript with optional static types.Additional benefits are improved checks against bugs in your code and any typos, async/await and more.It’s also the preferred language for writing Angular 2 apps.

PHP

PHP is the most popular server-side programming language in the world.It’s generally used as the foundation of Content Management Systems for WordPress and other websites like Wikipedia and Facebook.PHP 7.1 was released in December 2016. View the features here.

Ruby on Rails

Ruby on Rails (a notable framework) is like ‘jQuery for JavaScript’. It makes it much easier to use Ruby, but it’s advised that you have a good understanding of Ruby before you utilize Rails.Rails is a popular choice because many businesses make use of it. Some businesses are Airbnb, Groupon, Twitter, and Shopify.Also, make sure you have a good understanding of JavaScript as you will need to use it when you advance in Rails.Ruby on Rails 5.1 was released in December 2016 so take a look at the new features here.

C#

C# (‘see-sharp’) is a widely-used programming language. It’s not only limited to Microsoft’s .NET Framework.It’s also used for iOS/Android Apps with the technology from Xamarin and Windows applications.Version 7.0 will be released in 2017 with some incredible features.

Swift

Swift is one of the fastest growing programming languages in history!It’s built by Apple (not the one you eat) and they have some big plans for it so it would be good to take note of it now.If you’d like to become an iOS App Developer, learn Swift.

D.AISHWARYA

Sunday, 29 October 2017

COMTECH 2017-18 INTRA DEPARTMENT COMPETITION

Computer Science & Computer Application Department is going to organize an Intra Department Competition on 17.10.2017.

Resource Person : Dr. M.Senthil Raj, M.com.,M.B.A.,M.S.W.,M.Sc(Psy).,M.Ed.,

M.Phil(com),M.Phil(Edu),Ph.D.,D.Litt.,

Principal MKJC

Venue : New Seminar Hall

Time : 9.30a.m - 2.50p.m

Tuesday, 26 September 2017

Cloud computing
With cloud computing playing an even more important role in the management of Big Data, the number of servers worldwide is expected to grow tenfold and the amount of information managed directly by enterprise data centers to grow by a factor of 14. As the infrastructure of the digital universe becomes ever more connected, information won’t reside within the region where it is consumed, nor will it need to. By 2020, IDC estimates that nearly 40% of data will be “touched” by cloud computing (private and public), meaning that somewhere between a byte’s origination and consumption, it will be stored or processed in a cloud. IT professionals will be at the forefront of this changing technology landscape and would need to lead by evolving their skills and knowledge to take-up new roles emerging in the industry. EMC Corporation has taken lead in helping industry and academia develop the next generation of IT professionals, with a strong understanding of virtualization and cloud computing technologies thru its Cloud Infrastructure and Services.
FUTURE OF CLOUD
Cloud presents the perfect solution and is revolutionizing the IT process by making it possible to run IT As-AService. IDC projects that the digital universe will reach 40 zettabytes (ZB) by 2020, an amount that exceeds previous forecasts by 5 ZBs, resulting in a 50-fold growth from the beginning of 2010. As result, investment in spending on IT hardware, software, services, telecommunications and staff that could be considered the “infrastructure” of the digital universe will grow by 40% between 2012 and 2020. Investment in targeted areas like storage management, security, Big Data, and cloud computing will grow considerably faster. As per EMC - Zinnov study, private cloud opportunity alone will create 100,000 jobs in India by 2015. It also warns that companies are currently under skilled in addressing cloud computing implementations. Besides technical jobs, marketing and selling of cloud-based solutions will also be required, which in turn will create a host of new positions. Cloud computing is also giving rise to a new generation of software product companies, which can now sell their products on an on-demand basis over the Internet. It is early days yet, but IT engineers who can write cloud-ready applications, provision and maintain the infrastructure at the back-end, appear to have an edge over their peers. The cloud's impact on other parts of the economy will also be significant. According to a paper by research firm IDC, “jobs will be created across functional areas such as marketing, sales, finance and administration, production and service.”

Dept of Computer science and Computer Application is going to attend a webinar on Internet of Things on 27th of September, 2017 from 11:00AM to 1:00PM.
All relevant faculty members/ HODs from your college are invited. They can register using below mentioned steps.
We request the college to make arrangements of Webex and a phone to dial out an audio bridge number. (Eg. college may arrange a conference/training room or ask faculties to bring their own laptops and provide internet connection/provide common phone)

Steps and important points for registration:

1) Go to www.tcsionhub.in - a website with both free & paid different courses and communities.

We encourage you to see all the other courses/communities that might interest you or your students. However, for the current webinar registration:

2) In the search box on homepage, enter words "Faculty Development". The result will have a stamp named "iON PRIDE Faculty Development Program" as well as a link to view Online available courses.

3) Click on the stamp for iON PRIDE Faculty Development Program to open the course page.

4) To register for the webinar, click Apply. The registrations for the Sept 27, 2017 Webinar will remain open till Sat, Sep 23 , 2017 only.

5) Enter all your details, select "I Agree" checkbox to confirm your attendance and Click submit. You will receive an email about your successful registration.

6) Note down your application number

7) To change details in your registration, use the 'Sign In' link in the About the Course section in the course page in Digital Hub. Sign in credentials will be your application number and mobile number. Click "Edit" to edit details

The program is likely to cover: i) What is IOT? ii) How is it changing the industry landscape? iii) Skills needed for the students who want to work in IOT iv) Some projects that TCS is involved or completed. v) Open source software/projects that can be given to the students v) Q&A

Thursday, 7 September 2017

ROBOTICS CERTIFICATE COURSE

Department of Computer science & Computer Application is going to organize"ROBOTICS CERTIFICATE COURSE"on 2.8.2017

Resource Person:

Mr.G.KUMAR,

DIRECTOR OF ROBOTRICKS & CARDIO CARE,

CHENNAI,

Venue : New Seminar Hall

Time : 09.30am onwards

All are cordially Invited

Wednesday, 6 September 2017

ST-DBSCAN: An algorithm for clustering

spatial–temporal data

Abstract

This paper presents a new density-based clustering algorithm, ST-DBSCAN, which is based on DBSCAN. We propose three marginal extensions to DBSCAN related with the identification of (i) core objects, (ii) noise objects, and (iii) adjacent clusters. In contrast to the existing density-based clustering algorithms, our algorithm has the ability of discovering clusters according to non-spatial, spatial and temporal values of the objects. In this paper, we also present a spatial–temporal data warehouse system designed for storing and clustering a wide range of spatial–temporal data. We show an implementation

of our algorithm by using this data warehouse and present the data mining results.

Keywords:

Data mining; Cluster analysis; Spatial–temporal data; Cluster visualization; Algorithms

1. Introduction

Clustering is one of the major data mining methods for knowledge discovery in large databases. It is the process of grouping large data sets according to their similarity . Cluster analysis is a major tool in many areas of engineering and scientific applications including data segmentation, discretization of continuous attributes,data reduction, outlier detection, noise filtering, pattern recognition and image processing. In the field of Knowledge Discovery in Databases (KDD), cluster analysis is known as unsupervised learning process, since there is no priori knowledge about the data set.

Most studies in KDD [12] focus on discovering clusters from ordinary data (non-spatial and non-temporal data), so they are impractical to use for clustering spatial temporal data. Spatial temporal data refers to data which is stored as temporal slices of the spatial dataset. Knowledge discovery from spatial temporal data is a very promising subfield of data mining because increasingly large volumes of spatial temporal data are collected and need to be analyzed. The knowledge discovery process for spatial temporal data is more complex

than for non-spatial and non-temporal data. Because spatial temporal clustering algorithms have to consider the spatial and temporal neighbors of objects in order to extract useful knowledge. Clustering algorithms designed for spatial temporal data can be used in many applications such as geographic information systems, medical imaging, and weather forecasting. This paper presents a new density-based clustering algorithm ST-DBSCAN, which is based on the algorithmDBSCAN (Density-Based Spatial Clustering of Applications with Noise)[5]. In DBSCAN, the density associated with a point is obtained by counting the number of points in a region of specified radius around the point.Points with a density above a specified threshold are constructed as clusters. Among the existing clustering algorithms, we have chosen DBSCAN algorithm, because it has the ability in discovering clusters with arbitrary shape such as linear, concave, oval, etc. Furthermore, in contrast to some clustering algorithms, it does not require the predetermination of the number of clusters. DBSCAN has been proven in its ability of processing very large databases[3,5,6].. We have improved DBSCAN algorithm in three important directions. First, unlike the existing density-based clustering algorithms, our algorithm can cluster spatial temporal data according to

its non-spatial, spatial and temporal attributes. Second, DBSCAN cannot detect some noise points when clusters of different densities exist. Our algorithm solves this problem by assigning to each cluster a density factor. Third, the values of border objects in a cluster may be very different than the values of border objects in opposite side, if the non-spatial values of neighbor objects have little differences and the clusters are adjacent to each

other. Our algorithm solves this problem by comparing the average value of a cluster with new coming value. In addition to new clustering algorithm, this paper also presents a spatial data warehouse system designed for storing and clustering a wide range of spatial–temporal data. Environmental data, from a variety of sources, were integrated as coverages, grids, shape files, and tables. Special functions were developed for data integration, data conversion, visualization, analysis and management. User-friendly interfaces were also developed allowing relatively inexperienced users to operate the system. In order demonstrate the applicability of our algorithm to real world problems, we applied our algorithm to the data warehouse, and then presented and discussed the data mining results.

Spatial temporal data is indexed and retrieved according to spatial and time dimensions. A time period attached to the spatial data expresses when it was valid or stored in the database. A temporal database may support valid time, transaction time or both. Valid time denotes the time period during which a fact is true with respect to the real world. Transaction time is the time period during which a fact is stored in the database. This study focuses on valid time aspect of temporal data.The rest of the paper is organized as follows. Section 2 summaries the existing clustering algorithms and gives basic concepts of density-based clustering algorithms. Section 3describes the drawbacks of existing density-based clustering algorithms and our efforts to overcome these problems. Section4 explains our algorithm in detail and presents the performance of the algorithm. Section5 presents three applications which are implemented to demonstrate the applicability of it to real world problems. It shows and discusses the data mining results. Finally, a conclusion and some directions for future works are given in Section 6.2. Related works and basic concepts.This section summaries and discusses the existing clustering algorithms and then gives basic concepts of

Density based algorithms.

2.1. Density-based clustering

The problem of clustering can be defined as follows:

Definition 1. Given a database of n data objects D={o 1,o2,...,o n}. The process of partitioning D into C ={ C 1,C2,...,Ck} based on a certain similarity measure is called clustering, Ci’s are called clusters, where C iD,(i=1,2,...,k),Tki¼1Ci¼ and Ski¼1Ci¼D.

Clustering algorithms can be categorized into five main types[13]:Partitional,Hierarchical,

Grid-based ,Model-based and Density-based clustering algorithms. In Partitional

algorithms, cluster similarity is measured in regard to the mean value of the objects in a cluster, center of gravity, (K-Means[19]) or each cluster is represented by one of the objects of the cluster located near its center (K -Medoid[26]). K is an input parameter for

these algorithms, unfortunately it is not available for many applications. CLARANS[20]

is an improved version of K-Medoid algorithm Hierarchical algorithms such as CURE [9] ,BIRCH [31] produce a set of nested clusters organized as a hierarchical tree. Each node of the tree represents a cluster of a database D. Grid-based algorithms such as STING [28], WaveCluster [23] are based on multiple-level grid structure on which all operations for clustering are performed. In Model-based algorithms (COB-WEB[8], etc.), a model is hypothesized for each of the clusters and the idea is to find the best fit of that model to each other. They are often based on the assumption that the data are generated by a mixture of underlying probability distributions.The Density-based notion is a common approach for clustering. Density-based clustering algorithms are based on the idea that objects which form a dense region should be grouped together into one cluster. They use a fixed threshold value to determine dense regions. They search for regions of high density in a feature space that are separated by regions of lower density.

Density-based clustering algorithms such as DBSCAN [5], OPTICS[2], DENCLUE[15], CURD[18] are to some extent capable of clustering databases[21]. One drawback of these algorithms is that they capture only certain kinds of noise points when clusters of different densities exist. Furthermore, they are adequate if the clusters are distant from each other, but not satisfactory when clusters are adjacent to each other. The detailed description of these problems and our solutions are given in Section 3.In our study, we have chosen DBSCAN algorithm, because it has the ability in discovering clusters with arbitrary shape such as linear, concave, oval, etc. Furthermore, in contrast to some clustering algorithms,

it does not require the predetermination of the number of clusters. DBSCAN has been proven in its abilityof processing very large databases [3,6].

In the literature, DBSCAN algorithm was used in many studies. For example, the other popular density-based algorithm OPTICS (Ordering Points To Identify the Clustering Structure) [2] is based on the concepts of DBSCAN algorithm and identifies nested clusters and the structure of clusters. Incremental DBSCAN[7]algorithm is also based on the clustering algorithm DBSCAN and is used for incremental updates of a clustering after insertion of a new object to the database and deletion of an existing object from the database.

Based on the formal notion of clusters, the incremental algorithm yields the same result as the non-incremental DBSCAN algorithm. SDBDC (Scalable Density-Based Distributed Clustering)[16] method also uses DBSCAN algorithm on both local sites and global site to cluster distributed objects. In this method, DBSCAN algorithm is firstly carried out on each local site. Then, based on these local clustering results, cluster representatives are determined. Then, based on these local representatives, the standard DBSCAN algorithm is carried out on the global site to construct the distributed clustering. This study pro

poses the usage of different Eps-values for each local representative. Wen et al.[29]

adopted DBSCAN and Incremental DBSCAN as the core algorithms of their query clustering tool. They used DBSCAN to cluster frequently asked questions and most popular topics on a search engine. Spieth et al.[24] applied DBSCAN to identify solutions for the inference of regulatory networks. Finally, SNN density-based clustering algorithm [25]is also based on DBSCAN and it is applicable to high-dimensional data consisting of time series data of

atmospheric pressure at various points on the earth.

2.2. Basic concepts

DBSCAN is designed to discover arbitrary-shaped clusters in any database D and at the same time can distinguish noise points. More specifically, DBSCAN accepts a radius value Eps(e) based on a user defined distance measure and a value MinPts for the number of minimal points that should occur within Eps radius.Some concepts and terms to explain the DBSCAN algorithm can be defined as follows[5].

Definition 2 (Neighborhood). It is determined by a distance function (e.g., Manhattan Distance, Euclidean Distance) for two points p and q , denoted by dist(p,q).

Definition 3(Eps-neighborhood). The Eps-neighborhood of a point p is defined by {q2Dj dist(p,q)6Eps}.

Definition 4 (Core object). A core object refers to such point that its neighborhood of a given radius (Eps) has to contain at least a minimum number (MinPts) of other points ( Fig. 1.c).

Definition 5 (Directly density-reachable ). An object p is directly density-reachable from the object q if p is within the Eps neighborhood of q. q is a core object.

Definition 6 (Density-reachable ). An object p is density-reachable from the object q with respect to Eps and MinPts if there is a chain of objects p 1,...,pn,p1=q and pn=q such that

pi+1 is directly density-reachable from pi with respect to Eps and MinPts, for 16 I 6 n ,pi 2D(Fig. 1a).

Definition 7(Density-connected). An object p is density-connected to object q with respect to Eps and MinPts if there is an object o 2 D such that both p and q are density-reachable from

O with respect to Eps and MinPts ( Fig. 1b).

Definition 8(Density-based cluster). A cluster C is a non-empty subset of D satisfying the following ‘‘maximality’’ and ‘‘connectivity’’ requirements: (1) "p,q:if q2 C and p is density-reachable from q with respect to Eps and MinPts, then p2C.(2)"p,q2C:p is density-connected to q with respect to Eps and MinPts.

Definition 9 (Border object). An object p is a border object if it is not a core object but density-reachable from another core object.The algorithm starts with the first point p

in database D , and retrieves all neighbors of point p within Eps distance. If the total number of these neighbors is greater than MinPts if p is a core object a new cluster is created. The point p and its neighbors are assigned into this new cluster. Then, it iteratively collects the neighbors within Eps distance from the core points. The process is repeated until all of the points have been processed.

3. Problems of existing approaches

3.1. Problem of clustering spatial–temporal data

In order to determine whether a set of points is similar enough to be considered a cluster or not, we need a distance measure dist(i, j) that tells how far points i and j are. The most common distance measures used are Manhattan distance, Euclidean distance, and Minkowski distance. Euclidean distance is defined as Eq. (1).

(1)dist(i,j)=sqrt(xi1-xj1)²+(xi2-xj2)²+…+(xin-xjn)²

where i = (x_i₁, x_i₂, … , x_in) and j = (x_j₁, x_j₂, … , x_jn) are two n-dimensional data objects. For example, the Euclidean distance between the two data objects A(1, 2) and B(5, 3) is 4.12.

DBSCAN algorithm uses only one distance parameter Eps to measure similarity of spatial data with one dimension. In order to support two dimensional spatial data, we propose two distance metrics, Eps1 and Eps2, to define the similarity by a conjunction of two density tests. Eps1 is used for spatial values to measure the closeness of two points geographically. Eps2 is used to measure the similarity of non-spatial values. For example, A(x1, y1) and B(x2, y2) are two points (spatial values), t1, t2 (DayTimeTemperature, NightTimeTemperature) and t3, t4 are four temperature values of these points respectively (non-spatial values). In this example, Eps1 is used to measure the closeness of two points geographically, while Eps2 is used to measure thesimilarity of temperature values. If A(x1, y1, t1, t2) and B(x2, y2, t3, t4) are two points, Eps1 and Eps2 are calculated by the formulas in Eq. (2).

(2)Eps1=sqrt(x1-x2)²+(y1-y2)²

Eps2=sqrt(t1-t2)²+(t1-t2)²

In order to support temporal aspects, spatio-temporal data is first filtered by retaining only the temporal neighbors and their corresponding spatial values. Two objects are temporal neighbors if the values of these objects are observed in consecutive time units such as consecutive days in the same year or in the same day in consecutive years.

3.2. Problem of identifying noise objects

From the view of a clustering algorithm, noise is a set of objects not located in clusters of a database. More formally, noise can be defined as follows:

Definition 10 Noise

Let C₁, … , C_k be the clusters of the database D. Then the noise is the set of points in the database D not belonging to any cluster C_i, where i = 1, … , k, i.e., noise = {p ∈ D ∣ ∀i: p ∉ C_i}.

Existing density-based clustering algorithms [14] produce meaningful and adequate results under certain conditions, but their results are not satisfactory when clusters of different densities exist. To illustrate, consider the example given in Fig. 2.

Fig.2.Clusters of different densities exist.

This is a simple dataset containing 52 objects. There are 25 objects in the first cluster C₁, 25 objects in the second cluster C₂, and two additional noise objects o₁ and o₂. In this example, C₂ forms a denser cluster than C₁. In other words, the densities of the clusters are different from each other. DBSCAN algorithm identifies only one noise object o₁. Because approximately for every object p in C₁, the distance between the object p and its nearest neighbor is greater than distance between o₂ and C₂. For this reason, we can’t determine an appropriate value for the input parameter Eps. If the Eps value is less than the distance between o₂ and C₂, some objects in C₁ are assigned as noise object. If the Eps value is greater than the distance between o₂ and C₂, the object o₂ is not assigned as noise object.

4. ST-DBSCAN algorithm

4.1. The description of the algorithm

While DBSCAN algorithm needs two inputs, our algorithm ST-DBSCAN requires four parameters Eps1, Eps2, MinPts, and Δϵ because of the extensions described in Section 3. Eps1 is the distance parameter for spatial attributes (latitude and longitude). Eps2 is the distance parameter for non-spatial attributes. A distance metric such as Euclidean, Manhattan or Minkowski Distance Metric can be used for Eps1 and Eps2. MinPts is the minimum number of points within Eps1 and Eps2 distance of a point. If a region is dense, then it should contain more points than MinPts value. In [5], a simple heuristic is presented which is effective in many cases to determine the parameters Eps and MinPts. The heuristic suggests MinPts ≈ ln(n) where n is the size of the database and Eps must be picked depending on the value of MinPts. The first step of the heuristic method is to determine the distances to the k-nearest neighbors for each object, where kis equal to MinPts. Then these k-distance values should be sorted in descending order. Then we should determine the threshold point which is the first “valley” of the sorted graph. We should select Eps to less than the distance defined by the first valley. The last parameter Δϵ is used to prevent the discovering of combined clusters because of the little differences in non-spatial values of the neighboring locations.The algorithm starts with the first point p in database D and retrieves all points density-reachable from p with respect to Eps1 and Eps2. If p is a core object (see Definition 4), a cluster is formed. If p is a border object (see Definition 9), no points are density-reachable from p and the algorithm visits the next point of the database. The process is repeated until all of the points have been processed.

4.2. Performance evaluation

The average runtime complexity of the DBSCAN algorithm is O(n ∗ log n), where n is the number of objects in the database. Our modifications do not change the runtime complexity of the algorithm. DBSCAN has been proven in its ability of processing very large databases. The paper [6] shows that the runtime of other clustering algorithms such as CLARANS [20], DBCLASD [30] is between 1.5 and 3 times the runtime of DBSCAN. This factor increases with increasing size of the database.

6. Conclusions and future work

Clustering is a main method in many areas, including data mining and knowledge discovery, statistics, and machine learning. This study presents a new density-based clustering algorithm ST-DBSCAN which is constructed by modifying DBSCAN algorithm. The first reason of this modification is to be able to discover the clusters on spatial–temporal data. The second modification is necessary to find noise objects when clusters of different densities exist. We introduce a new concept: density factor. We assign to each cluster a density factor, which is the degree of the density of the cluster. The third modification provides a comparison of the average value of a cluster with new coming value. In order to demonstrate the applicability of our algorithm to real world problems, we present an application using a spatial–temporal data warehouse. Experimental results demonstrate that our modifications appear to be very promising when spatial–temporal data is used to be clustered.Very large databases need extreme computing power. In future studies, it is intended to run the algorithm in parallel in order to improve the performance. In addition, more useful heuristics may be found to determine the input parameters Eps and MinPts.

References

[1] T. Abraham, J.F. RoddickSurvey of spatio-temporal databases GeoInformatica, Springer, 3 (1) (1999), pp. 61-99

[2] M. Ankerst, M.M. Breunig, H.-P. Kriegel, J. Sander, OPTICS: Ordering points to identify the clustering structure, in: Proceedings of ACM SIGMOD International Conference on Management of Data, Philadelphia, PA, 1999, pp. 49–60.

[3] Z. Aoying, Z. ShuigengApproaches for scaling DBSCAN algorithm to large spatial database Journal of Computer Science and Technology, 15 (6) (2000), pp. 509-526

[4] C. Böhm, S. Berchtold, H.-P. Kriegel, U. MichelMultidimensional index structures in relational databases Journal of Intelligent Information Systems (JIIS), Springer, 15 (1) (2000), pp. 51-70

[5] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proceedings of Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, 1996, pp. 226–231.

[6] M. Ester, H.-P. Kriegel, J. Sander, X. XuClustering for mining in large spatial databases

KI-Journal (Artificial Intelligence), 12 (1) (1998), pp. 18-24Special Issue on Data Mining

[7] M. Ester, H.-P. Kriegel, J. Sander, M. Wimmer, X. Xu, Incremental clustering for mining in a data warehousing environment, in: Proceedings of International Conference on Very Large Databases (VLDB’98), New York, USA, 1998, pp. 323–333.

[8] D. FisherKnowledge acquisition via incremental conceptual clustering

Machine Learning, 2 (2) (1987), pp. 139-172

L.Hemalatha

Asst.prof, Dept of Computer Science