<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-1245508957622778430</id><updated>2011-07-07T17:30:51.267-07:00</updated><title type='text'>Data Mining Reading Group</title><subtitle type='html'>This blog is about the Data Mining Reading Group formed by Jeff Bergman. The goal of this reading group is to review the journal papers and research articles from peer reviewed journals on the topic of KDD, Machine Learning and Data Mining.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>13</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-1573106933055755245</id><published>2009-09-28T14:14:00.000-07:00</published><updated>2009-09-28T14:25:35.541-07:00</updated><title type='text'>Meeting 13 - Interestingness Measures for Data Mining: A Survey</title><content type='html'>This week's paper is &lt;a href="http://portal.acm.org/citation.cfm?doid=1132960.1132963"&gt;'Interestingness measures for data mining: A survey&lt;/a&gt;' by Liqiang Geng and Howard Hamilton of University of Regina, Saskatchewan. This paper provides great deal of information about the notion of 'interestingness' and various different ways of quantifying it. Authors note that "Measuring the interestingness of discovered patterns is an active and important area of data mining research." and acknowledge that "..there is no widespread agreement on a formal definition of interestingness..." Their explanation of 'interest' is defined based on "definitions presented to-date" as a "broad concept that emphasizes conciseness, coverage, reliability, peculiarity, diversity, novelty, surprisingness, utility, and actionability. These nine specific criteria are used to determine whether or not a pattern is interesting."&lt;br /&gt;&lt;br /&gt;We will discuss these areas and the Probability Based Objective Interestingness Measures mentioned in the paper such as Jaccard, Lift, Interestingness Weighting Dependency, Laplace Correction, Gini Index, Piatetsky-Shapiro, Cosine and Information Gain to name a few.&lt;br /&gt;&lt;br /&gt;Paper's abstract is as follows.&lt;br /&gt;&lt;br /&gt;Interestingness measures play an important role in data mining, regardless of the kind of patterns being mined. These measures are intended for selecting and ranking patterns according to their potential interest to the user. Good measures also allow the time and space costs of the mining process to be reduced. This survey reviews the interestingness measures for rules and summaries, classifies them from several perspectives, compares their properties, identifies their roles in the data mining process, gives strategies for selecting appropriate measures for applications, and identifies opportunities for future research in this area.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-1573106933055755245?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/1573106933055755245/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/09/meeting-13-interestingness-measures-for.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/1573106933055755245'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/1573106933055755245'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/09/meeting-13-interestingness-measures-for.html' title='Meeting 13 - Interestingness Measures for Data Mining: A Survey'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-1521415192355905573</id><published>2009-07-23T10:31:00.000-07:00</published><updated>2009-09-28T14:14:22.261-07:00</updated><title type='text'>Meeting 12 - Improving Search In Social Networks by Agent Based Mining</title><content type='html'>This week's paper is Anıl Gursel and Sandip Sen's "&lt;a href="http://ijcai.org/papers09/Papers/IJCAI09-335.pdf"&gt;Improving Search In Social Networks by Agent Based Mining&lt;/a&gt;" from IJCAI 2009. The abstract of this paper is as follows.&lt;br /&gt;&lt;br /&gt;"The popularity of social networks have burgeoned in recent years. Users share and access large volumes of information on social networking sites like Facebook, Flickr, del.icio.us, etc. Whereas a few of these sites have generic, impersonal searching mechanisms, we have developed an agent-based framework that mines the social network of a user to improve search results. Our Social Network based Item Search (SNIS) system uses agents that utilize the connections of a user in the social network to facilitate the search for items of interest. Our approach generates targeted search results that can improve the precision of the result returned from a user’s query. We have implemented the SNIS agent-based framework in Flickr, a photosharing social network, for searching for photos by using tag lists as search queries. We discuss the architecture&lt;br /&gt;of SNIS, motivate the searching scheme used, and demonstrate the effectiveness of the SNIS&lt;br /&gt;approach by presenting results. We also show how SNIS can be utilized for expertise location."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-1521415192355905573?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/1521415192355905573/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/07/improving-search-in-social-networks-by.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/1521415192355905573'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/1521415192355905573'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/07/improving-search-in-social-networks-by.html' title='Meeting 12 - Improving Search In Social Networks by Agent Based Mining'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-2716046872205315306</id><published>2009-05-11T16:35:00.000-07:00</published><updated>2009-09-28T14:14:02.360-07:00</updated><title type='text'>Meeting 11 - Topic Models - Latent Dirichlet Allocation</title><content type='html'>One of the hot topics in the data mining world is how to discover the semantic meaning of a text document.  Semantic meaning is related to moving beyond treating documents as a bag of words and instead discovering the underlying topics and meaning of the document.&lt;br /&gt;&lt;br /&gt;One of the landmark papers is in this area is &lt;a href="http://www.cs.princeton.edu/%7Eblei/papers/BleiNgJordan2003.pdf"&gt;Latent Dirichelet Allocation&lt;/a&gt; by Jordan, Blei, and Ng, which introduces topic models for text classification.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-2716046872205315306?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/2716046872205315306/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/05/topic-models-latent-dirichlet.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/2716046872205315306'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/2716046872205315306'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/05/topic-models-latent-dirichlet.html' title='Meeting 11 - Topic Models - Latent Dirichlet Allocation'/><author><name>Jeff</name><uri>http://www.blogger.com/profile/07905019217749368972</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-2632592681309873325</id><published>2009-04-06T17:10:00.000-07:00</published><updated>2009-04-06T17:17:11.874-07:00</updated><title type='text'>Meeting 10 - Fast Mining of Distance-Based Outliers in High-Dimensional Datasets</title><content type='html'>This week, Adnan will be presenting "&lt;a href="http://www.siam.org/meetings/sdm06/proceedings/071ghotinga.pdf"&gt;Fast Mining of Distance-Based Outliers in High-Dimensional Datasets&lt;/a&gt;" by Amol Ghoting, Srinivasan Parthasarathy and Matthew Eric Otey of IBM, &lt;span class="Affiliation"&gt;Ohio State University and Google Inc respectively. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Abstract: Defining outliers by their distance to neighboring data points has been shown to be an effective non-parametric approach to outlier detection. Existing algorithms for mining distance-based outliers do not scale to large, high-dimensional data sets. In this paper, we present RBRP, a fast algorithm for mining distance-based outliers, particularly targeted at high-dimensional data sets. RBRP scales log-linearly as a function of the number of data points and linearly as a function of the number of dimensions. Our empirical evaluation demonstrates that we outperform the state-of-the-art, often by an order of magnitude.&lt;br /&gt;&lt;br /&gt;Keywords: Outlier detection, high-dimensional data sets, approximate k-nearest neighbors, clustering.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-2632592681309873325?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/2632592681309873325/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/04/meeting-10-fast-mining-of-distance.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/2632592681309873325'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/2632592681309873325'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/04/meeting-10-fast-mining-of-distance.html' title='Meeting 10 - Fast Mining of Distance-Based Outliers in High-Dimensional Datasets'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-6064857803196437776</id><published>2009-03-13T12:54:00.000-07:00</published><updated>2009-03-13T12:56:12.779-07:00</updated><title type='text'>Meeting 9 - Support Vector Machines for Pattern Recognition</title><content type='html'>Next week, Raja Peer is presenting the following paper on Support Vector Machines for the reading group.&lt;br /&gt;&lt;br /&gt;A tutorial on support vector machines for pattern recognition&lt;br /&gt;CJC Burges - Data mining and knowledge discovery, 1998 - Springer&lt;br /&gt;http://www.cmlab.csie.ntu.edu.tw/~cyy/learning/papers/SVM_Tutorial.pdf&lt;br /&gt;&lt;br /&gt;A very interesting topic, looking forward to it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-6064857803196437776?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/6064857803196437776/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/03/meeting-9-support-vector-machines-for.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/6064857803196437776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/6064857803196437776'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/03/meeting-9-support-vector-machines-for.html' title='Meeting 9 - Support Vector Machines for Pattern Recognition'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-7660708235069099303</id><published>2009-03-13T11:05:00.000-07:00</published><updated>2009-03-13T14:03:23.689-07:00</updated><title type='text'>Meeting 8 - AdaBoost and Logistic Regression</title><content type='html'>This week Jeff Bergman discussed the Adaptive Boosting and it's role as a technique for combining a number of “weak” classifiers to make a “strong” classifier.&lt;br /&gt;&lt;br /&gt;The paper discussed was&lt;br /&gt;&lt;br /&gt;A Short Introduction to Boosting&lt;br /&gt;By Yoav Freund Robert E. Schapire&lt;br /&gt;AT&amp;amp;T Labs Research&lt;br /&gt;www.site.uottawa.ca/~stan/csi5387/boost-tut-ppr.pdf&lt;br /&gt;&lt;br /&gt;and the following presentation was used as reference.&lt;br /&gt;&lt;br /&gt;http://cmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf&lt;br /&gt;&lt;br /&gt;Also, discussed was the successful application of AdaBoost to problem of Face Detection&lt;br /&gt;in the classic paper, &lt;a href="http://research.microsoft.com/en-us/um/people/viola/pubs/detect/violajones_cvpr2001.pdf"&gt;Rapid Object Detection using a Boosted Cascade of Simple Features&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-7660708235069099303?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/7660708235069099303/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/03/meeting-8-adaboost-and-logistic.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/7660708235069099303'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/7660708235069099303'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/03/meeting-8-adaboost-and-logistic.html' title='Meeting 8 - AdaBoost and Logistic Regression'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-9074273518864702429</id><published>2009-02-26T13:12:00.000-08:00</published><updated>2009-02-26T13:17:14.759-08:00</updated><title type='text'>Meeting 7 - Algorithms for Mining Distance-Based Outliers in Large Datasets</title><content type='html'>This week our discussion is about the VLDB'98 conference paper on "Algorithms for Mining Distance-Based Outliers in Large Datasets." by Edwin M. Knox and Raymond T. Ng of University of British Columbia.&lt;br /&gt;&lt;br /&gt;Paper Abstract is as follows.&lt;br /&gt;&lt;br /&gt;This paper deals with finding outliers (exceptions) in large, multidimensional datasets. The identification of outliers can lead to the discovery of truly unexpected knowledge in areas such as electronic commerce, credit card fraud, and even the analysis of performance statistics of professional athletes. Existing methods that we have seen for finding outliers in large datasets can only deal efficiently with two dimensions/attributes of a dataset. Here, we study the notion of DB- (Distance- Based) outliers. While we provide formal and empirical evidence showing the usefulness of DB-outliers, we focus on the development of algorithms for computing such outliers. ( Proceedings of the 24th VLDB Conference)&lt;br /&gt;&lt;br /&gt;Paper can be downloaded from VLDB conference website &lt;a href="http://www.vldb.org/conf/1998/p392.pdf"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-9074273518864702429?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/9074273518864702429/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/02/meeting-7-algorithms-for-mining.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/9074273518864702429'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/9074273518864702429'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/02/meeting-7-algorithms-for-mining.html' title='Meeting 7 - Algorithms for Mining Distance-Based Outliers in Large Datasets'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-2788000652023255563</id><published>2009-02-19T18:18:00.000-08:00</published><updated>2009-02-19T18:23:41.055-08:00</updated><title type='text'>Meeting 6 - Privacy Preserving Mining of Association Rules</title><content type='html'>Paper:&lt;br /&gt;&lt;br /&gt;Privacy Preserving Mining of Association Rules&lt;br /&gt;By&lt;br /&gt;Alexandre Evfimievski&lt;br /&gt;Ramakrishnan Srikant&lt;br /&gt;Rakesh Agrawal&lt;br /&gt;Johannes Gehrke&lt;br /&gt;&lt;br /&gt;In this meeting we went through the paper to understand mainly 2 randomization techniques (Select-a-size and cut-and-paste) and discussed the chanllenges of combining privacy preservation within data mining algorithms.&lt;br /&gt;&lt;br /&gt;Lot of information to digest in a single paper.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-2788000652023255563?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/2788000652023255563/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/02/meeting-6-privacy-preserving-mining-of.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/2788000652023255563'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/2788000652023255563'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/02/meeting-6-privacy-preserving-mining-of.html' title='Meeting 6 - Privacy Preserving Mining of Association Rules'/><author><name>Raja</name><uri>http://www.blogger.com/profile/02900144181691042339</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-4039145889660853556</id><published>2009-01-30T15:09:00.000-08:00</published><updated>2009-02-02T15:59:45.819-08:00</updated><title type='text'>Meeting 5 - Probabilistic Inference - 2/5/2009</title><content type='html'>In the fifth meeting we plan to review probabilistic inference and modeling,  focusing on Bayesian methods, including Bayesian Inference, Bayesian Networks, and Markov Random Fields, time permitting.&lt;br /&gt;&lt;br /&gt;These concepts are fundamental for understanding various Data Mining techniques.  Jeff Bergman will be presenting and reviewing the paper, &lt;a href="http://www.stat.ucla.edu/%7Eyuille/meetings/IPAM07/Probabilistic_Inference_GY.pdf"&gt;Techincal Introduction: A Primer on Probabilistic Inference&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;An additional paper that is a good overview is David Heckerman's, &lt;a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.15.4522"&gt;A Tutorial on Learning With Bayesian Networks&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-4039145889660853556?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/4039145889660853556/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/01/meeting-5-probabilistic-inference.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/4039145889660853556'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/4039145889660853556'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/01/meeting-5-probabilistic-inference.html' title='Meeting 5 - Probabilistic Inference - 2/5/2009'/><author><name>Jeff</name><uri>http://www.blogger.com/profile/07905019217749368972</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-3289573917914411636</id><published>2009-01-29T22:08:00.000-08:00</published><updated>2009-01-29T22:17:01.239-08:00</updated><title type='text'>Meeting 4 - Top Ten Data Mining Algorithms - 1/29/2009</title><content type='html'>In the fourth meeting of Data mining reading group, we reviewed the top ten data mining algorithms paper based on the IEEE survey. Adnan Masood and Jeff Bergman co-presented the ten algorithms&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.cs.uvm.edu/%7Eicdm/algorithms/10Algorithms-08.pdf"&gt;Top 10 algorithms in data mining&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Abstract: This paper presents the top 10 data mining algorithms identified by the IEEE&lt;br /&gt;International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM,&lt;br /&gt;Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms&lt;br /&gt;are among the most influential data mining algorithms in the research community.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-3289573917914411636?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/3289573917914411636/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/01/meeting-4-top-ten-data-mining.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/3289573917914411636'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/3289573917914411636'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/01/meeting-4-top-ten-data-mining.html' title='Meeting 4 - Top Ten Data Mining Algorithms - 1/29/2009'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-5921507272521897712</id><published>2009-01-29T21:33:00.000-08:00</published><updated>2009-01-29T22:04:36.691-08:00</updated><title type='text'>Meeting 3 - Map Reduce - 1/22/2009</title><content type='html'>In the third meeting of Data Mining Reading Group, we reviewed the famous MapReduce paper from ACM SIGMod 2007. It was presented by Raja Peer.&lt;br /&gt;&lt;br /&gt;Map-reduce-merge: simplified relational data processing on large clusters&lt;br /&gt;International Conference on Management of Data archive&lt;br /&gt;Proceedings of the 2007 ACM SIGMOD international conference on Management of data&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-5921507272521897712?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/5921507272521897712/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/01/meeting-3-map-reduce-1222009.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/5921507272521897712'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/5921507272521897712'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/01/meeting-3-map-reduce-1222009.html' title='Meeting 3 - Map Reduce - 1/22/2009'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-7755327130194516907</id><published>2009-01-29T21:28:00.000-08:00</published><updated>2009-01-29T22:14:46.467-08:00</updated><title type='text'>Meeting 2 - Netflix Challenge &amp; Collaborative Filtering - 1/15/2009</title><content type='html'>In the 2nd data mining reading group meeting, we went over the Netflix Paper from KDD 2008. This week's presenter was Jeff Bergman.&lt;br /&gt;&lt;br /&gt;Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model&lt;br /&gt;by Yehuda Koren&lt;br /&gt;AT&amp;amp;T Labs – Research&lt;br /&gt;KDD’08, August 24–27, 2008, Las Vegas&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-7755327130194516907?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/7755327130194516907/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/01/meeting-2-page-rank-1152008.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/7755327130194516907'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/7755327130194516907'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/01/meeting-2-page-rank-1152008.html' title='Meeting 2 - Netflix Challenge &amp; Collaborative Filtering - 1/15/2009'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1245508957622778430.post-4942033667081461833</id><published>2009-01-29T21:11:00.000-08:00</published><updated>2009-01-29T22:14:07.605-08:00</updated><title type='text'>Meeting 1 - Page Rank - 1/8/2009</title><content type='html'>In the first data mining reading group meeting, we went over the Page Rank Paper. This week's presenter was Adnan Masood&lt;br /&gt;&lt;br /&gt;&lt;span class="w"&gt;&lt;a set="yes" linkindex="20" href="http://linkinghub.elsevier.com/retrieve/pii/S016975529800110X" onmousedown="new Image().src='/scholar_url?sa=T&amp;amp;url=http://linkinghub.elsevier.com/retrieve/pii/S016975529800110X';"&gt;The anatomy of a large-scale hypertextual Web search engine&lt;/a&gt;&lt;/span&gt; &lt;span style=""&gt;&lt;br /&gt;&lt;span class="a"&gt;S Brin, L Page - Computer Networks and ISDN Systems, 1998 - Elsevier&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="w"&gt;&lt;a set="yes" linkindex="26" href="http://www.cpe.ku.ac.th/%7Eanan/courses/phd-seminar/S11/S11-sum.doc" onmousedown="new Image().src='/scholar_url?sa=T&amp;amp;url=http://www.cpe.ku.ac.th/~anan/courses/phd-seminar/S11/S11-sum.doc&amp;amp;oi=ggp';"&gt;The pagerank citation ranking: Bringing order to the web&lt;/a&gt;&lt;/span&gt;&lt;span style=""&gt;&lt;br /&gt;&lt;span class="a"&gt;L Page, S Brin, R Motwani, T Winograd - 1998 - cpe.ku.ac.th&lt;/span&gt;&lt;br /&gt;The PageRank Citation Ranking: Bringing Order to the Web.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1245508957622778430-4942033667081461833?l=dmreadinggroup.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dmreadinggroup.blogspot.com/feeds/4942033667081461833/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/01/data-mining-reading-group-meeting-1-thu.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/4942033667081461833'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1245508957622778430/posts/default/4942033667081461833'/><link rel='alternate' type='text/html' href='http://dmreadinggroup.blogspot.com/2009/01/data-mining-reading-group-meeting-1-thu.html' title='Meeting 1 - Page Rank - 1/8/2009'/><author><name>Adnan Masood</name><uri>http://www.blogger.com/profile/06053395538661164636</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
