Data Mining

סוג הפעילות

קבוצות עניין

מועד הפעילות

08/01/08

מקום הפעילות

בית חיל האוויר , הרצליה מפה מצורפת

מרצים

ד"ר רן וולף , ד"ר אילן שמשוני , שלומית מורד

תיאור המפגש

The Mean Shift Data Analysis/Data Mining Technique Theory
  Application & 
 Ilan Shimshoni -

 

Mean shift is a well known clustering method. This simple method is used in solving many data analysis problems. These problems can be divided into two types of problems: robust parameter estimation problems and segmentation/classification/clustering problems.
In my talk I will describe this algorithm and show several applications. When using the mean shift method for high dimensional data (which happens frequently in practice) computational problems arise. I will show how a recently proposed approximation technique, locality-sensitive hashing (LSH), can be used to reduce the computational complexity of mean shift.
Applications of this algorithm will also be shown.
This algorithm and other data mining techniques are being used by me as part of a research project conducted at the Applied Materials company.


Data Mining for Misconfiguration Detection in Grid Systems

         Ran Wolf -

Grid systems are incredibly complex distributed systems. Even today there are grid systems spanning from dozens to tens of thousands of computers. These machines are usually extremely heterogeneous, and usually partitioned among several administrative domains. This makes administration of a grid system a daunting task. Consequently, every system usually contains many misconfigured machines.
In this work we describe a grid subsystem called GMS which locates machines it suspects to be misconfigured by mining system logs that are generated, anyhow, by every component of the system. GMS is entirely non-intrusive, executes same as any other grid job, and improves the accuracy of its suspicion whenever machines become available. The system was tested on a production system of more than 50 CPUs. Of these, four were suspected as misconfigured. Administrators then validated three of the suspected machines were indeed misconfigured while the problem with the fourth could not be regenerated.


Data Mining in Orbotech
  Shlomit Morad -

Orbotech develops industrial machines for manufacturers of PCBs (Printed Circuit Boards) and LCD monitors. These machines are extremely complicated and their maintenance is very costly. Their software applications are complicated as well and their operation in far from being trivial.
Five years ago, my development group was asked to assist the Customer Support (CS) group to improve the support mechanisms. We believed that the CS group should change the support concept: proactive rather than reactive.
In order to be proactive, they must receive an accurate and up-to-date analysis of the status of each machine. Since changing the machines' software was not an option, we checked the log files that the applications write for debug purposes. We found many of them as very informative and simple to analyse, and we felt capable of providing the CS people with the tools that will make their work proactive.
Today, after four years of production, and five years of development, we have a fully automated system that serves other departments as well: R&D, Integration and Marketing. The system gets the log files of 400 machines every week and it sends around 1000 analyses.

סדר היום:

14:00-14:15

פתיחה - משה סלם, אילטם

14:15-14:30

מבוא לכריית מידע

צבי קופליק, אילטם/אוניברסיטת חיפה

14:30-15:00 

The Mean Shift Data Analysis/Data Mining
Technique Theory & Applications

- אילן שמשוני, אוניברסיטת חיפה

15:00-15:30

Data Mining for Misconfiguration
Detection in Grid Systems

- רן וולף, אוניברסיטת חיפה

15:30-16:00

יישום כריית מידע באורבוטק

- שלומית מורד, אורבוטק

16:00-17:00

דיון פתוח והעלאת רעיונות לכיווני פעילות המעניינים את הקבוצה

הערות

מפגש התנעה

בברכה

משה סלם - מנכ"ל אילטם
דר' צביקה קופליק - מרכז הקבוצה

קישורים רלוונטיים

שאל את המומחה