Challenges include the design and implementation of Cayuga such that the system can scale to very high event throughputs while processing thousands of concurrently registered queries. In other research, Gehrke’s group is collaborating with companies such as Capital One to apply data-mining techniques to problems in finance and marketing, and he is working with scientists to apply data mining to scientific problems. For example, he is working jointly with astronomers from the Arecibo radio telescope observatory on a new census of all pulsars in the Milky Way galaxy. His group has developed some of the fastest data-mining algorithms available today, and his current focus is to improve the quality and scalability of data-mining methods. His third research direction is data privacy. His group is working on practical methods to share and analyze data while not disclosing private information in the data.
|