[6-23]TextScope: Enhance Human Perception via Text Mining

文章来源:  |  发布时间:2017-06-22  |  【打印】 【关闭


  报告题目:TextScope: Enhance Human Perception via Text Mining 



  报告人Prof. ChengXiang Zhai, University of IllinoisUrbana-Champaign 


  Recent years have seen a dramatic growth of natural language text data. Similar todata generated by a computer system or sensors, text can be regarded as thedata generated by humans (as subject intelligent "sensors" of theworld) to describe the observed world. As such, text data contain all kinds ofknowledge about the world as well as human opinions and preferences, thusproviding a great opportunity for mining large amounts of text data ("bigtext data") to discovery useful knowledge that can support various tasks,especially those involving complex decision-making. 

  In this talk, I will present the vision of developing an intelligent text miningtool, called TextScope, to support interactive text mining. Just as amicroscope allows us to see things in the micro world,and a telescope allows usto see things far away, the TextScope would allow us to finding useful hiddenknowledge buried in large amounts of text data that would otherwise be unknownto us. As examples of techniques that can be used to build a TextScope, I willpresent a few general statistical text mining algorithms that we have developedfor joint analysis of text and non-text data to discover interesting patternsand knowledge. I will also briefly discuss the challenges in developing aTextScope and some important directions for future research. 


  Bio: ChengXiang Zhai is a Professor of Computer Science and a Willett Faculty Scholar at the University of Illinois at Urbana-Champaign, where he is also affiliated with School of Information Sciences, Carl R. Woese Institute for Genomic Biology, and Department of Statistics. He received a Ph.D. in Computer Science from Nanjing University in 1990, and a Ph.D. in Language and Information Technologies from Carnegie Mellon University in 2002. He worked at Clairvoyance Corp. as a Research Scientist and a Senior Research Scientist from 1997 to 2000. His research interests are in the general area of intelligent information systems, including specifically intelligent information retrieval, data mining, natural language processing, machine learning, and their applications.  He has published over 200 papers in these areas and a textbook on text data management and analysis. He is the America Editor of Springer Information Retrieval Book Series and an Associate Editor of BMC Medical Informatics and Decision Making, and previously served as an Associate Editor of ACM Transactions on Information Systems, Associate Editor of Elsevier Information Processing and Management, Program Co-Chair of NAACL HLT 2007, ACM SIGIR 2009, and WWW 2015. He is an ACM Distinguished Scientist, and received a number of awards,including ACM SIGIR Test of Time Paper Award (three times), the 2004 Presidential Early Career Award for Scientists and Engineers (PECASE), an Alfred P. Sloan Research Fellowship, IBM Faculty Award, HP Innovation Research Award, Microsoft Beyond Search Research Award, Google Research Grant Award, Yahoo Faculty Research Engagement Program Award, UIUC Rose Award for Teaching Excellence, and UIUC Campus Award for Excellence in Graduate Student Mentoring. He has graduated 28 PhD students and over 40 MS students. 


  报告人简介:翟成祥教授现在美国伊利诺大学香槟分校(UIUC)计算机系教授,并同时在该校的生物信息研究所,图书馆信息科学系,及统计系任兼职教授。翟教授主要从事信息检索,大规模信息管理研究与开发,已获得多种奖励,包括2004年度美国国家科学基金会的青年科学基金奖(NSF CAREER),2004年度美国青年科学家和工程师最高荣誉总统奖(PECASE),2008年度斯隆学者奖,2009年度ACM杰出科学家奖,2004ACM SIGIR最佳论文奖,2011ACM CIKM最佳学生论文奖等。翟教授的主要研究成果包括开发新型信息检索模型,个人化信息检索技术,自动文本信息抽取技术,及生物信息分析算法,部分研究成果正在向产品转化。翟教授与计算所也有着多年的合作,2008年他担任了计算所牵头组织的龙星计划课程的讲师,为学员们开了“Introduction to Text Information Systems”的课程,获得了一致好评。他目前与程学旗研究员联合组织WSDM 2015国际会议.