CS 780/880 Topics / Information Retrieval

Prof. Laura Dietz, Fall 2016

This course covers basic and advanced algorithms and techniques for Web search engines as well as text-based information retrieval in general.

After this course you will be able to develop your own web search engine or customize existing retrieval frameworks such as Apache Lucene. Every week we have a close look at a different component of a web search engine system.

The course focuses on index building, query processing, and document ranking. We will further touch on text-based machine learning methods, such as classification and clustering, as well as crawling and link-based algorithms such as Google’s PageRank.

The course will cover several algorithms and data structures with application to web search, thereby building on CS 515 “Data Structures”. Both theoretical analyses of run-time performance as well as hands-on programming assignments and a class project are part of the course.

Information retrieval methods are an essential component in any text-based data analytics system, ranging from text mining, machine learning, natural language processing, to knowledge management applications.

Prereqs: Data Structures (CS 515) or permission of instructor. Ability to independently write basic programs in either Java, Python, or Scala.

For more information contact Laura.Dietz at unh . edu

Official course listings: