The first part of the course will introduce students to social network analytics (SNA) and their instrumental value for businesses and the society. SNA encompasses techniques and methods for analyzing the constant flow of information over offline networks (e.g. networks of workers in labor markets, networks of organizations in product markets etc.) and online networks (e.g. Facebook posts, twitter feeds, foursquare check-ins etc.) aiming to identify patterns of information propagation that are of interest to the analyst. The course will help students to understand the opportunities, challenges and threats arising by the use of social networks as far as businesses and the society at large are concerned. The issues of innovation diffusion and information spread through networks will also be covered. Finally, students will be introduced to the concepts of the wisdom of the crowds and social learning, investigating the conditions under which opinion convergence (asymptotic learning) or herding may occur in social networks.  

The second part of the course is concerned with extracting useful information from unstructured big data, mostly data in the form of text and speech. It will introduce core concepts, models, and algorithms from machine learning, natural language processing, and speech processing that can be used to recognize speech, and normalize, classify, cluster, tag, parse, disambiguate, and extract information from texts and spoken utterances. Several application areas will be considered, including filtering e-mails and social media messages, summarizing opinions and performing sentiment analysis (e.g., for particular products) in social media or discussion fora, monitoring spoken dialogues in call centers (e.g., to check for compliance with protocols), populating databases with information extracted from news feeds (e.g., company mergers and acquisitions), finding answers to scientific questions in the research literature. The students will have the opportunity to learn how to use existing tools (e.g., machine learning, speech recognition, and natural language processing toolkits) by applying them to realistic datasets. Key concepts and applications of multimodal content analytics will also be covered if time permits.