Navigation auf uzh.ch

Suche

Department of Informatics Business Intelligence Research Group

Workshop Research Talks

Visualizing the Invisible: Detecting and Visualizing Emotions in Event-Related Tweets
Presenter: Dr. Pearl Pu (11:15-12:00 AM Monday Feb.15th 2016)
Abstract:

Spectators are increasingly using social platforms to comment about bigpublic events such as sports games and political debates. The quantity of such data is too overwhelming to be processed by a human. During the 2012 Olympic games, 150 million tweets were generated on Twitter alone.To understand the public's perception of these events, it is important to recognize the subjective content revealed in such "big data". This has motivated us to develop a system to automatically detect and visualize the patterns and trends of user sentiments as expressed in their comments, and how their sentiments evolve over time. Previous work in opinion mining has addressed some of these issues. But the majority of them identify only two categories of emotions: positive and negative, leaving a more detailed and insightful analysis to be desired. In this talk, I describe EmotionWatch, a data mining and visualization tool, that helps people make sense of spectators' emotional reactions in public events using a fine-grained, multi-category emotion model.

Integration of Diverse Big Data Sources for Risks Supervision of Internet Finance
Presenter: Dr. Zhi Su (13:00-14:15 PM Monday Feb. 15th 2016)
Abstract:

With the sustained and rapid growth of e-commerce, measuring the development of e-commerce is not only the quantification of its own development, but also an important way to reflect its economic and social impact. E-commerce development index (EDI) contains four aspects: scale, potential, the degree of penetration and environment support, measuring the level of e-commerce development in a comprehensive manner. Accordingly, EDI generates four sub-index—scale index, potential index, penetration index and support index, including e-commerce transactions, online retail sales, the number of online shopping, number of enterprises with e-commerce activities’ scale proportion and growth rate, effects on traditional economic, and some supporting roles such as logistics, payment, practitioners and technology, fully reflect the e-commerce's own development and the external environment. 2015 EDI report releases e-commerce scores and rankings of China's 31 provinces—Zhejiang, Guangdong, Beijing, Jiangsu and Sichuan ranked the top five— and e-commerce index of 22 major countries worldwide —China, United States and Japan ranked the top three. At the same time, 2015 EDI also provides the ranking of China’s provinces and 22 major countries on sub-index level.

Big data analytics: Infrastructures, Models and Applications
Presenter: Dr. Philippe Cudré-Mauroux (15:30-16:15 PM Monday Feb. 15th 2016)
Abstract:

With the explosion of online data, scalable and efficient analytic platforms are becoming key in order to maximize the value of Big Data. Towards this goal, we are working on various innovative aspects of big data analytics, including new infrastructures, models, and applications for non-relational data. In this talk, we give a quick overview of some of our recent projects and then delve into one of our recent use cases, i.e., the large-scale analysis of human activity data from social media. We study the fusion of multi-dimensional social data to provide a deeper understanding of human activity patterns and enable personalized location-based services. In addition, we investigate the application of data sketching techniques to reduce the computational complexity of the personalisation process, and propose new privacy-preserving mechanisms for social data sharing.

Cost-Effective Management of Big Data in the Cloud
Presenter: Dr. Heiko Schuldt (08:30-09:15 AM Tuesday Feb.16th 2016)
Abstract:

In the last years, Cloud computing has attracted a large variety of applications which are completely deployed on resources of Cloud providers. As management of big data is an essential part of these applications, Cloud providers have to deal with many different requirements for data management, depending on the characteristics and guarantees these applications are supposed to have. With the pay-as-you-go cost model of the Cloud, literally each user action and resource usage has a price tag attached to it. For the application providers in the Cloud, it is essential that the needs of their applications are provided in a cost-optimized manner; at the same time, from the point of view of the users of a data Cloud, a high degree of data consistency and performance with optimized costs has to be achieved. In this talk, we present the PolarDBMS approach to provide a flexible and dynamically adaptable system for managing data in the Cloud. This includes cost-based and adaptive concurrency control as well as cost-based data archiving.

Filtering Big Data in Social Media - Building an Early Warning System for Adverse Drug Reaction
Presenter: Dr. Ming Yang (11:15-12:00 AM Tuesday Feb.16th 2016)
Abstract:

Adverse Drug Reactions (ADRs) are believed to be a leading cause of death in the world. Pharmacovigilance systems are aimed at early detection of ADRs. With the popularity of social media, Web forums and discussion boards become important sources of data for consumers to share their drug use experience, as a result may provide useful information on drugs and their adverse reactions. In this study, we propose an automated ADR related posts filtering mechanism using text classification methods. In real-life settings, ADR related messages are highly distributed in social media, while non-ADR related messages are unspecific and topically diverse. It is expensive to manually label a large amount of ADR related messages (positive examples) and non-ADR related messages (negative examples) to train classification systems. To mitigate this challenge, we examine the use of a partially supervised learning classification method to automate the process.

Deep Learning on Geometric Data
Presenter: Davide Boscaini (13:30-14:15 PM Tuesday Feb. 16th 2016)
Abstract:

The past decade in computer vision research has witnessed the re-emergence of "deep learning” and, in particular, convolutional neural network techniques, allowing to learn task-specific features from examples and achieving a breakthrough in performance in a wide range of applications. However, in the geometry processing and computer graphics communities, these methods are practically unknown. One of the reasons stems from the facts that 3D shapes (typically modeled as Riemannian manifolds) are not shift-invariant spaces, hence the very notion of convolution is rather elusive. In this talk, I will show some recent works from our group trying to bridge this gap. Specifically, I will show the construction of intrinsic convolutional neural networks on meshes and point clouds, with applications such as defining local descriptors, finding dense correspondence between deformable shapes and shape retrieval.

Sparse Similarity-Preserving Hashing for Large-scale Multimedia Retrieval
Presenter: Davide Eynard (14:15-15:00 PM Tuesday Feb. 16th 2016)
Abstract:

Similarity-preserving hashing has become paramount for large-scale multimedia retrieval. It aims at finding a hash function which maps high-dimensional input data to a low-dimensional binary code. Thanks to its compact representation, it allows for fast retrieval and efficient storage while retaining performance which is better than the one of the original descriptors. One limit of existing hashing techniques, however, is the intrinsic trade-off between performance and computational complexity: searching within a small radius in the Hamming space results in short search times, but dramatically lowers the recall. We propose a way to overcome this limitation by enforcing the hash codes to be sparse. Sparse high-dimensional codes enjoy from the low false positive rates typical of long hashes, while keeping the false negative rates similar to those of a shorter dense hashing scheme with equal number of degrees of freedom. I will show examples in image retrieval, where our method outperforms other state-of-the-art approaches, and an application of these techniques to video retrieval, allowing to correctly identify a video given a sequence of just few of its frames.