Machine Learning Coffee Seminar: Professor Hiroshi Mamitsuka, Kyoto University

2017-03-27 09:15:00 2017-03-27 10:00:00 Europe/Helsinki Machine Learning Coffee Seminar: Professor Hiroshi Mamitsuka, Kyoto University Weekly seminars held jointly by the Aalto University and the University of Helsinki. http://cs.aalto.fi/en/midcom-permalink-1e6ff419e0759d2ff4111e681346beceb00df12df12 Gustaf Hällströmin katu 2, 02150, Helsinki

Weekly seminars held jointly by the Aalto University and the University of Helsinki.

27.03.2017 / 09:15 - 10:00
Exactum D123, Gustaf Hällströmin katu 2, 02150, Helsinki, FI

Helsinki region machine learning researchers will start our week by an exciting machine learning talk. The aim is to gather people from different fields of science with interest in machine learning. Porridge and coffee is served at 9:00 and the talk will begin at 9:15. The venue for this talk is Exactum D123, Kumpula.

Subscribe to the mailing list where seminar topics are announced beforehand.

Learning to Rank: Applications to Bioinformatics

Hiroshi Mamitsuka
Professor, Kyoto University

Abstract:

Learning To Rank (LTR) has been developed in information retrieval for ranking documents regarding the relevance to a given query. Typically LTR builds a ranking model from given relevant (or irrelevant) query-document pairs. Generally, in some respect, LTR can be thought as an attempt to solve a multilabel classification problem, where queries are labels. A lot of settings in bioinformatics can be turned into multilabel classification problems having relatively similar properties. One typical example is biomedical document annotation. Currently PubMed, a database of 26 million biomedical citations, has around 30,000 keywords, called MeSH (Medical Subject Headings) terms, i.e. labels in multilabel classification, where the number of articles per MeSH term is extremely diverse, ranging from only 20 to more than eight million. This large, biased dataset already goes beyond the general sense of settings expected by regular multilabel classifiers. In this talk, I will start with introduction and a brief review of LTR. I then raise three bioinformatics multilabel classification problems that share real data-derived, practical properties, which hamper the application of regular multilabel classifiers. Finally I will show that LTR nicely addresses such large-scale, challenging bioinformatics multilabel classification problems.

A large portion of this talk appeared in ISMB in 2015 and 2016.

See the next talks at the seminar webpage.

Please spread the news and join us for our weekly habit of beginning the week by an interesting machine learning talk!

Welcome!