Our mission is to advance cutting-edge research and applications of knowledge technologies that support the analysis, modeling and management of knowledge and data. We.

Download Report

Transcript Our mission is to advance cutting-edge research and applications of knowledge technologies that support the analysis, modeling and management of knowledge and data. We.

Our mission is to advance cutting-edge research and applications of knowledge technologies that support the analysis, modeling and management of knowledge and data.

We have authored and edited numerous scientific books and coordinated several EU projects. Our technologies have been successfully applied to many practical problems.

We are active in education and transfer of knowledge, and act as a bridge between science and industry in Slovenia and abroad.

Department of KNOWLEDGE TECHNOLOGIES Jožef Stefan Institute

Contents Basic Information …………………………….……...

2 Scientific highlights ………………………….……… 6 Relevance highlights ………………….……….…. 12 Vision ………………………………………..…………..

23

Basic information

30 years of research tradition

– Founded as Department of Artificial Intelligence in 1979 – Department of Knowledge Technologies since 2004 – 30 researchers, 20 students/external, 5 support staff Research areas – Data Mining – Text, Web and Multimedia Mining – Semantic Web – Human Language Technologies – Decision Support – Knowledge Management Application areas – Ecology, Geology – Medicine, Health care – Biomedicine, Systems biology – Agriculture, Forestry – Telecommunications – Digital libraries – Cultural heritage – eGov, eBusiness, eLearning 2

Collaborations

In Slovenia – Center for Knowledge Transfer in IT at Jožef Stefan Institute (JSI) – Jožef Stefan International Postgraduate School – Spin-offs: Temida and Quintelligence – Cycorp Europe established in 2007 at JSI International – Collaboration with over 100 partners of EU projects, academic and industrial – Strong ties with over 20 other partners, including CMU, Stanford University, NASA Ames, Microsoft Research and Osaka University • • • • • • • • • • • • • • • • Industry British Telecom France Telecom Siemens Business Solutions Empolis/Bertelsman Atos Origin Software AG UN FAO Dassault Aviation BRGM (Bureau de recherches géologiques et minières) SINTEF (Norway) iSOCO (Spain) Ontoprise (Germany) SIRMA (Bulgaria) Helsinki Institute of Technology FZI Karlsruhe CRF FIAT 3

Education and knowledge transfer

Teaching – MSc and PhD courses in major research areas of knowledge technologies – Supervision of BSc, MSc, and PhD students Institutions – Jožef Stefan International Postgraduate School – Universities of Nova Gorica, Maribor, Ljubljana, Primorska, Graz University videolectures.net

– World leading video lectures Web portal Summer schools – Semantic Web Summer School 2004 – 100 attendees – AI Summer School ACAI-05 – 100 attendees High school student competitions – Yearly Computer Science competitions – Books of tasks and solutions 4

Selected publications before 2004

5

Department of KNOWLEDGE TECHNOLOGIES

Scientific Highlights 2004-2008

6

Scientific results 2004 - 2008

Publications in prestigious journals

– Journal of Machine Learning Research (3), Machine Learning (6), Decision Support Systems – Ecological modeling (10), Journal of Biomedical Informatics (3)

Awards

– Two elected ECCAI fellows (2007, 2008) – Prešeren BSc thesis award (2007) – Best software award (ESWC-2006)

PC chairs of major scientific events

– DS-06, ESSLLI-07, ILP-08 – ECML/PKDD-07, ECML/PKDD-09

High SCI citations of group members Editors/authors of numerous books and proceedings

7

Scientific highlight: Subgroup discovery

New methods and systems – Discovering interesting subgroups in tabular data : • SD, CN2-SD, APRIORI-SD – Discovering interesting subgroups in multi-realtional data: • RSD Breakthrough technology – Effective method for using ontologies in relational data mining • Using GO, KEGG, ENTREZ to form relational features • Successful discovery of new scientific knowledge in functional genomics Journal papers – MLJ 2004a and 2004b, JMLR 2004, …, IEEE TSMC 2006, MLJ 2007, JBI 2007, JMLR 2008, JBI 2008 8

Scientific highlight:

Equation discovery

New methods – Integrating process-based domain knowledge and models Breakthrough technology – Integrating knowledge-based and data-driven modeling of dynamic systems Systems LAGRAMGE 2.0, IPM Numerous successful applications – Modeling aquatic ecosystems • Lake Bled, Ohrid, Kasumigaura, Greifensee, Glumsoe Journal papers in MLJ, Ecological Modeling, … State-of-the-art-survey book – Computational Discovery of Scientific Knowledge 9

Scientific highlight:

Text mining and visualization

New methods – Text processing, clustering, SVM, ontology construction, … – Graph and text visualization Breakthrough technology – Open source text mining SW Systems – Text-Garden – text-mining library ( http://www.textmining.net

) – Document-Atlas – text visualization ( http://docatlas.ijs.si/ ) – OntoGen – semi-automated ontology construction ( http://ontogen.ijs.si/ ) • Award winner at ESWC 2006 Content-Land 10

Scientific highlight:

Qualitative decision support

New methods – Qualitative DS modeling – Truly hierarchical, probabilistic Systems DEXi 2.0, proDEX Monograph on Decision Support Journal papers in Decision Support Systems, Journal of Operational Research, Ecological Modeling, … Successful applications – GM farming models and DS systems, Highway control, … – SW Tools: SIGMEA Maize Coexistence Advisor (SMAC Advisor), ECOGEN Soil Quality Index (ESQI) water quality runoff water undergrnd water N 2 O greenhouse gasses CO 2 indirect CO 2 soil state nutrition state pesticide use fertilizer use fuel use herbicide use insecticide use fungicide use farm type CONTEXT soil climate weed profile ECOLOGY soil biodiversity chemical disturbance soil fertilization physical stress pollinators climatic disturbance physical disturbance weed biomass weed ctrl. applications biodiversity predators herbivores parasitoids crop sub-type chemical fertiliz. use water managmt soil tillage weed control pest control CROP PROTECTION disease control 11

Department of KNOWLEDGE TECHNOLOGIES

Relevance Highlights 2004-2008

12

Relevance highlight:

European projects

“… Knowledge Technologies is the most successful Slovenian program in terms of EU projects.” FP6 projects

National Research Fund director Oct. 25, 2007 F. Demšar,

FP6 20 EU projects

– 4 IP projects, 1 NoE, 3 SSA, 1 CA – 11 STREP projects – Coordination of one STREP project (IQ)

In FP6 we acquired

~ 25% of Slovenian FP6-IST funds (5.1+ Mio EUR), i.e. ~ 7% of Slovenian FP6 funds 13

FP6 European projects

• • • • • • • • • SEKT - Semantically-Enabled Knowledge Technologies (IP, 2004 – 2006) ECOLEAD - European Collaborative Networked Organizations Leadership Initiative (IP, 2004–08) NeOn - Lifecycle Support for Networked Ontologies (IP, 2006–10) Co-Extra – GM and non-GM Supply Chains: Their Co-existence and Traceability (IP, 2008– 09) PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning (NoE, 2003–07) CEC-WYS - Central European Centre for Women and Youth in Science (SSA, 2004–07) IST-World - Knowledge Base for RTD Competencies (SSA, 2005–07) WS DEBATE - Stimulating Policy Debate on Women and Science Issues in Central Europe (SSA, 2006–08) KD-ubiq - A blue print for ubiquitous knowledge discovery systems (CA, 2005–08) • • • • • • • • • • • ALVIS - Superpeer Semantic Search Engine (STREP, 2004–06) SIGMEA - Sustainable Introduction of GMOs into European Agriculture (STREP, 2004–07) IMAGINATION - Image-based Navigation in Multimedia Archives (STREP, 2006–09) SMART - Statistical Multilingual Analysis for Retrieval and Translation (STREP, 2006–09) SWING - Semantic Web Services Interoperability for Geospatial Decision Making (STREP, 2006–09) TAO - Transitioning Applications to Ontologies (STREP, 2006–09) E.E.T Pipeline - European Embryonal Tumor Pipeline (STREP, 2007–09) E4 - Extended Enterprise management in Enlarged Europe (STREP, 2006–08) Tool-East - Open Source Enterprise Resource Planning and Order Management System for Eastern European Tool and Die Making Workshops (STREP, 2006–08) IQ - Inductive Queries for Mining Patterns and Models (STREP, 2005–08), Coordinator HEALTHREATS - Integrated Decision Support System for HEALTH THREATS and crises management (STREP, 2007–10) 14

FP7 European projects (in 2008)

FP7

– 3 IP, 2 STREP, 1 NoE, 1 CSA – In FP7 we have acquired ~ 30% of Slovenian FP7-ICT funds (2.5+ Mio EUR)

Projects

COIN - COllaboration and INteroperability for networked enterprises (IP, 2008–12) – ACTIVE - Enabling the Knowledge Powered Enterprise (IP, 2008–11) – EURIDICE - European Inter-Disciplinary Research on Intelligent Cargo for Efficient, Safe and Environment-friendly Logistics (IP, 2008–11) – PASCAL2 - Pattern Analysis, Statistical Modelling and Computational Learning 2 (NoE, 2008–13) – BISON - Bisociation Networks for Creative Information Discovery (FET, 2008– 11) – PHAGOSYS - Systems biology of phagosome formation and maturation modulation by intracellular pathogens (STREP, 2008–11) – MONDILEX - Conceptual Modelling of Networking of Centres for High-Quality 15

Industrial participation in European projects

We have helped 11 Slovenian companies to become partners of FP6 and FP7 EU projects. Total value of EC contribution for these industrial partners is more than 2 Mio EUR. – Orodjarski grozd – Avtomobilski grozd – Grozd visokotehnološke opreme – Kogast Grosuplje – Emo orodjarna – Valji Štore – Tecos – Quintelligence – Cycorp – Amebis – Hermes Softlab 16

Relevance highlight:

Slovene language and heritage

nl.ijs.si portal

Largest public repository of Slovene language resources • ~ 10,000 requests/day • Annotated language corpora • Lexicons and dictionaries • On-line tools for language processing: concordancers, lemmatisers, taggers • eZISS digital library of critical editions of Slovene literature 17

Relevance highlight: Environmental data analysis

Applied projects

Agriculture: modeling co-existence of genetically modified and conventional crops – Forestry (automated forest mapping): • from satellite images instead of LIDAR • cost reduction: from 660 to 0.01 US$/km2 – Fire risk model: Deployed in e-GIS UJME

Events

– ECEM/EAML-04: European Conf. on Ecological Modeling: Env. App. of Machine Learning – Special issues of Ecological Modeling journal

Postgraduate education

– University of Nova Gorica – Jožef Stefan International Postgraduate School – University of Trento 18

Relevance highlight: Healthcare data analysis

Projects MediNet and MediNet+ for the Slovenian Ministry of Health

– Qualifications of physicians • Modeling and exception finding – Planning of needs for physicians – Accessibility of primary healthcare 19

Relevance highlight:

Public portals

videolectures.net

World leading video lectures Web portal • 4,500+ videos of 3,000+ lectures at 150+ events • About 3,000 views/day • Collaboration with CMU, Cambridge, Oxford, Max Planck, Berkeley, INRIA • To include all MIT OpenCourseWare and CERN lectures base in 2008 20

Relevance highlight:

Public portals

www.ist-world.org

World leading Web portal for analyzing European science • 90,000 RTD organizations, 68,000 RTD projects, 1.6 Mio experts and

2.5 Mio publications

• A bout 15,000 visits/day • E xtending coverage to Russia, India, SEE countries 21

Organization of events

Slovenian events

– Solomon seminar – regular public seminar, running for 9 years (200+ seminars) – SiKDD – yearly Slovenian Conference on Data Mining – Language Technologies - biennial conferences

International events

– ECEM/EAML-04: Eur. Conf. on Ecol. Modeling – 100 attendees – European Semantic Web Conference 2006 – 350 attendees – IDA 2007 – 100 attendees – 10+ international meetings and workshops (~50 attendees)

International events planned

– ECML/PKDD 2009 – est. 400 attendees – WWW 2012 (in process) – est. 1500 attendees largest CS event to be organized in Slovenia 22

Department of KNOWLEDGE TECHNOLOGIES

Vision

23

Future advances in basic research (1)

Data analytics

– Structured data analysis (structured prediction, bissociation analysis, …) – Sensor network analysis, social network analysis (large graph data) – Multi-modal data analysis (information fusion, different data types) – Complex data visualization

Text analytics

– Extending TextGarden to multimedia mining (text, image, Web) and social network analysis – Advancing information extraction, machine translation – Ontologies and Semantic Web 24

Future advances in basic research (2)

Human language technologies

– Semantic annotation of Slovene language corpora – Integrated digital library of Slovene text-critical editions – Slovene cultural heritage – processing old (19th century) language

Decision support

– Integration of qualitative and quantitative methods – Handling incompleteness, uncertainty and imprecision

Knowledge management

– Web 2.0, Semantic Web services – Networked organizations – eLearning – videolectures.net

25

Impact on other sciences through applied research

Environmental sciences

– Ecology (Aquatic, Modeling the response of ecosystems to climate change) – Forestry – Environmental epidemiology

Agriculture Biomedicine

– Bioinformatics, Functional genomics, Systems biology – Medicine

Linguistics, Humanities and Social sciences Engineering Impact will be achieved in collaboration with partners of EU projects

26

National relevance

Developing of IT and building a knowledge-based society

– Through basic research – Training competent researchers in this area – Education (at graduate and post-graduate level)

Applied research

– Impact of the potential introduction of GM crops, environmental epidemiology of tick-borne diseases, introduction of ML technology for Slovene-English machine translation systems, analytic techniques for enterprise knowledge management, systems biology

Continue opening new high-tech jobs in Slovenia

– Through direct industrial applications – Through inclusion of Slovenian industry in EU projects 27

Means for achieving our vision

• Clear scientific focus on knowledge technologies • Excellent links with scientists abroad • Excellent links with industry • Young and visionary staff • Available equipment • Secured funding: – About 25 % from national long-term research program Knowledge Technologies and other national and international projects – About 75 % from EU funded projects 2.500.000 2.000.000 1.500.000 1.000.000 500.000 0 EU projects 71% 2004 2005 2006 2007 2008 Total funding EU projects Knowledge Technologies National projects

Funding in 2008

Knowledge Technologies 17% Targeted research projects 7% Applied research projects 2% Basic research projects 3% Knowledge Technologies Targeted research projects Applied research projects Basic research projects EU projects 28

Principal researchers – project leaders

Nada Lavra č Head of Department Sašo Džeroski Tomaž Erjavec Dunja Mladenić Marko Bohanec Marko Debeljak Marko Grobelnik Mitja Jermol

29

Department members - Bohinj 2008

30

Notes

31

Notes

32

Jožef Stefan Institute and Postgraduate School

Jožef Stefan (1835-1893) was one of the most distinguished physicists of the 19th century. He originated the law of the total radiation from a black body.

j = σ T 4 Founded in 1949, Jožef Stefan Institute is the leading Slovenian scientific research institute, covering a broad spectrum of basic and applied research. The staff of more than 850 specializes in natural sciences, life sciences and engineering.

Department of Knowledge Technologies is one of the seven ICT departments of the institute. Other departments are in the area of chemistry, biochemistry, ecotechnology, nanotechnology, physics, nuclear technology and safety.

Founded in 2004, Jožef Stefan International Postgraduate School offers MSc and PhD programs: ICT, nanotechnology and ecotechnology. Courses are taught in English.