Dissertations & Diploma Theses
Dissertations
in Progress
|
Werner Klieber |
Automatic orchestration of knowledge discovery services Knowledge discovery processes need to be flexibly configurable for various domain scenarios. This includes the specification of the tasks, their orchestration, integration and optimising. Modelling of such processes allows an implementation independent and intuitive usage. However, for complex tasks it becomes difficult to select the appropriate services and put them together to build the correct work flow. Current research examines approaches like service oriented architectures, grid (parallel computing) and semantic description languages to deal with complex tasks. The main goal of the work concerns the adaptability of these approaches for the knowledge discovery domain. Furthermore it will be examined to what extend processes based on semantic descriptions can be automatically orchestrated. |
|
Mark Kröll |
Discovery of Relations in Semi-Structured Datasets
Knowledge-intensive work plays an increasingly important role in organisations of all types. Knowledge workers contribute their effort to achieve a common purpose; they are part of (business) processes. Workflow Management Systems support them during their daily work, featuring guidance and providing intelligent resource delivery. However, the emergence of richly structured, heterogeneous datasets requires a reassessment of existing mining techniques which do not take possible relations between individual instances into account. Neglecting these relations might lead to inappropriate conclusions about the data. In order to uphold the support quality of knowledge workers, the application of mining Structural information is obtained and maintained by representing user interaction patterns, e.g., relations between users, resources and tasks, as graphs. It can be used to improve predictive accuracy of learnt models: attributes of linked objects are often correlated, and links are more likely to exist between objects that have something in common. The graph structure itself may be an important feature to be included in the mining procedure. In the course of this work, experiments will show which selection of features succeeds in the mining challenge. Under what circumstances there is a need for balance between features regarding content and structure? Are certain graph mining techniques better suited than others? To sum up, this work aims at answering following questions:
|
finished
|
Michael Granitzer |
KnowMiner: Conception and Development of a Generic Knowledge Discovery Framework Steadily increasing information amounts require new ways and techniques for making knowledge efficiently and goal oriented utilisable. The lack of structure in information, the incomplete allocation of metadata, and the vagueness of human language make things even more difficult. For making information utilisable for different user groups, various techniques from different domains must be combined: Knowledge Discovery and Information Retrieval provide approaches and techniques for a semantic enrichment of information and thus for a better utilisation for information which partly lies idle. Against this background the work at hand deals with the conceptualisation and development of an integrative and flexible software framework. This framework supports the development of knowledge discovery and information retrieval services. The main goal of the work concerns the integration of different algorithms and techniques from the above mentioned domains, whereby the framework should be applicable in different application scenarios. The analysis of processes, data flows and application areas of Knowledge Discovery provides the conceptual foundation for the realisation of the software framework. The conceptualisation and realisation was divided into several iteration cycles, oriented along the spiral model of software development. The end of each cycle consisted in applying and evaluating the framework in practical scenarios and projects. This, on the one hand allowed for the further development of the framework, and on the other hand showed its applicability in different scenarios and technical areas. The developed KnowMiner-Framework was successfully realised and applied in five large and a number of smaller projects. As experiences show, the framework can easily be integrated in various application areas and techniques from Knowledge Discovery can easily be applied in practical scenarios |
|
Mathias Lux |
The Role of Metadata in Knowledge Discovery Metadata is a broad term ranging from simple attribute structures describing data to huge and complex ontologies trying to formalize the knowledge about a resource. In knowledge discovery the extracted knowledge has to be codified and used for retrieval and inference. One major aspect is the comparison of knowledge and the retrieval of similar codified knowledge. Based on graph structures different ways of knowledge representation exist, for instance RDF, which builds a foundation to the Semantic Web, the MPEG-7 semantic descriptor scheme, which allows the semantic description of multimedia resources or conceptual graphs, which are a general model for formalizing knowledge. The PhD thesis concentrates on retrieval and comparison of graph structures used to store semantic metadata and its application. |
Master Thesis
in Progress
|
Georg Öttl |
Recent IE systems extract named entities rule based, with machine learning approaches or by using a mixture of both. The main drawback of a rule based approach is that it requires the manual adaption of rules to a particular dataset. A machine learning algorithm, on the other hand, typically needs to be trained on a dataset. This study introduces mechanisms to support and improve the rule adaption process by learning rules. An important detail of this rule learning process is the semi-automatic extension of the used training dataset. If the quality of the learned rules is good enough, in means of precision and recall, the created set of rules can be reused to create multiple instances of ontology. The evaluation of the hybrid approach happens through comparison with state of the art machine learning algorithms and pure rule based information extraction systems |
Finished
|
Michael Granitzer |
Classification of Hierarchical Document Spaces Using Machine Learning Technologies
Due to the permantently growing amount of textual data, automatic methods for organizing the data are required. Automatic text classification is one of these methods. Based on the textual content of the document, it automatically assigns documents to a predefined set of classes. |
|
|
Philip Hofmair |
Asset- and Rightsmanagement in the context of digital libraries
A reaction to the ever increasing flood of information, especially in the digital sector, is the growing desire to find better methods to organize and control it. Be it documenting and coping with the general information appearing daily on billions of internet sites, or be it dealing with the highly specialized information as found within schools and universities. Digital libraries provide a very good method of collecting and logging information in a controlled manner. However, if the volume of data in such libraries exceeds a certain limit and moreover it also contains highly confidential information, then the use of Systems requiring access authorization, becomes a must.
|
|
|
Mathias Lux |
Magick - A Tool for Cross-Media Clustering and Visualization The high tide of digital information, that that takes course towards us in 21st century, brings along enough motivation for research and developments in the area of information retrieval. The con-junction of different media like TV, radio, Internet, newspapers and telephone leads to a heterogeneous information landscape in which uniform navigation and search is hard to apply. Information retrieval solves common problems with handling textual and image data, using metadata allows to enrich data with semantic computable content descriptions, evaluations and classifications independent of the actual media. The application Magick combines these well known and tested techniques to allow cross media retrieval and to restyle the information landscape in a more homogenous way for the user. |
|
|
Vedran Sabol |
Visualisation Islands: Interactive Visualisation and Clustering of Search Result Sets
The amount of knowledge available electronically is increasing exponentially. Huge amounts of information are available over the Internet and searching for a specific topic often results in a large number of matches. A significant portion of hits is often not at all of interest and the retrieved information contains no explicit relations between different hits, making it hard to obtain an overview and find relevant information. |
|
|
Werner Klieber |
Using MPEG-7 for Multimedia retrieval
The aim of knowledge retrieval is an efficient knowledge finding in complex knowledge spaces. In this thesis a Multimedia query Framework is realized that supports Multimedia queries composed of different media types linked together for a unique query request. The user does not have to enter Low-level data. The Multimedia query Framework supports the direct usage of Multimedia data for query specification. The Metadata standard MPEG-7 is used to ensure a uniform representation of the information and to make feature and semantic based information explicitly available for the system. This Multimedia query Framework is integrated into an existing distributed Web environment based on XML. |
|
|
Thomas Neidhart |
Semiautomatic creation of knowledge maps using knowledge mining techniques
By the abundance of existing information the need after a suitable structuring of |
|
|
Andreas Juffinger |
Focused Crawling in the Context of Digital Libraries
Focused crawling is gathering increasing momentum not only in combination with search engines but also in the context of digital libraries. Crawling is useful for developing new documents in certain topics. Focused crawling can also assist people to find data on the world wide web by suggesting sites and pages of interest. The overall crawling process is splitted up into a crawling part and a web mining part. |
|
|
Andreas Augustin |
Acquisition of semantic information from encyclopedic data In the last years the amount of data has been massivly growing and keeps on growing, hence it became necessary to develop new methods to overcome this large amount of data. Besides the search capability improvements, one of the main forces in current research on data mining is the need to expose and understand the underlying knowledge inside the data. Encyclopedias are known as a reflection of a decades knowledge. As encyclopedias were always and are a great resource for people to gain common knowledge, there is a need to build such common knowledge for computer systems too. The primary objective of this thesis is to extract knowledge out of the textual representation of those Encyclopedias and the preparation of the extracted knowledge for exploitation in different applications and domains. As an implementation of this process, the accurate methods of Ontology Learning are applied to the text to create taxonomies and concept hierachies. These structures are combined in an computer processable ontology. Furthermore the extracted information is evaluated and refined by using additional methods like online validation and clustering. Under the circumstance that suitable methods are used, this thesis shows that the semiautomatic extraction of high quality semantic information from an encyclopedic dataset to build a baseontology is possible. |