Component Mining - Mahdi Cheraghchi

Download Report

Transcript Component Mining - Mahdi Cheraghchi

Component Mining

Mahdi Cheraghchi-Bashi-Astaneh

[email protected]

Outline

 What is a component?

 Software reuse  What is component retrieval?

 Pros and cons of reuse  How to retrieve?

 Evaluation Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 2

What is a component?

     A part of the whole.

“A piece of software small enough to create and maintain, big enough to deploy and support, and with standard interfaces for interoperability" -

Jed Harris, President CI Labs.

Self contained binary pieces of software, but not complete applications.

Can be combined with other components to produce complete applications, regardless of the languages the components are implemented in or platforms they run on. Object-Oriented methods are often used for component development and reuse.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 3

Some Examples in Practice

 Borland Delphi  Borland C++ Builder  Borland Kylix  OLE / COM / ActiveX  JavaBeans  CORBA Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 4

Software Reuse

 Software reuse is the process of creating software systems from existing software rather than building software systems from scratch. [Krueger,1992]  Levels of software reuse: source code, algorithms, architectures, domain models, design, program transformations, documentation, … every possible aspect of a software system Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 5

What is Component Retrieval?

    The mere existence of a component library does not automatically entail its re-use.

“Component Mining” is the deliberate, organized and automated process of extracting reusable components from an existing rich software base.

Re-users need support to help them identifying components which suit their needs, This task is the topic of software component retrieval.

The goal is to develop reusable, adaptable software components rather than large, monolithic applications.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 6

Types of Reuse

 Black-Box Reuse: a client may reuse the retrieved components “as is.”  Component-adaptive Grey-Box Reuse: a client may reuse the retrieved components without meeting any additional conditions but only after interface-level modifications of the components.

 White-Box Reuse: arbitrary additions and modifications are required.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 7

Pros and Cons of Reuse

  1.

2.

3.

Advantages: Reduces time and cost spent on programming.

Increases programmers’ productivity.

Increases program quality and reliability.

4.

Expertise sharing 1.

Problems: It is hard to find things, especially in a large scale.

2.

3.

4.

Typically components are not (easily) modifiable.

It is hard to manage a large pool of components.

It only worth if it is easier to locate and modify a reusable component than to write it from scratch.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 8

How to Retrieve?

     Component retrieval is in fact a form of information retrieval. Despite this fact, “dedicated” component retrieval algorithms are being developed, since software is more than an ordinary text.

Component retrieval is a complex and heuristic process.

Typically needs a well-structured repository of components.

Methods of retrieval 1.

2.

Algorithms based on the meta-data accompanying software components.

Algorithms based on the structure of the components.

Exact retrieval versus approximated retrieval Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 9

Retrieval by Meta-Data

   By meta-data we mean the documentation accompanying the component.

This method relies on existence and quality of the documentation and needs some pre processing.

1.

How to find?

Using full-text search on documents and program files: No cost, but inaccurate 2.

By classification of the components either automatically or manually. (depending on the cost and accuracy we need) Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 10

Retrieval by Structure

 Depends on the availability of the structure in some form (source code, interface, etc)  Depends on the availability of computer language processors.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 11

Some Other Methods

  1.

2.

3.

Formal component specification Domain theories: algebraic model, signatures, etc Interface specifications Interface matching (automated theorem proving, etc)  Semantic Classification Feature-based methods (What possible features can a component have?) Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 12

Some Other Methods

 Deduction-Based Component Retrieval  Is the only method which retrieves proven matches only.

 Suitable for the development of high-reliability or safety-critical applications, e.g. space craft control systems.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 13

Searching and Browsing

   Searching: Software developers formulate a query, and the repository system returns components that match the query.

 Problem: Formulating an effective query is a challenging task.

Browsing: Developers determine the relevance of the components currently being displayed in terms of their development task, and traverse the associated links.

 It is an incremental task, and is usually preferred.

 Problem: Software developer may be puzzled.

Context Aware Browsing: Infers developers’ tasks by monitoring their interactions with the environment.

   Similar to browsing, but results in a significantly smaller browsing space.

Uses learning methods to refine itself.

Problem: It is difficult to “understand” the content.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 14

The Reuse Environment

 A component database.

 A library management system providing access to the database.

 A software component retrieval system (e.g. an ORB) that enables client applications to retrieve components from the library server.

 CBSE tools that support the integration of reused components into a new design.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 15

Evaluation Measures

 Recall = Ratio of the number of relevant components retrieved to the total number of relevant components in repository  Precision = Ratio of the number of relevant components retrieved to the total number of components retrieved  Response time Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 16

Summary and Conclusion

     Software reuse is a crucial concern in today’s world of complex software products.

Component-based development model plays an important role in software reuse.

Component-based model is useful only when an satisfactory means of retrieval is available.

No definite answer has yet been developed for description of components in unambiguous classifiable terms.

Component retrieval is a difficult problem and more work is needed to find an efficient solution.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 17

References

        D. Spinellis, K. Raptis, Component Mining: a process and its pattern language,

Information and Software Technology

42 (2000) pp 609-617 Hafedh Mili et al, An experiment in software component retrieval,

Information and Software Technology

45 (2003) pp 633-649 K. McArthur et al, An evaluation of the impact of component-based architectures on software reusability,

Information and Software Technology

44 (2002) pp 351-359 P.A.V. Hall, Architecture-driven component reuse,

Information and Software Technology

41 (1999) pp 963-968 I. Crnkovic, M. Larsson, Challenges of component-based development,

The Journal of Systems and Software

61 (2002) pp 201-212 Y. Ye, G. Fischer, Context-Aware Browsing of Large Component Repositories,

IEEE 16 th International Conference on Automated Software Engineering

, 2001 A. M. Zaremski, J. M. Wing, Signature Matching, A Key to Reuse B. Fischer, Deduction-Based Software Component Retrieval (Thesis) Mahdi Cheraghchi-Bashi-Astaneh ([email protected]) 18