Copyright Lombard Hill Group
Abstract
Keywords: cluster analysis, group technology, manufacturing concepts, domain analysis.
1. Background
Wemmerlov and Hyer [Wemm] highlight three ways that group technology benefits are achieved:
* By performing similar activities together, less time is wasted
in changing from one unrelated activity to the next
* By standardizing closely related items or activities,
unnecessary duplication of effort is avoided
* By efficiently storing and retrieving information related to
recurring problems, search time is reduced
2. Position
One of the means for family identification in group technology is cluster analysis. Cluster analysis is "concerned with grouping of objects into homogeneous clusters (groups) based on the object features. [Kusi]" Cluster analysis was applied to software reuse at the Manufacturing Productivity (MP) section of the Software Technology Division of HP to solve a problem concerning the maintenance of their reusable assets, called "utilities".
The MP section produces large application software for manufacturing resource planning. The MP reuse program started in 1983 and continues to the present. The original motivation for pursuing reuse was to increase the productivity of the engineers to meet critical milestones [Nish]. MP has since discovered that reuse also eases the maintenance burden and supports product enhancement. Reuse was practised in the form of reusable assets (application/architecture utilities and shared files) and generated code. Total code size for the 685 reusable assets was 55 Thousand lines of Non-Commented Source Statements (KNCSS). The reusable assets were written in Pascal and SPL, the Systems Programming Language for the HP 3000 Computer System. The development and target operating system was the Multi-Programming Environment (MPEXL).
The utilities at MP are many (685 utilities) and small in size (lines of code range from 14 to 619 Non-Commented Source Statements). In manufacturing systems software developed by MP, a transaction constitutes a cohesive set of activities used for inventory management and is a subunit of the manufacturing systems software.
Within each transaction, calls are made to the appropriate utilities as required which are contained in an include file specific to the transaction. However, this has led to a proliferation of different include files since each transaction is usually created by a different engineer. When a utility is modified, all the include files which contain this utility need to be identified and updated with the new version. This has resulted in a tremendous amount of effort.
In an effort to reduce the potential amount of effort required for future updates, an analysis using cluster analysis was conducted on the use of utilities by transactions.
First, a 13 x 11 matrix was created by designating the rows as transactions and the columns as utilities. (Figure 1). A "1" indicates that a transaction makes a call to the particular utility, and a "0" indicates that a transaction does not make a call to the particular utility.
Input Matrix
(Rows are transactions; columns are reusable assets)
1 0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 1 0 0 1
0 0 1 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 1 0 0 1
0 0 1 0 0 0 0 0 0 0 1
0 0 1 1 0 0 0 0 0 0 1
0 0 1 0 0 0 0 0 0 0 0
0 1 0 0 1 0 1 0 0 0 1
0 1 0 0 1 0 1 0 0 0 0
0 1 0 0 1 1 1 0 1 1 1
0 1 0 1 1 1 1 0 1 1 1
0 1 0 1 0 0 1 0 0 0 1
0 1 0 1 1 0 1 0 0 0 1
Column (Reusable assets):
1=Adj-summary-qty
2=Autobofill
3=check'store'up
4=invalid-fld-check
5=potency-inv-qty
6=prep'for'pcm
7=print-document
8=report-neg-qty
9=send'to'pcm
10=update'pcm'buff
11=write-stock-file
Rows (Transactions):
Figure 1
The matrix is then used as an input file to a clustering algorithm provided by Dr. Andrew Kusiak of the University of Iowa.
The output solution, as shown in figure 2, reorders the reusable assets into "clusters". The results suggest that we place utilities (depicted by the columns) 1, 3 and 8 into a single include file for transactions 1 to 7.
Utilities 2,5,6,7,9,10 should be placed into another include file for transactions 8,9,10,11,12,and 13.
Utilities 4 and 11 can either be placed in both include files or a separate one may be created for them.
Cluster Analysis Solution
(Rows are transactions; columns are reusable assets)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 2
Benefits of Cluster Analysis for Reuse
Cluster analysis is useful in the creation of include files with specified utilities that would reduce the effort required to maintain the files. In our example with the MP section, prior to cluster analysis, thirteen individual include files were maintained; one for each transaction. By utilizing cluster analysis, we were able to identify the commonalities and differences within the thirteen include files and specify a core set of two include files. By reengineering the thirteen include files into two, the number of files to maintain can be reduced by 85%.
Cluster analysis also has further applications in software reuse. It may be used to identify "families of systems" i.e. those that share the same features. For example, we can apply cluster analysis to a matrix where the columns depict the features of software systems/products and the rows, the software systems/products. The analysis would cluster the features to the software systems/products. thereby helping to identify families of systems which share common features. This information may be useful in determining specific reusable assets to create.
3. Comparison
Some researchers have utilized cluster analysis for the purposes
of reuse classification. For example, Maarek and Kaiser [Maar]
describe the use of conceptual clustering to classify software
components. Taxonomies of software components are created "by using a
classification scheme based on conceptual clustering, where the
physical closeness of components mirrors their conceptual
similarity." The objects are gathered into clusters where they are
more 'similar' to each other than to the members of other clusters.
Acknowledgements
My acknowledgements to Dr. Andrew Kusiak, Dr. Sylvia Kwan and Alvina Nishimoto for their help and input to this paper.
[Kusi] Kusiak, Andrew and Wing Chow, Decomposition of Manufacturing Systems, IEEE Journal of Robotics and Automation, vol. 4, no. 5, October 1988.
[Maar] Maarek, Yoelle and Gail Kaiser, Using Conceptual Clustering for classifying reusable Ada code, Using Ada: ACM SIGAda International Conference, December 9-11, 1987, ACM Press, New York, 1987.
[Nish] Nishimoto, Alvina, "Evolution of a Reuse Program in a Maintenance Environment", 2nd Irvine Software Symposium, March 1992.
[Wemm] Wemmerlov, Urban and Nancy Hyer, Group Technology, chapter 17 in Handbook of Industrial Engineering, Gavriel Salvendy, ed., John Wiley & Sons, 1992.