Department of ECE,PSG College of technology,Coimbatore-641004.
hssheshadri@hotmail.com
This paper deals with certain experimental investigations on computerized method of detection of malignant and nonmalignant breast masses from mammograms. The performance of this method was tested on mammograms collected from several clinics and hospitals in and around Coimbatore.This method is capable of detecting masses as small as 4.5 mm in diameter. The technoque discussed of course helps in the early detection of breast cancer, and as a consequence increases the rate of cuereness. This new method was developed on the basis of using the macro option, which enables the execution of a series of operations according to a set of parameters. These parameters were obtained using an image processing software. The principles of trial and error has been given more weightage.Experiments have been conducted on about 80mammograms to confirm the capability of the algorithm and correctly diagnose masses independently of their sources.The parameters of the algorithm were set to give the best matching ratio (defined as true positive regions, TPR) between the algorithm diagnosis and the radiologist’s diagnosis in terms of mass size and location. Percent matching was independent of mass size.This gave the algorithm high performance reliability because it indicated that its performance was insensitive to mass size. When the opinion of a second radiologist was considered, TPR was also invariable.
Keywords: Computer Aided Diagnosis,Malignant tumors, Tumor detection
Method Adopted.
The new algorithm detects small
breast masses, (malignant or nonmalignant) which helps in the
early detection of breast masses. A relatively large number of
mammograms (table 1) were used in the development process of the
algorithm. These mammograms were collected from several hospitals
and clinics. This of course enriched the study and covered many
conditions of breast masses.
Computerized detection of masses from mammograms can be used as a
tool to reduce the number of false diagnosis, which in turn
reduces number of cases transferred to biopsy. The new method is
based on the following three stages;
Knowledge collection of the nature of the mammograms.
Obtaining radiologist’s diagnosis.
Building the algorithm.
Table1: Sources and numbers of mammograms used in the study.
Each mammogram was labeled with
three tag points, and then it was superimposed on a transparency
in order for the radiologist to perform his eye-aided diagnosis.
This transparency also includes the same tag points for
referencing purpose as shown in figure 1.
Each mammogram and its corresponding transparency were digitized
using a scanner with a 360dpi, and 256 gray level, which resulted
in digital images of sizes of about 128-256KB. During the
digitization process,we considered only the affected area as
region of interest and stored in computer memory. ( in order to
reduce the memory volume consumed by these images).
The new mass detection algorithm was built based on the following
three image processing steps:
1. Brightness and contrast enhancement.
2. Histogram equalization.
3. Thresholding.
The white color was assigned to the tumor cells, and the black
color was assigned to normal breast tissues and the surrounding
air. Figure 2 shows a block diagram of the algorithm used. Also it
gives a comparitive study of the method employed by a radiologist.

Figure1: An example of a mammogram (a), and the eye-aided
diagnosis (b).


Figure 2: (A) block diagram of the radiologist’s diagnosis, (B) block diagram of algorithm.
The brightness for every image ( total of 82 images ) was increased in steps, each step equals 10 (full scale = 200). Every time the brightness was adjusted, a print out of the image was presented to the radiologist for diagnosis. The diagnosis obtained at each brightness level was compared to the original radiologist diagnosis performed on the transparency. Figure 3 shows the distribution of number of images that agrees with the original transparency-based radiologist diagnosis versus brightness level. With the distribution being close to normal, we used the distribution mean of 40 as the value to be used in the algorithm. In similar manner we obtained a value of 40 for the contrast enhancement, and 225 for the thresholding as shown in figure 4,and 5 respectively.The purpose of using histogram equalization is to normalize the distribution of the gray levels of all images. This was necessary as a result of the variety of sources and methods of mammography. In addition, it also increases the sensitivity of subsequent processing functions.

Figure 3: Distribution of images
with coincided diagnosis (radiologist and algorithm diagnosis) vs.
brightness level.

Figure 4: Distribution of images with coincided diagnosis (radiologist and algorithm diagnosis) vs. contrast level.

Figure 5: Distribution of images with coincided diagnosis (radiologist and algorithm diagnosis) vs. threshold level.
Results and Data Analysis
Results were presented in two forms. The qualitative results are the digitized images obtained after applying the algorithm. The quantitative results are the data analysis, and results of the statistical tests, which compares the outcome of the algorithm with the diagnosis of two radiologists.
Qualitative results
To evaluate the algorithm performance, its outcome was compared to
the radiologist diagnosis using the technique of image
subtraction. By defining the image outcome of the algorithm as pc
(i,j), and the image outcome of the radiologist diagnosis as pd (i,j),
then the subtracted image would be:Ps (i,j) = pd (i,j) - pc (i,j)
The subtracted image gives primary evaluation to the algorithm in
terms of size and location of the diagnosis mass. Figure 6 shows
an example of a mammogram that contains malignant mass as proven
by biopsy.

Figure 6: An example of the qualitative results, (a) the original image, (b) the algorithm output, (c) radiologist diagnosis, (d) subtraction of images “b” and “c”.
Quantitative analysis
1. This analysis is based on mass location and size. These two
parameters will be compared between the algorithm output image,
and the radiologist diagnosis image. The concept of true positive
regions (TPR), and false positive regions (FPR) will be used as a
measure of the comparison between the two diagnosis. These
parameters are based on the quantity of matched regions PM, total
region PT, and mismatch region PMM, as described in Figure 7.

Figure 7: Concept of quantitative analysis. (a) PM, (b) PT, (c) PMM.
Where “W” is the mass size as
diagnosed by the radiologist, which represents number of white
pixels in the image Pd . This parameter was obtained from the
image histogram.TPR and FPR will take values of 100 and 0
receptively in the ideal case, which represents total match
between the algorithm diagnosis, and the first radiologist
diagnosis.The algorithm parameters were based on the outcome
values of TPR1 and FPR1 between the algorithm diagnosis and the
first radiologist diagnosis. In this case a total of 86 mammograms
were presented to the first radiologist prior to being diagnosed
by the algorithm.
In order to test the independence of the algorithm performance, it
was applied to a total of 23 mammograms prior to being presented
to a second radiologist, in which case we obtained values of TPR2
and FPR2. Values of TPR1, FPR1, TPR2, and FPR2 are shown in tables
2, and 3

Table 2: values of TPR1, TPR2, FPR1, FPR2.

Table 3: values of TPR, FPR when masses were categorized as
malignant and nonmalignant.
Discussion
The mammograms used to build this algorithm were obtained from three hospitals and clinics clinics. The algorithm was able to detect masses as small as 4.8 mm in size, which correspond to a mass age of 1-2 years according to the type of the tumor. This emphasizes the fact that this algorithm may help in the early detection of breast cancer, and as a result, increase the chance of cureness. To test the accuracy of the algorithm, 23 mammograms not used in building the algorithm were diagnosed using the algorithm prior to being presented to the first radiologist. The structure of this algorithm allows the user (i. e. the radiologist) to use the whole process, or part of it as required. For example, expert radiologist may only need to enhance brightness or contrast. The radiologists in general include a safety factor when they diagnose mammograms. This factor presents an extra area around the mass. The first radiologist suggested that this factor is about 20% of mass size. Based on that the parameters of the algorithm were chosen to give an outcome of 20% less than the radiologist diagnosis. In this case, when using the algorithm to detect masses, the radiologist may add this extra factor depending on the case All 82 mammograms were presented to the second radiologist to compare his diagnosis with that of the first radiologist. The outcome of this comparison was TPRd, FPRd. The mean , and the standard deviation , were calculated for all values of TPR, FPR for all combinations as shown in table 4.

Table 4: values of and for TPR, FPR.
Statistically, there was no
significant differences between values of TPR1, and TPR2.

Figure 8: Mass size according to the algorithm vs. mass size according to the first radiologist.
To test the independence of accuracy of mass size, we examined the relationship between mass size according to the algorithm as a function of mass size according to the first, and second radiologist, as shown in figures 8, 9 respectively.

Figure 9: Mass size according to the algorithm vs. mass size according to the second radiologist.
The information in these figures
suggests that the relationships are linear. The figures also
include the linear regression lines along with the regression
parameters. With values of correlation coefficient (r) larger than
0.9, it indicate a strong linear relationships. The slope of the
linear regression represents the constant fractional difference
between the variables. This means that the fractional difference
between mass size according to the algorithm, and that according
to the radiologist is independent of mass size. Figure 10 shows
the relationship between mass size according to the first
radiologist, as a function of mass size according to the algorithm
for 23 mammograms when they were diagnose by the algorithm prior
to being presented to the radiologist.
The figure also includes the linear regression lines along with
the regression parameters

Figure 10: Mass size according to the first radiologist vs. mass size according to the algorithm
With the values of correlation
coefficient (r) is larger than 0.9, it indicates a strong linear
relationships. This indicates the capability of the algorithm to
correctly diagnose mammograms that were not involved in the
building process of the algorithm.
The relationship between TPR and mass size as determined by the
first radiologist is as shown in
figure 11. Thus, the values of TPR are independent of mass size.

Figure 11: The relationship between TPR and mass size as determined by the first radiologist.
Conclusion
The experimental method suggested
above is simple and employ direct image processing techniques to
detect any abnormality in the breast tissue.The radiologists who
assessed this work felt that this technoque would give better clue
for early detection of breast cancer. This method can be further
employed for the analysis of mammograms under a Computer Aided
Diagnosis (CAD ) system.
Our future work include the development of better image processing
techniques to identify microcalcifications in the breast
tissue.Also the development of a CAD system for early detection of
breast cancer is included in our future research work..
References
[1]ASTLEY S M and TAYLOR C J 1990 Combining cues for mammographic
abnormalities Proc. 1st British Machine Vision ConferenceOxford UK
253-258
[2]CHAN H P, DOI K et al. 1987 Image feature analysis and
computer-aided diagnosis in digital radiography. Automated
detection of microcalcifications in mammography Medical Physics 14
(4) 538-548
[3]CHAN H P, DOI K, VYBORNY C J, LAM K J and SCHMIDT R A 1990
Improvements in radiologists' detection of clustered
microcalcifications on mammograms: the potential of computer aided
diagnosis Invest. Radiol. 25 1102-1110
[4]DAVIES D H and DANCE D R 1990 Automatic computer detection of
clustered calcifications in digital mammograms Phys. Med. Biol. 35
1111-1118
[5]DHAWAN A P, BUELLONI G and GORDON R 1986 Enhancement of
mammographic features by optimal adaptive neighbourhood image
processing IEEE Trans. Med. Imag. MI-5 8-15
[6]EGAN R L Technologists guide to mammography 2nd edition
Williams and Wilkins Company 1977
[7]ERIKSEN J P, PIZER S M and AUSTIN J D 1990 MAHEM: a
multiprocessor engine for fast contrast limited adaptive histogram
equalisation SPIE Conference Medical Imaging IV - Image Processing
SPIE Vol. 1233
[8]GIGER M L, YIN F, DOI K, METZ C E, SCHMIDT R A and VYBORNY C J
1990 Investigation of methods for the computerised detection and
analysis of mammographic masses SPIE Conference Medical Imaging IV
- Image Processing SPIE Vol. 1233 183-184
[9]GORDON R and RANGAYAN R M 1984 Feature enhancement of film
mammograms using fixed and adaptive neighbourhoods Applied Optics
23 560-564
[10]HARALICK R M, SHANMUGAN K and DINSTEIN I 1973 Textural
features for image classification IEEE Trans. Sys. Man. Cyb. SMC-3
6 610-620
