DESIGN AND IMPLEMENTATION OF A
SOFfWARE PACKAGE FOR TOXICITY
ANALYSIS OF WATER SAMPLES ON
FRESHWATER ORGANISMS
WITH JAVA
By
BEIZHU
Bachelor of Engineering
Shanghai Jiao Tong University
Shanghai, China
1991
Submitted to the Faculty of the
Graduate College of the
Oklahoma State University
in partial fulfillment of the
requirements for
the degree of
MASTER OF SCIENCE
July 1997
OKLAHOMA STATE UNIVE~!
DESIGN AND IMPLEMENTATION OF A
SOFTWARE PACKAGE FOR TOXICITY
ANALYSIS OF WATER SAMPLES ON
FRESHWATER ORGANISMS
WITH JAVA
Thesis Approved:
f/. £
Thesis Advisor
~~.~
keF <1'_ t.1~
;JjllrmatJ C. C~
Dean of the Graduate College
II
ACKNOWLEDGEMENTS
I would like to express my sincere appreciation to my advisor, Dr. Huizhu Lu, for
her encouragement and excellent guidance throughout my thesis work. Without her
support, motivation and patience, it would have been difficult to complete this work.
My special thanks go to Dr. Kathleen Kaplan and Dr. Jacques LaFrance for serving
on my graduate committee. Their valuable suggestions and support were very helpful
throughout the study.
III
TABLE OF CONTENTS
Chapter Page
I. IN'TRODUCTION I
Motivation L
Major Approaches I
Organization 2
n. LITERATURE REVIEW 3
Statistics Analysis 3
Introduction to the Programming Language .4
Reasons for Choosing Java 6
III. ANALYSIS ALGORITHMS 8
Test Flow 8
Algorithms of Hypothesis Tests 14
ShapiroWilk's Test [I] 14
Bartlett's Test [1] 15
Dunnett's Test [I] 16
Bonferroni's TTest [1] 18
Steel's ManyOne Rank Test [1] 18
Wilconxon Rank Sum Test [I] 19
Fisher's Exact Test [1] 20
Algorithm of Point Estimation 21
Linear Interpolation Method [1] 22
Data Transformation 23
Arc Sine Square Root Transformation [I] 23
Minimum Significant Difference [1] 24
IV. DESIGN AND IMPLEMENTATION 26
Structure Design 26
Implementation 28
The Model Module 29
The View Module 35
IV
The Controller Module 41
V. CONCLUSIONS 42
BIBLIOGRAPHY 43
APPENDIX 45
y
LIST OF TABLES
Table Page
1. Fonnat for Contingency Table 20
II. A Description of Base Class 30
lIT. A Description of ShapiroWilk. Class 31
IV. A Description of Bartlett Class 31
V. A Description of Bonferroni Class 32
VI. A Description of Dunnett Class 33
VII. A Description of Steel Class 33
VIII. A Description of Wilcoxon Class 34
IX. A Description of Fisher Class 35
X. A Description of Interpolation Class 35
VI
Figure
LIST OF FIGURES
Page
I. Flow Chart for Statistical Analysis of
Fathead Minnow EmbryoLarval Data [1] 9
2. Flow Chart for Statistical Analysis of
Ceriodaphnia Reproduction Data [1] 10
3. Flow Chart of Statistical Analysis of
Fathead Minnow Larval Survival Data [J] I 1
4. Flow Chart of Statistical Analysis of
Fathead Minnow Larval Growth Data [I] 12
5. Flow Chart for Statistical Analysis of
Algal Growth Response Data [I ] 13
6. Flow Chart for Stati tical Analysis of
Ceriodaphnia Survival Data [1] 14
7. Connections between UI and Statistics 27
8. HTML Code for the Project. 28
9. The Input Window 36
10. The Input Window with Data 38
I J. The Report Window 39
12. The Chart Window 40
13. Choice Fields of Test Type 47
14. Choice Fields of Analysis Type 47
15. Load an Existing Data File .48
VII
16. A Report for Fathead Minnow Larval Growth Test.. .49
17. A Chart for Fathead Minnow Larval Growth Test Result 50
VlJI
Chapter J
INTRODUCTION
Motivation
It is a national goal that the discharge of toxic pollutants in toxic amounts must be
prohibited [I]. Extensive effluent toxicity tests should be conducted to achieve this goal.
Toxicity tests are used to measure effluent toxicity and to estimate the safe concentration of
toxic effluents in receiving water [1]. There exists a set of standard methods to measure
toxicity of pollutants developed by the Environment Protection Agency (EPA). Thus, it is
useful to provide userfriendly analysis tools to automate this process and to help
researchers to obtain results accurately and efficiently.
Major Approaches
The objective of this thesis is to design and implement a software package with
Java to automate toxicity analysis. The implementation will include a userfriendly
graphics interface and a set of methods for toxicity analysis of water samples on freshwater
organisms. This enables average users who do not have much knowledge on statistics or
computer science to take fuD advantages of the tools. With features of Java and today's
Internet and Intranet technologies, the analysis tools can be posted on a network. Users
who have a Javacapable Web browser installed can do the analysis without having the
tools installed locally. This feature makes the analysis software more portable, and al 0
greatly reduces the maintenance cost. These advantage will help the EPA to carry out it
policy much easier than it has been done in the past.
Organization
Chapter 3 provides flow charts of six analysis methods, together with a detailed
description of each test as documented by the EPA. Chapter 4 explains the design and
implementation of the program. It includes design of the overall program structure and
classes that implement the tests. Chapter 5 summarizes the thesis, including the major
advantage of the project. The following chapter will discuss toxicity analysi methods and
Java programming language.
2
Chapter 2
LITERATURE REVIEW
Statistics Analysis
According to ShortTerm Methods for Estimating the Chronic Toxicity of Effluents
and Receiving Waters to Freshwater Organisms [1] published by the EPA, hypothesis tests
and point estimates are used to analyze toxicity test data. These tests are used to determine
the highest "safe" or "noeffect concentration" of test data. The following arguments are
used to determine it.
1. NoObservedEffectConcentration (NOEC): NOEC is defined as the highest
concentration of toxicant, to which organisms are exposed in a fuIl lifecycle or partial
Mecycle test. NOEC causes no observable adverse effects on the test organism [1].
2. LowestObservedEffectConcentration (LOEC): LOEC is defined as the lowest
concentration of toxicant, to which organisms are exposed in a lifecycle or partial Iifecycle
test, which causes adverse effects on the test organisms [I].
3. Effective Concentration (EC): EC i.s a point estimate of the toxicant concentration that
would cause an observable adverse affect on a quantal, "all or nothing," response in a
given percent of the test organisms. For example, EC lOis the point estimate that would
cause adverse effect in 10% of the organisms.
3
4. Lethal Concentration (LC): LC is a point estimate of concentration would cane an
adverse effect in a given percent of test population. The adverse effect i death. For
example, LC20 is the point estimate that would cause death in 20% of the organi ms.
5. Inhibition Concentration (IC): IC is a point estimate of the toxicant concentration that
would cause a given percent reduction in a nonquantal biological measurement.
NOEC and LOEC are determined by hypothesis tests: Dunnett's test, Bonferroni's
Ttest, Steel's ManyOne Rank test, and Wilcoxon Rank Sum test. LC, EC, and IC are
determined by point estimation techniques: Probit Analysis or Linear Interpolation Method.
It is required to determine if a hypothesis test is a parametric or nonparametric one.
If data are normally distributed and variances are homogeneous, a parametric test is
performed. Otherwise, a nonparametric test is performed. A parametric test is preferred
because the analysis is performed with the observed data and not a rank of the data, thus
increasing its credibility [I].
Point estimation of probit analysis or linear interpolation is used to predict the
toxicant concentrations at a predefined mortality or inhibited percentage [ I].
Introduction to the Programming Language
The Internet, a global network of computers that communicate using a common
protocol, consists of mjllions of hosts in the world. It provides users immediate information
and communication facilities. It takes an important part in many businesses in the world.
The Web consists of Web pages, which are located on Internet sites. The Web has been
proved to be good for the purpose of distributing information to widely distributed users.
4
With additional graphics, image maps and fonns, Web pag may become interactive, and
Java is an efficient language and tool to achieve this goal. A brief introduction to J va is
given below.
Java is an objectoriented programming language developed by Sun Microsystems,
Inc. Java is designed to be portable, i.e., Java executable files can be moved ea ily from
one computer system to another without recompiling. Platfonn independence is one of the
most significant advantages that Java has over other programming languages. It make Java
portable. More important, the advantage enables Java program to work over a network with
a Javacapable browser. Java is a highlevel programming language similar to C and C++,
but it adds a few more features that C and C++ does not have, such as garbage collection
and multithreading. Java also supports most of the requirements by programmers. That
makes it suitable for about any application programming task [6]. Compared to C and C++,
it eliminates most of the complex parts in them. For example, there are no pointers in Java,
etc. Thus, it is easier to write, and also easier to debug [6]. In addition, Java virtual machine
has builtin restrictions to prevent most traditional ways of causing damage to the client
systems. Java is an objectoriented language. This helps to design programs in terms of
objects. Each of them ha'i a specific role in the program and all of them can talk to each
other in a predefined way, which means it allows abstract data type to be easily created and
used. It has capabilities of creating flexible, modular and reusable code. Java also has
classes to support user interface functions.
Java programs have two groups: applications and applets. Java applications are
general programs written in Java. Java applets are special kind of Java programs that can be
downloaded from the Web server, and executed by a Javacapable Web browser on a client
5
machine. This makes Java applets to have the advantage of tructure a brow er upport,
such as graphics context, eventhandling, existing window and surrounding user interface.
We are truly in an information society. Computers are popular more than ever. With
increasing number of computers, the total cost of ownership is becoming more en itive for
large organizations. One way to solve this problem is to construct a network with one or
more powerful centralized servers and relatively weak clients. Since major maintenance
work is done on the server, the cost of the ownership of the majority clients is low. Java fits
into this model very welL A Java applet is installed and maintained on a server. While other
computer systems connected to the network need to execute such a Java program, they
download the program over the network and execute it locally with a Javacapable browser.
This also helps to unload computing tasks from the servers.
However, Java has its own disadvantages. Compared to C and C++, its execution
speed is relatively slow because object code (bytecode) generated by a Java compiler is
intermediate code for the Java virtual machine, not for the real target machine. Yet, there
are also several solutions for it, such as justintime compiler, which converts Java
bytecode into the native machine code as it loads on a client machine.
Reasons for Choosing Java
After analyzing the requirements from the EPA, the author chooses Java as the
programming language for the implementation for the following reasons:
l. Java is an objectoriented language. The objectedoriented approach attempts to
manage the complexity inherent in realworld problem by abstracting out knowledge,
6
and encapsulating it within objects. It is much natural to organize and maintain than
traditional procedural languages.
2. Java includes a library called Abstract Windowing Toolkit (AWT). which provides a
set of building blocks for user interfaces. It provides a common base across all
platforms that support Java. It enables programmers to write one version of user
interface that appears identically on all different client platforms.
3. Java is designed based on a generic virtual machine. Its compiled code is stored as
bytecode of the virtual machine. Therefore, its object code is not tied to any particular
hardware. Thus. Java object code is platform independent. With this feature and the
technology of networking, it is possible to use Java as a means to deliver programs in
their executable forms, namely through Web to pass Java bytecode to a browser that is
capable of executing the program.
7
Chapter 3
ANALYSIS ALGORITHMS
Test Flow
The EPA has suggested the following six different experiments to analysis toxicity.
1. Fathead Minnow EmbryoLarval Survival and Teratogenecity test;
2. Ceriodaphnia Reproduction test;
3. Fathead Minnow Larval Survival test;
4. Fathead Minnow Larval Growth test;
5. Algal. Growth test; and
6. Ceriodaphnia Survival test.
Their flow charts are presented in Figure 1 to Figure 6. Algorithms for tests in Figure 1 to
Figure 6 follow. They are based on the EPA documentation, ShortTenn Methods for
Estimating the Chronic Toxicity of Effluents and Receiving Waters to Freshwater
Organisms [1].
8

Linear Interpolation
Total Mortality
Total Number of Dead Embryos
Dead Larvae and Deformed Larvae
Arcsin Transformation
Endpoint Estimate
ECI, EC5, EClO, ECSO
NonNormal
Distribution
Normal Distribution
Homogeneous
Variance
Heterogeneous
Variance
TTest with
Bonferroni
Adjustment
Dunnett's Test Steel's ManyOne
Rank Test
Wilcoxon Rank Sum
Test With
Bonferroni Adjustment
Endpoint Estimates
NOEC,LOEC
Figure] : Flow Chart for Statistical Analysis of Fathead Minnow ErnbryoLarva.
l Data [I]
9
Reproduction Data
Number of Young Produced
Linear Interpolation Hypothesis Testing
(Excluding Concentrations
Above NOEC for Survival)
Endpoint Estimate
IC25,leso NonNormal
Distribution
Normal Distribution
Homogeneous
Variance
Heterogeneous
Variance
TTest with
Bonferroni
Adjustment
Dunnett's Test Steel's ManyOne
Rank Test
Wilcoxon Rank Sum
Te tWith
Bonferroni Adjustment
Endpoint Estimates
NOEC, LOEC
Figure 2: Flow Chart for Statistical Analysis of Ceriodaphnia
Reproduction Data [1]
]0
Survival Data
Proportion Surviving
Linear Interpolation Arcsin Transformation
Endpoint Estimate
LC1, LC5, LClO, LC50
NonNormal
Distribution
Normal Distribution
Homogeneous
Variance
Heterogeneous
Variance
TTest wiLh
Bonferroni
AdjusLment
Dunnett's Test Steel's ManyOne
Rank Test
Wilcoxon Rank Sum
Test With
Bonferroni Adju tmenl
Endpoint Estimates
NOEC, LOEC
Figure 3: Flow Chart of Statistical Analysis of Fathead Minnow Larval
Survival Data [I]
11

Linear Interpolation
Growth Data
Mean Weight
Hypothesis Testing
Endpoint Estimate
IC25, IC50
NonNormal
Distribution
Normal Distribution
Homogeneous
Variance
Heterogeneous
Variance
TTest with
Bonferroni
Adjustment
Dunnett's Tesl Steel's ManyOne
Rank Test
Wilcoxon Rank Sum
Tesl With
Bonferroni Adju tmenl
Endpoint Estimates
NOEC,LOEC
Figure 4: Flow Chart of Statistical Analysis of Fathead Minnow Larval
Growth Data [I]
12

Growth Response Data
CellslML
Homogeneous
Variance
Normal Distribution
NonNormal
Distribution
Heterogeneous
Variance
TTest with
Bonferroni
Adjustment
Dunnett's Test Steel's ManyOne
Rank Test
Wilcoxon Rank Sum
Test With
Bonferroni Adjustment
Endpoint Estimates
NOEC,LOEC
Figure 5: Flow Chart for Statistical Analysis of Algal Growth Response
Data [1]
13

Survival Data
Proportion Surviving
Linear Interpolation Fisher's Exact Test
Endpoint Estimate Endpoint Estimate
Let, LCS, LClO, LC50 NOEC, LOEC
Figure 6: Flow Chart for Statistical Analysis of Ceriodaphnia Survival
Data [1]
Algorithms of Hypothesis Tests
ShapiroWilk's test, Bartett's test, Dunnet's test, Bonferroni's Ttest, Steel's Many
One Rank test and Wilcoxon Rank Sum test with Bonferroni Adjustment are used in Figure
1 to Figure 5, and they are hypothesis tests. The algorithms for those tests are depicted
below according to the EPA document [I].
ShapiroWilk's Test [1]
ShapiroWilk's test is used to check whether the data are normally distributed or
not. The algorithm of the test is given below.
Step 1: For the total n observations, calculate centered observations, Xi, i={ 1,2, ... , n}, by
subtracting the mean of all observations within a concentration from all observations,
respectively.
Step 2: Calculate the overall mean of centered observations, Y.
14

Step 3: Calculate the denominator, D, for the test statistic:
n
D = "£e. x.  y)2 , where n is the total number of the observations.
i=1
Step 4: Order the centered observations in ascending order and denote them as Xl, X2
, ...•
x..... X\ where X is the ilh ordered observation.
Step 5: Get coefficients, C,. , i = {I, 2, ... , k} ,from the table, where k is approximately
equal to nil.
Step 6: Compute the test statistics. W, as follows:
I k
W =[LCi
. (X n;+I  Xi )]2 , where k is approximately nil.
D ;=1
Step 7: Find the critical value at significance level 0.0 I or 0.05 and total observations, n, in
the table given by the EPA. If the computed W is greater than or equal to the critical value.
then the data are normally distributed. Otherwise, the data are not normally distributed.
(Notes: The calculated W must be greater than zero and less than or equal to one. This test
is recommended for a sample size of 50 or less.)
Bartlett's Test [1]
Bartlett's test is used to determine the homogeneity of variance, and is balled on the
allsumption that the data are nonnally distributed. The algorithm of the test is depicted
below.
Notation:
15


nj: the number of replicates for concentration i.
p: the number of concentration including the control.
s;: variance of group i.
~ S2 ·(n. I) ~ ~ 1 ?' 1 Step 1: Calculate sv =l../ni 1), dsv =L., S2 = 1=1 ,
i=1 i=1 ni  1 sv
of . 2 1 I sv·lnS 2 cosv
COSy =£..J(n, 1) ·InS,. , C =1+ ·(dsv) , B=
i=1 3(P  I) sv C
Step 2: Find the critical value with significance level 0.01 or 0.05 and pl degrees of
freedom in the table given by the EPA. If B is less than the critical value, then the
homogeneity of variance is satisfied. Namely, the variances are equal.
Dunnett's Test [1]
Dunnett's test is used to determine whether the mean for i1h concentration
different from the mean for control. Namely, it is used to compare each concentration mean
with the control mean to decide if any of the concentrations differs from the control. This
test can detect a significant reduction in mean weight if there is any. The test requires that
the data are normally distributed and variances of the data obtained from each toxicant
concentration and the control are equal [I]. The number of repl icates for each concentration
is also required to be equal. Otherwise, Bonferroni's Ttest is used as an alternative. From
results of Dunnett's test, the NOEC and the LOEC for growth can be determjned.
The algorithm of Dunnett's test is presented below according to the EPA
document [1].
Notation
16

ni: the number of replicates for concentration i.
Ti : the total of the replicate measurements for concentration i.
G: the grand total of all sample observations.
N: the total sample size.
Yij: the jth observation for concentration i.
k: the number of groups including control.
N: the number of observations.
SST: total sum of squares.
SSB: between sum of squares.
SSW: within sum of squares.
Step 1: Calculate ~ =IYij ,
j=1
k
G= IJ~,
;=]
SST = f~.' y2 _ G
2
~~'J '
i=l j=1 N
k T 2
G2 ~ SSB =I.', SSW =SST  SSB , Sw = SSW.
i=) ni N Nk
Step 2: Calculate statistic t; ~ ~ for each concentration i and the control, where
I 1
S  IV n1 ni
Y1 is the control mean, Yi is the mean for the concentration i, Sw is defined as in Step),
n, is the number of replicates in the control, ni is the number of replicates for
concentration i.
Step 3: Get the critical value with significance level 0.0 I or 0.05 and Nk degrees of
freedom in the table. For every ti, if ti is greater than the critical value, then the group i (i.e.,
concentration i) is significantly different from the control. Namely, the concentration i has
significantly lower growth than the control.
17
Bonferroni's TTest [1]
Bonferroni's Ttest is used as an alternative to Dunnett's test when the number of
replicates is not the same for all concentrations. Like Dunnett's test, Bonferroni's Ttest is
based on the assumptions of (i) the data is nonnally distributed, (ii) homogeneity of
variance. The result of Bonferroni' s Ttest is also used to detennine the NOEC and LOEC.
The algorithm of Bonferroni's Ttest is presented below. The notation used in this
algorithm is the same as in Dunnett's test.
~ k k
Step 1: Calculate T; = LYij , G = LT; , N = Lnr.
j=l ;=1 r=l
k "i G2 k y2 c2
Step 2: Calculate SST =LLY;f  , SSB =Li
 , SSW =SST  SSB.
;=1 j=l N ;=1 n; N
~ Step 3: Calculate Sw = SSW.
Nk
~  1';
Step 4: Calculate ti, the statistic for each concentration and control: t; ===""==
s.~ I + I .
n l n;
Step 5: Find the critical value, with Nk degrees of freedom in the Bonferroni's T table. If Ii
is greater than the critical value, then group i is significantly different from the control.
Namely, the mean of concentration i is significantly less than the control mean.
Steel's ManyOne Rank Test [1]
lfthe data is not normally distributed and/or variances are not equal, then Dunnett's
test and Bonferroni' s Ttest can not be used. In those cases, Steel's ManyOne Rank test
18


can be carried out if the number of replicates for each concentration are the same.
Otherwise, Wilcoxon Rank Sum test is used.
Steel's ManyOne Rank. test is a multiple comparison method for comparing
several treatments with a control. The data are ranked, and the analysis is performed on the
ranks rather than on data themselves. It is necessary to have at least four replicate per
toxicant concentration to use Steel's test. This is a nonparametric test and therefore the
assumptions of normality and homogeneity does not need to be met. The sensitivity of the
test can not be stated in terms of the minimum difference between treatment means and the
control mean.
The algorithm ofthe Steel's ManyOne Rank test is described as follows [I].
Step 1: Combine the data and arrange the observations in order of size from the smallest to
largest b. Assign ranks to the ordered observations.
Step 2: Calculate the sum of the ranks, Rio at each concentration and the control.
Step 3: For each Ri, if Ri is less than or equal to the critical rank sum in the table of Steel's
ManyOne Rank Sum test, then the group i is significantly different from the control. Test
results are used to detennine the NOEC and LOEC.
Wilconxon Rank Sum Test [1]
Wilcoxon's Rank Sum Test is used a'l an alternative to Steel's ManyOne Rank
Test when the numbers of replicates are not the same at each concentration. The control is
used to set an upper bound of alpha on the overall error rate, in contrast to Steel's ManyOne
Rank test. Thus, Steel's test is a more powerful test.
19


The Algorithm is depicted as follows.
Step 1: Combine the data and arrange the observations in the order of size from the smallest
to largest. Assign ranks to the ordered observations.
Step 2: Calculate the sum of the ranks, Rj , at each concentration and the control.
Step 3: For each rank Ri , find the critical rank sum at the significance level 0.05 or 0.01
from the table of Wilcoxon Rank Sum test. If Ri is less than or equal to the critical rank
sum, then the group (concentration) i is significantly from the control. The NOEC and
LOEC can be determined from the test result.
Fisher's Exact Test [1]
Fisher's Exact test is a statistical method based on the hypergeometric probability
distribution that can be llsed to test if the proportion of successes is the same in two
Bernoulli populations [1]. The te t is for Ceriodaphnia Survival data as shown in Figure 6.
To perfonn this test, each replicate value must be between aand 15.
# Successes # Failures # Observations
Row 1 A Aa A
Row 2 B Bb B
Total a+b [( A + B )  a  b] A+B
Table 1: Format for Contingency Table
20

Arrange Contingency Table (Table I) so that the total number of ob ervations for
row one is greater than or equal to the total number for row two (A ~ B). Categorize a
success such that the proportion of successes for row one is greater than or equal to the
proportion of successes for row two (alA ~ biB). Then find the critical value with A, B, a,
and the significance level 0.05 or 0.01 from the table. If b is less than or equal to the critical
value, then the group is significantly different from the control.
Algorithm of Point Estimation
The Le, BC, or IC is derived from a mathematical model that a'isumes a continuous
doseresponse relationship. This is the reason why any LC, EC, or IC value is an estimate
of some amount of adverse effect. Thus, to use a point estimate such a'i LC, EC, or IC to
determine a "safe" concentration would require the specification by biologists or
toxicologist of what level of adverse effect would be deemed acceptable or "safe". Point
estimation techniques have the advantage of providing a point estimate of the toxicant
concentration causing a given amount of adverse (inhibiting) effect.
The linear interpolation method is used for the point estimate of the effluent or
other toxicant concentration that causes a given percent reduction (e.g., 25%, 50%, etc.) in
the reproduction or growth of the test organisms (Inhibition Concentration, or IC). The
linear interpolation method assumes that the responses are monotonically nonincreasing,
where the mean response for each higher concentration is less than or equal to the mean
response for the previous concentration. If the data are not monotonically nonincrea'iing,
they are adjusted by smoothing (averaging). The IC is estimated by linear interpolation
between two concentrations whose responses bracket the response of interest, the p percent
21

reduction from the control. To obtain the estimate, detennine the concentrations Cj and q+1
which bracket the response M] (lp/lOO), where M/ is the smoothed control mean response
and p is the percent reduction in response relative to the control response. The algorithm of
linear interpolation method used for the point estimate is presented below according to the
EPA document [1].
Linear Interpolation Method [1]
Step 1: Calculate the smoothed mean by averaging adjacent means as follows. Let Yi be the
control mean. If Yi+/ is less than or equal to Yi , then Yi+l is used. Otherwise, the average Yi
and Yi+/ is used as the new mean for group i and i+ I. If Yi+2 is greater than Yi , the average
of Yi ,Yi+J and Yi+2 is used as the new mean for group i, i+1 and i+2. This continues till all
means are in decreasing order.
P C'+I C.
Step 2: Calculte rcp as follows: ICp =Cj + (M] . (1) M)· J J, where
100 M j +1 MJ
ICp the percent reduction,
p q: the 1st concentration whose observed mean response is greater than MI' (I  100) ,
Cj +I: the Ist concentration whose observed mean response is less than MI' ( J  1~O) ,
MJ: the smoothed mean for control,
M/ the smoothed mean for concentration},
Mj +J: the smoothed mean for concentration}+ 1.
22

Data Transformation
When the assumptions of nonnality and homogeneity of variance are not met, input
data may be transfonned by an Arc Sine Root Transfonnation. Then, the data may be
analyzed by parametric procedures, rather than a nonparametric technique such as Steel's
Manyone Rank Test or Wilcoxon's Rank Sum Test. After the data have been transformed,
ShapiroWilk's and Bartlett's tests should be performed on the transfonned observations to
detennine whether the assumptions of normality and/or homogeneity of variance are
met [1].
Arc Sine Square Root Transformation [lJ
Arc Sine Square Root Transformation is used in Fathead Minnow Larval Survival analysis
as shown in Figure 3 for hypothesis testing, which deal with survival proportion. When the
proportion is 0 or 1, the Arc Sine Square Root Transformation (arc sine fP:) is commonly
used to stabilize the variance and satisfy the normality requirements, where Pi is the
expected proportion (response/no response or live/dead) for the treatment. Following are
detail explanation and examples of the Arc Sine Square Root Transformation, which is
based on the EPA document [1].
Step I: Calculate the response proportion (RP) at each effluent concentration, where
RP = (number of dead or "affected" organisms) / (number exposed).
For example, if 8 of 20 animals in a given treatment die, RP = 8120 = 0.4.
Step 2: Transform each RP to arc sine.
(1) If RP =OA, Angle =arc sine ~OA =arc sine 0.6325 =0.6847 radians.
23

(2) If RP =0, special modification on procedures as follows:
Angle (in radians) = arc sine ~1 /(4N), where N = Number of animals/treatment.
Assume 20 animals are used. Then, Angle =arc sine ../1/80 =arc sine 0.1118 =O. I 12
radians.
(3) If RP = I, a special modification on the Angle is made as follows:
Angle =1.5708  (radians for RP =0). Using previous data, Angle =1.5708  0.112 =
1.4588 radians.
Minimum Significant Difference [1]
The minimum significant difference (MSD) is used to detennine the sensitivity of a test [I].
The MSD is calculated after either Dunnett's or Ttest with Bonferroni Adjustment is used.
1. Calculate the MSD for each group MSD =cSw ~1 / N + 1/ Nj , where c is the critical
value for either Dunnett's or Ttest with Bonferroni Adjustment, SI<' is the square root of
within mean square, N is the number of observations in control, and Mis the number of
observations in group i.
2. If the data has been transformed, then the untransformed MSDu is calculated as the
following example given in the EPA document [1 ].
An example: if the transformed control mean ControlMean = 0.714 and MSD in
transformed unit =0.087.
(a) Calculate untransformed units UMSD.
24

UMSD = [sine (0.714)f.
(b) Calculate untransfonned difference DMSD.
DMSD =[sine (0.7140.087)f =0.344.
(c) Calculate untransformed MSDI/'
MSDu =UMSD  DMSD =0.429  0.344 =0.085.
25

Chapter 4
DESIGN AND IMPLEMENTATION
Structure Design
The thesis presents an objectoriented approach using Java for the toxicity analysis
of water samples on freshwater organisms. Objectoriented design decomposes a program
into objects. Each object knows how to perform its own operations and remembers its own
information. Meanwhile, objects have a private side. The private side of an object is not a
concern of other objects. With this, objects are free to change their private sides without
affecting other objects. If the software has been designed with rigorous consistency,
interfaces can be extended and entities can be added. Programmers can add new entities
that re ponse to old requests in ways appropriate to the new system of which they are now
a part. If the interfaces between entities have been rigorously controlled, new portions of
the system can be created to use the same interface, but to do different thi~gs with them [9].
Thus, objectoriented design can be easily reused, refined, tested, maintained, and
extended.
The whole project is divided into three independent parts: model (statistical
analysis), controller and view (graphical user interface) (Figure 7). According to the
heuristics implied by Jacobson's Objectory methodology, the policy infonnation should not
be placed inside of classes involved in the policy decision because it renders them
26

unreusable by binding them to the domain that set the policy [9]. To realize !hi heuri tics,
there should be a special type of class called controller. It only contains behavior. It gets
data from outside the class and is used to decouple classes from their policy. On th.e other
hand, controller classes do render their host classes more reusable, because those classes
are mindless. In our case, the design is as Figure 7.
View
(User Interface)
PasslnputO
Controller
Model
(Statistics Analysis)
GetlnputO
ValidateDataO
Figure 7: Connections between UI and Statistics
A controller handles data between VIew (user interface) and model (statistics
analysis). For example, it checks the data getting from view, converts them into its COf/'ecl
type (usually changing from string to float in this case). This type of design is as the
artificial separation of data and behavior in a bidirectionally related package. A controller
is clearly a useful facility. If changes are made to the statistical analysis part, only the
functions relating to it have to be changed, and leave the user interface part unchanged.
Uses interfaces display the internals of a model, allow a user to update those internals, and
put the internals back into the model.
27
The program is implemented in such a way that the arne bytecode can be used a
either an applet or an application. An applet is a class derived from the Applet class. It can
only be run with a browser. An applet shares the same frame (window) with the browser
that launches it. Java virtual machine first calls initO method in the applet. then startO. A
Java application is executed using a different way. It uses a static method mainO as its
entry point. It is not given a frame automatically. We need to simulate an environment as
what a browser provides in order to let an applet run under an application environment. To
do so, we need to create a frame in mainO. Then create an instance of the applet and put it
on the frame just created. The mainO should also call initO and startO of the applet.
Implementation
To run an applet, a Web page should be created. It tells the browser where and how
to load the program. The HTML file in the package is created for this purpose (Figure 8).
While running the application. we use Java loader to load the pro!!ram.
<html>
<head>
< title >Toy</title>
<lhead>
<body>
<hr>
<applet code=Toy.c1ass id=Toy widtih=600 height=600>
<lapplet>
<hr>
<lbody>
<!html>
Figure 8: HTML Code for the Project
28
Base on the structure design and the documents of the EPA, we create twelve
classes for the model module, five classes for the view module, and some other classes.
They are presented in detail.
Toy: it is the entry point of the project.
AppFrame: When we run the project as an application, this is the frame object that i
created in mainO. It is not used when we run the project as an applet.
The Model Module
The following twelve classes belongs to the model module. Except for BadDataException,
TestData, ModuleClass and Fisher, the rest eight classes, derived from BaseClass, specify
different tests
1. BadDataException: this class defines an exception type for the project.
2. TestData: this class holds all test data. It also validates the test data when an instance of
the class is created.
3. ModuleClass: it is the class to judge which single test shou~d use during the whole test
procedure. It controls the flow of the test.
4. BaseClass: it handles the basic functions of the statistical test. For example,
GetInput(... ) is for getting input data; DataTrans( ... ) is for handling mathematical
transformation of test data before they are used for testing. All methods in BaseClass
can be inherited by its child classes. Table 2 is the description of BaseClass.
29


Methods Attnoutes
BaseClass(int); int lastRep;
int GetInput(float[], int, int, float, float[]); int lastCct;
void DataTrans(int, int); int curRep;
int NumOfRepO; int curCct;
int CurRepO; int totalObs;
int CurCctO; float p_value;
int LaslRepO; float mean[];
int LastCctO; float variance[];
float Mean(int); float sumSquarerPerCct[];
float Variance(int); float squareSumPerCct[];
float SumSquare(int); float cctArray[];
float SquareSum(int); int missingDataFlagOn[ I;
int TotalNumsO; float concctionLevel[];
int ConLevel(float[]); float inputData(Jll;
void SetCurRep(int);
void SetCurCct(int);
int ComputeGrandTotalObsO;
float ComputeMean(int, int);
float ComputeVariance(int. int);
float ComputeSumSquare(int. int);
float ComputeSquareSum(int. int);
Table 2: A Description of Base Class
5. ShapiroWilk: it checks for the normality of the test data. Two methods ExistW(... ) and
TestW(... ) are used to check the nonnality. ExistW(... ) gets the W value from the
existing table while TestW( ... ) calculates the W value from the input data. If
ExistW(... ) is less than or equal to TestW(... ), the test data are normal. Otherwise it is
abnonnal. Table 3 is the description of ShapiroWilk cla'is.
30

Methods Attributes
I
ShapiroWilkClass(int, int); float orderedCenteredData[];
int ShaptestO; float coeffs[];
float GetDO; float cenleredDataf] [];
void ShapQuick(float[], int);
void ShapQuicksort(float[), int, int);
void AscendingOrderO;
void GetCoeffO;
float TestWO;
float ExistWO;
Table 3: A Description of ShapiroWilk Class
6. Bartlett: it determines whether variances are equal or not for the test data. Its class
structure is similar to ShapiroWilk class. Two methods, ExistB( ... ) and TestB( ... ), are
used to judge the equality. ExistB(... ) gets the B value from the existing table while
TestB( ... ) calculates the B value from the input data. If ExistB( ... ) is greater than or
equal to TestB(... ), the test data have equal variance. Otherwise it is unequal. Table 4 is
the description of Bartlett class.
Methods Attributes
Bartlett(int); inl sumON;
void SetVO; inl vI];
intSumVO;
float SetCO;
float TestBO;
float ExistBO;
int BartestO;
Table 4: A Description of Bartlett Class
31


7. Bonferroni: it is the class used as a parametric test to calculate the values of NOEC and
LOEC when the test data have an equal number of replicates. Several methods such as
CaISST(... ), CaISSW(... ), etc., calculate the values of SST, SSW, etc. according to the
document of the EPA. Then we have the values of NOEC and LOEC. Table 5 is the
description of Bonferroni class.
Methods Attributes
BonferroniClass(int); floal totalSamples;
void SetSampO; inl numOfObsPerCon[];
float CaISSBO; float NOECfJ;
float CaISStO; float LOEC[];
float CaISSWO; float MSD[]; !
float CalMSBO;
float CalMSWO;
float TestT(int);
float ExistTO;
void Bonitest(float[], float[]); II
Table 5: A Description of Bonferroni Class
8. Dunnett: The class structure is similar to Bonferroni class. It is used when the number of
replicates is equal for parametric testing. Table 6 is the description of Dunnett class.
9. Steel: this class is used when it is a nonparametric test with equal number of replicates.
It is also for calculating values of NOEC and LOEC. The class structure is similar to
Bonferroni class. Table 7 is the description of Steel class.
32


Methods Attributes
DunnettClass(int); float totaJSamples;
void SetSampO; float MSW;
float GetSampO; float meanControl;
void ObsPerConO; int numOfObsPerCon[];
float CaISSBO; float NOEC[];
float CaISSTO; float LOEC[];
float CaISSWO; float MSD[];
float CalMSBO;
float CalMSWO;
noat TestT(int);
float ExistTO;
void Dunntest(float[], float[], int, int, float[], floal[]);
Table 6: A Description of Dunnett Class
Methods Attributes
SteelClass(int, int); float groupO;
void SArrayofRankO; float rankedArraylL
void SGroupNum(inl, int); floal NOEClI;
void StQuicksort(int, int); noal LOEc[J;
void StQuick(int);
void SRankNum(int);
void Stest(floal[], float[]);
float TestSt(int);
float ExistStO;
Table 7: A Description of Steel Class
10. Wilcoxon: this class is used when it is a nonparametric testing and the test data have an
unequal number of replicates. Values of NOEC and LOEC can be detennined from the
results. Table 8 is the description of Wi[coxon.
33

Methods Attributes
WilcoxonClass(int, int); float group[];
void WarrayofRank.(); float rankedArray[];
void WgroupNum(int, int[]); float NOEC[];
void WilQuicksort(int, int); float LOEC[];
void WiIQuick(int); float MSD[];
void WrankNum(int); float MSD[];
void Wtest(float[], float[]);
float WrankSum(int);
int WrepPerCct(int);
float ExistW(int, int);
Table 8: A Description of Wilcoxon Class
1]. Fisher: This is the only test class that is not derived from BaseCl.ass. This test is to
determine if each concentration group mean differs from the control group mean, and
further determine NOEC and LOEe. The method GetB( ... ) gets the b value in the
statistics table. The method, TestB (... ) returns the b value calculated from the test data.
By comparing two bs, we get the value of NOEC and LOEe. Table 9 is the description
of Fisher class.
12. Interpolation: it is a point estimation class. The concentration means must be nooincreasing
to perform the test. If montoninity is not met, the means can be smoothed.
Methods CaISmoothedMean( ... ), CalBaseMean(... ) and CaIICp( ... ) etc. are
employeed to calculate the value of IC, Ee and Le. Table lOis the description of
Interpolation class.
34

Methods Attributes
FisherClass(int); Int numOfCct;
void Getlnput(floal, float[], inl[]); Float significantLevel;
int GetB(int, int, int); Int deadData[] ;
String Ftest(float, float[], int[], int); Int aliveData[];
String TestB(int, int, int[], intO, int[], int[J); Int totalForDeadAliveO;
float effluenCCt[];
Table 9: A Description of Fisher Class
Methods Attributes
InlerpolationClass(int); float smoothedMean[];
void CalSmoothedMeanO; floal percentile;
float CaIBaseMean(float, int[], float[], float[]); Int neededlndex();
int ConcenlNum(float[], float[], float); float lowLimit();
float CaIICp(float); float upLimit(J;
float Itest(f1oat, float);
Table 10: A Description of Interpolation Class
The View Module
The following five classes belong to the view module.
I. InputWindow: The functionality of InputWindow is to receive input (Figure 9).
InputWindow contains the following fields:
35
om
Reporb Chart
Toy  A Toxic ity Analysis Utility
Figure 9: The Input Window
(1) Test Type: a choice field that contains a list of test types for users to choose.
(2) PValue: a text field in which users enter the specific Pvalue.
(3) Animalsffreatment: a text field In which users enter the number of animals per
treatment.
(4) Analysis Type: a choice field for users to specify a specific analysis type.
(5) File: a text field for users to enter the file name to be loaded from or saved to.
(6) Load: a button for loading a data file whose name is specified in the File field.
(J) Save: a button for saving data into a data file whose name is specified in the File field.
36
(8) Row++: a button for adding a row at the bottom of the table.
(9) Col++: a button for adding a column at the right of the table.
(10) Row: a button for deleting a row at the bottom of the table.
(11) Col: a button for deleting a column at the right of the table.
(12) Clear All: a button for users to clear all data in the data table.
(13) Data Table: a table for users to enter test data. The initial window contains a table with
six columns and six rows. The number of rows and columns can be changed
dynamjcally by users.
(14) Report: a button for users to output the test result to a report.
(15) Chart: a button for users to get the test result in a chart.
Figure 10 shows a sample of the InputWindow with test data loaded.
37
Chart..
147.0
Report...
PValue:
Test Type:
Figure 10: The Input Window with Data
2. OutputWindow: OutputWindow provides a skeleton of a output window. It is used with
other components, such as TextArea and Chart, to construct a complete output window.
If Analysis Type is NOEC and LOEC, the report window contains a trace of all
tests and their results as shown in Figure II. The values of NOEC, LOEC and MSD
(Minimum Significance Difference) are included at the end of the report. If the Analysis
Type is EC, IC and LC, report window would display the interpolation values of the test
data.
38
I
··Shaplro·Wilk's test····
Normal this time!
···Bartlett's test····
Equal variances this time!
····Dunnett's test····
< NOEC , LOEC > = <: 5.0 , 10.0 >
Minimum significant difference MSD =223.57672
Figure 11: The Report Window
39
. _. CharI I
~...';'":"" '~;:"'';~'~~~r~a ~._, ,1 ......  ..... ~ ,... 
1300.0
Growth Response
1200.0
1100.0
1000.0
900.0
800.0
70D.0
60D.0
:500.0
4{)0.0 \
300.0
200.0
100.0
8.0 16.0 24.0 32.0 40.0 48.0 :56.0 64.0 72.0 80.0
0.0 +,,...,....,,...,....,,...,....r
0.0
Figure 12: The Chart Window
3. Chart: it displays the critical value and mean replicates by two curves (Figure 12).
4. InputTable: it is a class used to implement an input table. It contains an ObjectTable.
ObjectTable contains a vector. Objects can be dynamjcally inserted into or deleted from
a vector. InputTable specializes ObjectTable to be a table of text fields. It is also
responsible for displaying the table.
InputTable and ObjectTable uses an observer and observable synchronization
model. ObjectTable is an object that can be observed. Whenever a change is made to the
40
ObjectTable, it notifies its observer, which is the InputTable in this case. When InputTable
is notified, it updates the screen according to the current state of the ObjectTable.
The Controller Module
Controller is a coordinator between model and view modules. It controls the program flow.
When the view module receives input, the controller is the one to contact models for further
operations. For example, if the view wants to save a data file, it should pass input data
together with a file name, to the controller. The controller converts input data to its correct
type, and invokes the model module to validate the input data. If there is any error in the
input data, controller will generate an exception and return it back to the view. The view
will display an error message.
41
Chapter 4
CONCLUSIONS
In this thesis, the design and implementation of a software package for tox.icity
analysis with Java are presented. The analysis methods are based on documents provided
by the United States Environmental Protection Agency (EPA).
The software package uses an objectoriented design. It is easier to be extended and
maintained. The graphical user interface is userfriendly and provides a means for users to
interact with the program. The software package handles a set of statistical tests.
Since the whole package is written in Java, it can be posted on a Web server.
Anyone who has Java compatible browser installed can download it from the server and
run it locally. This approach reduces local maintenance cost. On the other hand, the
package can also be installed locally and run on a Java virtual machine.
42
BIBLIOGRAPHY
1. United States Environmental Protection Agency. ShortTerm Methods for Estimating
the Chronic Toxicity of Effluents and Receiving Waters to Freshwater Organisms, second
ed. United States Environmental Protection Agency, 1989.
2. Sun Microsystems, Inc. The Java Language Specification, version 1.0 beta. Sun
Microsystems, Inc., 1995.
3. Gosling, 1. and McGilton, H. The Java Language Environment: A White Paper. Sun
Microsystems, Inc., 1995.
4. Sun Microsystems, Inc. The Java Virtual Machine Specification, release 1.0 beta. Sun
Microsystems, Inc., 1995.
5. Lemay, L. and Perkins, C. L. Teach Yourself Java in 21 Days, Sams.net Publishing,
1996.
6. WirfsBrock, R., Wilkerson, B., and Wiener L. Designing ObjectOriented Software.
PrenticeHall, 1990.
7. Lemay, L. Teach Yourself Web Publishing with HTML 3.0 in a Week, second ed.
Sams.net Publishing, 1996.
8. Kehoe, B. Zen and the Art of the Internet, a Beginner's Guide to the Internet, fourth ed.
PrenticeHall, 1996.
9. Jacobson, Ivar, Maria Ericsson, Agneta Jacobson. The Object Advantage:Business
Process ReEngineering with Object Technology, AddisonWesley, 1995.
43
10. Arthur 1. Riel ObjectOriented Design Heuristics Addison Wesley 1995.
44
APPENDIX
User's Manual
The software package implements a set of toxicity analysis tests documented in
SlwrtTerm Methodsfor Estimating the Chronic Toxicity ofEffluents and Receiving Waters
on Freshwater Organisms by the Environment Protection Agency (EPA). Any platfonns
with Java capability are able to use this tool.
Installation
The program can be run as an application or an applet. Clientbased installation is,
used when the tool is run as an application. Serverbased installation is used when the tool
is run as an applet.
When do a clientbased installation, copy the whole software package into the local
storage of a client machine. For serverbased installation, a Web server is needed to serve
aU clients. A link to the Java program should be implemented on a Web page. When client
browser loads the page, it loads the program automatically. The advantage of serverbased
installation is that no matter how many clients run the tool, only one installation on the
server is needed.
45
Usage
Starting the program
1. Serverbased
Download the program from the server and run it locally. To run the program in this way,
users should have Java compatible browser installed. Then type the correct URL address to
start the program.
2. Clientbased
Install the program (tool) locally and run it by a Java loader. Namely, using a Java loader to
load the tool to run on a Java virtual machine.
After the software starts (Start up window is as Figure 9), the following steps can be
applied to both serverbased and clientbased cases.
Entering Data
I. Choose Test Type, users can choose the test type by clicking the choice field drophsL
button (Figure 13).
2. Enter pvalue in PValue text field.
3. Enter animals per treatment in Animalsffreatment text field.
4. Select Analysis Type. Users can choose the analysis type by clicking the choice field
droplist buLton (Figure 14).
46
..," '. .' ), 1
~~..lI,;•• ~·..r.",,,,,,,,, ','
Figure 13: Choice Fields of Test Type
Test Type:
PValue.
Row++
o
1
2
3
Report... Chart.•
Figure 14: Choice Fields of Analysis Type
47
5. Enter test data in the input table. The initial size of the table is six by ix (6x6). U ers
can adjust the size of the table to meet their needs by clicking Row++, Col++, Row or
Col button.
6. Instead of typing data into the input table, if users have existing data file, they can enter
the data file name in File text field and click Load button. Data file will be loaded, and
the size of the table will be adjusted automatically.
Chart..
0.669
164.0
0.626 0.669 0.558
I==~~~lI0;;;;;;;.72;;;;;;;3=l11=0'"",,69;;;;;;;4~~""","""~1:0=.60",",,,6~
L_','_JI_O_.7_l1_0_.6_7_6l~ ____JL_0_.5_08_____'
PValue:
Test Type:
Figure 15: Load an Existing Data File
7. Users can also save the data table into a file. (Type the file name in File text field, click
Save. The data file will be saved with the name in File text field.)
8. If users want to clear data in the input table, click Clear ALL button. Data in the input
table will be erased.
48
9. Users can get the test report by clicking Report button. If the t t data is not proper,
error message will be displayed.
10. If users want to display test results in a chart, Chart button is the correct one to click.
Figure 15 is the window after inputting all fields for Fathead Minnow Larval Growth test,
Figure 16 is its corresponding report, and Figure 17 is its corresponding chart.
ShapiroWilk's testNormal
this time!
Bartlett's test
Equal variances this time!
Dunnett's test
< NOEC , LOEC > = < 64.0, 128.0 >
Minimum significant difference MSD = 0.08740206
Figure 16: A Report for Fathead Mmnow Larval Growth Test
49
   =~~~~~
[hall r
~.~:::JI"',......~~......,..~   
7.7
tu'eanWeigh
7.0
6.3
5.6
4.9
4.2
3.5
2.B
2.1
1.4
0.7
Concentrations
0.0 +..rrr.,..,.........,.....,
0.0 25.0 50.0 75.0 100.0 125.0 150..0 175.0 200.0 225.0 250.0 275.0
Figure 17: A Chart for Fathead Minnow Larval Growth Test Result
Exiting from the program
Users using browser to run this software package can exit from the program by closing the
browser. For users running application version locally, click x at the upright corner of the
window will exit from the program.
50
rJ
VITA
Bei Zhu
Candidate for the Degree of
Master of Science
Thesis: DESIGN AND IMPLEMENTAnON OF A SOFIWARE PACKAGE FOR
TOXICITY ANALYSIS OF WATER SAMPLES ON FRESHWATER
ORGANISMS WITH JAVA
Major Field: Computer Science
Biographical:
Personal Data: Born in Shanghai, China, in 1968.
Education: Graduated from Shanghai Jiao Tong University, Shanghai, China in
1991; received Bachelor of Engineering degree in Naval Architecture.
Completed the requirements for the Master of Science degree with a major in
Computer Science at Oklahoma State University in July 1997.
Experience: University Computer Center Lab assistant, Oklahoma State University,
January 1995 to May 1995; Software design engineer/test, Volt Computer
Services, November 1996 to present.