{VERSION 2 3 "IBM INTEL NT" "2.3" }
{USTYLETAB {CSTYLE "Maple Input" -1 0 "Courier" 0 1 255 0 0 1 0 1 0 0
1 0 0 0 0 }{CSTYLE "2D Math" -1 2 "Times" 0 1 0 0 0 0 0 0 2 0 0 0 0 0
0 }{CSTYLE "2D Comment" 2 18 "" 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 }
{CSTYLE "2D Output" 2 20 "" 0 1 0 0 255 1 0 0 0 0 0 0 0 0 0 }{CSTYLE "
" -1 256 "" 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 }{CSTYLE "" -1 257 "" 0 1 0
0 0 0 1 0 0 0 0 0 0 0 0 }{CSTYLE "" -1 258 "" 0 1 0 0 0 0 0 1 0 0 0 0
0 0 0 }{CSTYLE "" -1 259 "" 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 }{CSTYLE ""
-1 260 "" 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 }{CSTYLE "" -1 261 "" 0 1 0 0
0 0 1 1 0 0 0 0 0 0 0 }{CSTYLE "" -1 262 "" 0 1 0 0 0 0 1 0 0 0 0 0 0
0 0 }{CSTYLE "" -1 263 "" 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 }{CSTYLE ""
-1 264 "" 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 }{CSTYLE "" -1 265 "" 0 1 0 0
0 0 0 1 0 0 0 0 0 0 0 }{CSTYLE "" -1 266 "" 0 1 0 0 0 0 0 1 0 0 0 0 0
0 0 }{CSTYLE "" -1 267 "" 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 }{PSTYLE "Norm
al" -1 0 1 {CSTYLE "" -1 -1 "" 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 }0 0 0
-1 -1 -1 0 0 0 0 0 0 -1 0 }{PSTYLE "Heading 1" 0 3 1 {CSTYLE "" -1 -1
"" 1 18 0 0 0 0 0 1 0 0 0 0 0 0 0 }1 0 0 0 6 6 0 0 0 0 0 0 -1 0 }
{PSTYLE "Maple Output" 0 11 1 {CSTYLE "" -1 -1 "" 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 }3 3 0 -1 -1 -1 0 0 0 0 0 0 -1 0 }{PSTYLE "" 11 12 1
{CSTYLE "" -1 -1 "" 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 }1 0 0 -1 -1 -1 0 0
0 0 0 0 -1 0 }{PSTYLE "Title" 0 18 1 {CSTYLE "" -1 -1 "" 1 18 0 0 0 0
0 1 1 0 0 0 0 0 0 }3 0 0 -1 12 12 0 0 0 0 0 0 19 0 }{PSTYLE "" 11 256
1 {CSTYLE "" -1 -1 "" 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 }1 0 0 -1 -1 -1 0
0 0 0 0 0 -1 0 }{PSTYLE "" 0 257 1 {CSTYLE "" -1 -1 "" 0 1 0 0 0 0 1
0 0 0 0 0 0 0 0 }2 0 0 -1 -1 -1 0 0 0 0 0 0 -1 0 }}
{SECT 0 {EXCHG {PARA 0 "" 0 "" {TEXT -1 88 "File: depart\\math\\maple
\\misc\\lgtrudsm.mws Date: 2-feb-1998 By Les Wright to Bob Jantzen" }
}}{PARA 257 "" 0 "" {TEXT 266 15 "29 January 1998" }}{PARA 18 "" 0 ""
{TEXT -1 73 "True Least Squares Logistic Regression with Downhill Simp
lex Minimization" }}{EXCHG {PARA 0 "> " 0 "" {MPLTEXT 1 0 8 "restart:
" }}}{SECT 1 {PARA 3 "" 0 "" {TEXT -1 12 "Introduction" }}{PARA 0 ""
0 "" {TEXT 265 9 "Dear Bob:" }}{PARA 0 "" 0 "" {TEXT -1 100 "As I shar
ed with you, Francis Wright, a computer algebraist at the University o
f London, ported the " }{TEXT 256 17 "Numerical Recipes" }{TEXT -1 51
" Downhill Simplex Minimization code (Press et al., " }{TEXT 257 27 "N
umerical Recipes in Pascal" }{TEXT -1 254 ", CUP, 1989, pp. 326-330, i
s the version I have) from Fortran and Pascal into Maple. I have foun
d that it works very nicely in solving the least-squares logistic regr
ession problem we have been discussing. I apply it here to the same p
roblem as in our " }{TEXT 258 12 "logistru.mws" }{TEXT -1 42 " workshe
et. To use it, you also need the " }{TEXT 260 10 "downhill.m" }{TEXT
-1 36 " file, which basically includes the " }{XPPEDIT 18 0 "amoeba" "
I'amoebaG6\"" }{TEXT -1 125 " procedure in Francis's worksheet. I gav
e you the URL of that sheet in an earlier e-mail, but don't recall it \+
at the moment." }}{PARA 0 "" 0 "" {TEXT -1 0 "" }}{PARA 0 "" 0 ""
{TEXT -1 41 "Looking forward to your feedback. Enjoy!" }}{PARA 0 ""
0 "" {TEXT -1 0 "" }}{PARA 0 "" 0 "" {TEXT 259 10 "Les Wright" }}
{PARA 0 "" 0 "" {TEXT -1 0 "" }}{PARA 0 "" 0 "" {TEXT -1 85 "As before
, we will attempt to fit a logistic curve of the following form to our
data:" }}{EXCHG {PARA 0 "> " 0 "" {MPLTEXT 1 0 38 "logistic:=(x,a,b,c
)->c/(1+exp(a-b*x)):" }}{PARA 0 "> " 0 "" {MPLTEXT 1 0 21 "u:=logistic
(x,a,b,c);" }}{PARA 11 "" 1 "" {XPPMATH 20 "6#>%\"uG*&%\"cG\"\"\",&F'F
'-%$expG6#,&%\"aGF'*&%\"bGF'%\"xGF'!\"\"F'F1" }}}}{SECT 1 {PARA 3 ""
0 "" {TEXT -1 33 "Least-squares Logistic Regression" }}{PARA 0 "" 0 "
" {TEXT -1 157 "To perform the regression, we construct the function--
the sum-of-squares of deviations of data points from the fitted curve-
-that we plan to minimize. Here " }{XPPEDIT 18 0 "x " "I\"xG6\"" }
{TEXT -1 5 " and " }{XPPEDIT 18 0 "y" "I\"yG6\"" }{TEXT -1 76 " repres
ent the lists of independent and dependent values, respectively, and \+
" }{XPPEDIT 18 0 "n" "I\"nG6\"" }{TEXT -1 30 " is the number of data p
oints:" }}{EXCHG {PARA 0 "> " 0 "" {MPLTEXT 1 0 68 "dev:=(y[i]-subs(x=
x[i],u))^2:\nssq:=unapply(sum(dev,i=1..n),(x,y,n));" }}{PARA 11 "" 1 "
" {XPPMATH 20 "6#>%$ssqG:6%%\"xG%\"yG%\"nG6\"6$%)operatorG%&arrowGF*-%
$sumG6$*$,&&9%6#%\"iG\"\"\"*&%\"cGF7,&F7F7-%$expG6#,&%\"aGF7*&%\"bGF7&
9$F5F7!\"\"F7FDFD\"\"#/F6;F79&F*F*" }}}{PARA 0 "" 0 "" {TEXT -1 0 "" }
}{PARA 0 "" 0 "" {TEXT -1 44 "Next, let's enter the same sample data f
rom " }{TEXT 264 12 "logistru.mws" }{TEXT -1 67 ". You will note that
I have scaled the second list by a factor of " }{XPPEDIT 18 0 "1/10^3
" "*&\"\"\"\"\"\"*$\"#5\"\"$!\"\"" }{TEXT -1 419 ". I do this because
it makes no difference to the final result (aside from the multiplica
tive factor), and the downhill simplex technique seems to work more fl
uidly when the magnitude of the numbers in question is smaller. It ca
n be demonstrated algebraically that the solutions of the scaled and n
on-scaled problems are equivalent (apart from a multiplicative constan
t), but in the meantime just take my word for it:" }}{EXCHG {PARA 0 ">
" 0 "" {MPLTEXT 1 0 64 "Lx:=[1,2,3,4,5,6,7];\nLy:=[158,317,793,1823,3
170,5310,5944]/1000;" }}{PARA 11 "" 1 "" {XPPMATH 20 "6#>%#LxG7)\"\"\"
\"\"#\"\"$\"\"%\"\"&\"\"'\"\"(" }}{PARA 11 "" 1 "" {XPPMATH 20 "6#>%#L
yG7)#\"#z\"$+&#\"$<$\"%+5#\"$$zF+#\"%B=F+#F*\"$+\"#\"$J&F1#\"$V(\"$D\"
" }}}{PARA 256 "" 1 "" {TEXT -1 0 "" }}{PARA 0 "" 0 "" {TEXT -1 90 "Ne
xt, we substitute the data lists and construct the function we intend \+
to minimize, with " }{XPPEDIT 18 0 "a" "I\"aG6\"" }{TEXT -1 2 ", " }
{XPPEDIT 18 0 "b" "I\"bG6\"" }{TEXT -1 6 ", and " }{XPPEDIT 18 0 "c" "
I\"cG6\"" }{TEXT -1 27 " as the unknown parameters:" }}{EXCHG {PARA 0
"> " 0 "" {MPLTEXT 1 0 40 "f:=unapply(ssq(Lx,Ly,nops(Lx)),(a,b,c));" }
}{PARA 12 "" 1 "" {XPPMATH 20 "6#>%\"fG:6%%\"aG%\"bG%\"cG6\"6$%)operat
orG%&arrowGF*,0*$,&#\"#z\"$+&\"\"\"*&9&F4,&F4F4-%$expG6#,&9$F49%!\"\"F
4F>F>\"\"#F4*$,&#\"$<$\"%+5F4*&F6F4,&F4F4-F96#,&FF>F?F4*$
,&#\"$$zFDF4*&F6F4,&F4F4-F96#,&FF>F?F4*$,&#\"%B=FDF4*&F6F
4,&F4F4-F96#,&FF>F?F4*$,&#FC\"$+\"F4*&F6F4,&F4F4-F96#,&F<
F4F=!\"&F4F>F>F?F4*$,&#\"$J&F\\oF4*&F6F4,&F4F4-F96#,&FF>F
?F4*$,&#\"$V(\"$D\"F4*&F6F4,&F4F4-F96#,&FF>F?F4F*F*" }}}
{PARA 0 "" 0 "" {TEXT -1 0 "" }}{PARA 0 "" 0 "" {TEXT -1 205 "We are n
ow ready to find a minimum for this function. First, read in the proc
edure from the associated file. You will have to change the path in t
his line to conform to where the file is on your machine:" }}{EXCHG
{PARA 0 "> " 0 "" {MPLTEXT 1 0 49 "read `f:\\\\depart\\\\math\\\\maple
\\\\misc\\\\downhill.m`;" }}}{PARA 0 "" 0 "" {TEXT -1 0 "" }}{PARA 0 "
" 0 "" {TEXT -1 152 "We are now ready to solve. My choice for startin
g values are roughly those given by the first approximation linear fit
of transformed data way back in " }{TEXT 267 12 "logistic.mws" }
{TEXT -1 262 ". The scale factor is quite arbitrary, and begs to be e
xperimented with. The level of accuracy is also an arbitrary choice, \+
as the algorithm seems not to converge when this value is much smaller
. You will note in the output I have multiplied the estimate for " }
{XPPEDIT 18 0 "c" "I\"cG6\"" }{TEXT -1 4 " by " }{XPPEDIT 18 0 "1000"
"\"%+5" }{TEXT -1 296 " in order to compensate for the division discus
sed earlier. The second line of output is the estimated minimum of th
e function (again note the multiplicative factor, this time squared, t
o compensate for the earlier division), and the third line is the numb
er of iterations made by the algorithm:" }}{EXCHG {PARA 0 "> " 0 ""
{MPLTEXT 1 0 54 "start:=[5,1,7000/1000]: #note scaling of 3rd param
!" }}{PARA 0 "> " 0 "" {MPLTEXT 1 0 166 "amoeba(f,start,10,1e-12):\nso
l:=[a=\"[1],b=\"[2],c=\"[3]*1000]:sol;\n #no
te multiplication here\n`amoeba func`(\"\"\")*1e6;`amoeba iter`; #and
here!" }}{PARA 11 "" 1 "" {XPPMATH 20 "6#7%/%\"aG$\"1hw+\"G\"f\\_!#:/
%\"bG$\"1>)[$zo=]5F(/%\"cG$\"+;(*)G!o!\"'" }}{PARA 11 "" 1 ""
{XPPMATH 20 "6#$\"+\\yH7:!\"%" }}{PARA 11 "" 1 "" {XPPMATH 20 "6#\"$3
\"" }}}{PARA 0 "" 0 "" {TEXT -1 0 "" }}{PARA 0 "" 0 "" {TEXT -1 115 "N
ow substituting the results for the coefficients back into the general
logistic function, we get our fitted curve:" }}{EXCHG {PARA 0 "> " 0
"" {MPLTEXT 1 0 14 "y=subs(sol,u);" }}{PARA 11 "" 1 "" {XPPMATH 20 "6#
/%\"yG,$*$,&\"\"\"F(-%$expG6#,&$\"1hw+\"G\"f\\_!#:F(%\"xG$!1>)[$zo=]5F
/F(!\"\"$\"+;(*)G!o!\"'" }}}{PARA 0 "" 0 "" {TEXT -1 0 "" }}{PARA 0 "
" 0 "" {TEXT -1 68 "Note that this compares very favourably with the r
esult obtained in " }{TEXT 263 12 "logistru.mws" }{TEXT -1 76 ", both \+
respect to the parameter estimates and the associated sum-of-squares:
" }}{EXCHG {PARA 0 "> " 0 "" {MPLTEXT 1 0 64 "y = 6802.889625/(1+exp(5
.249591086-1.050186859*x));\n151229.7856;" }}{PARA 11 "" 1 ""
{XPPMATH 20 "6#/%\"yG,$*$,&\"\"\"F(-%$expG6#,&$\"+'3\"f\\_!\"*F(%\"xG$
!+fo=]5F/F(!\"\"$\"+D'*)G!o!\"'" }}{PARA 11 "" 1 "" {XPPMATH 20 "6#$\"
+cyH7:!\"%" }}}}{SECT 1 {PARA 3 "" 0 "" {TEXT -1 8 "A Caveat" }}{PARA
0 "" 0 "" {TEXT 261 8 "Warning!" }{TEXT -1 147 " As Francis Wright ha
s astutely indicated, the algorithm can give a spurious solution, espe
cially if the initial parameter estimates are way off, " }{TEXT 262 4
"viz." }{TEXT -1 1 ":" }}{EXCHG {PARA 0 "> " 0 "" {MPLTEXT 1 0 19 "sta
rt:=[100,12,17]:" }}{PARA 0 "> " 0 "" {MPLTEXT 1 0 100 "amoeba(f,start
,50,1e-6):\nsol:=[a=\"[1],b=\"[2],c=\"[3]*1000]:sol;\n`amoeba func`(\"
\"\")*1e6;`amoeba iter`;" }}{PARA 11 "" 1 "" {XPPMATH 20 "6#7%/%\"aG$
\"18B!**\\y+&)*!#9/%\"bG$\"1J$*e)RCPP\"F(/%\"cG$\"+y#*[pn!\"&" }}
{PARA 11 "" 1 "" {XPPMATH 20 "6#$\"+^oDKU!\"#" }}{PARA 11 "" 1 ""
{XPPMATH 20 "6#\"#Q" }}}{PARA 0 "" 0 "" {TEXT -1 372 "This result is o
bviously out-to-lunch (easy to say since we know what the right answer
is!) I have no idea why this happens or what this output represents.
Evidently, one should make \"decent\" initial estimates, though I su
spect that this approach is not quite as fussy as Alsholm's provided t
he guess is in the right ballpark. Experiment and let me know what yo
u think." }}}{EXCHG {PARA 0 "> " 0 "" {MPLTEXT 1 0 0 "" }}}}{MARK "5"
0 }{VIEWOPTS 1 1 0 3 4 1802 }