gam                   package:mgcv                   R Documentation

_G_e_n_e_r_a_l_i_z_e_d _A_d_d_i_t_i_v_e _M_o_d_e_l_s _u_s_i_n_g _p_e_n_a_l_i_z_e_d _r_e_g_r_e_s_s_i_o_n _s_p_l_i_n_e_s _a_n_d 
_G_C_V

_D_e_s_c_r_i_p_t_i_o_n:

     Fits the specified  generalized additive model (GAM) to data.
     'gam()' is not a clone of what S-PLUS provides. Smooth terms are
     represented using penalized regression splines with smoothing
     parameters selected by GCV/UBRE or by regression splines with
     fixed degrees of freedom (mixtures of the two are permitted).
     Multi-dimensional smooths are available using penalized thin plate
     regression splines (isotropic) or tensor product splines (when an
     isotropic smooth is inappropriate).  For more on specifying models
     see 'gam.models'. For more on model  selection see
     'gam.selection'.

_U_s_a_g_e:

     gam(formula,family=gaussian(),data=list(),weights=NULL,subset=NULL,
         na.action,control=gam.control(),scale=0,knots=NULL,sp=NULL,
         min.sp=NULL,H=NULL,gamma=1,fit=TRUE,G=NULL,...)

_A_r_g_u_m_e_n_t_s:

 formula: A GAM formula (see also 'gam.models'). This is exactly like
          the formula for a GLM except that smooth terms can be added
          to the right hand side of the formula (and a formula of the
          form 'y ~ .' is not allowed). Smooth terms are specified by
          expressions of the form: 
           's(var1,var2,...,k=12,fx=FALSE,bs="tp",by=a.var)' where
          'var1', 'var2', etc. are the covariates which the smooth is a
          function of and 'k' is the dimension of the basis used to
          represent the smooth term. If 'k' is not specified then
          'k=10*3^(d-1)' is used where 'd' is the number of covariates
          for this term. 'fx' is used to indicate whether or not this
          term has a fixed number of degrees of freedom ('fx=FALSE' to
          select d.f. by GCV/UBRE). 'bs' indicates the basis to use,
          with '"cr"' indicating cubic regression spline, '"cc"' a
          cyclic cubic regression spline, and '"tp"' indicating thin
          plate regression spline: '"cr"' and '"cc"' can only be used
          with 1-d smooths. 'by' can be used to specify a variable by
          which the smooth should be multiplied. For example
          'gam(y~z+s(x,by=z))' would specify a model E(y)=f(x)z where
          f(.) is a smooth function (the formula is 'y~x+s(x,by=z)'
          rather than 'y~s(x,by=z)' because the smooths are always set
          up to sum to zero over the covariate values). The 'by' option
          is particularly useful for models in which different
          functions of the same variable are required for each level of
          a factor: see 's'. 

          An alternative for specifying smooths of more than one
          covariate is e.g.: 
           'te(x,z,bs=c("tp","tp"),m=c(2,3),k=c(5,10))' which would
          specify a tensor product  smooth of the two covariates 'x'
          and 'z' constructed from marginal t.p.r.s. bases  of
          dimension 5 and 10 with marginal penalties of order 2 and 3.
          Any combination of basis types is  possible, as is any number
          of covariates.

          Formulae can involve nested or ``overlapping'' terms such as 
           'y~s(x)+s(z)+s(x,z)' or 'y~s(x,z)+s(z,v)': see
          'gam.side.conditions' for further details and examples. Note
          that  nesting with 'te' terms is not supported. 

  family: This is a family object specifying the distribution and link
          to use in fitting etc. See 'glm' and 'family' for more
          details. The negative binomial families provided by the MASS
          library  can be used, with or without known theta parameter:
          see 'gam.neg.bin' for details. 

    data: A data frame containing the model response variable and 
          covariates required by the formula. By default the variables
          are taken  from 'environment(formula)': typically the
          environment from  which 'gam' is called.

 weights: prior weights on the data.

  subset: an optional vector specifying a subset of observations to be
          used in the fitting process.

na.action: a function which indicates what should happen when the data
          contain `NA's.  The default is set by the `na.action' setting
          of `options', and is `na.fail' if that is unset.  The
          ``factory-fresh'' default is `na.omit'.

 control: A list of fit control parameters returned by  'gam.control'.

   scale: If this is zero then GCV is used for all distributions except
          Poisson and binomial where UBRE is used with scale parameter
          assumed to be 1. If this is greater than 1 it is assumed to
          be the scale parameter/variance and UBRE is used: to use the
          negative binomial in this case theta must be known. If
          'scale' is negative  GCV  is always used, which means that
          the scale parameter will be estimated by GCV and the Pearson 
          estimator, or in the case of the negative binomial theta will
          be estimated  in order to force the GCV/Pearson scale
          estimate to unity (if this is possible). For binomial models
          in  particular, it is probably worth  comparing UBRE and GCV
          results; for ``over-dispersed Poisson'' GCV is probably more
          appropriate than UBRE.

   knots: this is an optional list containing user specified knot
          values to be used for basis construction.  For the 'cr' and
          'cc' bases the user simply supplies the knots to be used, and
          there must be the same number as the basis dimension, 'k',
          for the smooth concerned. For the 'tp' basis 'knots' has two
          uses. Firstly, for large datasets  the calculation of the
          'tp' basis can be time-consuming. The user can retain most of
          the advantages of the t.p.r.s.  approach by supplying  a
          reduced set of covariate values from which to obtain the
          basis -  typically the number of covariate values used will
          be substantially  smaller than the number of data, and
          substantially larger than the basis dimension, 'k'. The
          second possibility  is to avoid the eigen-decomposition used
          to find the t.p.r.s. basis altogether and simply use  the
          basis implied by the chosen knots: this will happen if the
          number of knots supplied matches the  basis dimension, 'k'.
          For a given basis dimension the second option is  faster, but
          gives poorer results (and the user must be quite careful in
          choosing knot locations).  Different terms can use different 
          numbers of knots, unless they share a covariate. 

      sp: A vector of smoothing parameters for each term can be
          provided here. Smoothing parameters must  be supplied in the
          order that the smooth terms appear in the model  formula.
          With fit method '"magic"' (see 'gam.control'  and 'magic')
          then negative elements indicate that the  parameter should be
          estimated, and hence a mixture of fixed and estimated 
          parameters is possible. With fit method '"mgcv"', if 'sp' is 
          supplied then all its elements must be positive. Note that
          'fx=TRUE'  in a smooth term over-rides what is supplied here
          effectively setting the  smoothing parameter to zero.

  min.sp: for fit method '"magic"' only, lower bounds can be  supplied
          for the smoothing parameters. Note that if this option is
          used then the smoothing parameters 'sp', in the returned
          object, will need to be added to what is supplied here to get
          the actual smoothing parameters. Lower bounds on the
          smoothing  parameters can sometimes help stabilize otherwise
          divergent P-IRLS iterations.

       H: With fit method '"magic"' a user supplied fixed quadratic 
          penalty on the parameters of the  GAM can be supplied, with
          this as its coefficient matrix. A common use of this term is 
          to add a ridge penalty to the parameters of the GAM in
          circumstances in which the model is close to un-identifiable
          on the scale of the linear predictor, but perfectly well
          defined on the response scale.

   gamma: It is sometimes useful to inflate the model degrees of 
          freedom in the GCV or UBRE score by a constant multiplier.
          This allows  such a multiplier to be supplied if fit method
          is '"magic"'.

     fit: If this argument is 'TRUE' then 'gam' sets up the model and
          fits it, but if it is 'FALSE' then the model is set up and an
          object 'G' is returned which is the output from  'gam.setup'
          plus some extra items required to complete the GAM fitting
          process.

       G: Usually 'NULL', but may contain the object returned by a
          previous call to 'gam' with  'fit=FALSE', in which case all
          other arguments are ignored except for 'gamma', 'family',
          'control' and 'fit'.

     ...: further arguments for  passing on e.g. to 'gam.fit'

_D_e_t_a_i_l_s:

     A generalized additive model (GAM) is a generalized linear model
     (GLM) in which the linear  predictor is given by a user specified
     sum of smooth functions of the covariates plus a  conventional
     parametric component of the linear predictor. A simple example is:

                   log(E(y_i))=f_1(x_1i)+f_2(x_2i)

     where the (independent) response variables y_i~Poi, and f_1 and
     f_2 are smooth functions of covariates x_1 and  x_2. The log is an
     example of a link function. 

     If absolutely any smooth functions were allowed in model fitting
     then maximum likelihood  estimation of such models would
     invariably result in complex overfitting estimates of  f_1  and
     f_2. For this reason the models are usually fit by  penalized
     likelihood  maximization, in which the model (negative log)
     likelihood is modified by the addition of  a penalty for each
     smooth function, penalizing its `wiggliness'. To control the
     tradeoff  between penalizing wiggliness and penalizing badness of
     fit each penalty is multiplied by  an associated smoothing
     parameter: how to estimate these parameters, and  how to
     practically represent the smooth functions are the main
     statistical questions  introduced by moving from GLMs to GAMs. 

     The 'mgcv' implementation of 'gam' represents the smooth functions
     using  penalized regression splines, and by default uses basis
     functions for these splines that  are designed to be optimal,
     given the number basis functions used. The smooth terms can be 
     functions of any number of covariates and the user has some
     control over how smoothness of  the functions is measured. 

     'gam' in 'mgcv' solves the smoothing parameter estimation problem
     by using the  Generalized Cross Validation (GCV) criterion or an
     Un-Biased Risk Estimator criterion  (UBRE) which is in practice an
     approximation to AIC. Smoothing parameters are chosen to  minimize
     the GCV or UBRE score for the model, and the main computational
     challenge solved  by the 'mgcv' package is to do this efficiently
     and reliably. Two alternative  numerical methods are provided, see
     'mgcv', 'magic' and  'gam.control'. 

     Broadly 'gam' works by first constructing basis functions and one
     or more quadratic penalty  coefficient matrices for each smooth
     term in the model formula, obtaining a model matrix for  the
     strictly parametric part of the model formula, and combining these
     to obtain a  complete model matrix (/design matrix) and a set of
     penalty matrices for the smooth terms.  Some linear
     identifiability constraints are also obtained at this point. The
     model is  fit using 'gam.fit', a modification of 'glm.fit'. The
     GAM  penalized likelihood maximization problem is solved by
     penalized Iteratively  Reweighted  Least Squares (IRLS) (see e.g.
     Wood 2000). At each iteration a penalized  weighted least squares
     problem is solved, and the smoothing parameters of that problem
     are  estimated by GCV or UBRE. Eventually both model parameter
     estimates and smoothing  parameter estimates converge. 

     The fitting approach just described, in which the smoothing
     parameters are estimated for  each approximating linear model of
     the IRLS process was suggested by Chong Gu (see, e.g.  Gu 2002),
     and is very computationally efficient. However, because the
     approach neglects  the dependence of the iterative weights on the
     smoothing parameters, it is usually  possible to find smoothing
     parameters which actually yield slightly lower GCV/UBRE score 
     estimates than those resulting from this `performance iteration'.
     'gam' therefore also  allows the user to `improve' the smoothing
     parameter estimates, by using O'Sullivan's  (1986) suggested
     method, in which for each trial set of smoothing parameters the
     IRLS is  iterated to convergence before the UBRE/GCV score is
     evaluated. This requires much less  efficient minimisation of the
     power iteration based on 'nlm', and is  therefore quite slow. 

     Three alternative bases are built in for representing model
     smooths, but alternatives can easily be added (see
     'smooth.construct' which uses p-splines to illustrate how to add
     new smooths).  The built in alternatives for univariate smooths
     terms are: a conventional penalized cubic regression spline basis,
     parameterized in terms of the function values at the knots;  a
     cyclic cubic spline with a similar parameterization and thin plate
     regression splines.  The cubic spline bases are computationally
     very efficient, but require `knot' locations to be  chosen
     (automatically by default). The thin plate regression splines are
     optimal low rank  smooths which do not have knots, but are more
     computationally costly to set up. Multivariate terms can be
     represented using thin plate regression splines, or tensor
     products of any available basis  including user defined bases
     (tensor product penalties are obtained automatically form  the
     marginal basis penalties). The t.p.r.s. basis is isotropic, so if
     this is not appropriate tensor  product terms should be used.
     Tensor product smooths have one penalty and smoothing parameter
     per marginal  basis, which means that the relative scaling of
     covariates is essentially determined automatically by GCV/UBRE. 

     For any  basis the user specifies the dimension of the basis for
     each smooth term. The dimension of the basis is one more than the
     maximum degrees of freedom that the  term can have, but usually
     the term will be fitted by penalized maximum likelihood estimation
     and the actual degrees of freedom will be chosen by GCV. However,
     the user can choose to fix the degrees of freedom of a term, in
     which case the actual degrees of freedom will be one less than the
     basis dimension.

     Thin plate regression splines are constructed by starting with the
     basis for a full thin plate spline and then truncating this basis
     in an optimal manner, to obtain a low rank smoother. Details are
     given in Wood (2003). One key advantage of the approach is that it
     avoids the knot placement problems of conventional regression
     spline modelling, but it also has the advantage that smooths of
     lower rank are nested within smooths of higher rank, so that it is
     legitimate to use conventional hypothesis testing methods to
     compare models based on pure regression splines. The t.p.r.s.
     basis can become expensive to calculate for large datasets. In
     this case the user can supply a reduced  set of knots to use in
     basis construction (see knots, in the argument list), or  use
     tensor products of cheaper bases.

     In the case of the cubic regression spline basis, knots  of the
     spline are placed evenly throughout the covariate values to which
     the term refers:  For example, if fitting 101 data with an 11 knot
     spline of 'x' then there would be a knot at every 10th (ordered) 
     'x' value. The parameterization used represents the spline in
     terms of its values at the knots. The values at neighbouring knots
     are connected by sections of  cubic polynomial constrained to be 
     continuous up to and including second derivative at the knots. The
     resulting curve is a natural cubic  spline through the values at
     the knots (given two extra conditions specifying  that the second
     derivative of the curve should be zero at the two end  knots).
     This parameterization gives the parameters a nice
     interpretability. 

     Details of '"mgcv"' GCV/UBRE minimization method are given in Wood
     (2000):  the basis of the approach  is to alternate efficient
     global optimization with respect to one overall smoothing 
     parameter with Newton updates of a set of relative smoothing
     parameters for each smooth term. The Newton updates are backed up
     by steepest descent, since the GCV/UBRE score functions are not
     positive definite everywhere.

_V_a_l_u_e:

     If 'fit = FALSE' the function returns a list 'G' of items needed
     to fit a GAM, but doesn't actually fit it. 

     Otherwise the function returns an object of class '"gam"' which
     has the following elements: 

boundary: did parameters end up at boundary of parameter space?

    call: the matched call (allows 'update' to be used with 'gam'
          objects, for example). 

coefficients: the coefficients of the fitted model. Parametric
          coefficients are  first, followed  by coefficients for each
          spline term in turn.

converged: indicates whether or not the iterative fitting method
          converged.

deviance: (unpenalized)

 df.null: null degrees of freedom.

     edf: estimated degrees of freedom for each coefficient.

  family: family object specifying distribution and link used.

fit.method: The underlying multiple GCV/UBRE method used: '"magic"' 
          for the new more stable method, '"mgcv"' for the Wood (2000)
          method.

fitted.values: fitted model predictions of expected value for each
          datum.

 formula: the model formula.

full.formula: the model formula with each smooth term fully expanded
          and with option arguments given explicitly (i.e. not with
          reference to other variables) - useful for later prediction
          from the model.

gcv.ubre: The minimized GCV or UBRE score.

     hat: array of elements from the leading diagonal of the `hat' (or
          `influence') matrix.  Same length as response data vector.

    iter: number of iterations of P-IRLS taken to get convergence.

linear.predictor: fitted model prediction of link function of expected
          value for  each datum.

  method: Either '"GCV"' or '"UBRE"', depending on smoothing parameter
          selection method used  (or appropriate, if none used).

mgcv.conv: A list of convergence diagnostics relating to smoothing
          parameter estimation. Differs for method '"magic"' and
          '"mgcv"'. Here is  the '"mgcv"' version:

     _s_c_o_r_e corresponding to edf, an array of GCV or UBRE scores for the
          model given the final  estimated relative smoothing
          parameters.

     _g the gradient of the GCV/UBRE score w.r.t. the relative smoothing
          parameters at termination.

     _h the second derivatives corresponding to 'g' above - i.e. the
          leading diagonal of the Hessian.

     _e the eigen-values of the Hessian. All non-negative indicates a
          positive definite Hessian.

     _i_t_e_r the number of iterations taken.

     _i_n._o_k 'TRUE' if the second smoothing parameter guess improved the
          GCV/UBRE score.

     _s_t_e_p._f_a_i_l 'TRUE' if the algorithm terminated by failing to improve
          the GCV/UBRE score rather than by `converging'.  Not
          necessarily a problem, but check the above derivative
          information quite carefully.

          In the case of '"magic"' the items are:

     _f_u_l_l._r_a_n_k The apparent rank of the problem given the model matrix
          and  constraints.

     _r_a_n_k The numerical rank of the problem.

     _f_u_l_l_y._c_o_n_v_e_r_g_e_d 'TRUE' is multiple GCV/UBRE converged by meeting 
          convergence criteria. 'FALSE' if method stopped with a
          steepest descent step  failure.

     _h_e_s_s._p_o_s._d_e_f Was the hessian of the GCV/UBRE score positive
          definite at  smoothing parameter estimation convergence?

     _i_t_e_r How many iterations were required to find the smoothing
          parameters?

     _s_c_o_r_e._c_a_l_l_s and how many times did the GCV/UBRE score have to be
          evaluated?

     _r_m_s._g_r_a_d root mean square of the gradient of the GCV/UBRE score at
           convergence.

 min.edf: Minimum possible degrees of freedom for whole model.

   model: model frame containing all variables needed in original model
          fit.

    nsdf: number of parametric, non-smooth, model terms including the
          intercept.

null.deviance: deviance for single parameter model.

prior.weights: prior weights on observations.

residuals: the deviance residuals for the fitted model.

    sig2: estimated or supplied variance/scale parameter.

  smooth: list of smooth objects, containing the basis information for
          each term in the  model formula in the order in which they
          appear. These smooth objects are what gets returned by the
          'smooth.construct' objects.

      sp: smoothing parameter for each smooth.

      Vp: estimated covariance matrix for the parameters. This is a
          Bayesian posterior covariance matrix that results from
          adopting a particular Bayesian model of the smoothing
          process.

 weights: final weights used in IRLS iteration.

       y: response data.

_W_A_R_N_I_N_G_S:

     The '"mgcv"' code does not check for rank deficiency of the model
     matrix that may result from lack of identifiability between the
     parametric and smooth components of the model. 

     You must have more unique combinations of covariates than the
     model has total parameters. (Total parameters is sum of basis
     dimensions plus sum of non-spline  terms less the number of spline
     terms). 

     Automatic smoothing parameter selection is not likely to work well
     when  fitting models to very few response data.

     With large datasets (more than a few thousand data) the '"tp"'
     basis gets very slow to use: use the 'knots' argument as discussed
     above and  shown in the examples. Alternatively, for 1-d smooths 
     you can use the '"cr"' basis and  for multi-dimensional smooths
     use 'te' smooths.

     For data with many  zeroes clustered together in the covariate
     space it is quite easy to set up  GAMs which suffer from
     identifiability problems, particularly when using Poisson or
     binomial families. The problem is that with e.g. log or logit
     links, mean value zero corresponds to an infinite range on the
     linear predictor scale. Some regularization is possible in such
     cases: see  'gam.control' for details.

_A_u_t_h_o_r(_s):

     Simon N. Wood simon@stats.gla.ac.uk

_R_e_f_e_r_e_n_c_e_s:

     Key References:

     Wood, S.N. (2000)  Modelling and Smoothing Parameter Estimation
     with Multiple  Quadratic Penalties. J.R.Statist.Soc.B
     62(2):413-428

     Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B
     65(1):95-114

     Wood, S.N. (in press) Stable and efficient multiple smoothing
     parameter estimation for generalized additive models. J. Amer.
     Statist. Ass.

     Background References:

     Green and Silverman (1994) Nonparametric Regression and
     Generalized  Linear Models. Chapman and Hall.

     Gu and Wahba (1991) Minimizing GCV/GML scores with multiple
     smoothing parameters via the Newton method. SIAM J. Sci. Statist.
     Comput. 12:383-398

     Gu (2002) Smoothing Spline ANOVA Models, Springer.

     Hastie and Tibshirani (1990) Generalized Additive Models. Chapman
     and Hall.

     O'Sullivan, Yandall and Raynor (1986) Automatic smoothing of
     regression functions in generalized linear models. J. Am.
     Statist.Ass. 81:96-103 

     Wahba (1990) Spline Models of Observational Data. SIAM 

     Wood (2001) mgcv:GAMs and Generalized Ridge Regression for R. R
     News 1(2):20-25

     Wood and Augustin (2002) GAMs with integrated model selection
     using penalized regression splines and applications  to
     environmental modelling. Ecological Modelling 157:157-177

     <URL: http://www.stats.gla.ac.uk/~simon/>

_S_e_e _A_l_s_o:

     'gam.models', 's', 'predict.gam', 'plot.gam', 'summary.gam',
     'gam.side.conditions', 'gam.selection','mgcv', 'gam.control'
     'gam.check', 'gam.neg.bin', 'magic','vis.gam'

_E_x_a_m_p_l_e_s:

     library(mgcv)
     set.seed(0) 
     n<-400
     sig2<-4
     x0 <- runif(n, 0, 1)
     x1 <- runif(n, 0, 1)
     x2 <- runif(n, 0, 1)
     x3 <- runif(n, 0, 1)
     pi <- asin(1) * 2
     f <- 2 * sin(pi * x0)
     f <- f + exp(2 * x1) - 3.75887
     f <- f+0.2*x2^11*(10*(1-x2))^6+10*(10*x2)^3*(1-x2)^10-1.396
     e <- rnorm(n, 0, sqrt(abs(sig2)))
     y <- f + e
     b<-gam(y~s(x0)+s(x1)+s(x2)+s(x3))
     summary(b)
     plot(b,pages=1)
     # same fit in two parts .....
     G<-gam(y~s(x0)+s(x1)+s(x2)+s(x3),fit=FALSE)
     b<-gam(G=G)
     # an extra ridge penalty (useful with convergence problems) ....
     bp<-gam(y~s(x0)+s(x1)+s(x2)+s(x3),H=diag(0.5,41)) 
     print(b);print(bp);rm(bp)
     # set the smoothing parameter for the first term, estimate rest ...
     bp<-gam(y~s(x0)+s(x1)+s(x2)+s(x3),sp=c(0.01,-1,-1,-1))
     plot(bp,pages=1);rm(bp)
     # set lower bounds on smoothing parameters ....
     bp<-gam(y~s(x0)+s(x1)+s(x2)+s(x3),min.sp=c(0.001,0.01,0,10)) 
     print(b);print(bp);rm(bp)

     # now a GAM with 3df regression spline term & 2 penalized terms
     b0<-gam(y~s(x0,k=4,fx=TRUE,bs="tp")+s(x1,k=12)+s(x2,k=15))
     plot(b0,pages=1)
     # now fit a 2-d term to x0,x1
     b1<-gam(y~s(x0,x1)+s(x2)+s(x3))
     par(mfrow=c(2,2))
     plot(b1)
     par(mfrow=c(1,1))
     # now simulate poisson data
     g<-exp(f/5)
     y<-rpois(rep(1,n),g)
     b2<-gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=poisson)
     plot(b2,pages=1)
     # and a pretty 2-d smoothing example....
     test1<-function(x,z,sx=0.3,sz=0.4)  
     { (pi**sx*sz)*(1.2*exp(-(x-0.2)^2/sx^2-(z-0.3)^2/sz^2)+
       0.8*exp(-(x-0.7)^2/sx^2-(z-0.8)^2/sz^2))
     }
     n<-500
     old.par<-par(mfrow=c(2,2))
     x<-runif(n);z<-runif(n);
     xs<-seq(0,1,length=30);zs<-seq(0,1,length=30)
     pr<-data.frame(x=rep(xs,30),z=rep(zs,rep(30,30)))
     truth<-matrix(test1(pr$x,pr$z),30,30)
     contour(xs,zs,truth)
     y<-test1(x,z)+rnorm(n)*0.1
     b4<-gam(y~s(x,z))
     fit1<-matrix(predict.gam(b4,pr,se=FALSE),30,30)
     contour(xs,zs,fit1)
     persp(xs,zs,truth)
     vis.gam(b4)
     par(old.par)
     # very large dataset example using knots
     n<-10000
     x<-runif(n);z<-runif(n);
     y<-test1(x,z)+rnorm(n)
     ind<-sample(1:n,1000,replace=FALSE)
     b5<-gam(y~s(x,z,k=50),knots=list(x=x[ind],z=z[ind]))
     vis.gam(b5)
     # and a pure "knot based" spline of the same data
     b6<-gam(y~s(x,z,k=100),knots=list(x= rep((1:10-0.5)/10,10),
             z=rep((1:10-0.5)/10,rep(10,10))))
     vis.gam(b6,color="heat")

