As a first step to produce some useable code for spline interpolation/approximation in R, I set out to first do polynomial interpolation to see how I get along. It's not that there is no spline interpolation software for R, but I find it a bit limited. splinefun, for example, can do only 1-dimensional interpolation. interp{akima} can do bicubic splines - i.e. it can fit a spline surface to a set of points over gridded data as for example (x,y,f(x,y)). However, x and y need to have the same length. There's no apparent reason for this, and it might be inefficient to have to do this at times, so I'll be looking at changing that.

As a first step, I'll implement a polynomial approximation algorithm. The basic principle is that of a convex hull of points in a given space. By repeatedly taking convex combinations of points one builds up the approximating curve. The theoretical background for this can be looked up here. The algorithm I implement here is a variant of the DeBoor triangular calculation. I am not claiming this to be the most efficient calculation, this is just a first try. Way too many loops.

So: I sample different number of points on a half circle and construct an interpolating polynomial. you can see the result in the graph below. the red line is the polynomial approximation. the code is added below the graph.



here is the code:


# test polynomial interpolation code by interpolating points on a semi-circle 
# using a different number of points each time.
 
# contents:
# 1) make.circle(degree) is a function that generates
# n=degree+1 data points (x,y) on a semicircle. this is the "data" we
# want to interpolate.
# 2) pinterp() makes polynomial interpolation on "data" of degree "degree"
# using degree+1 parameters
# 3) run() is a wrapper that runs the interpolation
 
# note: polynomial interpolation uses n=degree+1 points and T=degree+1 parameters.
#
# output:
# plot of the interpolation overlaid by the used data points
 
rm(list=ls(all=TRUE))
setwd("~/dropbox/Rstuff/splines")
 
require(plotrix)
# this package contains a function to draw a circle.
# you could generate any other data as well.
 
# task: run this approximation for 3,7,11 and 15 sample points.
degrees <- c(2,6,10,14)
 
 
##################
# define functions
##################
make.circle <- function(degree){
# makes a vector of degree+1 points on a semi-circle
plot(-2:2,-2:2,type="n",xlab="",ylab="",main="make.circle output. disregard.")
vals <- draw.circle(0,0,radius=1,nv=2*(degree)+1)
data <- array(unlist(vals),dim=c(2*degree+1,2))
data <- data[data[ ,2]>=0, ] # take only upper half of circle§
return(data)
}
 
pinterp <- function(degree,params,data,t){
# polynomial interpolation on data at point t
 
# has a list "x" whose entries are each a
# matrix for points at t at approximation of degree
# deg. x gets shorter with increasing degree. at final entry
# degree+1, x is (1,2)
x <- vector("list",degree+1)
x[[1]] <- data # matrix 1 holds data for degree zero
# each row of x[[j]]
# represents (x,y) coords of a point
n <- degree+1 # number of points we're interpolating on
for (ideg in 2:n){ # recursively build interpolations of increasing degree
nn <- n-ideg+1
x[[ideg]] <- array(data=NA,dim=c(nn,2))
for (i in 1:nn){
x[[ideg]][i, ] <-
(params[i+ideg-1] - t)/(params[i+ideg-1] - params[i])*x[[ideg-1]][i, ] +
(t - params[i])/(params[i+ideg-1] - params[i])*x[[ideg-1]][i+1, ]
}
}
return(x[[degree+1]]) # return only final degree values,i.e. result
}
 
run <- function(degrees,t){
nd <- length(degrees)
rval <- vector("list",nd)
origdata <- vector("list",nd)
for (j in 1:nd){
degree <- degrees[j]
params <- seq(0,5,length=degree+1)
nt <- length(t)
rval[[j]] <- array(data=NA,dim=c(nt,2))
data <- make.circle(degree)
origdata[[j]] <- data
for (it in 1:nt) rval[[j]][it, ] <- pinterp(degree,params,data,t[it])
}
return(list(interpol=rval,orig=origdata))
}
 
###################
# produce output
###################
 
outrun <- run(degrees,t=seq(1,5,length=20))
rval <- outrun$interpol
orig <- outrun$orig
png(file="graphs/ex1.png")
par(mfrow=c(2,2))
for (j in 1:length(degrees)){
plot(x=orig[[j]][ ,1],y=orig[[j]][ ,2],type="o",
main=paste("degree: ",degrees[j],", points: ",
degrees[j]+1),xlab="x",ylab="y",ylim=c(0,1.5))
lines(x=rval[[j]][ ,1],y=rval[[j]][ ,2],col="red")
}
dev.off()
par(mfrow=c(1,1))

Created by Pretty R at inside-R.org




2

View comments

  1. There are also penalized smoothing splines of various kinds (pspline, cobs, lmeSplines, modreg), and a bunch of interpolating splines in the basic package. What do you find wrong with akima?

    ReplyDelete
  2. hi there!
    thanks for your comment. my main problem is that i need a spline surface (ie. a spline approximating {x,y,f(x,y)} ), and I need it as an interpolating function such as splinefun{stats}. akima only gives me a list of interpolated values. apart from that I think it requires me to have the same number of points into x and y directions, which is inefficient for my application. any suggestions? "splinefun{akima}" would be great.

    ReplyDelete

UPDATE: Following up on a comment below, I used another data source, ONS JOBS02 for the labor market statistics. I report the findings below.

I read a series of articles related to the goings of the UK housing market, the likely effects of the new Help To Buy scheme, the 10% increase in mean London house price over the last year, and employment statistics. I failed to reproduce some numbers cited in the economist (below). This post talks about this.

It all starts with this blog post on the economist:

http://www.economist.com/blogs/buttonwood/2013/09/house-prices

It talks about many things, amongst which employment and housing completions, and how the UK seems likely to be embarking on another round of debt-fueled growth.
10

For loss of a better place, I'll store my recipe for homemade pizza dough here. This will make dough for 14 people.

2kg strong white flour (no self-raising or other extras) 4 sachets of yeast, 7g each (not the super fast bicarbonate stuff) Salt Sugar Olive Oil Water The main problem is to get the right consistency, i.e. how much water to add. You'll have to do some experiments here.

I've got some questions regarding this issue, maybe someone out there has a clue.

The Austrian contingent was 300 out of 1000 soldiers

source: http://www.guardian.co.uk/world/2013/jun/06/israel-angry-austria-golan-heights.

Inspired by the Institute of Fiscal Studies' "Where do you fit in" application, where people can find out their position in the UK's income distribution, I wanted to find out how the picture in London looks like. Quite different. If you are in a very high percentile nationwide, high incomes of mainly financial sector employees in London make sure that you find yourself a couple of ranks further down.
3

I just pushed the most recent version of the PSID panel data builder introduced a little while ago. Got some user feedback and made some improvements. The package is hosted on github.

News:

I added a reproducible example using artificial data which you can run by calling 'example(build.panel)'. This means you can try out the package before bothering to download anything and it provides a simple test of the main function.
3

I got intrigued by the numbers presented in this news article talking about the re-trial in the Amanda Knox case. The defendants, accused and initially convicted of murder, were acquitted in the appeal's instance when the judge ruled that the forensic evidence was insufficiently conclusive. The appeals judge ignored the forensic scientist's advice to retest a DNA sample, because

"The sum of the two results, both unreliable… cannot give a reliable result," he wrote.

I just finished reading the extraordinary book tomorrow's table by P. Ronald and Raoul Adamchak. (I linked to Ronald's blog). In this post I wanted to quickly redo a calculation Adamchak does on page 16, where he explains to his students how much energy is required to produce the fertilizer used to grow one acre of corn using conventional agriculture (as opposed to organic methods).

Economists frequently use public datasets. One frequently used dataset is the Panel Study of Income Dynamics, short PSID, maintained by the Institute of Social Research at the University of Michigan.

I'm introducing psidR, which is a small helper package for R here which makes constructing panels from the PSID a bit easier.

One potential difficulty with the PSID is to construct a longitudinal dataset, i.e. one where individuals are followed over several survey waves.
7

I was recently asked by a friend whether it's worth to buy a house in the UK. That is, assuming they could put down the money, whether it was worth buying as opposed to renting. Apart from obvious things like the expected length of stay in one place, the interest on mortgages and how prices might develop and so forth, they were interested in particular in the amount of transaction costs they were likely to face: fees, taxes and so forth.

So Paul Krugman laments in his post that policy makers across Europe have blindly signed up to the "Austerity only" ticket. He cites some evidence which I find fairly convincing. I just want to raise the point that what he says cannot be used as a critique against the Monti government.

Basically what he's saying is that Monti was installed as a puppet of European creditor nations to make sure that austerity would be imposed and the country's government debt would be continued to be serviced.
2
links
About Me
About Me
Blog Archive
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.