I just pushed the most recent version of the PSID panel data builder introduced a little while ago. Got some user feedback and made some improvements. The package is hosted on github.

News:
  • I added a reproducible example using artificial data which you can run by calling 'example(build.panel)'. This means you can try out the package before bothering to download anything and it provides a simple test of the main function.
  • I've included a suggestion to use the R survey package to analyse this dataset and made it explicit in the examples how to obtain the desired weights for each wave. Note that your results are invalid in the majority of cases if you ignore the survey design (i.e. the weights).
  • I got some useful comments from Anthony Damico (thanks!) and integrated the SAScii package. (check out his tutorials at http://www.asdfree.com/).  This allows one to download the data directly from the PSID server into R, thereby removing any dependency on Stata or SAS to preprocess the raw data. (As is common with large datasets, the raw data come in ASCII format that needs to be fixed up into rows and columns.) The downside is that downloading directly takes a rather long time: downloading FAM1985ER, FAM1986ER and the index IND2009ER took 3 and a half hours.
Hopefully I can get another round of feedback (particularly from a windows user: I could not test that all the paths are written correctly on a unix system) before submitting to CRAN.


3

View comments

  1. I saw Anthony Damico's comments on a previous post of yours, and am thrilled to see that this has led to removing a dependency (the extra download time is, in my opinion, a fine trade-off). Your work will be extremely valuable to the R user community: there is a real need for adsfree-like code and packages that bring large social science datasets to R, so thanks to you both are in order!

    ReplyDelete
  2. Hey Fr good to get some feedback. yes Anthony enlightened me with his ASCii package: I've been looking for something like it for a long time (but didn't quite know what to look for). the stuff he put up at adsfree.com is superb, and a completely different scale (and scope) than this here. That monetDB connection they've got? boy...
    I wrote this package up to deal with the PSID once you have it on your disk, at which point you still have a couple of problems to solve. Nothing a more or less experienced useR couldn't deal with, but it would be case by case and truck loads of unnecessary overhead. Hopefully it'll be useful to some. Oh, please let me know how you get along if you try it out.

    ReplyDelete
  3. of course Anthony's site is www.asdfree.com (and not adsfree.com)! I always type it wrong the first time round. It's an interesting typo to make.

    ReplyDelete

UPDATE: Following up on a comment below, I used another data source, ONS JOBS02 for the labor market statistics. I report the findings below.

I read a series of articles related to the goings of the UK housing market, the likely effects of the new Help To Buy scheme, the 10% increase in mean London house price over the last year, and employment statistics. I failed to reproduce some numbers cited in the economist (below). This post talks about this.

It all starts with this blog post on the economist:

http://www.economist.com/blogs/buttonwood/2013/09/house-prices

It talks about many things, amongst which employment and housing completions, and how the UK seems likely to be embarking on another round of debt-fueled growth.
10

For loss of a better place, I'll store my recipe for homemade pizza dough here. This will make dough for 14 people.

2kg strong white flour (no self-raising or other extras) 4 sachets of yeast, 7g each (not the super fast bicarbonate stuff) Salt Sugar Olive Oil Water The main problem is to get the right consistency, i.e. how much water to add. You'll have to do some experiments here.

I've got some questions regarding this issue, maybe someone out there has a clue.

The Austrian contingent was 300 out of 1000 soldiers

source: http://www.guardian.co.uk/world/2013/jun/06/israel-angry-austria-golan-heights.

Inspired by the Institute of Fiscal Studies' "Where do you fit in" application, where people can find out their position in the UK's income distribution, I wanted to find out how the picture in London looks like. Quite different. If you are in a very high percentile nationwide, high incomes of mainly financial sector employees in London make sure that you find yourself a couple of ranks further down.
3

I just pushed the most recent version of the PSID panel data builder introduced a little while ago. Got some user feedback and made some improvements. The package is hosted on github.

News:

I added a reproducible example using artificial data which you can run by calling 'example(build.panel)'. This means you can try out the package before bothering to download anything and it provides a simple test of the main function.
3

I got intrigued by the numbers presented in this news article talking about the re-trial in the Amanda Knox case. The defendants, accused and initially convicted of murder, were acquitted in the appeal's instance when the judge ruled that the forensic evidence was insufficiently conclusive. The appeals judge ignored the forensic scientist's advice to retest a DNA sample, because

"The sum of the two results, both unreliable… cannot give a reliable result," he wrote.

I just finished reading the extraordinary book tomorrow's table by P. Ronald and Raoul Adamchak. (I linked to Ronald's blog). In this post I wanted to quickly redo a calculation Adamchak does on page 16, where he explains to his students how much energy is required to produce the fertilizer used to grow one acre of corn using conventional agriculture (as opposed to organic methods).

Economists frequently use public datasets. One frequently used dataset is the Panel Study of Income Dynamics, short PSID, maintained by the Institute of Social Research at the University of Michigan.

I'm introducing psidR, which is a small helper package for R here which makes constructing panels from the PSID a bit easier.

One potential difficulty with the PSID is to construct a longitudinal dataset, i.e. one where individuals are followed over several survey waves.
7

I was recently asked by a friend whether it's worth to buy a house in the UK. That is, assuming they could put down the money, whether it was worth buying as opposed to renting. Apart from obvious things like the expected length of stay in one place, the interest on mortgages and how prices might develop and so forth, they were interested in particular in the amount of transaction costs they were likely to face: fees, taxes and so forth.

So Paul Krugman laments in his post that policy makers across Europe have blindly signed up to the "Austerity only" ticket. He cites some evidence which I find fairly convincing. I just want to raise the point that what he says cannot be used as a critique against the Monti government.

Basically what he's saying is that Monti was installed as a puppet of European creditor nations to make sure that austerity would be imposed and the country's government debt would be continued to be serviced.
2
links
About Me
About Me
Blog Archive
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.