Some time ago I got inspired by a post on r-bloggers.com, showing the housing bubble in several US cities, nicely done with ggplot. I extended this to incorporate two measures of problems in the consumer credit markets: the percentage of people with a new bankruptcy, and the percentage of people with a new foreclosure, in each quarter from 2006 up to the end of 2011. The data are public (S&P Case-Shiller and NY Fed credit data).
I know this relationship is kind of common knowledge - at least for the foreclosure part - but I was surprised as to how pronounced it is. I did this for 2 groups of states. In both groups, in general, states whose house price came down from a higher level, have more people getting into credit difficulties. (I am not trying to establish a causal relationship here.)
I often find that preparing the data is much more demanding than actually producing the plot (which tells you something about the quality of ggplot). For this plot I had to learn some new things (date formatting, time series aggregation from {zoo}), all of which I found on the web (stackoverflow mostly), so thanks to all for sharing. I post my code below, maybe somebody finds it useful.

UPDATE: I got a useful comment for another visualization. Using phaseplots, or just plotting bankruptcy/default rates against the house price index. Here I add the time dimension as an additional layer in the plot (i label some points with their date), Here is what you get:








These were the original plots:


Here is the second group. Nevada seems to be a case in point. But also Texas and Ohio fit the pattern: the house price there moved much less than elsewhere, and so did the percentage of people getting into trouble.


There are other ways to look at the same relationship. There is a variable in the NYFed data called "more than 90 days late" on either mortgage or balance repayments. Doing the same analysis as above, we get 


and for the second group:


Finally, we can look at a measure that cumulates all new foreclosures and bankruptcies from the first plot. That's just saying "how many bankruptcies/foreclosures have we accumulated since 2006". Given that those events have some bearing on behaviour over several years, such a measure makes some sense if we want to know how many people are in "default state":



The code for this is on a gist at github.


3

View comments

  1. Nice work.

    You should check out both plyr and reshape2 packages for data manipulation (both by Hadley, and already installed in your system since you are using ggplot2)

    ReplyDelete
  2. Nice work.

    You should check out both plyr and reshape2 packages for data manipulation (both by Hadley, and already installed in your system since you are using ggplot2)

    ReplyDelete
  3. hey chemman,
    thanks for your comment. did you look at my code? I actually wanted to use ddply for getting quarterly out of monthly data, but found in the end that aggregate was easier. but maybe i just didnt' get it. also used melt{reshape}. did you have something different in mind?
    cheers

    ReplyDelete

UPDATE: Following up on a comment below, I used another data source, ONS JOBS02 for the labor market statistics. I report the findings below.

I read a series of articles related to the goings of the UK housing market, the likely effects of the new Help To Buy scheme, the 10% increase in mean London house price over the last year, and employment statistics. I failed to reproduce some numbers cited in the economist (below). This post talks about this.

It all starts with this blog post on the economist:

http://www.economist.com/blogs/buttonwood/2013/09/house-prices

It talks about many things, amongst which employment and housing completions, and how the UK seems likely to be embarking on another round of debt-fueled growth.
10

For loss of a better place, I'll store my recipe for homemade pizza dough here. This will make dough for 14 people.

2kg strong white flour (no self-raising or other extras) 4 sachets of yeast, 7g each (not the super fast bicarbonate stuff) Salt Sugar Olive Oil Water The main problem is to get the right consistency, i.e. how much water to add. You'll have to do some experiments here.

I've got some questions regarding this issue, maybe someone out there has a clue.

The Austrian contingent was 300 out of 1000 soldiers

source: http://www.guardian.co.uk/world/2013/jun/06/israel-angry-austria-golan-heights.

Inspired by the Institute of Fiscal Studies' "Where do you fit in" application, where people can find out their position in the UK's income distribution, I wanted to find out how the picture in London looks like. Quite different. If you are in a very high percentile nationwide, high incomes of mainly financial sector employees in London make sure that you find yourself a couple of ranks further down.
3

I just pushed the most recent version of the PSID panel data builder introduced a little while ago. Got some user feedback and made some improvements. The package is hosted on github.

News:

I added a reproducible example using artificial data which you can run by calling 'example(build.panel)'. This means you can try out the package before bothering to download anything and it provides a simple test of the main function.
3

I got intrigued by the numbers presented in this news article talking about the re-trial in the Amanda Knox case. The defendants, accused and initially convicted of murder, were acquitted in the appeal's instance when the judge ruled that the forensic evidence was insufficiently conclusive. The appeals judge ignored the forensic scientist's advice to retest a DNA sample, because

"The sum of the two results, both unreliable… cannot give a reliable result," he wrote.

I just finished reading the extraordinary book tomorrow's table by P. Ronald and Raoul Adamchak. (I linked to Ronald's blog). In this post I wanted to quickly redo a calculation Adamchak does on page 16, where he explains to his students how much energy is required to produce the fertilizer used to grow one acre of corn using conventional agriculture (as opposed to organic methods).

Economists frequently use public datasets. One frequently used dataset is the Panel Study of Income Dynamics, short PSID, maintained by the Institute of Social Research at the University of Michigan.

I'm introducing psidR, which is a small helper package for R here which makes constructing panels from the PSID a bit easier.

One potential difficulty with the PSID is to construct a longitudinal dataset, i.e. one where individuals are followed over several survey waves.
7

I was recently asked by a friend whether it's worth to buy a house in the UK. That is, assuming they could put down the money, whether it was worth buying as opposed to renting. Apart from obvious things like the expected length of stay in one place, the interest on mortgages and how prices might develop and so forth, they were interested in particular in the amount of transaction costs they were likely to face: fees, taxes and so forth.

So Paul Krugman laments in his post that policy makers across Europe have blindly signed up to the "Austerity only" ticket. He cites some evidence which I find fairly convincing. I just want to raise the point that what he says cannot be used as a critique against the Monti government.

Basically what he's saying is that Monti was installed as a puppet of European creditor nations to make sure that austerity would be imposed and the country's government debt would be continued to be serviced.
2
links
About Me
About Me
Blog Archive
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.