I know this relationship is kind of common knowledge - at least for the foreclosure part - but I was surprised as to how pronounced it is. I did this for 2 groups of states. In both groups, in general, states whose house price came down from a higher level, have more people getting into credit difficulties. (I am not trying to establish a causal relationship here.)
I often find that preparing the data is much more demanding than actually producing the plot (which tells you something about the quality of ggplot). For this plot I had to learn some new things (date formatting, time series aggregation from {zoo}), all of which I found on the web (stackoverflow mostly), so thanks to all for sharing. I post my code below, maybe somebody finds it useful.
UPDATE: I got a useful comment for another visualization. Using phaseplots, or just plotting bankruptcy/default rates against the house price index. Here I add the time dimension as an additional layer in the plot (i label some points with their date), Here is what you get:
These were the original plots:
Here is the second group. Nevada seems to be a case in point. But also Texas and Ohio fit the pattern: the house price there moved much less than elsewhere, and so did the percentage of people getting into trouble.
There are other ways to look at the same relationship. There is a variable in the NYFed data called "more than 90 days late" on either mortgage or balance repayments. Doing the same analysis as above, we get
and for the second group:
Finally, we can look at a measure that cumulates all new foreclosures and bankruptcies from the first plot. That's just saying "how many bankruptcies/foreclosures have we accumulated since 2006". Given that those events have some bearing on behaviour over several years, such a measure makes some sense if we want to know how many people are in "default state":
The code for this is on a gist at github.
Nice work.
ReplyDeleteYou should check out both plyr and reshape2 packages for data manipulation (both by Hadley, and already installed in your system since you are using ggplot2)
Nice work.
ReplyDeleteYou should check out both plyr and reshape2 packages for data manipulation (both by Hadley, and already installed in your system since you are using ggplot2)
hey chemman,
ReplyDeletethanks for your comment. did you look at my code? I actually wanted to use ddply for getting quarterly out of monthly data, but found in the end that aggregate was easier. but maybe i just didnt' get it. also used melt{reshape}. did you have something different in mind?
cheers