There now exists the Statistical Computing Blog for everyone who comes here for statistics, econometrics ...etc.

For everyone else, this blog will continue with the theme of discussing insurance, life insurance for example.

Enjoy!

# Insurance Blog

This is a blog about insurance, focusing mostly on the quant side of insurance. I hope to write about models for pricing insurance contracts, analyse some anecdotes about getting cheaper insurance and some other insurance-related bits that I cannot think of now.

## Wednesday, March 28, 2012

## Friday, March 23, 2012

### The Julia Language

The purpose of this post is to mention the Julia Language. It is a new language for technical computing. Its main strength is that it runs faster than R, MATLAB...etc. The code is compiled Just-In-Time. In the backend, amongst other things, it has LAPACK and ARPACK.

So check out http://julialang.org/

So check out http://julialang.org/

## Wednesday, March 21, 2012

### R Programming Syntax Reference To Get You Started Quickly So You Can Start Implementing Your Insurance Models, Probably Life Insurance

**EDIT:**This article has been re-written and updated in my analytics blog: R Programming Syntax Quickstart

If you have ANY programming experience in other languages, this guide will get you started in R very quickly and then you can start implementing your favourite insurance models. A special mention goes out to my friend who is currently working on a health insurance project.

## Logic Operators

a == b | a equals b |

a != b | a is not equal to b |

a > b | a is greater than b |

a < b | a is less than b |

a >= b | a is greater than OR equal to b |

a <= b | a is less than OR equal to b |

(condition 1) & (condition 2) | (condition 1) AND (condition 2) |

(condition 1) | (condition 2) | (condition 1) OR (condition 2) |

Also, try the following to understand "&&" and "||":

> a<-c(1:10) > b<-a > c<-b > c[1:4]<-.5 > (a == b) && (a > c)

[1] TRUE

> (a == b) & (a > c)

[1] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE

> (a == b) | (a > c)

[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

> (a == b) || (a > c)

[1] TRUE

## IF statements

The general example:

if( *condition* ) {

} else if( * other condition* ){

} else {

}

The specific example:

a<-55

if( a <= 54.9 ) {

print("a is less than or equal to 54.9")

} else if( a == 55 ){

print("a equals 55")

} else {

print("a is greater than 54.9 and not 55")

}

## For Loops

The general example:

for(*variable* in *vector*) {

}

Specific examples:

#example 1

for(i in 1:10) {

print(i)

}

#example 2

index.vector<-c(4,3,7,5)

numberz<-runif(10)

print(numberz)

for(i in index.vector) {

print(numberz[i])

}

#example 3

for(i in 1:10) {

if(i == 3) {

next

} else if(i == 7) {

break

}

print(i)

}

#example 4

mat<-matrix(0,3,4)

print(mat)

for(i in 1:3) {

for(j in 1:4) {

mat[i,j]<-rnorm(1)

}

}

## While Loops

General Example:

while(*condition*) {

}

Note that you must something write something within the while that will update at least one of the variables in the condition. Otherwise, you could have a perpetual loop.

Specific Example:

i<- -1

while( i < 10) {

` print(i)`

` i<-i+1 `

`} `

## Repeat Loop

In a repeat loop, you not only explicitly update variables, you must also explicitly test the condition.Specific Example:

i<- -1

repeat{

print(i)

i<-i+1

if( i == 10) {

break

}

}

## Functions

For example, you could have a function that evaluates an insurance contract pricing equation. A function could return the amount of excess that must be paid on an insurance claim. A function can call other functions.General Example:

*function_name*<-function(*parameters*) {

return(*return_variable*)

}

Specific Example:

calcQuadratic<-function(a, b, c, x) {

y<-a*x*x+b*x+c

return(y)

}

calcQuadratic(2,3,5,.07)

my.var<-calcQuadratic(3.32,7.6,5.999,3.2)

print(my.var)

Easy as purchasing life insurance without asking any questions that strain the mind of the sale person, right!? :-P Just not as pointless.

## Thursday, August 25, 2011

### Testing for seasonal unit roots in R

**EDIT:**This article has been re-written and updated in my analytics blog:Testing for seasonal unit roots in R

Suppose that for our new life insurance product, we want to model and forecast accidental deaths in US. Suppose that a our dataset is seasonal and that we intend to use a seasonal ARIMA model. We need to test our time to see if it is seasonal integrated. This will be the topic of this insurance quant blog post.

We will be using R and I will assume that the reader knows about R and how it could be applied in insurance. Briefly, R is very similar to MATLAB, SAS...etc. The website is http://www.r-project.org

I know that I have not written a "formal introduction" to R or how it can be used to model insurance, but that will have to wait because I deem it more important to document new packages/features of those packages as they come out.

Version 3 of the "forecast" R package was published yesterday. It has a new function for testing for seasonal unit roots. The function is nsdiffs().

R also come with a US Accidental Deaths dataset that we will be discussing in the insurance blog post with our example life insurancer problem. Right, so we are starting a life insurance business and we want to forecast accidental deaths.

So to follow along, open up R and type the following:

USAccDeaths

You will then see the US Accidental Deaths dataset. You can see that it is monthly.

Now install the "forecast" R package from CRAN. Then load it. By the time youd read this, forecast version (at least) 3.01 should be available. Version 3.00 would also be sufficient to work through this post, but I strongly recommend 3.01.

To view the help file for the nsdiffs() type:

?nsdiffs

It will bring up a page that is for both nsdiffs and ndiffs.

There are two tests that have been implemented in nsdiffs, the OCSB test (default) and the Canova-Hansen test. You can also speicify the seasonal period of your dataset. USAccDeaths is a TS object and the seasonal period or "frequency" is a data member of the USAccDeaths/TS object.

To perform the OCSB test:

nsdiffs(USAccDeaths)

To perform the Canova-Hansen test:

nsdiffs(USAccDeaths, test="ch")

The ouput: "1" means that there is a seasonal unit root and "0" that there is no seasonal unit root.

You notice that the two different tests give two different answers. This is because the Canova-Hansen test is less likely to decide in favour of a seasonal unit root than the OCSB test. This is becuase unlike the Canova-Hansen test, the OCSB test has a null hypothesis of a unit root. Further, Osborn (1990) writes that when in doubt, it's better to seasonally difference.

Enjoy this life insurance related post! :-)

**Bibliography:**

Osborn, DR (1990) "A survey of seasonality in UK macroeconomic variables", International Journal of Forecasting 6(3):327-336

Osborn DR, Chui APL, Smith J, and Birchenhall CR (1988) "Seasonality and the order of integration for consumption", Oxford Bulletin of Economics and Statistics 50(4):361-377.

Canova F and Hansen BE (1995) "Are Seasonal Patterns Constant over Time? A Test for Seasonal Stability", Journal of Business and Economic Statistics 13(3):237-252.

## Monday, August 15, 2011

### A model for Insurance Losses

Our first insurance model: a mathematical model for insurance losses.

E(X) = The expected value of X

Var(X) = The variance of X

StDev(X) = The standard deviation of X

SQRT(X) = The square root of X

Let's use car insurance as an example because a lot of people have had first hand experience with it. Most Australians are forced to drive and hence reluctantly, experience auto insurance.

Suppose that there is a 70% percent chance that I will not make an insurance claim; presumably because I will not have a car accident. Then say there is a 20% chance that I will make an insurance claim for something small: $700, something mostly cosmetic and perhaps only involving my car.

Suppose that there is a 5% chance that I will make a $6000 insurance cliam.

A 3% chance that I will cause some appreciable damage to the tune of $18,000.

A 1.5% chance that I will write-off two moderately priced cars for a total of $60,000.

Finally, a 0.5% chance of causing a catastrophic accident with a $350,000 damage bill.

Suppose that the amount the insurance company must pay (the insurance loss) is a random variable, X. From the above, assuming that the insurance excess is zero, we can deduce a discrete distribution for X, f(x).

The discrete distribution for insurance payout is above. The probability of the payout is on the left and the amount of the payout is on the right.

So E(X) = .7*0+.2*700+.05*6000+.03*18000+.015*60000+.005*350000=$3,630

If there is an insurance excess, you can subtract the excess from the insurance payout.

1. Find Var(X) and the standard deviation of X.

2. Suppose that there is an insurance excess of $500. What would the mean insurance payout and its standard deviation be then?

The distribution of insurance losses (insurance payouts) can also be continuous. However, that will be covered in a later post.

The random variable X is the insurance payout for one individual. Now the expected insurance payout for all individuals/customers in a given time period is the sum of their indiviual means. In this post, we will assumne that everyone has the same insurance payout distribution. We will denote the total/collective insurance payout as Y.

If we explicitly assume that all X's are independant:

E(Y) = the sum of the insurance payouts for each insurance customer.

Suppose that there are n insurance customers. Var(Y) = n*Var(X). (Remember that we are assuming that every insurance customer has the same distribution of X.)

Then StDev(Y) = StDev(X)*SQRT(n)

The standard deviation is a measure of risk. For a large n, StDev(Y) is much less than sum of StDev(X).

Course notes for MTH3251/ETC3510/ETC5351 at Monash University by Fima Klebaner Semester 1, 2009.

**Firstly, some notation:**E(X) = The expected value of X

Var(X) = The variance of X

StDev(X) = The standard deviation of X

SQRT(X) = The square root of X

**Individual Insurance Losses**Let's use car insurance as an example because a lot of people have had first hand experience with it. Most Australians are forced to drive and hence reluctantly, experience auto insurance.

Suppose that there is a 70% percent chance that I will not make an insurance claim; presumably because I will not have a car accident. Then say there is a 20% chance that I will make an insurance claim for something small: $700, something mostly cosmetic and perhaps only involving my car.

Suppose that there is a 5% chance that I will make a $6000 insurance cliam.

A 3% chance that I will cause some appreciable damage to the tune of $18,000.

A 1.5% chance that I will write-off two moderately priced cars for a total of $60,000.

Finally, a 0.5% chance of causing a catastrophic accident with a $350,000 damage bill.

Suppose that the amount the insurance company must pay (the insurance loss) is a random variable, X. From the above, assuming that the insurance excess is zero, we can deduce a discrete distribution for X, f(x).

The discrete distribution for insurance payout is above. The probability of the payout is on the left and the amount of the payout is on the right.

So E(X) = .7*0+.2*700+.05*6000+.03*18000+.015*60000+.005*350000=$3,630

If there is an insurance excess, you can subtract the excess from the insurance payout.

**Homework Exersises:**1. Find Var(X) and the standard deviation of X.

2. Suppose that there is an insurance excess of $500. What would the mean insurance payout and its standard deviation be then?

The distribution of insurance losses (insurance payouts) can also be continuous. However, that will be covered in a later post.

**Collective insurance payouts**The random variable X is the insurance payout for one individual. Now the expected insurance payout for all individuals/customers in a given time period is the sum of their indiviual means. In this post, we will assumne that everyone has the same insurance payout distribution. We will denote the total/collective insurance payout as Y.

If we explicitly assume that all X's are independant:

E(Y) = the sum of the insurance payouts for each insurance customer.

Suppose that there are n insurance customers. Var(Y) = n*Var(X). (Remember that we are assuming that every insurance customer has the same distribution of X.)

Then StDev(Y) = StDev(X)*SQRT(n)

The standard deviation is a measure of risk. For a large n, StDev(Y) is much less than sum of StDev(X).

**Bibliography:**Course notes for MTH3251/ETC3510/ETC5351 at Monash University by Fima Klebaner Semester 1, 2009.

## Thursday, July 21, 2011

### First insurance blog post!

HELLO! This is the first post and also a description of what this site will be about. This will be a site about insurance, probably focusing more on the quant side of insurance than other areas such as where to buy insurance; the internet should have many other places where you can search for "buy auto insurance" :-P

However, I do hope to also write about methods how you could get cheaper insurance and the mathematical reasoning behind it. For example, I have heard anecdotally that insurance for a Mercedes Benz is cheaper than unsurance for a Holden Commodore. Why? The anecdote says that "because there are a lot of kids who drive Commodores and do stupid stuff, whereas a Mercedes Benz is driven mostly by older people who are more careful". Would the insurance company really think like that? How about Pr(Crash | Commodore) > Pr(Crash | Mercedes Benz )? Now to make this fit a little better with the insurance model that will be published in the next post: (mean payout for a Commodore) > (mean payout for a Mercedes Benz). However, whether or not this insurance contract princing anecdote is true or not can only be verified with actual data.

Have fun,

Insurance Blog

However, I do hope to also write about methods how you could get cheaper insurance and the mathematical reasoning behind it. For example, I have heard anecdotally that insurance for a Mercedes Benz is cheaper than unsurance for a Holden Commodore. Why? The anecdote says that "because there are a lot of kids who drive Commodores and do stupid stuff, whereas a Mercedes Benz is driven mostly by older people who are more careful". Would the insurance company really think like that? How about Pr(Crash | Commodore) > Pr(Crash | Mercedes Benz )? Now to make this fit a little better with the insurance model that will be published in the next post: (mean payout for a Commodore) > (mean payout for a Mercedes Benz). However, whether or not this insurance contract princing anecdote is true or not can only be verified with actual data.

Have fun,

Insurance Blog

Subscribe to:
Posts (Atom)