lm - Removing character level outlier in R -


i have linear model1<-lm(divorce_rate~marriage_rate+median_age+population) leverage plot shows outlier @ 28 (state variable id "nevada"). i'd specify model without nevada in dataset. tried following got stuck.

data<-read.dta("census.dta") attach(data) data1<-data.frame(pop,divorce,marriage,popurban,medage,divrate,marrate) attach(data1) model1<-lm(divrate~marrate+medage+pop,data=data1) summary(model1) layout(matrix(1:4,2,2)) plot(model1) dfbetaplots(lm(divrate~marrate+medage+pop),id.n=50) vif(model1)  datanv<-data[!data$state == "nevada",] attach(datanv) model3<-lm(divrate~marrate+medage+pop,data=datanv) 

the last line of above code gives me

error in model.frame.default(formula = divrate ~ marrate + medage + pop,  :    variable lengths differ (found 'medage') 

enter image description here

i suspect have glitch in code such have attach()ed copies still lying around in environment -- that's why it's best practice not use attach(). following code works me:

library(foreign) ## best not call data 'data' mydata <- read.dta("http://www.stata-press.com/data/r8/census.dta") 

i didn't find divrate or marrate in data set: i'm going speculate want per capita rates:

## best practice use new name rather transforming 'in place' mydata2 <- transform(mydata,marrate=marriage/pop,divrate=divorce/pop) model1 <- lm(divrate~marrate+medage+pop,data=mydata2) library(car) plot(model1) dfbetaplots(model1) 

this works fine me in clean session:

datanv <- subset(mydata2,state != "nevada") ## update() may nice avoid repeating details of ##   model specification (not necessary in case) model3 <- update(model1,data=datanv) 

or can use subset argument:

model4 <- update(model1,subset=(state != "nevada")) 

Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -