ここでは71羽のニワトリについて、エサ(6種類)と体重のデータを使う。
連続変数weight
をカテゴリー変数feed
で回帰すると、デフォルトではcasein
がreferenceになっている。
> data(chickwts) > table(chickwts$feed) casein horsebean linseed meatmeal soybean sunflower 12 10 12 11 14 12 > fit1 <- lm(weight~feed, data=chickwts) > summary(fit1) ~~~ Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 323.583 15.834 20.436 < 2e-16 *** feedhorsebean -163.383 23.485 -6.957 2.07e-09 *** feedlinseed -104.833 22.393 -4.682 1.49e-05 *** feedmeatmeal -46.674 22.896 -2.039 0.045567 * feedsoybean -77.155 21.578 -3.576 0.000665 *** feedsunflower 5.333 22.393 0.238 0.812495 ~~~
出力結果は一部省略している。
relevel( )を使って基準だけを指定する
sunflower
をreferenceに指定する。
fit2 <- lm(weight~relevel(feed, ref="sunflower"), data=chickwts)
> summary(fit2) ~~~ Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 328.917 15.834 20.773 < 2e-16 *** relevel(feed, ref = "sunflower")casein -5.333 22.393 -0.238 0.812495 relevel(feed, ref = "sunflower")horsebean -168.717 23.485 -7.184 8.20e-10 *** relevel(feed, ref = "sunflower")linseed -110.167 22.393 -4.920 6.21e-06 *** relevel(feed, ref = "sunflower")meatmeal -52.008 22.896 -2.271 0.026435 * relevel(feed, ref = "sunflower")soybean -82.488 21.578 -3.823 0.000298 *** ~~~
カテゴリーの順序を全て指定する
デフォルトの逆順にしてみる。
fit3 <- lm(weight~factor(feed, levels=c("sunflower", "soybean", "meatmeal", "linseed", "horsebean", "casein")), data=chickwts)
> summary(fit3) ~~~ Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 328.917 15.834 20.773 < 2e-16 *** factor(feed, levels = c("sunflower", "soybean", "meatmeal", "linseed", "horsebean", "casein"))soybean -82.488 21.578 -3.823 0.000298 *** factor(feed, levels = c("sunflower", "soybean", "meatmeal", "linseed", "horsebean", "casein"))meatmeal -52.008 22.896 -2.271 0.026435 * factor(feed, levels = c("sunflower", "soybean", "meatmeal", "linseed", "horsebean", "casein"))linseed -110.167 22.393 -4.920 6.21e-06 *** factor(feed, levels = c("sunflower", "soybean", "meatmeal", "linseed", "horsebean", "casein"))horsebean -168.717 23.485 -7.184 8.20e-10 *** factor(feed, levels = c("sunflower", "soybean", "meatmeal", "linseed", "horsebean", "casein"))casein -5.333 22.393 -0.238 0.812495 ~~~
カテゴリー名が長くなって見にくい。
同じ要領で新しい変数を作成してからなら短くて済む。
chickwts$x_ <- factor(feed, levels=c("sunflower", "soybean", "meatmeal", "linseed", "horsebean", "casein")) fit4 <- lm(weight~x_, data=chickwts)
~~~ Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 328.917 15.834 20.773 < 2e-16 *** x_soybean -82.488 21.578 -3.823 0.000298 *** x_meatmeal -52.008 22.896 -2.271 0.026435 * x_linseed -110.167 22.393 -4.920 6.21e-06 *** x_horsebean -168.717 23.485 -7.184 8.20e-10 *** x_casein -5.333 22.393 -0.238 0.812495 ~~~
おわりに
- lm( )に入れる前にデータ整形しておけば済む話かも。
- キャットタワーのバスケットがお気に入りみたいですが、たわみ方が半端ないです。