Publication: Boylamsal verilerde semiparametrik karma etki modelleri ve bir uygulama
Abstract
BOYLAMSAL VERİLERDE SEMİPARAMETRİK KARMA ETKİ MODELLERİ VE BİR UYGULAMA Boylamsal veriler, aynı birimlere ait özelliklerin zaman içerisinde tekrarlı olarak ölçülmesi ile elde edilen verilerdir. Boylamsal verilerin analizi, klasik regresyon modelleri ile gerçekleşemediğinden bu veriler için özel regresyon modelleri geliştirilmiştir. Bu modellerden en sık kullanılanları parametrik regresyon modellerinden Karma Etki Modelleri (KEM) ve Genelleştirilmiş Tahmin Denklemleridir (GTD). KEM’de sabit ve tesadüfi etkiler modele birlikte dahil edilir. GTD ise bağımlı değişkenin tekrarlı ölçümleri arasındaki korelasyonu hesaba katarak anakütle ortalama değerindeki marjinal değişimi ortaya çıkarır. Her iki yöntemde bağımlı değişken ile bağımsız değişken(ler) arasındaki ilişkinin doğrusal olması veya ilişkinin bilinen parametrik fonksiyonlarla ifade edilmesi temeline dayanmaktadır. Bu durumda da gerçek ilişki yapısı ortaya çıkarılamamaktadır. Özellikle sosyal bilimlerde bu durum güvenilir ve mantıklı sonuçlara ulaşılmasını engelleyecektir. Boylamsal verilerde bağımlı değişken ile bağımsız değişken(ler) arasındaki ilişkinin daha karmaşık olduğu durumlarda, parametrik olmayan regresyon modeli kullanılmaktadır. Bu modellerde birden fazla bağımsız değişken olması durumunda, hesaplamalarda ve yorumlama aşamasında sıkıntılar ortaya çıkmaktadır. Bu durumda da Semiparametrik regresyon modelleri devreye girmektedir. Bu modellerde bağımlı değişken ile bazı bağımsız değişken(ler) arasındaki ilişki doğrusal ya da bilinen fonksiyonlarla ifade edilen bir yapıda iken, bazı bağımsız değişken(ler) ile doğrusal olmayan ya da bilinen fonksiyonlarla ifade edilemeyen bir yapıda olabilir. Bu durumda, boylamsal veriler söz konusu olduğundan, birim-özel etkilerinde dahil edildiği Semiparametrik Karma Etki Modelleri (SKEM) en uygun sonucu vermektedir. Bu çalışmada, bir partinin oy oranını etkileyen faktörlerin belirlenmesi amacıyla iki boylamsal veri seti oluşturulmuştur. Birinci veri seti 2007 ve 2011 yıllarını, ikinci veri seti ise 2002, 2007 ve 2011 yıllarını kapsamaktadır. Her iki veri seti için uygun SKEM oluşturulmuş ve elde edilen sonuçlar yorumlanmıştır.
SEMIPARAMETRIC MIXED EFFECTS MODELS IN LONGITUDINAL DATA AND AN APPLICATION Longitudinal data is defined as data obtained by a repeated measurement of variables pertaining to the same units over time. Since the analysis of longitudinal data cannot be achieved through classical regression models, specific regression models have been developed for such data. Mixed-Effects Models (MEM) and Generalized Estimation Equations (GEE), as parametric regression models, are the most frequently used ones among such models. Fixed effects and random effects are both integrated into the model in MEM. GEE reveals the marginal change in the average value of the population by taking the correlation between the repeated measurements of the dependent variable into account. Both models are based on the rationale that the relation between the dependent variable and the independent variable(s) is linear or the relation is expressed through known parametric functions. In such a case, it is not possible to reveal the actual structure of the relation, which will prevent the researcher from achieving reliable and rational outcomes particularly in social sciences. Non-parametric regression model is utilized in cases where the relation between the dependent variable and the independent variable(s) is more complicated in longitudinal data. Some problems arise during the calculation and interpretation stages if there exist more than one independent variable in these models. In such a case, semiparametric regression models are used. In such models, the relation between the dependent variable and some independent variable(s) has a structure which allows expression through either linear or known functions. It is also likely for them to have a structure which does not have linear attributes with some independent variable(s) or does not allow expression through known functions. As longitudinal data comes into question in such a case, Semiparametric Mixed-Effects Models (SMEM), including subject-specific effects, yield the most applicable outcomes. Two different sets of longitudinal data are formed in this study so that the factors which have impact on the vote rate of a party can be identified. The first data set covers the period between the years 2007 and 2011, whereas the second set covers the years 2002, 2007 and 2011. An applicable SMEM is formed for both sets of data, and the results obtained are interpreted accordingly.
SEMIPARAMETRIC MIXED EFFECTS MODELS IN LONGITUDINAL DATA AND AN APPLICATION Longitudinal data is defined as data obtained by a repeated measurement of variables pertaining to the same units over time. Since the analysis of longitudinal data cannot be achieved through classical regression models, specific regression models have been developed for such data. Mixed-Effects Models (MEM) and Generalized Estimation Equations (GEE), as parametric regression models, are the most frequently used ones among such models. Fixed effects and random effects are both integrated into the model in MEM. GEE reveals the marginal change in the average value of the population by taking the correlation between the repeated measurements of the dependent variable into account. Both models are based on the rationale that the relation between the dependent variable and the independent variable(s) is linear or the relation is expressed through known parametric functions. In such a case, it is not possible to reveal the actual structure of the relation, which will prevent the researcher from achieving reliable and rational outcomes particularly in social sciences. Non-parametric regression model is utilized in cases where the relation between the dependent variable and the independent variable(s) is more complicated in longitudinal data. Some problems arise during the calculation and interpretation stages if there exist more than one independent variable in these models. In such a case, semiparametric regression models are used. In such models, the relation between the dependent variable and some independent variable(s) has a structure which allows expression through either linear or known functions. It is also likely for them to have a structure which does not have linear attributes with some independent variable(s) or does not allow expression through known functions. As longitudinal data comes into question in such a case, Semiparametric Mixed-Effects Models (SMEM), including subject-specific effects, yield the most applicable outcomes. Two different sets of longitudinal data are formed in this study so that the factors which have impact on the vote rate of a party can be identified. The first data set covers the period between the years 2007 and 2011, whereas the second set covers the years 2002, 2007 and 2011. An applicable SMEM is formed for both sets of data, and the results obtained are interpreted accordingly.
