五千年(敝帚自珍)

主题:【原创】选择抗疫模式的一个可能决策依据(知己知彼,百战不殆) -- 学步桥

共:💬27 🌺302 🌵1
全看分页树展 · 主题 跟帖
家园 关于超额死亡率计算方法 -- 补充帖

我个人的理解,对于从已知数据推断未知数据,就是要先寻找“变化中的不变性”(引号内是一位老先生的名言)。这个不变性可能不仅仅是简单的恒常性,甚至可能不是线性。

所以从历史的死亡人口估计没有特定新因素影响下的预期死亡人口才有了不同的模型,他们对于“不变性”是有不同的理解的。而细化的输入,可以让模型预期的不变性更接近真实世界。比如假设《每个人死亡概率一样》就没有假设《每个同年龄的人死亡概率一样》精确,后者又没有假设《每个同年龄且同收入的人死亡概率一样》精确。

但越精确的模型,需要输入数据越精细。主贴的引文因为要全球比较,未必有条件得到非常详细的数据,如人口中年龄、收入、基础疾病、医疗水平等等分布情况。很多网友提出,第一年基础病严重的脆弱人群已经死了,第二年数据就不一样,就算是一个例子。在有限数据下,主贴引文中的处理,就是引入多个模型,对一些数据估计后根据估计的好坏进行加权平均,思路简单但是确实有不少优点。

现在很多网友在根据数据估计差额死亡,我建议大家参考一下该文的方法,而不是简单认为每年就该死一样多的人,或者每年死的人要走同样的趋势。

To estimate expected mortality, we developed six models, each fit separately by location. The first four models were based on first estimating the weekly (or monthly) seasonal pattern of mortality and then estimating the time trend in weekly or monthly mortality not explained by seasonality. We used a Bayesian spline to estimate the weekly seasonal pattern for each location using data from 2010, or the earliest year after 2010 when such data first became available, until around February, 2020, when the COVID-19 pandemic started for each location (appendix p 48). Second, using the same Bayesian spline, we estimated the time trend in the residuals (additional details provided in the appendix, pp 38–40). By combining the seasonal and secular trends, we generated predictions of the expected level of mortality in 2020 and 2021.

The specification of the spline can have a sizeable impact on the estimated expected mortality for a particular location. To make the results more robust to model specification, we included in our ensemble four variants according to where the second to last knot in the spline was placed: 6 months, 12 months, 18 months, and 24 months before the end of the period for the input data before the COVID-19 pandemic started for each location. We also included in the ensemble a Poisson model with fixed effects on week and year, and a model that assumed that expected mortality for 2020 and 2021 was the same as the corresponding weekly mortality observed in 2019. To derive weights for the different models in the ensemble, we assessed how each model performed in an out-of-sample predictive validity test. We fit the model to all data prior to March 1, 2019 and then evaluated how each model performed in predicting mortality between March, 2019, and February, 2020, compared with observed mortality in the same time period. We then weighted component models in the ensemble using 1 over the root mean squared error (RMSE) of the predictions for each component to down-weight component models with larger RMSE (and thus less accurate predictions) in the ensemble. A global weighting scheme was used for all locations. The distribution of RMSE by location for each of the six models included in the model ensemble and examples of the estimated excess mortality for each component model are provided in the appendix (p 49). Expected mortality from the ensemble model was subtracted from observed mortality in 2020 and 2021 to estimate excess mortality due to the COVID-19 pandemic.

通宝推:龙牡,
全看分页树展 · 主题 跟帖


有趣有益,互惠互利;开阔视野,博采众长。
虚拟的网络,真实的人。天南地北客,相逢皆朋友

Copyright © cchere 西西河