首页 > 最新文献

Statistical Analysis and Data Mining最新文献

英文 中文
Error-controlled feature selection for ultrahigh-dimensional and highly correlated feature space using deep learning 利用深度学习为超高维和高相关特征空间选择错误可控的特征
IF 1.3 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-05 DOI: 10.1002/sam.11664
Arkaprabha Ganguli, Tapabrata Maiti, David Todem
Deep learning has been at the center of analytics in recent years due to its impressive empirical success in analyzing complex data objects. Despite this success, most existing tools behave like black-box machines, thus the increasing interest in interpretable, reliable, and robust deep learning models applicable to a broad class of applications. Feature-selected deep learning has emerged as a promising tool in this realm. However, the recent developments do not accommodate ultrahigh-dimensional and highly correlated features or high noise levels. In this article, we propose a novel screening and cleaning method with the aid of deep learning for a data-adaptive multi-resolutional discovery of highly correlated predictors with a controlled error rate. Extensive empirical evaluations over a wide range of simulated scenarios and several real datasets demonstrate the effectiveness of the proposed method in achieving high power while keeping the false discovery rate at a minimum.
近年来,深度学习在分析复杂数据对象方面取得了令人印象深刻的经验性成功,因而成为分析领域的中心。尽管取得了这一成功,但大多数现有工具都表现得像黑盒子机器,因此人们对适用于广泛应用的可解释、可靠和稳健的深度学习模型的兴趣与日俱增。在这一领域,特征选择深度学习已成为一种前景广阔的工具。然而,最近的发展并不能适应超高维、高度相关的特征或高噪声水平。在本文中,我们提出了一种新颖的筛选和清理方法,借助深度学习,在误差率可控的情况下,以数据自适应的多分辨率方式发现高度相关的预测因子。在广泛的模拟场景和多个真实数据集上进行的大量实证评估证明,所提出的方法在实现高功率的同时,还能将错误发现率保持在最低水平。
{"title":"Error-controlled feature selection for ultrahigh-dimensional and highly correlated feature space using deep learning","authors":"Arkaprabha Ganguli, Tapabrata Maiti, David Todem","doi":"10.1002/sam.11664","DOIUrl":"https://doi.org/10.1002/sam.11664","url":null,"abstract":"Deep learning has been at the center of analytics in recent years due to its impressive empirical success in analyzing complex data objects. Despite this success, most existing tools behave like black-box machines, thus the increasing interest in interpretable, reliable, and robust deep learning models applicable to a broad class of applications. Feature-selected deep learning has emerged as a promising tool in this realm. However, the recent developments do not accommodate ultrahigh-dimensional and highly correlated features or high noise levels. In this article, we propose a novel screening and cleaning method with the aid of deep learning for a data-adaptive multi-resolutional discovery of highly correlated predictors with a controlled error rate. Extensive empirical evaluations over a wide range of simulated scenarios and several real datasets demonstrate the effectiveness of the proposed method in achieving high power while keeping the false discovery rate at a minimum.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"272 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140046612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Marginal clustered multistate models for longitudinal progressive processes with informative cluster size 具有信息聚类规模的纵向渐进过程的边际聚类多态模型
IF 1.3 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-04 DOI: 10.1002/sam.11668
Sean Xinyang Feng, Aya A. Mitani
Informative cluster size (ICS) is a phenomenon where cluster size is related to the outcome. While multistate models can be applied to characterize the unit‐level transition process for clustered interval‐censored data, there is a research gap addressing ICS within this framework. We propose two extensions of multistate model that account for ICS to make marginal inference: one by incorporating within‐cluster resampling and another by constructing cluster‐weighted score functions. We evaluate the performances of the proposed methods through simulation studies and apply them to the Veterans Affairs Dental Longitudinal Study (VADLS) to understand the effect of risk factors on periodontal disease progression. ICS occurs frequently in dental data, particularly in the study of periodontal disease, as people with fewer teeth due to the disease are more susceptible to disease progression. According to the simulation results, the mean estimates of the parameters obtained from the proposed methods are close to the true values, but methods that ignore ICS can lead to substantial bias. Our proposed methods for clustered multistate model are able to appropriately take ICS into account when making marginal inference of a typical unit from a randomly sampled cluster.
信息集群规模(ICS)是一种集群规模与结果相关的现象。虽然多态模型可用于描述聚类区间校验数据的单位级转换过程,但在这一框架内解决 ICS 问题的研究还存在空白。我们提出了两个考虑到 ICS 的多态模型扩展方案,以进行边际推断:一个是纳入聚类内再采样,另一个是构建聚类加权得分函数。我们通过模拟研究评估了所提方法的性能,并将其应用于退伍军人事务牙科纵向研究(VADLS),以了解风险因素对牙周病进展的影响。ICS经常出现在牙科数据中,尤其是在牙周病研究中,因为因牙周病导致牙齿减少的人更容易受到疾病进展的影响。根据模拟结果,从所提出的方法中得到的参数平均估计值接近真实值,但忽略 ICS 的方法会导致很大的偏差。我们提出的聚类多态模型方法能够在对随机抽样聚类的典型单位进行边际推断时适当考虑 ICS。
{"title":"Marginal clustered multistate models for longitudinal progressive processes with informative cluster size","authors":"Sean Xinyang Feng, Aya A. Mitani","doi":"10.1002/sam.11668","DOIUrl":"https://doi.org/10.1002/sam.11668","url":null,"abstract":"Informative cluster size (ICS) is a phenomenon where cluster size is related to the outcome. While multistate models can be applied to characterize the unit‐level transition process for clustered interval‐censored data, there is a research gap addressing ICS within this framework. We propose two extensions of multistate model that account for ICS to make marginal inference: one by incorporating within‐cluster resampling and another by constructing cluster‐weighted score functions. We evaluate the performances of the proposed methods through simulation studies and apply them to the Veterans Affairs Dental Longitudinal Study (VADLS) to understand the effect of risk factors on periodontal disease progression. ICS occurs frequently in dental data, particularly in the study of periodontal disease, as people with fewer teeth due to the disease are more susceptible to disease progression. According to the simulation results, the mean estimates of the parameters obtained from the proposed methods are close to the true values, but methods that ignore ICS can lead to substantial bias. Our proposed methods for clustered multistate model are able to appropriately take ICS into account when making marginal inference of a typical unit from a randomly sampled cluster.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"13 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140033736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel two‐step extrapolation‐insertion risk model based on the Expectile under the Pareto‐type distribution 基于帕累托类型分布下的期望值的新型两步外推-插入风险模型
IF 1.3 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-21 DOI: 10.1002/sam.11665
Ziwen Geng
The catastrophe loss model developed is a challenging problem in the insurance industry. In the context of Pareto‐type distribution, measuring risk at the extreme right tail has become a major focus for academic research. The quantile and Expectile of distribution are found to be useful descriptors of its tail, in the same way as the median and mean are related to its central behavior. In this article, a novel two‐step extrapolation‐insertion method is introduced and proved its advantages of less bias and variance theoretically through asymptotic normality by modifying the existing far‐right tail numerical model using the risk measures of Expectile and Expected Shortfall (ES). In addition, another solution to obtain the ES is proposed based on the fitted extreme distribution, which is demonstrated to have superior unbiased statistical properties. Uniting these two methods provides the numerical interval upper and lower bounds for capturing the real quantile‐based ES commonly used in insurance. The numerical simulation and the empirical analysis results of Danish reinsurance claim data indicate that these methods offer high prediction accuracy in the applications of catastrophe risk management.
所开发的巨灾损失模型是保险业的一个挑战性问题。在帕累托类型分布的背景下,衡量右极端尾部的风险已成为学术研究的重点。正如中位数和平均数与中心行为的关系一样,分布的量值和期望值被认为是其尾部的有用描述符。本文介绍了一种新颖的两步外推-插入法,并通过渐近正态性对现有的远右尾数值模型进行修改,利用期望值和期望缺口(ES)的风险度量,从理论上证明了其偏差和方差较小的优点。此外,还提出了另一种基于拟合极值分布的 ES 求解方法,证明其具有优越的无偏统计特性。将这两种方法结合起来,就能获得保险业常用的基于真实量值的 ES 的数值区间上下限。丹麦再保险理赔数据的数值模拟和实证分析结果表明,这些方法在巨灾风险管理应用中具有很高的预测准确性。
{"title":"A novel two‐step extrapolation‐insertion risk model based on the Expectile under the Pareto‐type distribution","authors":"Ziwen Geng","doi":"10.1002/sam.11665","DOIUrl":"https://doi.org/10.1002/sam.11665","url":null,"abstract":"The catastrophe loss model developed is a challenging problem in the insurance industry. In the context of Pareto‐type distribution, measuring risk at the extreme right tail has become a major focus for academic research. The quantile and Expectile of distribution are found to be useful descriptors of its tail, in the same way as the median and mean are related to its central behavior. In this article, a novel two‐step extrapolation‐insertion method is introduced and proved its advantages of less bias and variance theoretically through asymptotic normality by modifying the existing far‐right tail numerical model using the risk measures of Expectile and Expected Shortfall (ES). In addition, another solution to obtain the ES is proposed based on the fitted extreme distribution, which is demonstrated to have superior unbiased statistical properties. Uniting these two methods provides the numerical interval upper and lower bounds for capturing the real quantile‐based ES commonly used in insurance. The numerical simulation and the empirical analysis results of Danish reinsurance claim data indicate that these methods offer high prediction accuracy in the applications of catastrophe risk management.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"37 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139953467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian inference for nonprobability samples with nonignorable missingness 具有不可忽略缺失性的非概率样本的贝叶斯推论
IF 1.3 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-21 DOI: 10.1002/sam.11667
Zhan Liu, Xuesong Chen, Ruohan Li, Lanbao Hou
Nonprobability samples, especially web survey data, have been available in many different fields. However, nonprobability samples suffer from selection bias, which will yield biased estimates. Moreover, missingness, especially nonignorable missingness, may also be encountered in nonprobability samples. Thus, it is a challenging task to make inference from nonprobability samples with nonignorable missingness. In this article, we propose a Bayesian approach to infer the population based on nonprobability samples with nonignorable missingness. In our method, different Logistic regression models are employed to estimate the selection probabilities and the response probabilities; the superpopulation model is used to explain the relationship between the study variable and covariates. Further, Bayesian and approximate Bayesian methods are proposed to estimate the response model parameters and the superpopulation model parameters, respectively. Specifically, the estimating functions for the response model parameters and superpopulation model parameters are utilized to derive the approximate posterior distribution in superpopulation model estimation. Simulation studies are conducted to investigate the finite sample performance of the proposed method. The data from the Pew Research Center and the Behavioral Risk Factor Surveillance System are used to show better performance of our proposed method over the other approaches.
非概率样本,尤其是网络调查数据,已经应用于许多不同领域。然而,非概率样本存在选择偏差,这会产生有偏差的估计值。此外,非概率样本中还可能出现遗漏,尤其是不可忽略的遗漏。因此,从非概率样本中进行推断是一项具有挑战性的任务。在本文中,我们提出了一种基于非概率样本的贝叶斯推断方法。在我们的方法中,使用不同的 Logistic 回归模型来估计选择概率和响应概率;使用超人口模型来解释研究变量和协变量之间的关系。此外,还提出了贝叶斯方法和近似贝叶斯方法来分别估计响应模型参数和超人口模型参数。具体而言,利用响应模型参数和超群模型参数的估计函数,得出超群模型估计的近似后验分布。模拟研究考察了所提方法的有限样本性能。利用皮尤研究中心和行为风险因素监测系统的数据,表明我们提出的方法比其他方法具有更好的性能。
{"title":"Bayesian inference for nonprobability samples with nonignorable missingness","authors":"Zhan Liu, Xuesong Chen, Ruohan Li, Lanbao Hou","doi":"10.1002/sam.11667","DOIUrl":"https://doi.org/10.1002/sam.11667","url":null,"abstract":"Nonprobability samples, especially web survey data, have been available in many different fields. However, nonprobability samples suffer from selection bias, which will yield biased estimates. Moreover, missingness, especially nonignorable missingness, may also be encountered in nonprobability samples. Thus, it is a challenging task to make inference from nonprobability samples with nonignorable missingness. In this article, we propose a Bayesian approach to infer the population based on nonprobability samples with nonignorable missingness. In our method, different Logistic regression models are employed to estimate the selection probabilities and the response probabilities; the superpopulation model is used to explain the relationship between the study variable and covariates. Further, Bayesian and approximate Bayesian methods are proposed to estimate the response model parameters and the superpopulation model parameters, respectively. Specifically, the estimating functions for the response model parameters and superpopulation model parameters are utilized to derive the approximate posterior distribution in superpopulation model estimation. Simulation studies are conducted to investigate the finite sample performance of the proposed method. The data from the Pew Research Center and the Behavioral Risk Factor Surveillance System are used to show better performance of our proposed method over the other approaches.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"22 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139953469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling matrix variate time series via hidden Markov models with skewed emissions 通过具有倾斜排放的隐马尔可夫模型建立矩阵变量时间序列模型
IF 1.3 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-20 DOI: 10.1002/sam.11666
Michael P. B. Gallaugher, Xuwen Zhu
Data collected today have increasingly become more complex and cannot be analyzed using regular statistical methods. Matrix variate time series data is one such example where the observations in the time series are matrices. Herein, we introduce a set of three hidden Markov models using skewed matrix variate emission distributions for modeling matrix variate time series data. Compared to the hidden Markov model with matrix variate normal emissions, the proposed models present greater flexibility and are capable of modeling skewness in time series data. Parameter estimation is performed using an expectation maximization algorithm. We then look at both simulated data and salary data for public Texas universities.
如今收集到的数据越来越复杂,无法使用常规统计方法进行分析。矩阵变量时间序列数据就是这样一个例子,时间序列中的观测值都是矩阵。在此,我们介绍了一组使用倾斜矩阵变量发射分布的三个隐马尔可夫模型,用于对矩阵变量时间序列数据建模。与使用矩阵变量正态排放的隐马尔可夫模型相比,所提出的模型具有更大的灵活性,能够对时间序列数据中的偏斜进行建模。参数估计采用期望最大化算法。然后,我们研究了德克萨斯州公立大学的模拟数据和薪资数据。
{"title":"Modeling matrix variate time series via hidden Markov models with skewed emissions","authors":"Michael P. B. Gallaugher, Xuwen Zhu","doi":"10.1002/sam.11666","DOIUrl":"https://doi.org/10.1002/sam.11666","url":null,"abstract":"Data collected today have increasingly become more complex and cannot be analyzed using regular statistical methods. Matrix variate time series data is one such example where the observations in the time series are matrices. Herein, we introduce a set of three hidden Markov models using skewed matrix variate emission distributions for modeling matrix variate time series data. Compared to the hidden Markov model with matrix variate normal emissions, the proposed models present greater flexibility and are capable of modeling skewness in time series data. Parameter estimation is performed using an expectation maximization algorithm. We then look at both simulated data and salary data for public Texas universities.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"37 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139956631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subsampling under distributional constraints 分布约束下的子采样
IF 1.3 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-09 DOI: 10.1002/sam.11661
Florian Combes, Ricardo Fraiman, Badih Ghattas
Some complex models are frequently employed to describe physical and mechanical phenomena. In this setting, we have an input <mjx-container aria-label="upper X" ctxtmenu_counter="0" ctxtmenu_oldtabindex="1" jax="CHTML" role="application" sre-explorer- style="font-size: 103%; position: relative;" tabindex="0"><mjx-math aria-hidden="true"><mjx-semantics><mjx-mrow><mjx-mi data-semantic-annotation="clearspeak:simple" data-semantic-font="italic" data-semantic- data-semantic-role="latinletter" data-semantic-speech="upper X" data-semantic-type="identifier"><mjx-c></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden="true" display="inline" unselectable="on"><math altimg="/cms/asset/7d0e6166-8cd5-42ea-ba28-4c46d5e0ba78/sam11661-math-0001.png" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi data-semantic-="" data-semantic-annotation="clearspeak:simple" data-semantic-font="italic" data-semantic-role="latinletter" data-semantic-speech="upper X" data-semantic-type="identifier">X</mi></mrow>$$ X $$</annotation></semantics></math></mjx-assistive-mml></mjx-container> in a general space, and an output <mjx-container aria-label="upper Y equals f left parenthesis upper X right parenthesis" ctxtmenu_counter="1" ctxtmenu_oldtabindex="1" jax="CHTML" role="application" sre-explorer- style="font-size: 103%; position: relative;" tabindex="0"><mjx-math aria-hidden="true"><mjx-semantics><mjx-mrow data-semantic-children="0,8" data-semantic-content="1" data-semantic- data-semantic-role="equality" data-semantic-speech="upper Y equals f left parenthesis upper X right parenthesis" data-semantic-type="relseq"><mjx-mi data-semantic-annotation="clearspeak:simple" data-semantic-font="italic" data-semantic- data-semantic-parent="9" data-semantic-role="latinletter" data-semantic-type="identifier"><mjx-c></mjx-c></mjx-mi><mjx-mo data-semantic- data-semantic-operator="relseq,=" data-semantic-parent="9" data-semantic-role="equality" data-semantic-type="relation" rspace="5" space="5"><mjx-c></mjx-c></mjx-mo><mjx-mrow data-semantic-annotation="clearspeak:simple" data-semantic-children="2,6" data-semantic-content="7,2" data-semantic- data-semantic-parent="9" data-semantic-role="simple function" data-semantic-type="appl"><mjx-mi data-semantic-annotation="clearspeak:simple" data-semantic-font="italic" data-semantic- data-semantic-operator="appl" data-semantic-parent="8" data-semantic-role="simple function" data-semantic-type="identifier"><mjx-c></mjx-c></mjx-mi><mjx-mo data-semantic-added="true" data-semantic- data-semantic-operator="appl" data-semantic-parent="8" data-semantic-role="application" data-semantic-type="punctuation" style="margin-left: 0.056em; margin-right: 0.056em;"><mjx-c></mjx-c></mjx-mo><mjx-mrow data-semantic-children="4" data-semantic-content="3,5" data-semantic- data-semantic-parent="8" data-semantic-role="leftright" data-semantic-type="fenced"><mjx-mo data-semantic- data-semantic-operator="fenced" data-semantic-parent="6"
我们经常使用一些复杂的模型来描述物理和机械现象。在这种情况下,我们有一个一般空间中的输入 X$$ X $$,以及一个输出 Y=f(X)$$ Y=f(X) $$,其中 f$$ f $ 是一个非常复杂的函数,它对每一个新输入的计算成本都非常高,也可能非常昂贵。我们给定了两组不同大小的 X$$ X$, S1$$ {S}_1 $$ 和 S2$$ {S}_2 $$ 的观测值,其中只有 f(S1)$$ fleft({S}_1right) $$ 是可用的。我们要解决的问题是选择一个规模较小的子集 S3⊂S2$$ {S}_3subset {S}_2 $$ 来运行复杂模型 f$$ f $$,并使 f(S3)$$ fleft({S}_3right) $$ 的经验分布与 f(S1)$$ fleft({S}_1right) $$ 的经验分布接近。我们提出了三种算法来解决这个问题,并使用模拟数据集和机翼自噪声数据集展示了它们的效率。
{"title":"Subsampling under distributional constraints","authors":"Florian Combes, Ricardo Fraiman, Badih Ghattas","doi":"10.1002/sam.11661","DOIUrl":"https://doi.org/10.1002/sam.11661","url":null,"abstract":"Some complex models are frequently employed to describe physical and mechanical phenomena. In this setting, we have an input &lt;mjx-container aria-label=\"upper X\" ctxtmenu_counter=\"0\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"&gt;&lt;mjx-math aria-hidden=\"true\"&gt;&lt;mjx-semantics&gt;&lt;mjx-mrow&gt;&lt;mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"upper X\" data-semantic-type=\"identifier\"&gt;&lt;mjx-c&gt;&lt;/mjx-c&gt;&lt;/mjx-mi&gt;&lt;/mjx-mrow&gt;&lt;/mjx-semantics&gt;&lt;/mjx-math&gt;&lt;mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"&gt;&lt;math altimg=\"/cms/asset/7d0e6166-8cd5-42ea-ba28-4c46d5e0ba78/sam11661-math-0001.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-role=\"latinletter\" data-semantic-speech=\"upper X\" data-semantic-type=\"identifier\"&gt;X&lt;/mi&gt;&lt;/mrow&gt;$$ X $$&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/mjx-assistive-mml&gt;&lt;/mjx-container&gt; in a general space, and an output &lt;mjx-container aria-label=\"upper Y equals f left parenthesis upper X right parenthesis\" ctxtmenu_counter=\"1\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"&gt;&lt;mjx-math aria-hidden=\"true\"&gt;&lt;mjx-semantics&gt;&lt;mjx-mrow data-semantic-children=\"0,8\" data-semantic-content=\"1\" data-semantic- data-semantic-role=\"equality\" data-semantic-speech=\"upper Y equals f left parenthesis upper X right parenthesis\" data-semantic-type=\"relseq\"&gt;&lt;mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"9\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"&gt;&lt;mjx-c&gt;&lt;/mjx-c&gt;&lt;/mjx-mi&gt;&lt;mjx-mo data-semantic- data-semantic-operator=\"relseq,=\" data-semantic-parent=\"9\" data-semantic-role=\"equality\" data-semantic-type=\"relation\" rspace=\"5\" space=\"5\"&gt;&lt;mjx-c&gt;&lt;/mjx-c&gt;&lt;/mjx-mo&gt;&lt;mjx-mrow data-semantic-annotation=\"clearspeak:simple\" data-semantic-children=\"2,6\" data-semantic-content=\"7,2\" data-semantic- data-semantic-parent=\"9\" data-semantic-role=\"simple function\" data-semantic-type=\"appl\"&gt;&lt;mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"simple function\" data-semantic-type=\"identifier\"&gt;&lt;mjx-c&gt;&lt;/mjx-c&gt;&lt;/mjx-mi&gt;&lt;mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"application\" data-semantic-type=\"punctuation\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"&gt;&lt;mjx-c&gt;&lt;/mjx-c&gt;&lt;/mjx-mo&gt;&lt;mjx-mrow data-semantic-children=\"4\" data-semantic-content=\"3,5\" data-semantic- data-semantic-parent=\"8\" data-semantic-role=\"leftright\" data-semantic-type=\"fenced\"&gt;&lt;mjx-mo data-semantic- data-semantic-operator=\"fenced\" data-semantic-parent=\"6\"","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"36 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139769428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rarity updated ensemble with oversampling: An ensemble approach to classification of imbalanced data streams 超采样的稀有性更新集合:不平衡数据流分类的集合方法
IF 1.3 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-09 DOI: 10.1002/sam.11662
Zahra Nouri, Vahid Kiani, Hamid Fadishei
Today's ever-increasing generation of streaming data demands novel data mining approaches tailored to mining dynamic data streams. Data streams are non-static in nature, continuously generated, and endless. They often suffer from class imbalance and undergo temporal drift. To address the classification of consecutive data instances within imbalanced data streams, this research introduces a new ensemble classification algorithm called Rarity Updated Ensemble with Oversampling (RUEO). The RUEO approach is specifically designed to exhibit robustness against class imbalance by incorporating an imbalance-specific criterion to assess the efficacy of the base classifiers and employing an oversampling technique to reduce the imbalance in the training data. The RUEO algorithm was evaluated on a set of 20 data streams and compared against 14 baseline algorithms. On average, the proposed RUEO algorithm achieves an average-accuracy of 0.69 on the real-world data streams, while the chunk-based algorithms AWE, AUE, and KUE achieve average-accuracies of 0.48, 0.65, and 0.66, respectively. The statistical analysis, conducted using the Wilcoxon test, reveals a statistically significant improvement in average-accuracy for the proposed RUEO algorithm when compared to 12 out of the 14 baseline algorithms. The source code and experimental results of this research work will be publicly available at https://github.com/vkiani/RUEO.
当今,流式数据的生成量不断增加,这就要求采用新颖的数据挖掘方法来挖掘动态数据流。数据流本质上是非静态的、持续生成的和无穷无尽的。它们经常会出现类不平衡和时间漂移的问题。为了解决不平衡数据流中连续数据实例的分类问题,本研究引入了一种新的集合分类算法,称为 "带超采样的稀有性更新集合"(RUEO)。RUEO 方法专门针对类不平衡而设计,它采用了针对不平衡的标准来评估基础分类器的功效,并采用了超采样技术来减少训练数据中的不平衡。RUEO 算法在一组 20 个数据流上进行了评估,并与 14 种基准算法进行了比较。平均而言,拟议的 RUEO 算法在真实世界数据流上的平均准确率为 0.69,而基于块的算法 AWE、AUE 和 KUE 的平均准确率分别为 0.48、0.65 和 0.66。使用 Wilcoxon 检验进行的统计分析显示,与 14 种基线算法中的 12 种相比,拟议的 RUEO 算法在平均准确率方面有显著提高。这项研究工作的源代码和实验结果将在 https://github.com/vkiani/RUEO 上公开。
{"title":"Rarity updated ensemble with oversampling: An ensemble approach to classification of imbalanced data streams","authors":"Zahra Nouri, Vahid Kiani, Hamid Fadishei","doi":"10.1002/sam.11662","DOIUrl":"https://doi.org/10.1002/sam.11662","url":null,"abstract":"Today's ever-increasing generation of streaming data demands novel data mining approaches tailored to mining dynamic data streams. Data streams are non-static in nature, continuously generated, and endless. They often suffer from class imbalance and undergo temporal drift. To address the classification of consecutive data instances within imbalanced data streams, this research introduces a new ensemble classification algorithm called Rarity Updated Ensemble with Oversampling (RUEO). The RUEO approach is specifically designed to exhibit robustness against class imbalance by incorporating an imbalance-specific criterion to assess the efficacy of the base classifiers and employing an oversampling technique to reduce the imbalance in the training data. The RUEO algorithm was evaluated on a set of 20 data streams and compared against 14 baseline algorithms. On average, the proposed RUEO algorithm achieves an average-accuracy of 0.69 on the real-world data streams, while the chunk-based algorithms AWE, AUE, and KUE achieve average-accuracies of 0.48, 0.65, and 0.66, respectively. The statistical analysis, conducted using the Wilcoxon test, reveals a statistically significant improvement in average-accuracy for the proposed RUEO algorithm when compared to 12 out of the 14 baseline algorithms. The source code and experimental results of this research work will be publicly available at https://github.com/vkiani/RUEO.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"247 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139769619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A deep learning approach for the comparison of handwritten documents using latent feature vectors 使用潜在特征向量比较手写文件的深度学习方法
IF 1.3 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-02-07 DOI: 10.1002/sam.11660
Juhyeon Kim, Soyoung Park, Alicia Carriquiry
Forensic questioned document examiners still largely rely on visual assessments and expert judgment to determine the provenance of a handwritten document. Here, we propose a novel approach to objectively compare two handwritten documents using a deep learning algorithm. First, we implement a bootstrapping technique to segment document data into smaller units, as a means to enhance the efficiency of the deep learning process. Next, we use a transfer learning algorithm to systematically extract document features. The unique characteristics of the document data are then represented as latent vectors. Finally, the similarity between two handwritten documents is quantified via the cosine similarity between the two latent vectors. We illustrate the use of the proposed method by implementing it on a variety of collections of handwritten documents with different attributes, and show that in most cases, we can accurately classify pairs of documents into same or different author categories.
法证问题文件检验员在很大程度上仍然依赖目测评估和专家判断来确定手写文件的出处。在此,我们提出一种新方法,利用深度学习算法对两份手写文件进行客观比较。首先,我们采用引导技术将文档数据分割成更小的单元,以此提高深度学习过程的效率。接下来,我们使用迁移学习算法系统地提取文档特征。然后,将文档数据的独特特征表示为潜在向量。最后,通过两个潜向量之间的余弦相似度来量化两个手写文档之间的相似度。我们在各种不同属性的手写文档集合上实施了这一方法,以说明所提方法的用途,结果表明,在大多数情况下,我们都能准确地将文档对分为相同或不同的作者类别。
{"title":"A deep learning approach for the comparison of handwritten documents using latent feature vectors","authors":"Juhyeon Kim, Soyoung Park, Alicia Carriquiry","doi":"10.1002/sam.11660","DOIUrl":"https://doi.org/10.1002/sam.11660","url":null,"abstract":"Forensic questioned document examiners still largely rely on visual assessments and expert judgment to determine the provenance of a handwritten document. Here, we propose a novel approach to objectively compare two handwritten documents using a deep learning algorithm. First, we implement a bootstrapping technique to segment document data into smaller units, as a means to enhance the efficiency of the deep learning process. Next, we use a transfer learning algorithm to systematically extract document features. The unique characteristics of the document data are then represented as latent vectors. Finally, the similarity between two handwritten documents is quantified via the cosine similarity between the two latent vectors. We illustrate the use of the proposed method by implementing it on a variety of collections of handwritten documents with different attributes, and show that in most cases, we can accurately classify pairs of documents into same or different author categories.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"136 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139769432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparse Bayesian variable selection in high-dimensional logistic regression models with correlated priors 具有相关先验的高维逻辑回归模型中的稀疏贝叶斯变量选择
IF 1.3 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-30 DOI: 10.1002/sam.11663
Zhuanzhuan Ma, Zifei Han, Souparno Ghosh, Liucang Wu, Min Wang
In this paper, we propose a sparse Bayesian procedure with global and local (GL) shrinkage priors for the problems of variable selection and classification in high-dimensional logistic regression models. In particular, we consider two types of GL shrinkage priors for the regression coefficients, the horseshoe (HS) prior and the normal-gamma (NG) prior, and then specify a correlated prior for the binary vector to distinguish models with the same size. The GL priors are then combined with mixture representations of logistic distribution to construct a hierarchical Bayes model that allows efficient implementation of a Markov chain Monte Carlo (MCMC) to generate samples from posterior distribution. We carry out simulations to compare the finite sample performances of the proposed Bayesian method with the existing Bayesian methods in terms of the accuracy of variable selection and prediction. Finally, two real-data applications are provided for illustrative purposes.
本文针对高维逻辑回归模型中的变量选择和分类问题,提出了一种具有全局和局部(GL)收缩先验的稀疏贝叶斯程序。具体而言,我们为回归系数考虑了两种 GL 收缩先验,即马蹄形先验(HS)和正态伽马先验(NG),然后为二元向量指定了一个相关先验,以区分具有相同大小的模型。然后,将 GL 先验与逻辑分布的混合表示相结合,构建分层贝叶斯模型,从而高效地实施马尔科夫链蒙特卡罗(MCMC),从后验分布中生成样本。我们进行了模拟,比较了所提出的贝叶斯方法与现有贝叶斯方法在变量选择和预测准确性方面的有限样本性能。最后,我们提供了两个实际数据应用,以作说明。
{"title":"Sparse Bayesian variable selection in high-dimensional logistic regression models with correlated priors","authors":"Zhuanzhuan Ma, Zifei Han, Souparno Ghosh, Liucang Wu, Min Wang","doi":"10.1002/sam.11663","DOIUrl":"https://doi.org/10.1002/sam.11663","url":null,"abstract":"In this paper, we propose a sparse Bayesian procedure with global and local (GL) shrinkage priors for the problems of variable selection and classification in high-dimensional logistic regression models. In particular, we consider two types of GL shrinkage priors for the regression coefficients, the horseshoe (HS) prior and the normal-gamma (NG) prior, and then specify a correlated prior for the binary vector to distinguish models with the same size. The GL priors are then combined with mixture representations of logistic distribution to construct a hierarchical Bayes model that allows efficient implementation of a Markov chain Monte Carlo (MCMC) to generate samples from posterior distribution. We carry out simulations to compare the finite sample performances of the proposed Bayesian method with the existing Bayesian methods in terms of the accuracy of variable selection and prediction. Finally, two real-data applications are provided for illustrative purposes.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"22 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139646214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Considerations in Bayesian agent-based modeling for the analysis of COVID-19 data 基于贝叶斯代理建模分析 COVID-19 数据的考虑因素
IF 1.3 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-25 DOI: 10.1002/sam.11655
Seungha Um, Samrachana Adhikari
Agent-based model (ABM) has been widely used to study infectious disease transmission by simulating behaviors and interactions of autonomous individuals called agents. In the ABM, agent states, for example infected or susceptible, are assigned according to a set of simple rules, and a complex dynamics of disease transmission is described by the collective states of agents over time. Despite the flexibility in real-world modeling, ABMs have received less attention by statisticians because of the intractable likelihood functions which lead to difficulty in estimating parameters and quantifying uncertainty around model outputs. To overcome this limitation, a Bayesian framework that treats the entire ABM as a Hidden Markov Model has been previously proposed. However, existing approach is limited due to computational inefficiency and unidentifiability of parameters. We extend the ABM approach within Bayesian framework to study infectious disease transmission addressing these limitations. We estimate the hidden states, represented by individual agent's states over time, and the model parameters by applying an improved particle Markov Chain Monte Carlo algorithm, that accounts for computing efficiency. We further evaluate the performance of the approach for parameter recovery and prediction, along with sensitivity to prior assumptions under various simulation conditions. Finally, we apply the proposed approach to the study of COVID-19 outbreak on Diamond Princess cruise ship. We examine the differences in transmission by key demographic characteristics, while considering two different networks and limited COVID-19 testing in the cruise.
基于代理的模型(ABM)通过模拟被称为代理的自主个体的行为和相互作用,被广泛用于研究传染病的传播。在 ABM 中,代理状态(如感染或易感)根据一套简单的规则分配,疾病传播的复杂动态则由代理随时间变化的集体状态来描述。尽管在现实世界中建模非常灵活,但 ABM 却较少受到统计学家的关注,原因是其似然函数难以处理,导致难以估计参数和量化模型输出的不确定性。为了克服这一局限性,之前有人提出了一种贝叶斯框架,将整个 ABM 视为隐马尔可夫模型。然而,由于计算效率低下和参数的不可识别性,现有方法受到了限制。我们在贝叶斯框架内扩展了 ABM 方法,以研究传染病的传播,从而解决这些局限性。我们采用改进的粒子马尔可夫链蒙特卡洛算法来估计隐含状态(由单个代理随时间变化的状态表示)和模型参数,该算法考虑到了计算效率。我们进一步评估了该方法在参数恢复和预测方面的性能,以及在各种模拟条件下对先验假设的敏感性。最后,我们将提出的方法应用于钻石公主号游轮 COVID-19 疫情的研究。我们研究了主要人口特征在传播方面的差异,同时考虑了两个不同的网络和游轮上有限的 COVID-19 测试。
{"title":"Considerations in Bayesian agent-based modeling for the analysis of COVID-19 data","authors":"Seungha Um, Samrachana Adhikari","doi":"10.1002/sam.11655","DOIUrl":"https://doi.org/10.1002/sam.11655","url":null,"abstract":"Agent-based model (ABM) has been widely used to study infectious disease transmission by simulating behaviors and interactions of autonomous individuals called agents. In the ABM, agent states, for example infected or susceptible, are assigned according to a set of simple rules, and a complex dynamics of disease transmission is described by the collective states of agents over time. Despite the flexibility in real-world modeling, ABMs have received less attention by statisticians because of the intractable likelihood functions which lead to difficulty in estimating parameters and quantifying uncertainty around model outputs. To overcome this limitation, a Bayesian framework that treats the entire ABM as a Hidden Markov Model has been previously proposed. However, existing approach is limited due to computational inefficiency and unidentifiability of parameters. We extend the ABM approach within Bayesian framework to study infectious disease transmission addressing these limitations. We estimate the hidden states, represented by individual agent's states over time, and the model parameters by applying an improved particle Markov Chain Monte Carlo algorithm, that accounts for computing efficiency. We further evaluate the performance of the approach for parameter recovery and prediction, along with sensitivity to prior assumptions under various simulation conditions. Finally, we apply the proposed approach to the study of COVID-19 outbreak on Diamond Princess cruise ship. We examine the differences in transmission by key demographic characteristics, while considering two different networks and limited COVID-19 testing in the cruise.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"4 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139582015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Statistical Analysis and Data Mining
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1