<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article
  PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD with MathML3 v1.2 20190208//EN" "JATS-journalpublishing1-3-mathml3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ali="http://www.niso.org/schemas/ali/1.0/" article-type="research-article" dtd-version="1.2" xml:lang="en">
<front>
<journal-meta><journal-id journal-id-type="publisher-id">QCMB</journal-id><journal-id journal-id-type="nlm-ta">Quant Comput Methods Behav Sci</journal-id>
<journal-title-group>
<journal-title>Quantitative and Computational Methods in Behavioral Sciences</journal-title><abbrev-journal-title abbrev-type="pubmed">Quant. Comput. Methods Behav. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2699-8432</issn>
<publisher><publisher-name>PsychOpen</publisher-name></publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">qcmb.14891</article-id>
<article-id pub-id-type="doi">10.5964/qcmb.14891</article-id>
<article-categories>
<subj-group subj-group-type="heading"><subject>Method Dissemination Article</subject></subj-group>

<subj-group subj-group-type="badge">
<subject>Data</subject>
<subject>Code</subject>
<subject>Materials</subject>
</subj-group>

</article-categories>
<title-group>
	<article-title>Linear Classification Methods for Multivariate Repeated Measures Data — A Simulation Study</article-title>
	<alt-title alt-title-type="right-running">Multivariate Repeated Measures Data Classification</alt-title>
	<alt-title specific-use="APA-reference-style" xml:lang="en">Linear classification methods for multivariate repeated measures data — A simulation study</alt-title>
</title-group>
<contrib-group content-type="authors">
	<contrib id="author-1" contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid" authenticated="false">https://orcid.org/0000-0002-0149-479X</contrib-id><name name-style="western"><surname>Graf</surname><given-names>Ricarda</given-names></name><xref ref-type="corresp" rid="cor1">*</xref><xref ref-type="aff" rid="aff1">1</xref></contrib>
	<contrib id="author-2" contrib-type="author"><contrib-id contrib-id-type="orcid" authenticated="false">https://orcid.org/0000-0003-0172-9904</contrib-id><name name-style="western"><surname>Zeldovich</surname><given-names>Marina</given-names></name><xref ref-type="aff" rid="aff2">2</xref><xref ref-type="aff" rid="aff3">3</xref></contrib>
	<contrib id="author-3" contrib-type="author"><contrib-id contrib-id-type="orcid" authenticated="false">https://orcid.org/0000-0003-0291-4378</contrib-id><name name-style="western"><surname>Friedrich</surname><given-names>Sarah</given-names></name><xref ref-type="aff" rid="aff1">1</xref><xref ref-type="aff" rid="aff4">4</xref></contrib>
<contrib contrib-type="editor">
<name>
	<surname>Karch</surname>
	<given-names>Julian</given-names>
</name>
<xref ref-type="aff" rid="aff5"/>
</contrib>
<aff id="aff1"><label>1</label><institution>Department of Mathematics, University of Augsburg</institution>, <addr-line>Augsburg</addr-line>, <country country="DE">Germany</country></aff>
	<aff id="aff2"><label>2</label><institution>Institute of Psychology, University of Innsbruck</institution>, <addr-line>Innsbruck</addr-line>, <country country="AT">Austria</country></aff>
<aff id="aff3"><label>3</label><institution>Faculty of Psychotherapy Science, Sigmund Freud University Vienna</institution>, <addr-line>Vienna</addr-line>, <country country="AT">Austria</country></aff>
<aff id="aff4"><label>4</label><institution>Centre for Advanced Analytics and Predictive Sciences (CAAPS), University of Augsburg</institution>, <addr-line>Augsburg</addr-line>, <country country="DE">Germany</country></aff>
	<aff id="aff5">Leiden University, Leiden, <country>the Netherlands</country></aff>
</contrib-group>
	
	<author-notes>
		<corresp id="cor1"><label>*</label>Department of Mathematics, University of Augsburg, Universitätsstraße 2, 86159 Augsburg, Germany. <email xlink:href="ricarda.graf@math.uni-augsburg.de">ricarda.graf@math.uni-augsburg.de</email></corresp>
	</author-notes>
	
	
<pub-date pub-type="epub"><day>10</day><month>07</month><year>2025</year></pub-date>
<pub-date pub-type="collection" publication-format="electronic"><year>2025</year></pub-date>
<volume>5</volume>
	<elocation-id>e14891</elocation-id>
<history>
<date date-type="received">
<day>19</day>
<month>06</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>19</day>
<month>05</month>
<year>2025</year>
</date>
<date date-type="corrected">
<day>23</day>
<month>07</month>
<year>2025</year>
</date>
</history>
<permissions><copyright-year>2025</copyright-year><copyright-holder>Graf, Zeldovich, &amp; Friedrich</copyright-holder><license license-type="open-access" specific-use="CC BY 4.0" xlink:href="https://creativecommons.org/licenses/by/4.0/"><ali:license_ref>https://creativecommons.org/licenses/by/4.0/</ali:license_ref><license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution (CC BY) 4.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p></license></permissions>
<abstract>
<p>Researchers in the behavioral and social sciences use linear discriminant analysis (LDA) for predictions of group membership (classification) and for identifying the variables most relevant to group separation among a set of continuous correlated variables (description). In these and other disciplines, longitudinal data are often collected which provide additional temporal information. Linear classification methods for repeated measures data are more sensitive to actual group differences by taking the complex correlations between time points and variables into account, but are rarely discussed in the literature. Moreover, psychometric data rarely fulfill the multivariate normality assumption.</p>
<p>In this paper, we compare existing linear classification algorithms for nonnormally distributed multivariate repeated measures data in a simulation study based on psychological questionnaire data comprising Likert scales. The results show that in data without any specific assumed structure and larger sample sizes, the robust alternatives to standard repeated measures LDA may not be needed. To our knowledge, this is one of the few studies discussing repeated measures classification techniques, and the first one comparing multiple alternatives among each other.</p>
</abstract>
<kwd-group kwd-group-type="author"><kwd>Likert-type data</kwd><kwd>linear classification</kwd><kwd>multivariate repeated measures data</kwd><kwd>nonnormality</kwd><kwd>robustness</kwd></kwd-group>

</article-meta>
</front>
<body>
	<sec sec-type="intro" id="intro"><title/>	
		<p id="S1.p1">In psychology and the social sciences, discriminant analysis (DA) is traditionally applied to classification tasks in data with continuous variables since its invention by <xref ref-type="bibr" rid="bib25">Fisher (1936)</xref>. Based on estimates of group means and the pooled covariance matrix, a classification rule is obtained or relative variable weights can be computed, respectively. Its importance for the behavioral sciences has often been emphasized in reviews, tutorials and textbooks (<xref ref-type="bibr" rid="bib7">Betz, 1987</xref>; <xref ref-type="bibr" rid="bib8">Boedeker &amp; Kearns, 2019</xref>; <xref ref-type="bibr" rid="bib24">Field, 2017</xref>; <xref ref-type="bibr" rid="bib26">Fletcher et al., 1978</xref>; <xref ref-type="bibr" rid="bib29">Garrett, 1943</xref>; <xref ref-type="bibr" rid="bib36">Huberty &amp; Olejnik, 2006</xref>; <xref ref-type="bibr" rid="bib67">Sherry, 2006</xref>). It has been applied to a large number of problems in experimental and applied psychology for class prediction as well as description (<xref ref-type="bibr" rid="bib1">Aggarwala et al., 2022</xref>; <xref ref-type="bibr" rid="bib45">Kumpulainen et al., 2021</xref>; <xref ref-type="bibr" rid="bib46">Langlois et al., 2000</xref>; <xref ref-type="bibr" rid="bib55">O’Brien et al., 2009</xref>; <xref ref-type="bibr" rid="bib60">Rogge &amp; Bradbury, 1999</xref>; <xref ref-type="bibr" rid="bib68">Shinba et al., 2021</xref>; <xref ref-type="bibr" rid="bib70">Stoyanov et al., 2022</xref>).</p>
		<p id="S1.p2">In contrast to multivariate data measured at a single time point, longitudinal data provide additional information about temporal changes, wherefore they are collected in various disciplines, including psychology and the social sciences (<xref ref-type="bibr" rid="bib3">Banks et al., 2021</xref>; <xref ref-type="bibr" rid="bib40">Jensen et al., 2021</xref>; <xref ref-type="bibr" rid="bib50">McLanahan et al., 2019</xref>). Despite these potential applications for repeated measures DA or alternative linear classification techniques, textbooks discussing DA do not mention respective repeated measures approaches (<xref ref-type="bibr" rid="bib48">Lix &amp; Sajobi, 2010</xref>).</p>
		<p id="S1.p3">To complicate matters further, many classification approaches for continuous multivariate repeated measures data assume multivariate normality (<xref ref-type="bibr" rid="bib31">Gupta, 1986</xref>; <xref ref-type="bibr" rid="bib63">Roy &amp; Khattree, 2005a</xref>, <xref ref-type="bibr" rid="bib64">2005b</xref>; <xref ref-type="bibr" rid="bib75">Tomasko et al., 2010</xref>), but this assumption is rarely fulfilled by psychological datasets and hard to verify for small sample sizes (<xref ref-type="bibr" rid="bib6">Beaumont et al., 2006</xref>; <xref ref-type="bibr" rid="bib20">Delacre et al., 2017</xref>; <xref ref-type="bibr" rid="bib53">Neto et al., 2016</xref>; <xref ref-type="bibr" rid="bib56">Rausch &amp; Kelley, 2009</xref>). Psychological data, especially those obtained using patient-reported instruments, are often characterized by skewness.</p>
		<p id="S1.p4">There are only few alternative repeated measures approaches which relax or overcome the multivariate normality assumption and take the complex correlation structure between time points and variables into account. It is the aim of this manuscript to compare these approaches in an extensive simulation study. In particular, we consider two modifications of repeated measures LDA by <xref ref-type="bibr" rid="bib11">Brobbey (2021)</xref> and <xref ref-type="bibr" rid="bib12">Brobbey et al. (2022)</xref> that are more robust to deviations from multivariate normality, and the generalization of the support vector machine classifier by <xref ref-type="bibr" rid="bib17">Chen and Bowman (2011)</xref> to longitudinal data, which is a nonparametric linear classifier when used with a linear kernel. We compare these methods’ performance among each other and choose more general, realistic simulation settings, including unequal sample sizes, unstructured covariance matrices, and varying correlations over time instead of assuming any specific pattern.</p>
		<p id="S1.p5"><xref ref-type="bibr" rid="bib11">Brobbey (2021)</xref> compares the standard repeated measures LDA (assuming multivariate normality and homoscedasticity) to its performance after preceding multivariate outlier removal based on two trimming algorithms (<xref ref-type="bibr" rid="bib61">Rousseeuw, 1985</xref>). In <xref ref-type="bibr" rid="bib12">Brobbey et al. (2022)</xref>, the performance of the standard approach is compared to its performance when based on parsimonious Kronecker product structure covariance matrix estimates (<xref ref-type="bibr" rid="bib12">Brobbey et al., 2022</xref>) from the Generalized Estimating Equations (GEE) model (<xref ref-type="bibr" rid="bib38">Inan, 2015</xref>). The longitudinal support vector machine classifier by <xref ref-type="bibr" rid="bib17">Chen and Bowman (2011)</xref> uses a weighted combination of multivariate measurements taken at several time points as input in order to represent the data structure more realistically.</p>
<p id="S1.p6">Thus, this paper provides a neutral comparison study which evaluates the performance of the standard repeated measures LDA, its robust and nonparametric alternatives as well as all possible combinations thereof, in linear classification problems of multivariate repeated measures data and investigate their robustness when data deviate from multivariate normality. In order to mimic realistic datasets, we base simulations on unstructured means and covariance matrices estimated from psychometric reference datasets which differ in sample size, sample size ratios, class overlap, temporal variation and number of measurement occasions. In addition to method comparisons using data simulations, we evaluate the algorithms’ performance in the reference data using a nonparametric bootstrap approach which provides confidence intervals for the performance measures (<xref ref-type="bibr" rid="bib79">Wahl et al., 2016</xref>).</p>
<p id="S1.p7">The paper is organized as follows. In the <xref ref-type="sec" rid="s2">Data</xref> section we explain the general structure of Likert-type data and its analysis. Some of the literature sources mention the need for longitudinal techniques. We then discuss the characteristics of the five reference datasets, which are based on Likert-type data. In the <xref ref-type="sec" rid="s3">Method</xref> section, we introduce the classification algorithms whose performance we compare, and the two approaches based on the reference data and data simulations, respectively, to compare them. In the <xref ref-type="sec" rid="s4">Results and Discussion</xref>, we present and discuss the results and provide recommendations based on the findings. <xref ref-type="sec" rid="s5">Conclusions</xref> are made in the final section.</p></sec>
<sec id="s2" sec-type="Data"><title>Data</title>
	<p id="S2.p1">Questionnaires using Likert-type responses data are a typical example of psychological data to which LDA is applied. In the <xref ref-type="sec" rid="s2_1">Psychological Questionnaires Using Likert-Type Scales</xref> section we describe the general data structure and how LDA for linear classification is used for validating the importance of a particular subset of variables with the aim of distinguishing two groups. Some sources explicitly mention the need for longitudinal techniques, emphasizing the need for discussing available techniques. In <xref ref-type="sec" rid="s2_2">Reference Datasets</xref>, we present the two reference datasets for which individuals completed standardized questionnaires using Likert-type responses. In order to examine the methods’ performance in further relevant scenarios, we additionally considered multiple modifications of these datasets, which will also be described.</p>
<sec id="s2_1"><title>Psychological Questionnaires Using Likert-Type Scales</title>
	<p id="S2_1.p1S2.SS1.Px1.p1">In psychological and social science research, behaviour is most often assessed by self-report questionnaires using Likert scales (<xref ref-type="bibr" rid="bib5">Baumeister et al., 2007</xref>; <xref ref-type="bibr" rid="bib18">Clark &amp; Watson, 2019</xref>; <xref ref-type="bibr" rid="bib71">Sullivan &amp; Artino, 2013</xref>). It is common practice to create pools of Likert items to form subscales which each represent an aspect of the overall construct that the questionnaire is intended to investigate. Single Likert items (i.e., questions) are not considered to suﬃciently capture these aspects (<xref ref-type="bibr" rid="bib18">Clark &amp; Watson, 2019</xref>; <xref ref-type="bibr" rid="bib58">Rickards et al., 2012</xref>) and are therefore summarized into subscales by considering either the sum or average of subgroups of Likert items. The development and best practices of constructing questionnaires using Likert-type responses is discussed in the methodological psychology literature (<xref ref-type="bibr" rid="bib39">Jebb et al., 2021</xref>). <xref ref-type="bibr" rid="bib47">Likert (1932)</xref> developed the typical 5- or 7-point ordinal scale on which single items are measured, e.g., ranging from “Strongly approve” to “Strongly disapprove”. He suggests to assign numerical values to the answer choices in the same order as they are ranked. However, he does not suggest that these ordinal values must necessarily be translated into an equidistant scale, and states that the same results will be obtained as long as the rank order is preserved. This translation of an ordinal scale into a numerical scale conditional on rank preservation is considered to be legitimate elsewhere (<xref ref-type="bibr" rid="bib69">Silan, 2020</xref>). So in conclusion, the distances between the numerical values are irrelevant to the analysis (<xref ref-type="bibr" rid="bib28">Gaito, 1980</xref>) which complies with the ordinal measurement scale of the Likert items where distances between answer choices cannot be measured. <xref ref-type="bibr" rid="bib47">Likert (1932)</xref> suggests to subsequently take the sum or mean of the transformed values, which he assumes to be normally distributed. There is a long-standing debate about how Likert-type scales should appropriately be analysed but the prevailing opinion due to vast empirical evidence (<xref ref-type="bibr" rid="bib14">Carifio &amp; Perla, 2007</xref>; <xref ref-type="bibr" rid="bib54">Norman, 2010</xref>) is that survey scales as opposed to single Likert items may be treated as interval data such that means and standard deviations can be computed, and parametric methods should be applied to them (<xref ref-type="bibr" rid="bib15">Carifio &amp; Perla, 2008</xref>; <xref ref-type="bibr" rid="bib58">Rickards et al., 2012</xref>; <xref ref-type="bibr" rid="bib71">Sullivan &amp; Artino, 2013</xref>). Specific examples for the application of LDA to questionnaire data based on Likert-type scales are <xref ref-type="bibr" rid="bib42">Knowles et al. (2000)</xref>, <xref ref-type="bibr" rid="bib43">Kristjansdottir et al. (2018)</xref>, <xref ref-type="bibr" rid="bib78">Veronese and Pepe (2017)</xref>, and <xref ref-type="bibr" rid="bib80">Wang et al. (2016)</xref>. In all of these studies, the authors computed Fisher discriminant function coeﬃcients (descriptive DA) for the subscales of the considered psychological questionnaires using Likert-type responses and showed the validity of these coeﬃcients, i.e., their discriminative ability, by subsequent linear classification (predictive DA). In particular, <xref ref-type="bibr" rid="bib80">Wang et al. (2016)</xref> examine a longitudinal data set but restrict their analysis to time point one when applying LDA. <xref ref-type="bibr" rid="bib78">Veronese and Pepe (2017)</xref> emphasize the need to explore the dynamic relations between their chosen subscales over time and point out their restriction to cross-sectional data in their LDA as a considerable limitation.</p></sec>
<sec id="s2_2"><title>Reference Datasets</title>
<p id="S2.SS2.Px1.p1">Two datasets differing in the number of repeated measurement occasions, as well as two modifications thereof, are used as reference datasets. Each original dataset comprises measurements of four continuous predictor variables which are measured at two time points (CORE-OM dataset) and four time points (CASP-19 dataset), respectively. The binary outcome variable represents the group (<inline-formula><mml:math id="math-1" display="inline"><mml:mi>y</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">}</mml:mo></mml:math></inline-formula>). Both of these standardized psychological questionnaires consist of Likert-type questions measured on a 5-point and 4-point Likert scale, respectively. According to the developers of these questionnaires, we considered the mean score of multiple Likert items in case of the CORE-OM dataset, and the sum score in case of the CASP-19 dataset, respectively, as the basis for parameter estimation and subsequent data simulation.</p>
<p id="S2.SS1.Px1.p2">We created reference datasets from these data in order to compare the methods’ performance in different (almost) realistic settings, not in order to draw any substantive conclusions about the data themselves. Datasets differ among others in sample sizes, sample size ratios, class overlap, temporal variation, and number of measurement occasions.</p>
<p id="S2.SS1.Px1.p3">The first dataset (<xref ref-type="bibr" rid="bib85">Zeldovich, 2018</xref>) is a self-report questionnaire of psychological distress abbreviated to CORE-OM (Clinical Outcomes in Routine Evaluation-Outcome Measure) (<xref ref-type="bibr" rid="bib4">Barkham et al., 1998</xref>). It assesses the progress of psychological or psychotherapeutic treatment using four domains (subjective well-being, problems/symptoms, life functioning, risk/harm) measured on a 5-point Likert scale (0: not at all, 1: only occasionally, 2: sometimes, 3: often, 4: most or all the time). Our dataset uses the binary variable hospitalisation as group variable and is denoted as “Dataset 1” in the following. Non-hospitalised participants represent Group 0 (<inline-formula><mml:math id="math-2" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>42</mml:mn></mml:math></inline-formula>) and hospitalised ones Group 1 (<inline-formula><mml:math id="math-3" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>142</mml:mn></mml:math></inline-formula>).</p>
<p id="S2.SS1.Px1.p4">The second dataset is a self-report questionnaire of quality of life developed for adults aged 60 and older abbreviated to CASP-19 (<xref ref-type="bibr" rid="bib37">Hyde et al., 2003</xref>). The dataset on CASP-19 is derived from Waves 2, 3, 4, and 5 of the English Longitudinal Study of Ageing (ELSA) (<xref ref-type="bibr" rid="bib3">Banks et al., 2021</xref>). The CASP-19 questionnaire comprises four subdomains (Control, Autonomy, Self-realization, Pleasure) measured on a 4-point Likert scale (0: <italic>Often</italic>, 1: <italic>Sometimes</italic>, 2: <italic>Not often</italic>, 3: <italic>Never</italic>; reversed scale for some items). Loneliness as one of the factors affecting quality of life (<xref ref-type="bibr" rid="bib72">Talarska et al., 2018</xref>) is chosen as the group variable. For this purpose, the sample was dichotomized at a score value of three determined from two questions related to loneliness (“Old age is a time of loneliness”, “As I get older, I expect to become more lonely”), answered on a 5-point Likert scale (1: <italic>Strongly agree</italic>, 5: <italic>Strongly disagree</italic>) by the participants during Wave 2. Persons who feel less lonely represent Group 0 (<inline-formula><mml:math id="math-4" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn></mml:math></inline-formula>) and those who feel more lonely represent Group 1(<inline-formula><mml:math id="math-5" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1682</mml:mn></mml:math></inline-formula>). Since the group differences were nevertheless marginal, we modified these data. All individuals of Group 1 were included in our reference dataset, but only those individuals of Group 0 were included, whose scores of the variables “control” and “self-realization” lay above their respective 0.2 percentiles. The dataset is referred to as “Dataset 2” in the following.</p>
<p id="S2.SS1.Px1.p5">Answers to questions of each subdomain in these questionnaires using Likert-type responses are summarized in a score, where a higher mean score correspond to a higher level of distress (Dataset 1), and a higher sum score indicates a better quality of life (Dataset 2), respectively. Data simulations are based on these scores. Boxplots in <xref ref-type="fig" rid="fig-1">Figure 1a</xref> and <xref ref-type="fig" rid="fig-1">1b</xref> show the scores’ distribution in Reference Data 1 and 2, respectively. They indicate that on average individuals in one group usually obtain higher/lower scores compared to the other group irrespective of the time point and variable, presumably facilitating classification in these datasets. Also, temporal variation in Dataset 2 is rather modest.</p>
	<p id="S2.SS1.Px1.p6">Therefore, we considered further scenarios beyond these two original datasets (Dataset 1: CORE-OM, Dataset 2: CASP-19). In addition, to test the methods under different conditions, we provided three modified versions of these datasets (Dataset 3: modified CORE-OM with equal group means collapsed over time points and group means with opposite temporal trends, Dataset 4: modified CASP-19, Time Points 1 &amp; 2 only, with identical means but heterogeneous covariance matrices, Dataset 5: modified CASP-19, Time Points 1 and 2 only, balanced class sizes by random undersampling of Group 1). Dataset 3 was modified by adding a constant specific to each variable to the data of Group 0 such that collapsed means of both groups became equal in size, while maintaining the original boundaries of the measurement intervals. Then we swapped the data of the two time points for Variables 1, 2, and 3 for Group 0, such that means of Group 0 have an upward temporal trend compared to the downward temporal trend of measurements in Group 1. For Dataset 4, only Time Points 1 and 2 are considered. We adjusted group means of Group 0 per time point such that they equal those of Group 1. For Dataset 5, also only Time Points 1 and 2 are considered, and a random subset of the larger Group 1 equalling the sample size of Group 0 was chosen in order to obtain a balanced scenario. The corresponding <monospace>R</monospace> code can be found in <xref ref-type="bibr" rid="bib31.5">Graf et al. (2025</xref>, code “availability”).</p>
	<p id="S2.SS1.Px1.p7">With Dataset 4, the aim was to create data which only differ in their group covariance matrices. Homogeneity of covariance matrices can be tested using the well-known Box’s M test (<xref ref-type="bibr" rid="bib9">Box, 1949</xref>), but its reliability suffers when the multivariate normality assumption is even only slightly violated (<xref ref-type="bibr" rid="bib73">Tiku &amp; Balakrishnan, 1984</xref>). For Dataset 4, the <inline-formula><mml:math id="math-6" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula>-value of the approximate <inline-formula><mml:math id="math-7" display="inline"><mml:msup><mml:mrow><mml:mi>χ</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> test statistic of the Box’s M test is <inline-formula><mml:math id="math-8" display="inline"><mml:mo>&lt;</mml:mo><mml:mo>.</mml:mo><mml:mn>001</mml:mn></mml:math></inline-formula>( <inline-formula><mml:math id="math-9" display="inline"><mml:msup><mml:mrow><mml:mi>χ</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>(36) = 1789.9) but this significant test result may indicate a violation of normality instead of inequality of covariance matrices. Therefore, since the data significantly differ from multivariate normality (Table S4 in <xref ref-type="bibr" rid="bib31.5">Graf et al., 2025</xref>), we visually assessed the covariance matrices’ heterogeneity based on the components used for Box’s M test, i.e., log determinants of the pooled and group covariance matrices (<inline-formula><mml:math id="math-10" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mspace width="0.3em"/><mml:mtext>and</mml:mtext><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>), which equal the product of their respective log eigenvalues. We use plots of log determinants with 95% confidence intervals and plots of log eigenvalues of the covariance matrices as suggested by <xref ref-type="bibr" rid="bib27">Friendly and Sigal (2020)</xref>. From <xref ref-type="fig" rid="fig-2">Figure 2</xref> we conclude substantial heterogeneity of the group covariance matrices <inline-formula><mml:math id="math-11" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="math-12" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> in Dataset 4. For Dataset 4, simulations are based on the estimates of <inline-formula><mml:math id="math-13" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="math-14" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>, whereas for Datasets 1–3, and 5 they are based on the estimate of <inline-formula><mml:math id="math-15" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula> such that the LDA assumption of homogeneous covariance matrices holds. Figure S1 in <xref ref-type="bibr" rid="bib31.5">Graf et al. (2025)</xref> shows the plots for inspecting heterogeneity of covariance matrices for the other reference datasets as well. The assumption is not fulfilled in any of the datasets. Boxplots in <xref ref-type="fig" rid="fig-1">Figure 1c–1e</xref> show the scores’ distribution in Reference Data 3–5.</p>
	<p id="S2.SS1.Px1.p8">We chose reference datasets with moderate temporal and cross-sectional correlations. Correlation matrices are shown in Table S1a–S1e of <xref ref-type="bibr" rid="bib31.5">Graf et al. (2025)</xref>. In this case, analyzing the data separately per time point or focussing on measurements of single variables over multiple time points, respectively, would ignore these correlations and yield less reliable results if, in fact, aﬃliation to one of the groups is affected by multiple correlated variables and/or time points (e.g., <xref ref-type="bibr" rid="bib30">Gnanadesikan &amp; Kettenring, 1984)</xref>.</p>
<table-wrap id="tab_1" position="anchor" orientation="portrait">
<label>Table 1</label><caption><title>Some Properties of the Reference Datasets and the Corresponding Simulation Scenarios Considered in the Simulation Study</title>
</caption>
	<table frame="hsides" rules="groups" style="compact-1; striped-#f3f3f3"><colgroup span="1">
<col width="" align="left" valign="bottom"/>
<col width=""/>
<col width=""/>
<col width="" align="left"/>
<col width=""/>
<col width="" align="left"/></colgroup>
<thead>
<tr>
<th valign="bottom">Dataset</th>
<th valign="bottom"># Variables</th>
<th valign="bottom"># Time Points</th>
<th valign="bottom">Sample Size</th>
<th valign="bottom">Simulation Covariance Matrix</th>
<th valign="bottom">Simulation Scenario Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><inline-formula><mml:math id="math-19" display="inline"><mml:mtext mathvariant="italic">Dataset 1</mml:mtext></mml:math></inline-formula></td>
<td>4</td>
<td>2</td>
<td><inline-formula><mml:math id="math-20" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>42</mml:mn><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>142</mml:mn></mml:math></inline-formula></td>
<td><inline-formula><mml:math id="math-21" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula></td>
<td>unbalanced sample sizes, homogeneous covariance matrices, same temporal trends of group means</td>
</tr>
<tr>
<td><inline-formula><mml:math id="math-22" display="inline"><mml:mtext mathvariant="italic">Dataset 2</mml:mtext></mml:math></inline-formula></td>
<td>4</td>
<td>4</td>
<td><inline-formula><mml:math id="math-23" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1682</mml:mn></mml:math></inline-formula></td>
<td><inline-formula><mml:math id="math-24" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula></td>
<td>unbalanced sample sizes, homogeneous covariance matrices, same temporal trends of group means</td>
</tr>
<tr>
<td><inline-formula><mml:math id="math-25" display="inline"><mml:mtext mathvariant="italic">Dataset 3</mml:mtext></mml:math></inline-formula></td>
<td>4</td>
<td>2</td>
<td><inline-formula><mml:math id="math-26" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>42</mml:mn><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>142</mml:mn></mml:math></inline-formula></td>
<td><inline-formula><mml:math id="math-27" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula></td>
<td>unbalanced sample sizes, homogeneous covariance matrices, same group means collapsed over time, opposite temporal trends</td>
</tr>
<tr>
<td><inline-formula><mml:math id="math-28" display="inline"><mml:mtext mathvariant="italic">Dataset 4</mml:mtext></mml:math></inline-formula></td>
<td>4</td>
<td>2</td>
<td><inline-formula><mml:math id="math-29" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1682</mml:mn></mml:math></inline-formula></td>
<td><inline-formula><mml:math id="math-30" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula></td>
<td>unbalanced sample sizes, heterogeneous covariance matrices, same group means</td>
</tr>
<tr>
<td><inline-formula><mml:math id="math-31" display="inline"><mml:mtext mathvariant="italic">Dataset 5</mml:mtext></mml:math></inline-formula></td>
<td>4</td>
<td>2</td>
<td><inline-formula><mml:math id="math-32" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn></mml:math></inline-formula></td>
<td><inline-formula><mml:math id="math-33" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula></td>
<td>balanced sample sizes, homogeneous covariance matrices, same temporal trends of group means</td>
</tr>
<tr>
<td/>
</tr>
</tbody>
	
</table>
	<table-wrap-foot>
		<p><italic>Note.</italic> <inline-formula><mml:math id="math-16" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula> = pooled covariance matrix, <inline-formula><mml:math id="math-17" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> = covariance matrix Group 0, <inline-formula><mml:math id="math-18" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> = covariance matrix Group 1.</p>
	</table-wrap-foot>
</table-wrap>
	
	<fig id="fig-1" position="anchor" orientation="portrait"><label>Figure 1</label><caption><title>Boxplots Showing the Variables’ Distribution in the Reference Datasets</title>
<p><italic>Note. </italic>(a) Dataset 1: CORE-OM dataset, group variable <italic>hospitalisation</italic> (<inline-formula><mml:math id="math-34" display="inline"><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>42</mml:mn></mml:math></inline-formula>, <inline-formula><mml:math id="math-35" display="inline"><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>142</mml:mn></mml:math></inline-formula>, non-hospitalised individuals represent Group 0 and hospitalised individuals represent Group 1).</p>
<p>(b) Dataset 2: CASP-19 dataset, group variable <italic>loneliness</italic> (<inline-formula><mml:math id="math-36" display="inline"><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn></mml:math></inline-formula>, <inline-formula><mml:math id="math-37" display="inline"><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1682</mml:mn></mml:math></inline-formula>, participants who feel less lonely represent Group 0 and participants who feel more lonely represent Group 1).</p>
<p>(c) Dataset 3 (modified Dataset 1): same collapsed means, group means with opposite temporal trends.</p>
<p>(d) Dataset 4 (modified Dataset 2, Time Points 1 &amp; 2): same means, group covariance matrices differ.</p>
<p>(e) Dataset 5 (modified Dataset 2, Time Points 1 &amp; 2): balanced class sizes by random undersampling of Group 1.</p></caption><graphic mimetype="image" mime-subtype="png" xlink:href="qcmb.14891-f1.png" position="anchor" orientation="portrait"/></fig><fig id="fig-2" position="anchor" orientation="portrait"><label>Figure 2</label><caption><title>Plots of the Components of Box’s M Test for Dataset 4</title><p><italic>Note. </italic>Left: log determinants of covariance matrices with asymptotic 95% confidence intervals (CI). Right: scree plots of log eigenvalues of the covariance matrices. Less overlap of CIs and higher differences between log eigenvalues, respectively, correspond to a higher degree of heterogeneity of the (group) covariance matrices. The figures indicate (significant) heterogeneity of covariance matrices.</p> </caption><graphic mimetype="image" mime-subtype="png" xlink:href="qcmb.14891-f2.png" position="anchor" orientation="portrait"/></fig></sec></sec>
<sec id="s3" sec-type="Method"><title>Method</title>
<p id="S3.p1">In the following section, we will describe the traditional repeated measures LDA,
				which relies on the multivariate normality assumption, its robust versions and the
				nonparametric longitudinal SVM for classification of nonnormally distributed
				repeated measures data. We will compare the performance of these methods in a
				neutral comparison study with respect to multiple performance measures. An overview
				of the considered methods is given in <xref ref-type="table" rid="tab_2">Table
					2</xref>. Each classification method is considered in combination with or
				without previous outlier removal by trimming algorithms. An overview of the steps in
				the simulation study is shown in <xref ref-type="table" rid="tab_3">Table 3</xref>.
				Further details are included in the <xref ref-type="sec" rid="s3_4">Simulation
					Study</xref> section.</p>
<p id="S3.p2">We consider a situation with a categorical outcome variable <inline-formula><mml:math id="math-38" display="inline"><mml:mi>y</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">}</mml:mo></mml:math></inline-formula>, where measurements of <inline-formula><mml:math id="math-39" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> variables are taken at <inline-formula><mml:math id="math-40" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> consecutive time points instead of only a single time point in <inline-formula><mml:math id="math-41" display="inline"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> individuals. We consider complete data, i.e., for each individual <inline-formula><mml:math id="math-42" display="inline"><mml:mi>j</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">}</mml:mo></mml:math></inline-formula>, each measurement <inline-formula><mml:math id="math-43" display="inline"><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:mi>d</mml:mi></mml:math></inline-formula> is taken at each time point <inline-formula><mml:math id="math-44" display="inline"><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:math></inline-formula>. The aim is to estimate a classification rule from the (training) data that can classify new observations (from separate independent test data) into one of two groups.</p>
<table-wrap id="tab_2" position="float" orientation="portrait">
<label>Table 2</label><caption><title>Overview of the Considered Linear Classification Methods for Nonnormally Distributed Multivariate Repeated Measures Data</title></caption>
	<table frame="hsides" rules="groups" style="striped-#f3f3f3"><colgroup span="1">
<col width="" align="left"/>
<col width="" align="left"/>
<col width="" align="left"/></colgroup>
<thead>
<tr>
<th valign="bottom">Linear Classification Method</th>
<th valign="bottom">Description</th>
<th valign="bottom">Abbreviation</th>
</tr>
</thead>
<tbody>
<tr>
	<td>Repeated measures linear discriminant analysis (LDA)<sup>a</sup></td>
<td>Parametric method depending on estimates of the group means and common covariance matrix</td>
<td/>
</tr>
<tr>
<td>1) standard/traditional</td>
<td>(unstructured) pooled covariance matrix, equires multivariate normality (<xref ref-type="bibr" rid="bib48">Lix &amp; Sajobi, 2010</xref>)</td>
<td>LDA(<inline-formula><mml:math id="math-45" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>)</td>
</tr>
<tr>
<td>2) robust</td>
<td>a) (parsimonious) Kronecker product covariance estimated by flip-flop algorithm (<xref ref-type="bibr" rid="bib11">Brobbey, 2021</xref>)</td>
<td>LDA(<inline-formula><mml:math id="math-46" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>)</td>
</tr>
<tr>
<td/>
<td>b) (unstructured) covariance matrix estimated using the joint Generalized Estimating Equations model (<xref ref-type="bibr" rid="bib12">Brobbey et al., 2022</xref>)</td>
<td>LDA(GEE)</td>
</tr>
<tr>
	<td>Longitudinal Support Vector Machine (SVM) using a linear kernel<sup>b</sup></td>
	<td>Nonparametric method independent of distributional assumptions (<xref ref-type="bibr" rid="bib17">Chen &amp; Bowman, 2011</xref>)</td>
<td>SVM</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
	<p><italic>Note.</italic> The performance of each classification method is estimated either without or in combination with preceding multivariate outlier removal (using the Minimum Volume Ellipsoid (MVE) or the Minimum Covariance Determinant (MCD) algorithm, respectively).</p><p><sup>a</sup>see the <xref ref-type="sec" rid="s3_1">Multivariate Repeated Measures LDA</xref> sub-section.</p><p><sup>b</sup>see the <xref ref-type="sec" rid="s3_2">Longitudinal Support Vector Machine</xref> sub-section.</p>
	</table-wrap-foot>
</table-wrap>
<sec id="s3_1"><title>Multivariate Repeated Measures LDA</title>
<p id="S3.SS1.Px1.p1">For LDA, the unknown parameters <inline-formula><mml:math id="math-47" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, i.e., the group-specific mean vectors, and <inline-formula><mml:math id="math-48" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, i.e., the pooled covariance matrix, need to be estimated from the data</p> 

	<p><disp-formula><mml:math id="math-49"><mml:msub><mml:mfenced open="{" close="}"><mml:mrow><mml:msubsup><mml:mi mathvariant="bold">X</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mi>T</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:mo>⋯</mml:mo><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="bold">X</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mi>T</mml:mi></mml:msubsup></mml:mrow></mml:mfenced><mml:munder accentunder="false"><mml:mrow><mml:mi>i</mml:mi><mml:mo>∈</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>⋯</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:munder></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi><mml:mi>t,</mml:mi></mml:mrow></mml:msup></mml:math></disp-formula></p>

<p>where <inline-formula><mml:math id="math-50" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> are continuous measurements. Here, <inline-formula><mml:math id="math-51" display="inline"><mml:mi>i</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">}</mml:mo></mml:math></inline-formula> represents the group label, <inline-formula><mml:math id="math-52" display="inline"><mml:mi>j</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">}</mml:mo></mml:math></inline-formula> the patient, <inline-formula><mml:math id="math-53" display="inline"><mml:mi>k</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy="false">}</mml:mo></mml:math></inline-formula> the time point, and <inline-formula><mml:math id="math-54" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> the number of variables. The total sample size is denoted by <inline-formula><mml:math id="math-55" display="inline"><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>. The covariance matrix <inline-formula><mml:math id="math-56" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is assumed to be positive definite. The traditional LDA assumes multivariate normality of the data,</p> 
	
	<p><disp-formula><mml:math id="math-57"><mml:msub><mml:mrow><mml:mi mathvariant="bold">X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mover><mml:mrow><mml:mo>∼</mml:mo></mml:mrow><mml:mrow><mml:mrow><mml:mtext>iid</mml:mtext></mml:mrow></mml:mrow></mml:mover><mml:msub><mml:mrow><mml:mi mathvariant="script">N</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="bold">Σ</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math>,</disp-formula></p>
	
	<p>as well as equality of group covariance matrices (homoscedasticity), <inline-formula><mml:math id="math-58" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi mathvariant="bold">Σ</mml:mi></mml:math></inline-formula>. <xref ref-type="bibr" rid="bib11">Brobbey (2021)</xref> and <xref ref-type="bibr" rid="bib12">Brobbey et al. (2022)</xref> developed two approaches for robust LDA (when data deviate from multivariate normality) based on the Kronecker product estimate of the covariance matrix <inline-formula><mml:math id="math-59" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi></mml:math></inline-formula> that will be described in the <xref ref-type="sec" rid="s3_1_1">Robust Trimmed Likelihood LDA for Multivariate Repeated Measures Data</xref> section and in the <xref ref-type="sec" rid="s3_1_2">Generalized Estimation Equations (GEE) Discriminant Analysis for Repeated Measures Data</xref> section. Here, we will briefly explain the rationale behind these modified LDA approaches and introduce the general LDA classification rule.</p>
	<p id="S3.SS1.Px1.p2">Assuming that <inline-formula><mml:math id="math-60" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi></mml:math></inline-formula> is unstructured, all distinct correlations between each pair of the <inline-formula><mml:math id="math-61" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> variables and each combination of the <inline-formula><mml:math id="math-62" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> time points must be estimated. If the dataset is small, the estimate <inline-formula><mml:math id="math-63" display="inline"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:math></inline-formula> may become singular, i.e., if <inline-formula><mml:math id="math-64" display="inline"><mml:mi>n</mml:mi><mml:mo>≤</mml:mo><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:math></inline-formula>. In order to reduce the complexity of <inline-formula><mml:math id="math-65" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi></mml:math></inline-formula> or to estimate <inline-formula><mml:math id="math-66" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi></mml:math></inline-formula> more eﬃciently, a reduced number of parameters can be considered by assuming, for example, a Kronecker product structure <inline-formula><mml:math id="math-67" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>⊗</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. Here, <inline-formula><mml:math id="math-68" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> comprises the correlations between the <inline-formula><mml:math id="math-69" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> time points and <inline-formula><mml:math id="math-70" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> comprises the correlations between the <inline-formula><mml:math id="math-71" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> variables. The number of unknown parameters reduces from <inline-formula><mml:math id="math-72" display="inline"><mml:mo stretchy="false">(</mml:mo><mml:mi>d</mml:mi><mml:mi>t</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>d</mml:mi><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo>/</mml:mo><mml:mn>2</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> for an unstructured covariance matrix to <inline-formula><mml:math id="math-73" display="inline"><mml:mi>d</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>d</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo>/</mml:mo><mml:mn>2</mml:mn><mml:mo>+</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo>/</mml:mo><mml:mn>2</mml:mn></mml:math></inline-formula> for a Kronecker product covariance matrix (<xref ref-type="bibr" rid="bib52">Naik &amp; Rao, 2001</xref>). It can be estimated by the flip-flop algorithm, which gives maximum likelihood estimates of <inline-formula><mml:math id="math-74" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="math-75" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> (<xref ref-type="bibr" rid="bib49">Lu &amp; Zimmerman, 2005</xref>). The flip-flop algorithm is suitable in case each observation can be separated with respect to two factors, such as the time points and variables in case of multivariate longitudinal data.</p>
<p id="S3.SS1.Px1.p3">The LDA classification rule states that a new observation <inline-formula><mml:math id="math-76" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is assigned to class 0 if
<disp-formula id="e_1">
<mml:math id="x4" display="block"><mml:mtable columnalign="left"><mml:mtr><mml:mtd columnalign="right"><mml:msup><mml:mrow><mml:mfenced separators="" open="(" close=")"><mml:mrow><mml:msub><mml:mrow><mml:mi mathvariant="bold">X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:mfenced></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo><mml:mo>&gt;</mml:mo><mml:mo>log</mml:mo><mml:mfenced separators="" open="(" close=")"><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>π</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>π</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:mfenced></mml:mtd></mml:mtr></mml:mtable></mml:math>
</disp-formula></p>
	<p id="S3.SS1.Px1.p4">where <inline-formula><mml:math id="math-77" display="inline"><mml:msub><mml:mrow><mml:mi>π</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">}</mml:mo><mml:mo>,</mml:mo></mml:math></inline-formula> is the prior probability of class <inline-formula><mml:math id="math-78" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="math-79" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="math-80" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> the respective group means, and <inline-formula><mml:math id="math-81" display="inline"><mml:msup><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> is the inverse covariance matrix (<xref ref-type="bibr" rid="bib48">Lix &amp; Sajobi, 2010</xref>). In the methods by <xref ref-type="bibr" rid="bib11">Brobbey (2021)</xref> and <xref ref-type="bibr" rid="bib12">Brobbey et al. (2022)</xref>, <inline-formula><mml:math id="math-82" display="inline"><mml:msup><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> is replaced by <inline-formula><mml:math id="math-83" display="inline"><mml:msubsup><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>⊗</mml:mo><mml:msubsup><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>.</p>
<sec id="s3_1_1"><title>Robust Trimmed Likelihood LDA for Multivariate Repeated Measures Data</title>
<p id="S3.SS1.SSS1.Px1.p1">The rationale behind robust trimmed likelihood LDA for multivariate repeated measures data (<xref ref-type="bibr" rid="bib11">Brobbey, 2021</xref>) is to use more robust estimators of the sample mean and covariance matrix in order to increase the accuracy of LDA predictions in new data. Robust trimmed likelihood LDA for multivariate repeated measures data can also be used as a supporting analysis alongside the traditional LDA, showing that the results are not severely affected by outliers.</p>
	<p id="S3.SS1.SSS1.Px1.p2">Many estimators of these sample statistics are particularly prone to outliers, which are hard to detect in multivariate data with <inline-formula><mml:math id="math-84" display="inline"><mml:mi>d</mml:mi><mml:mo>&gt;</mml:mo><mml:mn>2</mml:mn></mml:math></inline-formula> variables. A popular measure of robustness, the finite sample breakdown point by <xref ref-type="bibr" rid="bib21">Donoho (1982)</xref> and <xref ref-type="bibr" rid="bib22">Donoho and Huber (1983)</xref>, is the smallest number or fraction of extremely small or large values that must be added to the original sample that will result in an arbitrarily large value of the statistic. While many estimators of multivariate location and scatter break down when adding <inline-formula><mml:math id="math-85" display="inline"><mml:mi>n</mml:mi><mml:mo>/</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mi>d</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> outliers (<xref ref-type="bibr" rid="bib21">Donoho, 1982</xref>), estimators based on the Minimum Volume Ellipsoid (MVE) and Minimum Covariance Determinant (MCD) algorithms (<xref ref-type="bibr" rid="bib61">Rousseeuw, 1985</xref>) have a substantially higher break-down point of <inline-formula><mml:math id="math-86" display="inline"><mml:mo stretchy="false">(</mml:mo><mml:mo>⌊</mml:mo><mml:mi>n</mml:mi><mml:mo>/</mml:mo><mml:mn>2</mml:mn><mml:mo>⌋</mml:mo><mml:mo>−</mml:mo><mml:mi>d</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo>/</mml:mo><mml:mi>n</mml:mi></mml:math></inline-formula> (<xref ref-type="bibr" rid="bib62">Rousseeuw &amp; Driessen, 1999</xref>; <xref ref-type="bibr" rid="bib83">Woodruff &amp; Rocke, 1993</xref>). The high-breakdown linear discriminant analysis (<xref ref-type="bibr" rid="bib34">Hawkins &amp; McLachlan, 1997</xref>) for cross-sectional data, for example, is based on the MCD algorithm and has already been implemented in the <monospace>R</monospace> package <monospace>rrcov</monospace> (<xref ref-type="bibr" rid="bib74">Todorov, 2022</xref>).</p>
	<p id="S3.SS1.SSS1.Px1.p3">The MCD is statistically more eﬃcient than the MVE algorithm because it is asymptotically normal (<xref ref-type="bibr" rid="bib13">Butler et al., 1993</xref>), its distances are more precise, i.e., it is more capable of detecting outliers (<xref ref-type="bibr" rid="bib62">Rousseeuw &amp; Driessen, 1999</xref>). The MCD algorithm takes subsets of size <inline-formula><mml:math id="math-87" display="inline"><mml:mo stretchy="false">(</mml:mo><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mi>d</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo>/</mml:mo><mml:mn>2</mml:mn><mml:mo>≤</mml:mo><mml:mi>h</mml:mi><mml:mo>≤</mml:mo><mml:mi>n</mml:mi></mml:math></inline-formula> of the dataset (for <inline-formula><mml:math id="math-88" display="inline"><mml:mi>h</mml:mi><mml:mo>&gt;</mml:mo><mml:mi>p</mml:mi></mml:math></inline-formula>) and determines the particular subset of <inline-formula><mml:math id="math-89" display="inline"><mml:mi>h</mml:mi></mml:math></inline-formula> observations out of the <inline-formula><mml:math id="math-90" display="inline"><mml:mfenced separators="" open="(" close=")"><mml:mfrac linethickness="0.0pt"><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:mfrac></mml:mfenced></mml:math></inline-formula> possible subsets for which the determinant of the sample covariance <inline-formula><mml:math id="math-91" display="inline"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:math></inline-formula> becomes minimal. The MVE algorithm chooses the subset of <inline-formula><mml:math id="math-92" display="inline"><mml:mi>h</mml:mi></mml:math></inline-formula> observations for which the ellipsoid containing all <inline-formula><mml:math id="math-93" display="inline"><mml:mi>h</mml:mi></mml:math></inline-formula> data points becomes minimal.</p>
<p id="S3.SS1.SSS1.Px1.p4"><xref ref-type="bibr" rid="bib11">Brobbey (2021)</xref> suggests to estimate the class means <inline-formula><mml:math id="math-94" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="math-95" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> as well as the common covariance matrix <inline-formula><mml:math id="math-96" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi></mml:math></inline-formula> in the reduced dataset derived after applying the MCD or MVE algorithm, respectively. She furthermore suggests to estimate the Kronecker product structure of the covariance matrix since it is more parsimonious than the unstructured equivalent, which may not be estimable for small sample sizes. We apply both versions, where we once estimate the unstructured pooled covariance matrix
<disp-formula id="e_2">
	
	

	
	
	<mml:math id="x1" display="block"><mml:mtable columnalign="left"><mml:mtr><mml:mtd columnalign="right"><mml:mstyle mathvariant="bold-italic"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo>+</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math>
</disp-formula>
</p>
	<p id="S3.SS1.SSS1.Px1.p5">and once the Kronecker product covariance <inline-formula><mml:math id="math-97" display="inline"><mml:mstyle mathvariant="bold-italic"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>⊗</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, where <inline-formula><mml:math id="math-98" display="inline"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="math-99" display="inline"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are the pooled covariances between the <inline-formula><mml:math id="math-100" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> time points and <inline-formula><mml:math id="math-101" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> variables, respectively. The flip-flop algorithm (<xref ref-type="bibr" rid="bib49">Lu &amp; Zimmerman, 2005</xref>) is used to estimate <inline-formula><mml:math id="math-102" display="inline"><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula>
		<mml:math id="math-103" display="inline"><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">}</mml:mo></mml:math></inline-formula> from the data.</p></sec>
<sec id="s3_1_2"><title>Generalized Estimation Equations (GEE) Discriminant Analysis for Repeated Measures Data</title>
	<p id="S1.SS2.Px1.p1">Joint Generalized Estimating Equations (GEEs) are another possibility to derive more robust estimates of the sample means and covariance matrix from multivariate longitudinal data (<xref ref-type="bibr" rid="bib12">Brobbey et al., 2022</xref>; <xref ref-type="bibr" rid="bib38">Inan, 2015</xref>). GEEs provide population-level parameter estimates, which are consistent and asymptotically normally distributed even in case of misspecified working correlation structures of the outcome variables. The covariance matrix is estimated by a robust sandwich estimator (<xref ref-type="bibr" rid="bib32">Hardin &amp; Hilbe, 2013</xref>). <xref ref-type="bibr" rid="bib12">Brobbey et al. (2022)</xref> proposed the use of GEEs for multivariate repeated measures data in the context of repeated measures LDA as implemented by <xref ref-type="bibr" rid="bib38">Inan (2015)</xref>. The population-level estimates (<inline-formula><mml:math id="math-104" display="inline"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:math></inline-formula>) of the GEE model are plugged into the repeated measures LDA classification rule. For parsimony, the joint GEE model by <xref ref-type="bibr" rid="bib38">Inan (2015)</xref> uses a decomposition of the working correlation matrix into a <inline-formula><mml:math id="math-105" display="inline"><mml:mi>t</mml:mi><mml:mo>×</mml:mo><mml:mi>t</mml:mi></mml:math></inline-formula> within- and a <inline-formula><mml:math id="math-106" display="inline"><mml:mi>d</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:math></inline-formula> between-multivariate response correlation matrix through the Kronecker product. We fitted the joint GEE model by <xref ref-type="bibr" rid="bib38">Inan (2015)</xref> to the data of each group <inline-formula><mml:math id="math-107" display="inline"><mml:mi>i</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">}</mml:mo></mml:math></inline-formula> to obtain the class-specific means and covariance matrix estimates, which we subsequently pooled to obtain the common covariance matrix of the entire dataset. Further details on the approach are given in Supplementary Material S.1 (see <xref ref-type="bibr" rid="bib31.5">Graf et al., 2025)</xref>.</p></sec></sec>
<sec id="s3_2"><title>Longitudinal Support Vector Machine</title>
	<p id="S3.SS2.Px1.p1">The original linear SVM for cross-sectional data and linearly separable classes (<xref ref-type="bibr" rid="bib77">Vapnik, 1982</xref>) has been modified such that an overlap between the samples of both classes is to some extent allowed (<xref ref-type="bibr" rid="bib19">Cortes &amp; Vapnik, 1995</xref>). <xref ref-type="bibr" rid="bib17">Chen and Bowman (2011)</xref> further generalized this SVM classifier such that it becomes applicable to longitudinal data. In their longitudinal SVM algorithm, temporal changes are modeled by considering a linear combination of the observations <inline-formula><mml:math id="math-108" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold">x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> and a parameter vector <inline-formula><mml:math id="math-109" display="inline"><mml:mi mathvariant="bold-italic">β</mml:mi><mml:mo>=</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>β</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>β</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>, which represents the coeﬃcients for each time point <inline-formula><mml:math id="math-110" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>. Then, <inline-formula><mml:math id="math-111" display="inline"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi mathvariant="bold">x</mml:mi></mml:mrow><mml:mo>~</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold">x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mrow><mml:mi>β</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi mathvariant="bold">x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mo>⋯</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mrow><mml:mi>β</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi mathvariant="bold">x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, are provided as input to the traditional SVM. Combining the <inline-formula><mml:math id="math-112" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> observations from all <inline-formula><mml:math id="math-113" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> time points in a single vector assumes that the distances between time points are the same. The approach also assumes a fixed number of <inline-formula><mml:math id="math-114" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> observations per time point <inline-formula><mml:math id="math-115" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> (complete data) just as in case of LDA. Although this SVM classifier can also estimate nonlinear decision boundaries depending on the type of kernel matrix that is used, we apply a linear kernel in order to compare its performance to the other linear classifiers and since the absolute values of the weight vector can be interpreted as variable importance in case of a linear kernel matrix.</p>
<p id="S2.SS2.Px1.p2">Although the SVM algorithm does not make any distributional assumptions, the regularization Parameter <italic>C</italic> needs to be optimized. We use the SSVMP algorithm (<xref ref-type="bibr" rid="bib66">Sentelle et al., 2016</xref>), a modification of the SVMpath algorithm (<xref ref-type="bibr" rid="bib33">Hastie et al., 2004</xref>) to find the optimal value of <italic>C</italic>. The SSVMP algorithm is applicable for unequal class sizes and semidefinite kernel matrices in contrast to the original version by <xref ref-type="bibr" rid="bib33">Hastie et al. (2004)</xref>. The path algorithm finds the optimal value <inline-formula><mml:math id="math-116" display="inline"><mml:msup><mml:mrow><mml:mi>λ</mml:mi></mml:mrow><mml:mrow><mml:mtext>SVM</mml:mtext></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>/</mml:mo><mml:mi>C</mml:mi></mml:math></inline-formula> with high accuracy, since it considers all possible values of <italic>C</italic>. At the same time, it is computationally eﬃcient compared to the generally recommended grid search. It has been shown that the choice of <italic>C</italic> can be critical for the generalizability of the SVM model (<xref ref-type="bibr" rid="bib33">Hastie et al., 2004)</xref>.</p>
<p id="S2.SS2.Px1.p3">The SSVMP algorithm (<xref ref-type="bibr" rid="bib65">Sentelle, 2015</xref>; <xref ref-type="bibr" rid="bib66">Sentelle et al., 2016</xref>) optimizes the inverse of the regularization parameter, <inline-formula><mml:math id="math-117" display="inline"><mml:msup><mml:mrow><mml:mi>λ</mml:mi></mml:mrow><mml:mrow><mml:mtext>SVM</mml:mtext></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>/</mml:mo><mml:mi>C</mml:mi></mml:math></inline-formula>. Starting with a high value of <inline-formula><mml:math id="math-118" display="inline"><mml:msup><mml:mrow><mml:mi>λ</mml:mi></mml:mrow><mml:mrow><mml:mtext>SVM</mml:mtext></mml:mrow></mml:msup></mml:math></inline-formula> such that all samples lie within the margin of the SVM, it successively determines a strictly decreasing sequence of <inline-formula><mml:math id="math-119" display="inline"><mml:msup><mml:mrow><mml:mi>λ</mml:mi></mml:mrow><mml:mrow><mml:mtext>SVM</mml:mtext></mml:mrow></mml:msup></mml:math></inline-formula> values for which the set of support vectors changes for each <inline-formula><mml:math id="math-120" display="inline"><mml:msup><mml:mrow><mml:mi>λ</mml:mi></mml:mrow><mml:mrow><mml:mtext>SVM</mml:mtext></mml:mrow></mml:msup></mml:math></inline-formula> value, and it stops if no more observations are left inside of the margin (linearly separable case) or if the next <inline-formula><mml:math id="math-121" display="inline"><mml:msup><mml:mrow><mml:mi>λ</mml:mi></mml:mrow><mml:mrow><mml:mtext>SVM</mml:mtext></mml:mrow></mml:msup></mml:math></inline-formula> value would be zero.</p>
	<p id="S2.SS2.Px1.p4">The longitudinal SVM algorithm by <xref ref-type="bibr" rid="bib17">Chen and Bowman (2011)</xref> requires to specify a maximum number of iterations used for finding the optimal separating hyperplane parameters. In our case, the iterative algorithm for optimization of the Lagrange multipliers <inline-formula><mml:math id="math-122" display="inline"><mml:mi mathvariant="bold-italic">α</mml:mi></mml:math></inline-formula> and temporal change parameters <inline-formula><mml:math id="math-123" display="inline"><mml:mi mathvariant="bold-italic">β</mml:mi></mml:math></inline-formula> in the longitudinal SVM is repeated until the Euclidean distance between two consecutive estimates of <inline-formula><mml:math id="math-124" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">α</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> becomes less than <inline-formula><mml:math id="math-125" display="inline"><mml:mn>1</mml:mn><mml:mi>E</mml:mi><mml:mo>−</mml:mo><mml:mn>08</mml:mn></mml:math></inline-formula> or the maximum number of 100 iterative steps is reached. A summary of the longitudinal SVM algorithm using the linear soft-margin approach can be found in Supplementary Material S.2. (see <xref ref-type="bibr" rid="bib31.5">Graf et al., 2025)</xref></p>
<sec id="s3_2_1"><title>Nonparametric Bootstrap Approach</title>
	<p id="S2.SS2.SSS1.Px1.p1">The nonparametric bootstrap approach for point estimates by <xref ref-type="bibr" rid="bib79">Wahl et al. (2016)</xref> is an extension of the algorithm by <xref ref-type="bibr" rid="bib41">Jiang et al. (2008)</xref> and based on the.632+ bootstrap method (<xref ref-type="bibr" rid="bib23">Efron &amp; Tibshirani, 1997</xref>), and thus assumes independence of observations. It estimates the <inline-formula><mml:math id="math-126" display="inline"><mml:mo>.</mml:mo><mml:mn>632</mml:mn><mml:mo>+</mml:mo></mml:math></inline-formula> bootstrap estimate (<inline-formula><mml:math id="math-127" display="inline"><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mo>.</mml:mo><mml:mn>632</mml:mn><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula>) of the respective performance measure including a 95% confidence interval.</p>
<p id="S2.SS2.SSS1.Px1.p2">The.632+ bootstrap estimate is computed as a weighted average of the apparent performance <inline-formula><mml:math id="math-128" display="inline"><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> (training and test data given by the original dataset) and the average “out-of-bag” (OOB) performance <inline-formula><mml:math id="math-129" display="inline"><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>b</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi><mml:mo>,</mml:mo><mml:mi>O</mml:mi><mml:mi>O</mml:mi><mml:mi>B</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:munderover><mml:mrow><mml:mo>∑</mml:mo></mml:mrow><mml:mrow><mml:mi>b</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>B</mml:mi></mml:mrow></mml:munderover><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>b</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi><mml:mo>,</mml:mo><mml:mi>O</mml:mi><mml:mi>O</mml:mi><mml:mi>B</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> computed from <italic>B</italic> bootstrap datasets (training data given by the bootstrap dataset, and test data given by the samples not present in the bootstrap dataset). The formula is:
<disp-formula id="e_3">
<mml:math id="x2" display="block"><mml:mtable columnalign="left"><mml:mtr><mml:mtd columnalign="right"><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mo>.</mml:mo><mml:mn>632</mml:mn><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mi>w</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>⋅</mml:mo><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:mi>w</mml:mi><mml:mo>⋅</mml:mo><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>b</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi><mml:mo>,</mml:mo><mml:mi>O</mml:mi><mml:mi>O</mml:mi><mml:mi>B</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
</disp-formula>
</p>
<p id="S2.SS2.SSS1.Px1.p3">where <inline-formula><mml:math id="math-130" display="inline"><mml:mi>w</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>0.632</mml:mn></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mn>0.368</mml:mn><mml:mo>⋅</mml:mo><mml:mtext>r</mml:mtext></mml:mrow></mml:mfrac></mml:math></inline-formula> and <inline-formula><mml:math id="math-131" display="inline"><mml:mi>r</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>b</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi><mml:mo>,</mml:mo><mml:mi>O</mml:mi><mml:mi>O</mml:mi><mml:mi>B</mml:mi></mml:mrow></mml:msup><mml:mo>−</mml:mo><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mi>o</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>f</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msup><mml:mo>−</mml:mo><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:math></inline-formula>. The value of <inline-formula><mml:math id="math-132" display="inline"><mml:msup><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mi>o</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>f</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is 0.5 for predictive accuracy, sensitivity, and specificity. For the Youden index, this value is 0.</p>
<p id="S2.SS2.SSS1.Px1.p4">Then each bootstrap dataset is assigned a weight <inline-formula><mml:math id="math-133" display="inline"><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>b</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>b</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi><mml:mo>,</mml:mo><mml:mi>b</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msubsup><mml:mo>−</mml:mo><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, where <inline-formula><mml:math id="math-134" display="inline"><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>b</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi><mml:mo>,</mml:mo><mml:mi>b</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the value of the performance measure, when the bootstrap dataset <inline-formula><mml:math id="math-135" display="inline"><mml:mi>b</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>⋯</mml:mo><mml:mspace width="0.3em"/><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo stretchy="false">}</mml:mo></mml:math></inline-formula> is used as training as well as test dataset. The <inline-formula><mml:math id="math-136" display="inline"><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mi>α</mml:mi></mml:mrow><mml:mrow><mml:mo>∗</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula> and <inline-formula><mml:math id="math-137" display="inline"><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mi>α</mml:mi></mml:mrow><mml:mrow><mml:mo>∗</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula> percentiles of the empirical distribution of these weights, <inline-formula><mml:math id="math-138" display="inline"><mml:msub><mml:mrow><mml:mi>ξ</mml:mi></mml:mrow><mml:mrow><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mi>α</mml:mi></mml:mrow><mml:mrow><mml:mo>∗</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="math-139" display="inline"><mml:msub><mml:mrow><mml:mi>ξ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mi>α</mml:mi></mml:mrow><mml:mrow><mml:mo>∗</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:msub></mml:math></inline-formula>, give the confidence interval of <inline-formula><mml:math id="math-140" display="inline"><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mo>.</mml:mo><mml:mn>632</mml:mn><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula>:
<disp-formula id="e_4">
<mml:math id="x3" display="block"><mml:mtable columnalign="left"><mml:mtr><mml:mtd columnalign="right"><mml:mo stretchy="false">[</mml:mo><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mo>.</mml:mo><mml:mn>632</mml:mn><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mo>−</mml:mo><mml:msub><mml:mrow><mml:mi>ξ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mi>α</mml:mi></mml:mrow><mml:mrow><mml:mo>∗</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mo>.</mml:mo><mml:mn>632</mml:mn><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msub><mml:mrow><mml:mi>ξ</mml:mi></mml:mrow><mml:mrow><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mi>α</mml:mi></mml:mrow><mml:mrow><mml:mo>∗</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
</disp-formula>
</p></sec>
<sec id="s3_3"><title>Performance Measures</title>
<p id="S3.SS3.Px1.p1">In order to compare class prediction of the classification algorithms in the independent test data, we used predictive accuracy, the Youden index, sensitivity, and specificity as measures of discrimination. Predictive accuracy is the number of correctly classified samples divided by the total number of samples. Sensitivity, or true positive rate, is the proportion of individuals among all individuals that have been predicted to belong to class 1, whose class prediction matches their true class label. Specificity, or true negative rate, is the the proportion of individuals among all individuals that have been predicted to belong to class 0, whose class prediction matches their true class label. The Youden index (<xref ref-type="bibr" rid="bib84">Youden, 1950</xref>) combines sensitivity and specificity of the classification model into a single measure (Youden index = |Sensitivity + Specificity - 1|).</p>
<p id="S3.SS3.Px1.p2">Recommendations based on theses measures can differ a lot. Predictive accuracy of an algorithm may be high in data with highly unbalanced classes if the label of the larger class is predicted for all samples. In this case the Youden index will have the minimum value of zero. Therefore it is reasonable to consider both measures, predictive accuracy and the Youden index.</p></sec></sec>
<sec id="s3_4"><title>Simulation Study Approach and Software</title>
	<p id="S3.SS4.Px1.p1">Our simulation study aims at mimicking reference datasets from psychological applications. See the <xref ref-type="sec" rid="s2_2">Reference Datasets</xref> sub-section for a detailed description of these datasets. A brief overview of the steps in the simulation study is given in <xref ref-type="fig" rid="fig-3">Figure 3</xref>. For each scenario, 2000 datasets are simulated. Sample sizes for the training data are chosen identical to the sample sizes of the reference datasets. Sample sizes for the test data for each group are 10 times the number of the respective original group sample size in order to maintain the group size ratio. A larger test sample size can be chosen in simulations since they do not rely on actual data. Variance in the performance estimates may thereby be decreased. Data are simulated from the multivariate normal distribution (as a reference), from the multivariate truncated normal distribution which only takes on values within specified boundaries similar to the sum or mean scores in the reference data, respectively, and from the multivariate lognormally distributed data in order to include an extremely skewed distribution (overview in <xref ref-type="table" rid="tab_3">Table 3</xref>). Parameters needed for data simulations are estimated from the reference datasets (i.e., the pooled covariance matrix <inline-formula><mml:math id="math-141" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi></mml:math></inline-formula>, or the group covariance matrices <inline-formula><mml:math id="math-142" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="math-143" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>, respectively, group means <inline-formula><mml:math id="math-144" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>, and <inline-formula><mml:math id="math-145" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>, and the lower and upper boundaries, <bold>a</bold> and <bold>b</bold>, of the sum or mean score, respectively). Training data are either not trimmed or trimmed using the MCD and the MVE algorithm, respectively, keeping 90% of the samples, before applying the classification algorithms. In contrast to <xref ref-type="bibr" rid="bib11">Brobbey (2021)</xref> and <xref ref-type="bibr" rid="bib12">Brobbey et al. (2022)</xref>, we did not use the restrictive assumption of a Kronecker product covariance structure for simulating the data. In contrast to <xref ref-type="bibr" rid="bib17">Chen and Bowman (2011)</xref>, the datasets to which we applied the method are not balanced in sample size. We would like to examine the methods’ performance in more general simulation settings.</p>
<p id="S3.SS4.Px1.p2">Since the SVM algorithm relies on the Euclidean distance to determine the optimal decision boundary, standardization is required as a data-preprocessing step. We standardized the data variable-wise (across time points) before applying the method. Centering and scaling is done using the <monospace>preProcess</monospace> function in the <monospace>R</monospace> package <monospace>caret</monospace> (<xref ref-type="bibr" rid="bib44">Kuhn et al., 2024</xref>). More specifically, each training dataset is centered, and scaled to unit variance, and the same parameters are then used to standardize the test dataset in the same way (<xref ref-type="bibr" rid="bib35">Hsu et al., 2003</xref>). Machine-learning algorithms generally require the optimization of hyperparameters. Application of the linear SVM algorithm requires finding the optimal value of the Hyperparameter <italic>C</italic> which determines the maximum amount of overlap allowed between samples of both classes. We applied the simple SVM path (SSVMP) algorithm by <xref ref-type="bibr" rid="bib66">Sentelle et al. (2016)</xref> as suggested by <xref ref-type="bibr" rid="bib17">Chen and Bowman (2011)</xref> in order to determine the optimal regularization Parameter <italic>C</italic>. It is available as <monospace>MATLAB</monospace> code (<xref ref-type="bibr" rid="bib65">Sentelle, 2015</xref>), which we rewrote in <monospace>R</monospace>. Computation of the longitudinal SVM results including the computation of the optimal <italic>C</italic> could only be done for the two smaller datasets (Dataset 1 and Dataset 3) due to limitations by computational complexity.</p>
	<p id="S3.SS4.Px1.p3">The flip-flop algorithm (<xref ref-type="bibr" rid="bib49">Lu &amp; Zimmerman, 2005</xref>) used by <xref ref-type="bibr" rid="bib11">Brobbey (2021)</xref> for estimating the Kronecker product structure of the covariance matrix from the training data (for the LDA(<inline-formula><mml:math id="math-146" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) algorithm) was iterated until the Frobenius norm of two consecutive Kronecker product covariance matrices became less than or equal to <inline-formula><mml:math id="math-147" display="inline"><mml:mn>1</mml:mn><mml:mi>E</mml:mi><mml:mo>−</mml:mo><mml:mn>04</mml:mn></mml:math></inline-formula>, a proposed stopping criterion by <xref ref-type="bibr" rid="bib16">Castañeda Garcia and Nossek (2014)</xref>.</p>
	<p id="S3.SS4.Px1.p4">We used the following software for data simulations. We implemented the longitudinal SVM in <monospace>R</monospace> using the <monospace>R</monospace> package <monospace>Rcplex</monospace> (<xref ref-type="bibr" rid="bib10">Bravo et al., 2021</xref>). We used the implementations of the MVE and MCD algorithm from the <monospace>R</monospace> package <monospace>MASS</monospace> (<xref ref-type="bibr" rid="bib59">Ripley et al., 2022</xref>), the joint GEE model as implemented in the <monospace>R</monospace> package <monospace>JGEE</monospace> (<xref ref-type="bibr" rid="bib38">Inan, 2015</xref>), and implemented the version of the flip-flop algorithm in <monospace>R</monospace> as described in <xref ref-type="bibr" rid="bib49">Lu and Zimmerman (2005)</xref>. For simulation of multivariate normally, lognormally, and truncated normally distributed data, we used the respective functions from the <monospace>R</monospace> packages <monospace>MASS</monospace> (<xref ref-type="bibr" rid="bib59">Ripley et al., 2022</xref>), <monospace>compositions</monospace> (<xref ref-type="bibr" rid="bib76">van den Boogaart et al., 2022</xref>), and <monospace>tmvtnorm</monospace> (<xref ref-type="bibr" rid="bib82">Wilhelm &amp; Manjunath, 2022</xref>). For the truncated normal distribution, the rejection method (default) was used.</p>
<table-wrap id="tab_3" position="anchor" orientation="portrait">
<label>Table 3</label><caption><title>Parameterizations of the Multivariate Distributions for Group <inline-formula><mml:math id="math-148" display="inline"><mml:mi>i</mml:mi><mml:mo>∈</mml:mo><mml:mo stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">}</mml:mo></mml:math></inline-formula></title></caption>
	<table frame="hsides" rules="groups" style="striped-#f3f3f3"><colgroup span="1">
<col width="" align="left"/>
<col width="" align="left"/></colgroup>
<thead>
<tr>
<th>Distribution</th>
<th>Parameterization</th>
</tr>
</thead>
<tbody>
<tr>
<td>Multivariate normal</td>
<td><inline-formula><mml:math id="math-153" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="script">N</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="bold">Σ</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula></td>
</tr>
<tr>
<td>Multivariate lognormal</td>
<td><inline-formula><mml:math id="math-154" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="script">N</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="bold">Σ</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula></td>
</tr>
<tr>
<td>Multivariate truncated normal</td>
<td><inline-formula><mml:math id="math-155" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="script">T</mml:mi><mml:mi mathvariant="script">N</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="bold">Σ</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">a</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">b</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula></td>
</tr>
</tbody>
</table>
	<table-wrap-foot>
		<p><italic>Note.</italic> 	The multivariate truncated normal distribution is defined by lower and upper boundaries, <inline-formula><mml:math id="math-149" display="inline"><mml:mi mathvariant="bold">a</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> and <inline-formula><mml:math id="math-150" display="inline"><mml:mi mathvariant="bold">b</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, respectively, in addition to the mean (<inline-formula><mml:math id="math-151" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">μ</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>) and covariance (<inline-formula><mml:math id="math-152" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi></mml:math></inline-formula>) parameters.</p>
	</table-wrap-foot>	
</table-wrap><fig id="fig-3" position="float" orientation="portrait"><label>Figure 3</label><caption><title>Overview of the Steps in the Simulation Study for a Particular Reference Dataset</title><p><italic>Note.</italic> <inline-formula><mml:math id="math-156" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="script">N</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> — multivariate normal distribution <inline-formula><mml:math id="math-157" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="script">N</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> — multivariate lognormal distribution, <inline-formula><mml:math id="math-158" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="script">T</mml:mi><mml:mi mathvariant="script">N</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> — multivariate truncated normal distribution, <inline-formula><mml:math id="math-159" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> — # variables, <inline-formula><mml:math id="math-160" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> — # time points, LDA(<inline-formula><mml:math id="math-161" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) — Linear Discriminant Analysis (pooled covariance matrix), LDA(<inline-formula><mml:math id="math-162" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) — Linear Discriminant Analysis (Kronecker product covariance matrix), LDA(GEE) — Linear Discriminant Analysis (covariance matrix based on generalized estimating equations estimates), SVM — longitudinal Support Vector Machine, MVE — Minimum Volume Ellipsoid algorithm, MCD — Minimum Covariance Determinant algorithm.</p></caption><graphic mimetype="image" mime-subtype="png" xlink:href="qcmb.14891-f3.png" position="float" orientation="portrait"/></fig></sec></sec>
<sec id="s4" sec-type="Results|discussion"><title>Results and Discussion</title>
<sec id="s4_1"><title>Performance in the Reference Data</title>
	<p id="S4.SS1.Px1.p1">For computing point estimates of the performance measures including confidence intervals in the reference data, we used the bootstrap approach described in the <xref ref-type="sec" rid="s3_2_1">Nonparametric Bootstrap Approach</xref> sub-section. Estimates of predictive performance and the Youden index are shown in <xref ref-type="fig" rid="fig-4">Figure 4</xref>, those of sensitivity and specificity can be found in Figure S2 (see <xref ref-type="bibr" rid="bib31.5">Graf et al., 2025)</xref>. The bootstrap estimates and their respective confidence intervals are also shown in Table S3 (see <xref ref-type="bibr" rid="bib31.5">Graf et al., 2025</xref>).</p>
	<p id="S4.SS1.Px1.p2"><xref ref-type="fig" rid="fig-4">Figure 4</xref> shows that the two methods LDA(<inline-formula><mml:math id="math-163" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) and LDA(<inline-formula><mml:math id="math-164" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) have a very similar performance in all scenarios, and generally perform best. Including Figure S2 in <xref ref-type="bibr" rid="bib31.5">Graf et al. (2025)</xref>, these two methods tend to have more moderate values of sensitivity and specificity even for highly imbalanced datasets (Datasets 1,2,3). However, similar to LDA(GEE), they are (almost) incapable to accurately predict the correct class of individuals from the minority class when group means are identical and only group covariance matrices differ (Dataset 4). All three methods, LDA(<inline-formula><mml:math id="math-165" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>), LDA(<inline-formula><mml:math id="math-166" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>), and LDA(GEE), predominantly predict that individuals belong to the majority class in this scenario, probably because its covariance matrix has a greater weight when computing the inverse of the pooled covariance matrix for the classification rule (<xref ref-type="sec" rid="s3_1">Multivariate Repeated Measures LDA</xref> sub-section). In comparison, LDA(GEE) and SVM perform worse for unequal class sizes, of which SVM performs worse compared to LDA(GEE), particularly because its specificity (prediction of the minority class) is very low. Comparing the performance for Dataset 1 (same temporal trends of group means) and Dataset 3 (opposite temporal trends of group means), the results of all performance measures considerably improve for LDA(<inline-formula><mml:math id="math-167" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) and LDA(<inline-formula><mml:math id="math-168" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>). For LDA(GEE) there is almost no change (very slight improvement), and overall no difference for the SVM. Results for Dataset 5 show that using balanced instead of the imbalanced data (Dataset 2), increase specificity of all LDA methods but particularly for LDA(GEE), resulting in a higher Youden index. Trimming of the training data does only in some cases improve the performance in the test data. A slight improvement of predictive performance and Youden index at the same time can only be observed in some cases: for LDA(<inline-formula><mml:math id="math-169" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) when applied to Dataset 1 (after MVE trimming), Dataset 3 (after MVE and MCD trimming), Dataset 5 (after MCD trimming), for LDA(<inline-formula><mml:math id="math-170" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) when applied to Dataset 1 (MVE trimming), Dataset 3 (MVE and MCD trimming), and LDA(GEE) when applied to Dataset 1 (MCD trimming).</p><fig id="fig-4" position="anchor" orientation="portrait"><label>Figure 4</label><caption><title>Performance in the Reference Data</title>
		<p><italic>Note.</italic> The bootstrap approach by <xref ref-type="bibr" rid="bib79">Wahl et al. (2016)</xref> was used, along with the 2000 bootstrap datasets: <inline-formula><mml:math id="math-171" display="inline"><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>𝜃</mml:mi></mml:mrow><mml:mo>̂</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mo>.</mml:mo><mml:mn>632</mml:mn><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> and respective 95% confidence intervals for the performance measures predictive accuracy and Youden index.</p><p>(a) Dataset 1: CORE-OM dataset, group variable <italic>hospitalisation</italic> (<inline-formula><mml:math id="math-172" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>42</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>142</mml:mn></mml:math></inline-formula>).</p>
		<p>(b) Dataset 2: CASP-19 dataset, group variable <italic>loneliness</italic> (<inline-formula><mml:math id="math-173" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1682</mml:mn></mml:math></inline-formula>).</p>
		<p>(c) Dataset 3: modified Dataset 1, such that group means collapsed over time points are equal, and group means have opposite temporal trends.</p>
		<p>(d) Dataset 4: modified Dataset 2 (Time Points 1 &amp; 2), such that group means are equal, and group covariance matrices differ.</p>
		<p>(e) Dataset 5: modified Dataset 2 (Time Points 1 &amp; 2), balanced class sizes by random undersampling of Group 1.</p></caption><graphic mimetype="image" mime-subtype="png" xlink:href="qcmb.14891-f4.png" position="anchor" orientation="portrait"/></fig></sec>
<sec id="s4_2"><title>Performance in the Simulated Data</title>
	<p id="S4.SS2.Px1.p1">For data simulations we assumed homogeneity of covariance matrices (which is a LDA assumption) for data generation based on Datasets 1, 2, 3, and 5, despite heterogeneity in the reference datasets. Figure S1 in <xref ref-type="bibr" rid="bib31.5">Graf et al. (2025)</xref> shows plots for comparison of the components of Box’s M test, which is known to be very sensitive to violations of the normality assumption, and results may therefore not be reliable. Log determinants and log eigenvalues of the covariance matrices differ from each other suggesting heterogeneity of covariances in the reference data. Only for Dataset 4 we assumed heterogeneous covariance matrices for data generation in order to compare the methods’ performance under violation of this assumption when group means are identical at the same time.</p>
	<p id="S4.SS2.Px1.p2">The second LDA assumption is multivariate normality of the data. Table S4 in <xref ref-type="bibr" rid="bib31.5">Graf et al. (2025)</xref> shows that lognormally distributed multivariate data differ most from multivariate normality according to the Mardia measure of multivariate skewness (highest absolute number of significant test results). Truncated normally distributed data differ more significantly from multivariate normality for larger sample sizes and/or a higher number of measurement occasions. Especially for Datasets 2 and 4, respectively, trimming the data using the MCD algorithm notably decreases deviation from multivariate normality in truncated normally distributed data, which is also true for datasets 1 and 3, respectively, when the MCD algorithm is applied to the lognormally distributed data. This effect is weaker for the MVE algorithm. This shows at least that the MCD algorithm, which has been found to be more suitable for outlier detection compared to the MVE algorithm (<xref ref-type="bibr" rid="bib62">Rousseeuw &amp; Driessen, 1999</xref>), may be useful in case outliers or non-normality is assumed to bias parameter estimates. On the other hand, the optimal trimming value has to be chosen in order to not remove valuable observations from the data. There currently are no general guidelines.</p>
	<p id="S4.SS2.Px1.p3"><xref ref-type="table" rid="tab_4">Table 4</xref> shows the computational times per algorithm, averaged over scenarios using different data distributions and trimming approaches. The method LDA(<inline-formula><mml:math id="math-174" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) has the advantage of low computational times. Especially for LDA(<inline-formula><mml:math id="math-175" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) computational time hugely increases with larger sample size and/or higher number of measurement occasions. In comparison, computational time of LDA(GEE) seems to be less affected by larger sample sizes but rather higher dimensionality (number of time points and variables). Computation of SVM results are most time-consuming, and the algorithm does not always converge after 100 iterations (Table S6 in <xref ref-type="bibr" rid="bib31.5">Graf et al., 2025</xref>).</p>
<table-wrap id="tab_4" position="anchor" orientation="portrait">
<label>Table 4</label><caption><title>Computational Times (Hours) per Algorithm Averaged Over the Simulated Datasets per Reference Dataset</title>
</caption>
<table frame="hsides" rules="groups"><colgroup span="1">
<col width="" align="left"/>
<col width=""/>
<col width=""/>
<col width=""/>
<col width=""/></colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>LDA(<inline-formula><mml:math id="math-180" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>)</th>
<th>LDA(<inline-formula><mml:math id="math-181" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>)</th>
<th>LDA(GEE)</th>
<th>SVM</th>
</tr>
</thead>
<tbody>
<tr>
<td><inline-formula><mml:math id="math-182" display="inline"><mml:mtext mathvariant="italic">Dataset 1</mml:mtext></mml:math></inline-formula></td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">1.05</td>
<td align="char" char=".">0.34</td>
<td align="char" char=".">64.29</td>
</tr>
<tr>
<td><inline-formula><mml:math id="math-183" display="inline"><mml:mtext mathvariant="italic">Dataset 2</mml:mtext></mml:math></inline-formula></td>
<td align="char" char=".">1.4</td>
<td align="char" char=".">29.62</td>
<td align="char" char=".">26.71</td>
	<td>—</td>
</tr>
<tr>
<td><inline-formula><mml:math id="math-184" display="inline"><mml:mtext mathvariant="italic">Dataset 3</mml:mtext></mml:math></inline-formula></td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">1.29</td>
<td align="char" char=".">0.39</td>
<td align="char" char=".">61.63</td>
</tr>
<tr>
<td><inline-formula><mml:math id="math-185" display="inline"><mml:mtext mathvariant="italic">Dataset 4</mml:mtext></mml:math></inline-formula></td>
<td align="char" char=".">0.93</td>
<td align="char" char=".">16.99</td>
<td align="char" char=".">6.9</td>
	<td>—</td>
</tr>
<tr>
<td><inline-formula><mml:math id="math-186" display="inline"><mml:mtext mathvariant="italic">Dataset 5</mml:mtext></mml:math></inline-formula></td>
<td align="char" char=".">0.79</td>
<td align="char" char=".">10.57</td>
<td align="char" char=".">4.12</td>
	<td>—</td>
</tr>
</tbody>
</table>
	<table-wrap-foot><p><italic>Note.</italic> Irrespective of the data distribution and irrespective whether trimming has been done before application of the classification algorithm.</p><p>Dataset 1: CORE-OM dataset, group variable <italic>hospitalisation</italic> (<inline-formula><mml:math id="math-176" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>42</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>142</mml:mn></mml:math></inline-formula>)</p>
		<p>Dataset 2: CASP-19 dataset, group variable <italic>loneliness</italic> (<inline-formula><mml:math id="math-177" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1682</mml:mn></mml:math></inline-formula>).</p>
		<p>Dataset 3: modified Dataset 1, such that group means collapsed over time points are equal, and group means have opposite temporal trends.</p>
		<p>Dataset 4: modified Dataset 2 (Time Points 1 &amp; 2), such that group means are equal, and group covariance matrices differ.</p>
		<p>Dataset 5: modified Dataset 2 (Time Points 1 &amp; 2), balanced class sizes by random undersampling of Group 1.</p>
		<p>LDA(<inline-formula><mml:math id="math-178" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) — Linear discriminant analysis (pooled covariance matrix), LDA(<inline-formula><mml:math id="math-179" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) — Linear discriminant analysis (Kronecker product covariance matrix), LDA(GEE) — Linear discriminant analysis (covariance matrix based on generalized estimating equations estimates), SVM — Support vector machine.</p></table-wrap-foot>
</table-wrap>
	<p id="S4.SS2.Px1.p4"><xref ref-type="fig" rid="fig-5">Figures 5</xref> and <xref ref-type="fig" rid="fig-6">6</xref> show the estimates’ distribution of predictive accuracy and the Youden index in the simulated data, respectively. Plots for sensitivity and specificity are shown in Figures S 3 and S 4, respectively. Mean (standard error) of the performance measures are also shown in Tables S5a–Se (see <xref ref-type="bibr" rid="bib31.5">Graf et al., 2025</xref>) for Datasets 1–5. A first finding from <xref ref-type="fig" rid="fig-5">Figures 5</xref> and <xref ref-type="fig" rid="fig-6">6</xref> is that deviation from normality (in the multivariate lognormally distributed data) in some cases increases (Dataset 1), decreases (Dataset 2 and 5) the algorithms’ predictive performance and Youden index, and in some cases does not have a considerable effect (Dataset 3 and 4). It seems that for the scenarios with smaller sample sizes (<inline-formula><mml:math id="math-187" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>42</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>142</mml:mn></mml:math></inline-formula>), no negative effect could be determined, whereas for the scenarios with much larger sample sizes (<inline-formula><mml:math id="math-188" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1682</mml:mn></mml:math></inline-formula> and <inline-formula><mml:math id="math-189" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn></mml:math></inline-formula>) there is a clear decrease in predictive accuracy and Youden index. The effect is approximately the same for all three repeated-measures LDA methods. A second finding is that predictive accuracy and Youden index for the SVM are visibly worse compared to the LDA methods for these imbalanced sample sizes. It has a sensitivity close to 1, but specificity close to 0, and thus mostly predicts the majority class.</p>
<p id="S4.SS2.Px1.p5">With respect to predictive accuracy, LDA(<inline-formula><mml:math id="math-190" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) without prior trimming usually performs best. Only for Dataset 5 (balanced class sizes) LDA(GEE) with prior trimming (MCD algorithm) has a marginally better predictive performance in the lognormally distributed data. Values of both measures, predictive performance and Youden index, of LDA(GEE) are only equal to the other two LDA methods for Dataset 5 (equal sample sizes), Dataset 4 (where all methods perform poorly) and lognormally distributed data simulated based on Dataset 2 (where all methods perform poorly). For Dataset 1 (unbalanced classes, same temporal trends of group means), the Youden index of LDA(GEE) is higher than the values for LDA(<inline-formula><mml:math id="math-191" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) and LDA(<inline-formula><mml:math id="math-192" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) for multivariate normally and truncated normally distributed data, especially when no trimming is applied to the training data. The boxes only slightly overlap or do not overlap at all. The reason is its higher specificity (prediction of the minority class), but its sensitivity is comparably lower. For the lognormally distributed data generated based on Dataset 1, the Youden index of LDA(<inline-formula><mml:math id="math-193" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>), especially without prior trimming, is higher compared to the other methods, which is also due to higher specificity.</p>
<p id="S4.SS2.Px1.p6">It is not clear in which situations among the presented simulation scenarios trimming for outlier removal may help, but there is no scenario where we explicitly simulated outliers. Both, predictive accuracy and the Youden index, somewhat increase (from a rather low performance level) for all three LDA methods in the lognormally distributed data for Dataset 5 when trimming in the training data is done.</p><fig id="fig-5" position="anchor" orientation="portrait"><label>Figure 5</label><caption><title>Boxplots Showing the Distribution of Predictive Accuracy Estimated in the 2000 Simulated Datasets</title>
<p><italic>Note.</italic> Distribution for the multivariate normal (left), for the multivariate lognormal (center), and for the multivariate truncated normal distribution (right). Results with the highest median value are highlighted in darker colours.</p><p>(a) Dataset 1: CORE-OM dataset, group variable <italic>hospitalisation</italic> (<inline-formula><mml:math id="math-194" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>42</mml:mn></mml:math></inline-formula>, <inline-formula><mml:math id="math-195" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>142</mml:mn></mml:math></inline-formula>).</p>
<p>(b) Dataset 2: CASP-19 dataset, group variable <italic>loneliness</italic> (<inline-formula><mml:math id="math-196" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1682</mml:mn></mml:math></inline-formula>).</p>
<p>(c) Dataset 3: modified Dataset 1, such that group means collapsed over time points are equal, and group means have opposite temporal trends.</p>
<p>(d) Dataset 4: modified Dataset 2 (Time Points 1 &amp; 2), such that group means are equal, and group covariance matrices differ.</p>
<p>(e) Dataset 5: modified Dataset 2 (Time Points 1 &amp; 2), balanced class sizes by random undersampling of Group 1.</p>
	<p>Abbreviations: LDA(<inline-formula><mml:math id="math-197" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) — Linear Discriminant Analysis (pooled covariance matrix), LDA(<inline-formula><mml:math id="math-198" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) — Linear Discriminant Analysis (Kronecker product covariance matrix), LDA(GEE) — Linear Discriminant Analysis (covariance matrix based on Generalized Estimating Equations estimates), SVM — Support vector machine, MVE — Minimum Volume Ellipsoid algorithm, MCD — Minimum Covariance Determinant algorithm.</p></caption><graphic mimetype="image" mime-subtype="png" xlink:href="qcmb.14891-f5.png" position="anchor" orientation="portrait"/></fig></sec>
<sec id="s4_3"><title>Recommendations</title>
<p id="S4.SS3.Px1.p1">Generally, in these simulations the traditional LDA(<inline-formula><mml:math id="math-199" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) performs best or reasonably well with respect to predictive performance and Youden index, irrespective of smaller or larger sample size, differing group size ratios, number of measurement occasions, similar or opposite temporal trends in group means. None of the LDA methods works well for identical group means but heterogeneous covariance matrices, where they predominantly assign new observations to the majority class, and the Youden index is close to zero. The same is the case for multivariate lognormally distributed data when sample sizes are large, i.e., for an extremely evident violation of multivariate normality corresponding to extremely high values of the Mardia measure of multivariate skewness test statistic (approximately above 100).</p>
<p id="S4.SS3.Px1.p2">We did not explicitly generate outliers from a different distribution than the actual data, but there may have been some random outliers. In this case, trimming for outlier removal had no effect except a minor effect on the Youden index for all LDA methods in the scenario with balanced group sizes and same temporal trends per group when data were generated from lognormally distributed data. In this case, the LDA methods still did not perform reasonably well. Multivariate trimming in the training data can be tried as a sensitivity analysis if the presence of outliers is suspected. Especially the MCD algorithm has already been recommended in the literature.</p>
<p id="S4.SS3.Px1.p3">In our simulations no Kronecker product covariance matrices and group means are assumed in the reference data. We used unstructured estimates of the pooled covariance matrix and group means. In our simulations, there is only an advantage of the alternative LDA(<inline-formula><mml:math id="math-200" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) and LDA(GEE) with respect to the Youden index for data with imbalanced class sizes and comparably smaller (but not small) sample sizes. The advantage of these methods, even if no underlying Kronecker product structure of the parameters can be assumed, may become more evident for smaller sample sizes. They may provide more exact estimates due to their parsimonious number of values that have to be estimated.</p>
	<p id="S4.SS3.Px1.p4">Application of repeated-measures techniques should be preferred in order to incorporate the additional information about temporal trends and in order to obtain more reliable results by including data of multiple time points in the analysis provided that moderate correlations between data of different variables and times points exist. Multicollinearity among time points and/or variables would require removal of respective time points or variables, respectively. In case of independence between time points/variables, univariate techniques can be used. According to the psychometric literature, multivariate data are very common. An example are the widely applied questionnaires using Likert-type responses where multiple correlated aspects related to an overall topic are measured. In order to assess the usefulness of different sets of variables for distinguishing two classes of individuals, LDA can be applied for class prediction and its performance for different sets of variables can subsequently be compared to determine the most relevant variables. Usually, for LDA applied to cross-sectional data, Fisher discriminant function coeﬃcients (<xref ref-type="bibr" rid="bib25">Fisher, 1936</xref>) are computed in order to assess relative variable importance within a particular set. The method can in principle also be applied to repeated measures data. It does not assume multivariate normality although it requires homogeneity of covariance matrices.</p><fig id="fig-6" position="anchor" orientation="portrait"><label>Figure 6</label><caption><title>Boxplots Showing the Distribution of Youden Index Estimated in the 2000 Simulated Datasets</title><p><italic>Note.</italic> Distribution for the multivariate normal (left), for the multivariate lognormal (center), and for the multivariate truncated normal distribution (right). Results with the highest median value are highlighted in darker colours.</p>
<p>(a) Dataset 1: CORE-OM dataset, group variable <italic>hospitalisation</italic> (<inline-formula><mml:math id="math-201" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>42</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>142</mml:mn></mml:math></inline-formula>).</p>
<p>(b) Dataset 2: CASP-19 dataset, group variable <italic>loneliness</italic> (<inline-formula><mml:math id="math-202" display="inline"><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>948</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1682</mml:mn></mml:math></inline-formula>).</p>
<p>(c) Dataset 3: modified Dataset 1, such that group means collapsed over time points are equal, and group means have opposite temporal trends.</p>
<p>(d) Dataset 4: modified Dataset 2 (Time Points 1 &amp; 2), such that group means are equal, and group covariance matrices differ.</p>
<p>(e) Dataset 5: modified Dataset 2 (Time Points 1 &amp; 2), balanced class sizes by random undersampling of Group 1.</p>
		<p>Abbreviations: LDA(<inline-formula><mml:math id="math-203" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>pooled</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) — Linear Discriminant Analysis (pooled covariance matrix), LDA(<inline-formula><mml:math id="math-204" display="inline"><mml:msub><mml:mrow><mml:mi mathvariant="normal">Σ</mml:mi></mml:mrow><mml:mrow><mml:mtext>KP</mml:mtext></mml:mrow></mml:msub></mml:math></inline-formula>) — Linear Discriminant Analysis (Kronecker product covariance matrix), LDA(GEE) — Linear Discriminant Analysis (covariance matrix based on Generalized Estimating Equations estimates), SVM — Support Vector Machine, MVE — Minimum Volume Ellipsoid algorithm, MCD — Minimum Covariance Determinant algorithm.</p></caption><graphic mimetype="image" mime-subtype="png" xlink:href="qcmb.14891-f6.png" position="float" orientation="portrait"/></fig></sec></sec>
<sec id="s5" sec-type="Conclusion"><title>Conclusion</title>
<p id="S5.p1">Longitudinal studies are conducted in psychology and other disciplines. Data in psychology and the social sciences are often characterized by nonnormal distributions, especially skewness. LDA is widely applied as a standard technique in these fields, e.g., to questionnaire data where answers are measured on Likert scales that are summarized in subscales based on means or sums of multiple Likert items (i.e., single questions), either for classification tasks or for identifying variables most relevant to group separation. Repeated measures techniques are preferable for the analysis of data that are collected repeatedly over time compared to conducting several independent analyses for each time point in case temporal correlations exist.</p>
	<p id="S5.p2">We compared the performance of robust repeated measures DA techniques proposed by <xref ref-type="bibr" rid="bib11">Brobbey (2021)</xref> and <xref ref-type="bibr" rid="bib12">Brobbey et al. (2022)</xref> and the longitudinal SVM by <xref ref-type="bibr" rid="bib17">Chen and Bowman (2011)</xref> using multiple performance measures. We based these comparisons on real psychometric datasets which differ with respect to sample size, sample size ratio, class overlap, temporal variation, number of repeated measurement occasions, and properties of group means and covariance matrices. We thus considered additional scenarios to those in <xref ref-type="bibr" rid="bib11">Brobbey (2021)</xref> and <xref ref-type="bibr" rid="bib12">Brobbey et al. (2022)</xref>, where Kronecker product structures of means and covariances and thus constant correlations and means of the variables over time were assumed. We also compared several alternative methods among each other in contrast to comparing a particular alternative to the standard method at a time. We included the longitudinal SVM because it is similar to repeated measures LDA in that they are both linear classifiers for which variable weights can additionally be computed and temporal correlations are considered in the analysis. We did not consider extensions of other supervised machine learning algorithms for classification since they usually assume independence between time points (<xref ref-type="bibr" rid="bib57">Ribeiro &amp; Freitas, 2019</xref>) and do not have a comparably intuitive interpretation of variable weights as the linear SVM.</p>
<p id="S5.p3">We followed the guidelines for neutral comparison studies by <xref ref-type="bibr" rid="bib81">Weber et al. (2019)</xref> and the general design of simulation studies by <xref ref-type="bibr" rid="bib51">Morris et al. (2019)</xref>. We found that the alternative robust methods may not be required for suﬃciently large sample sizes and absence of outliers. Limitations of our simulation study are that only a limited number of scenarios and datasets are considered. Further examination in data with smaller sample sizes and in data containing outliers from a different distribution would be helpful. In this context, the influence of different choices for the trimming parameter when applying one of the trimming algorithms for outlier removal may also be examined. To date, no recommendations on the choice of the trimming parameter for multivariate data exist. Therefore, for an actual dataset, multiple values should be tried. Moreover, due to availability of suitable datasets in particular given data protection policies, and limited number of scenarios considered in every simulation study in general, further conclusions may be possible when applying the methods to other datasets. As with any simulation study, our results can therefore not be generalized beyond the considered scenarios. We found that none of the LDA methods did work well for extreme deviations from normality, and heterogeneity of covariance matrices when group means were identical, respectively. Conclusions based on the performance in the reference datasets and based on data simulations, respectively, are similar.</p></sec>
</body>
<back>
<ref-list><title>References</title>
	<ref id="bib1"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Aggarwala</surname>, <given-names>J.</given-names></string-name>, <string-name name-style="western"><surname>Garg</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Chatterjee</surname>, <given-names>S.</given-names></string-name> (<year>2022</year>). <article-title>Linear discriminant analysis of various physiological and psychological parameters among Indian elite male athletes of different types of sports</article-title>. <source><italic>Sport Mont</italic></source>, <volume>20</volume>(<issue>3</issue>), <fpage>53</fpage>–<lpage>60</lpage>. <pub-id pub-id-type="doi">10.26773/smj.221009</pub-id></mixed-citation></ref>
	<ref id="bib3"><mixed-citation publication-type="data"><string-name name-style="western"><surname>Banks</surname>, <given-names>J.</given-names></string-name>, <string-name name-style="western"><surname>Batty</surname>, <given-names>G.</given-names></string-name>, <string-name name-style="western"><surname>Breedvelt</surname>, <given-names>J.</given-names></string-name>, <string-name name-style="western"><surname>Coughlin</surname>, <given-names>K.</given-names></string-name>, <string-name name-style="western"><surname>Crawford</surname>, <given-names> R.</given-names></string-name>, <string-name name-style="western"><surname>Marmot</surname>, <given-names>M.</given-names></string-name>, <string-name name-style="western"><surname>Nazroo</surname>, <given-names>J.</given-names></string-name>, <string-name name-style="western"><surname>Oldfield</surname>, <given-names>Z.</given-names></string-name>, <string-name name-style="western"><surname>Steel</surname>, <given-names>N.</given-names></string-name>, <string-name name-style="western"><surname>Steptoe</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Wood</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Zaninotto</surname>, <given-names>P.</given-names></string-name> (<year>2021</year>). <italic>English longitudinal study of ageing: Waves 0–9, 1998–2019</italic> [36<sup>th</sup> edition] <comment>SN: 5050</comment>. <publisher-name>UK Data Service</publisher-name>. <pub-id pub-id-type="doi">10.5255/ukda-sn-5050-24</pub-id></mixed-citation></ref>
	<ref id="bib4"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Barkham</surname>, <given-names>M.</given-names></string-name>, <string-name name-style="western"><surname>Evans</surname>, <given-names>C.</given-names></string-name>, <string-name name-style="western"><surname>Margison</surname>, <given-names>F.</given-names></string-name>, <string-name name-style="western"><surname>Mcgrath</surname>, <given-names>G.</given-names></string-name>, <string-name name-style="western"><surname>Mellor-Clark</surname>, <given-names>J.</given-names></string-name>, <string-name name-style="western"><surname>Milne</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Connell</surname>, <given-names>J.</given-names></string-name> (<year>1998</year>). <article-title>The rationale for developing and implementing core batteries in service settings and psychotherapy outcome research</article-title>. <source><italic>Journal of Mental Health</italic></source>, <volume>7</volume>(<issue>1</issue>), <fpage>35</fpage>–<lpage>47</lpage>. <pub-id pub-id-type="doi">10.1080/09638239818328</pub-id></mixed-citation></ref>
	<ref id="bib5"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Baumeister</surname>, <given-names>R.</given-names></string-name>, <string-name name-style="western"><surname>Vohs</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Funder</surname>, <given-names>D.</given-names></string-name> (<year>2007</year>). <article-title>Psychology as the science of self-reports and finger movements whatever happened to actual behavior?</article-title> <source><italic>Perspectives on Psychological Science</italic></source>, <volume>2</volume>(<issue>4</issue>), <fpage>396</fpage>–<lpage>403</lpage>. <pub-id pub-id-type="doi">10.1111/j.1745-6916.2007.0005</pub-id></mixed-citation></ref>
	<ref id="bib6"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Beaumont</surname>, <given-names>J.</given-names></string-name>, <string-name name-style="western"><surname>Lix</surname>, <given-names>L.</given-names></string-name>, <string-name name-style="western"><surname>Yost</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Hahn</surname>, <given-names>E.</given-names></string-name> (<year>2006</year>). <article-title>Application of robust statistical methods for sensitivity analysis of health-related quality of life outcomes</article-title>. <source><italic>Quality of Life Research</italic></source>, <volume>15</volume>(<issue>3</issue>), <fpage>349</fpage>–<lpage>356</lpage>. <pub-id pub-id-type="doi">10.1007/s11136-005-2293-1</pub-id></mixed-citation></ref>
	<ref id="bib7"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Betz</surname>, <given-names>N. E.</given-names></string-name> (<year>1987</year>). <article-title>Use of discriminant analysis in counseling psychology research</article-title>. <source><italic>Journal of Counseling Psychology</italic></source>, <volume>34</volume>(<issue>4</issue>), <fpage>393</fpage>–<lpage>403</lpage>. <pub-id pub-id-type="doi">10.1037/0022-0167.34.4.393</pub-id></mixed-citation></ref>
	<ref id="bib8"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Boedeker</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Kearns</surname>, <given-names>N.</given-names></string-name> (<year>2019</year>). <article-title>Linear discriminant analysis for prediction of group membership: A user-friendly primer</article-title>. <source><italic>Advances in Methods and Practices in Psychological Science</italic></source>, <volume>2</volume>(<issue>3</issue>), <fpage>250</fpage>–<lpage>263</lpage>. <pub-id pub-id-type="doi">10.1177/2515245919849378</pub-id></mixed-citation></ref>
	<ref id="bib9"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Box</surname>, <given-names>G. E. P.</given-names></string-name> (<year>1949</year>). <article-title>A general distribution theory for a class of likelihood criteria</article-title>. <source><italic>Biometrika</italic></source>, <volume>36</volume>(<issue>3–4</issue>), <fpage>317</fpage>–<lpage>346</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/36.3-4.317</pub-id></mixed-citation></ref>
	<ref id="bib10"><mixed-citation publication-type="book"><string-name name-style="western"><surname>Bravo</surname>, <given-names>H. C.</given-names></string-name>, <string-name name-style="western"><surname>Hornik</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Theussl</surname>, <given-names>S.</given-names></string-name> (<year>2021</year>). <source><italic>Rcplex: R interface to CPLEX</italic></source>. <publisher-name>Comprehensive R Archive Network</publisher-name>. <pub-id pub-id-type="doi">10.32614/CRAN.package.Rcplex</pub-id></mixed-citation></ref>
	<ref id="bib11"><mixed-citation publication-type="other"><string-name name-style="western"><surname>Brobbey</surname>, <given-names>A.</given-names></string-name> (<year>2021</year>). <source><italic>Classification models for multivariate non-normal repeated measures data</italic></source>. [Doctoral thesis, University of Calgary]. University of Calgary Repository. <ext-link ext-link-type="uri" xlink:href="http://hdl.handle.net/1880/112972">http://hdl.handle.net/1880/112972</ext-link></mixed-citation></ref>
	<ref id="bib12"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Brobbey</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Wiebe</surname>, <given-names>S.</given-names></string-name>, <string-name name-style="western"><surname>Nettel-Aguirre</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Josephson</surname>, <given-names>C.</given-names></string-name>, <string-name name-style="western"><surname>Williamson</surname>, <given-names>T.</given-names></string-name>, <string-name name-style="western"><surname>Lix</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Sajobi</surname>, <given-names>T.</given-names></string-name> (<year>2022</year>). <article-title>Repeated measures discriminant analysis using multivariate generalized estimation equations</article-title>. <source><italic>Statistical Methods in Medical Research</italic></source>, <volume>31</volume>(<issue>4</issue>), <fpage>646</fpage>–<lpage>657</lpage>. <pub-id pub-id-type="doi">10.1177/09622802211032705</pub-id></mixed-citation></ref>
	<ref id="bib13"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Butler</surname>, <given-names>R. W.</given-names></string-name>, <string-name name-style="western"><surname>Davies</surname>, <given-names>P. L.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Jhun</surname>, <given-names>M.</given-names></string-name> (<year>1993</year>). <article-title>Asymptotics for the minimum covariance determinant estimator</article-title>. <source><italic>Annals of Statistics</italic></source>, <volume>21</volume>(<issue>3</issue>), <fpage>1385</fpage>–<lpage>1400</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1176349264</pub-id></mixed-citation></ref>
	<ref id="bib14"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Carifio</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Perla</surname>, <given-names>R. J.</given-names></string-name> (<year>2007</year>). <article-title>Ten common misunderstandings, misconceptions, persistent myths and urban legends about Likert scales and Likert response formats and their antidotes</article-title>. <source><italic>Journal of Social Sciences</italic></source>, <volume>3</volume>(<issue>3</issue>), <fpage>106</fpage>–<lpage>116</lpage>. <pub-id pub-id-type="doi">10.3844/jssp.2007.106.116</pub-id></mixed-citation></ref>
	<ref id="bib15"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Carifio</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Perla</surname>, <given-names>R. J.</given-names></string-name> (<year>2008</year>). <article-title>Resolving the 50-year debate around using and misusing Likert scales</article-title>. <source><italic>Medical Education</italic></source>, <volume>42</volume>(<issue>12</issue>), <fpage>1150</fpage>–<lpage>1152</lpage>. <pub-id pub-id-type="doi">10.1111/j.1365-2923.2008.03172.x</pub-id></mixed-citation></ref>
	<ref id="bib16"><mixed-citation publication-type="other"><string-name name-style="western"><surname>Castañeda Garcia</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Nossek</surname>, <given-names>J.</given-names></string-name> (<year>2014</year>). <article-title>Estimation of rank deficient covariance matrices with Kronecker structure</article-title>. <source><italic>ICASSP — IEEE International Conference on Acoustics, Speech and Signal Processing — Proceedings</italic></source>, (pp. 394–398). Curran Associates.</mixed-citation></ref>
	<ref id="bib17"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Chen</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Bowman</surname>, <given-names>F. D.</given-names></string-name> (<year>2011</year>). <article-title>A novel support vector classifier for longitudinal high-dimensional data and its application to neuroimaging data</article-title>. <source><italic>Statistical Analysis and Data Mining</italic></source>, <volume>4</volume>(<issue>6</issue>), <fpage>604</fpage>–<lpage>611</lpage>. <pub-id pub-id-type="doi">10.1002/sam.10141</pub-id></mixed-citation></ref>
	<ref id="bib18"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Clark</surname>, <given-names>L. A.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Watson</surname>, <given-names>D.</given-names></string-name> (<year>2019</year>). <article-title>Constructing validity: New developments in creating objective measuring instruments</article-title>. <source><italic>Psychological Assessment</italic></source>, <volume>31</volume>(<issue>12</issue>), <fpage>1412</fpage>–<lpage>1427</lpage>. <pub-id pub-id-type="doi">10.1037/pas0000626</pub-id></mixed-citation></ref>
	<ref id="bib19"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Cortes</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Vapnik</surname>, <given-names>V. N.</given-names></string-name> (<year>1995</year>). <article-title>Support-Vector Networks</article-title>. <source><italic>Machine Learning</italic></source>, <volume>20</volume>, <fpage>273</fpage>–<lpage>297</lpage>. <pub-id pub-id-type="doi">10.1007/BF00994018</pub-id></mixed-citation></ref>
	<ref id="bib20"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Delacre</surname>, <given-names>M.</given-names></string-name>, <string-name name-style="western"><surname>Lakens</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Leys</surname>, <given-names>C.</given-names></string-name> (<year>2017</year>). <article-title>Why psychologists should by default use Welch’s <italic>t</italic>-test instead of Student’s <italic>t</italic>-test</article-title>. <source><italic>International Review of Social Psychology</italic></source>, <volume>30</volume>(<issue>1</issue>), <fpage>92</fpage>–<lpage>101</lpage>. <pub-id pub-id-type="doi">10.5334/irsp.82</pub-id></mixed-citation></ref>
<ref id="bib21"><mixed-citation publication-type="other"><string-name name-style="western"><surname>Donoho</surname>, <given-names>D.</given-names></string-name> (<year>1982</year>). <source><italic>Breakdown properties of multivariate location estimators</italic></source> [Unpublished doctoral dissertation]. Harvard University.</mixed-citation></ref>
	<ref id="bib22"><mixed-citation publication-type="book"><string-name name-style="western"><surname>Donoho</surname>, <given-names>D. L.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Huber</surname>, <given-names>P. J.</given-names></string-name> (<year>1983</year>). <chapter-title>The notion of breakdown point</chapter-title>. In P. J. Bickel, K. A. Doksum, &amp; J. L. Hodges, Jr. (Eds.), <source><italic>A Festschrift for Erich Lehmann</italic></source> (pp. 157–184). <publisher-name>Wadsworth</publisher-name>.</mixed-citation></ref>
	<ref id="bib23"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Efron</surname>, <given-names>B.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Tibshirani</surname>, <given-names>R.</given-names></string-name> (<year>1997</year>). <article-title>Improvements on cross-validation: The .632+ bootstrap method</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>92</volume>(<issue>438</issue>), <fpage>548</fpage>–<lpage>560</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.1997.10474007</pub-id></mixed-citation></ref>
<ref id="bib24"><mixed-citation publication-type="book"><string-name name-style="western"><surname>Field</surname>, <given-names>A.</given-names></string-name> (<year>2017</year>). <source><italic>Discovering statistics using IBM SPSS Statistics</italic></source>. <publisher-name>SAGE Publications</publisher-name>.</mixed-citation></ref>
	<ref id="bib25"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Fisher</surname>, <given-names>R. A.</given-names></string-name> (<year>1936</year>). <article-title>The use of multiple measurements in taxonomic problems</article-title>. <source><italic>Annals of Human Genetics</italic></source>, <volume>7</volume>(<issue>2</issue>), <fpage>179</fpage>–<lpage>188</lpage>. <pub-id pub-id-type="doi">10.1111/j.1469-1809.1936.tb02137.x</pub-id></mixed-citation></ref>
	<ref id="bib26"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Fletcher</surname>, <given-names>J. M.</given-names></string-name>, <string-name name-style="western"><surname>Rice</surname>, <given-names>W. J.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Ray</surname>, <given-names>R. M.</given-names></string-name> (<year>1978</year>). <article-title>Linear discriminant function analysis in neuropsychological research: Some uses and abuses</article-title>. <source>Cortex: A Journal Devoted to the Study of the Nervous System and Behavior</source>, <volume>14</volume>(<issue>4</issue>), <fpage>564</fpage>–<lpage>577</lpage>. <pub-id pub-id-type="doi">10.1016/S0010-9452(78)80031-8</pub-id></mixed-citation></ref>	
	<ref id="bib27"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Friendly</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Sigal</surname>, <given-names>M.</given-names></string-name> (<year>2020</year>). <article-title>Visualizing tests for equality of covariance matrices</article-title>. <source><italic> American Statistician</italic></source>, <volume>74</volume>(<issue>2</issue>), <fpage>144</fpage>–<lpage>155</lpage>. <pub-id pub-id-type="doi">10.1080/00031305.2018.1497537</pub-id></mixed-citation></ref>
	<ref id="bib28"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Gaito</surname>, <given-names>J.</given-names></string-name> (<year>1980</year>). <article-title>Measurement scales and statistics: Resurgence of an old misconception</article-title>. <source><italic>Psychological Bulletin</italic></source>, <volume>87</volume>(<issue>3</issue>, <fpage>564</fpage>–<lpage>567</lpage>. <pub-id pub-id-type="doi">10.1037/0033-2909.87.3.564</pub-id></mixed-citation></ref>
	<ref id="bib29"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Garrett</surname>, <given-names>H. E.</given-names></string-name> (<year>1943</year>). <article-title>The discriminant function and its use in psychology</article-title>. <source><italic>Psychometrika</italic></source>, <volume>8</volume>(<issue>2</issue>), <fpage>65</fpage>–<lpage>79</lpage>. <pub-id pub-id-type="doi">10.1007/BF02288691</pub-id></mixed-citation></ref>
	<ref id="bib30"><mixed-citation publication-type="book"><string-name name-style="western"><surname>Gnanadesikan</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Kettenring</surname>, <given-names>J. R.</given-names></string-name> (<year>1984</year>). <chapter-title>A pragmatic review of multivariate methods in applications</chapter-title>. In W. A. David &amp; H. T. David (Eds.), <italic>Statistics: An appraisal</italic> (pp. 309–337). <publisher-name>Iowa State University Press</publisher-name>.</mixed-citation></ref>
	<ref id="bib31"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Gupta</surname>, <given-names>A. K.</given-names></string-name> (<year>1986</year>). <article-title>On a classification rule for multiple measurements</article-title>. <source><italic>Computers &amp; Mathematics with Applications</italic></source>, <volume>12</volume>(<issue>2A</issue>), <fpage>301</fpage>–<lpage>308</lpage>. <pub-id pub-id-type="doi">10.1016/0898-1221(86)90082-9</pub-id></mixed-citation></ref>
	
	<ref id="bib31.5"><mixed-citation publication-type="data"><string-name name-style="western"><surname>Graf</surname>, <given-names>R.</given-names></string-name>, <string-name name-style="western"><surname>Zeldovich</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Friedrich</surname>, <given-names>S.</given-names></string-name> (<year>2025</year>). <italic>Linear classification methods for multivariate repeated measures data — A simulation study</italic> [Code, Data, Supplementary Materials]. Figshare. <ext-link ext-link-type="uri" xlink:href="https://figshare.com/s/104aeb2a870a810f80bd">https://figshare.com/s/104aeb2a870a810f80bd</ext-link>.</mixed-citation></ref>	
	
	<ref id="bib32"><mixed-citation publication-type="book"><string-name name-style="western"><surname>Hardin</surname>, <given-names>J. W.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Hilbe</surname>, <given-names>J. M.</given-names></string-name> (<year>2013</year>). <source><italic>Generalized estimating equations</italic></source>. <publisher-name>CRC Press</publisher-name>.</mixed-citation></ref>
	<ref id="bib33"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Hastie</surname>, <given-names>T. J.</given-names></string-name>, <string-name name-style="western"><surname>Rosset</surname>, <given-names>S.</given-names></string-name>, <string-name name-style="western"><surname>Tibshirani</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Zhu</surname>, <given-names>J.</given-names></string-name> (<year>2004</year>). <article-title>The entire regularization path for the support vector machine</article-title>. <source><italic>Journal of Machine Learning Research</italic></source>, <volume>5</volume>, <fpage>1391</fpage>–<lpage>1415</lpage>.</mixed-citation></ref>
	<ref id="bib34"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Hawkins</surname>, <given-names>D. M.</given-names></string-name>, &amp; <string-name name-style="western"><surname>McLachlan</surname>, <given-names>G. J.</given-names></string-name> (<year>1997</year>). <article-title>High-breakdown linear discriminant analysis</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>92</volume>(<issue>437</issue>), <fpage>136</fpage>–<lpage>143</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.1997.10473610</pub-id></mixed-citation></ref>
	<ref id="bib35"><mixed-citation publication-type="report"><string-name name-style="western"><surname>Hsu</surname>, <given-names>C.-W.</given-names></string-name>, <string-name name-style="western"><surname>Chang</surname>, <given-names>C.-C.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Lin</surname>, <given-names>C.-J.</given-names></string-name> (<year>2003</year>). <italic>A practical guide to support vector classification</italic> [Technical Report] (pp. 1–12). Department of Computer Science and Information Engineering, National Taiwan University.</mixed-citation></ref> 
	<ref id="bib36"><mixed-citation publication-type="book"><string-name name-style="western"><surname>Huberty</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Olejnik</surname>, <given-names>S.</given-names></string-name> (<year>2006</year>). <source><italic>Applied MANOVA and discriminant analysis</italic></source>. <publisher-name>John Wiley &amp; Sons</publisher-name>.</mixed-citation></ref>
	<ref id="bib37"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Hyde</surname>, <given-names>M.</given-names></string-name>, <string-name name-style="western"><surname>Wiggins</surname>, <given-names>R.</given-names></string-name>, <string-name name-style="western"><surname>Higgs</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Blane</surname>, <given-names>D.</given-names></string-name> (<year>2003</year>). <article-title>A measure of quality of life in early old age: The theory, development and properties of a needs satisfaction model (CASP-19)</article-title>. <source>Aging &amp; Mental Health</source>, <volume>7</volume>(<issue>3</issue>), <fpage>186</fpage>–<lpage>194</lpage>. <pub-id pub-id-type="doi">10.1080/1360786031000101157</pub-id></mixed-citation></ref>
	<ref id="bib38"><mixed-citation publication-type="other"><string-name name-style="western"><surname>Inan</surname>, <given-names>G.</given-names></string-name> (<year>2015</year>). <source>JGEE: Joint Generalized Estimating Equation solver</source>. Comprehensive R Archive Network. <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/src/contrib/Archive/JGEE/">https://cran.r-project.org/src/contrib/Archive/JGEE/</ext-link></mixed-citation></ref>
	<ref id="bib39"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Jebb</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Ng</surname>, <given-names>V.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Tay</surname>, <given-names>L.</given-names></string-name> (<year>2021</year>). <article-title>A review of key Likert scale development advances: 1995–2019</article-title>. <source>Frontiers in Psychology</source>, <volume>12</volume>, <elocation-id>637547</elocation-id>. <pub-id pub-id-type="doi">10.3389/fpsyg.2021.637547</pub-id></mixed-citation></ref>
	<ref id="bib40"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Jensen</surname>, <given-names>E.</given-names></string-name>, <string-name name-style="western"><surname>Pfleger</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Lorenz</surname>, <given-names>L.</given-names></string-name>, <string-name name-style="western"><surname>Jensen</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Wagoner</surname>, <given-names>B.</given-names></string-name>, <string-name name-style="western"><surname>Watzlawik</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Herbig</surname>, <given-names>L.</given-names></string-name> (<year>2021</year>). <article-title>A repeated measures dataset on public responses to the COVID-19 pandemic: Social norms, attitudes, behaviors, conspiracy thinking, and (mis)information</article-title>. <source><italic>Frontiers in Communication</italic></source>, <volume>6</volume>, <elocation-id>678335</elocation-id>. <pub-id pub-id-type="doi">10.3389/fcomm.2021.678335</pub-id></mixed-citation></ref>
	<ref id="bib41"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Jiang</surname>, <given-names>B.</given-names></string-name>, <string-name name-style="western"><surname>Zhang</surname>, <given-names>X.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Cai</surname>, <given-names>T.</given-names></string-name> (<year>2008</year>). <article-title>Estimating the confidence interval for prediction errors of support vector machine classifiers</article-title>. <source><italic>Journal of Machine Learning Research</italic></source>, <volume>9</volume>(<issue>17</issue>), <fpage>521</fpage>–<lpage>540</lpage>.</mixed-citation></ref>
	<ref id="bib42"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Knowles</surname>, <given-names>C.</given-names></string-name>, <string-name name-style="western"><surname>Eccersley</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Scott</surname>, <given-names>M.</given-names></string-name>, <string-name name-style="western"><surname>Walker</surname>, <given-names>S.</given-names></string-name>, <string-name name-style="western"><surname>Reeves</surname>, <given-names>B.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Lunniss</surname>, <given-names>P.</given-names></string-name> (<year>2000</year>). <article-title>Linear discriminant analysis of symptoms in patients with chronic constipation</article-title>. <source><italic>Diseases of the Colon &amp; Rectum</italic></source>, <volume>43</volume>, <fpage>1419</fpage>–<lpage>1426</lpage>. <pub-id pub-id-type="doi">10.1007/BF02236639</pub-id></mixed-citation></ref>
	<ref id="bib43"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Kristjansdottir</surname>, <given-names>H.</given-names></string-name>, <string-name name-style="western"><surname>Erlingsdóttir</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Saavedra</surname>, <given-names>J.</given-names></string-name> (<year>2018</year>). <article-title>Psychological skills, mental toughness and anxiety in elite handball players</article-title>. <source><italic>Personality and Individual Differences</italic></source>, <volume>134</volume>, <fpage>125</fpage>–<lpage>130</lpage>. <pub-id pub-id-type="doi">10.1016/j.paid.2018.06.011</pub-id></mixed-citation></ref>
	<ref id="bib44"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Kuhn</surname>, <given-names>M.</given-names></string-name>, <string-name name-style="western"><surname>Wing</surname>, <given-names>J.</given-names></string-name>, <string-name name-style="western"><surname>Weston</surname>, <given-names>S.</given-names></string-name>, <string-name name-style="western"><surname>Williams</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Keefer</surname>, <given-names>C.</given-names></string-name>, <string-name name-style="western"><surname>Engelhardt</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Cooper</surname>, <given-names>T.</given-names></string-name>, <string-name name-style="western"><surname>Mayer</surname>, <given-names>Z.</given-names></string-name>, <string-name name-style="western"><surname>Kenkel</surname>, <given-names>B.</given-names></string-name>, <string-name name-style="western"><surname>R Core Team, Benesty</surname>, <given-names>M.</given-names></string-name>, <string-name name-style="western"><surname>Lescarbeau</surname>, <given-names>R.</given-names></string-name>, <string-name name-style="western"><surname>Ziem</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Scrucca</surname>, <given-names>L.</given-names></string-name>, <string-name name-style="western"><surname>Tang</surname>, <given-names>Y.</given-names></string-name>, <string-name name-style="western"><surname>Candan</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Hunt</surname>, <given-names>T.</given-names></string-name> (<year>2024</year>). <source><italic>caret: Classification and Regression Training</italic></source>. <comment>Comprehensive R Archive Network</comment>. <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=caret">https://CRAN.R-project.org/package=caret</ext-link></mixed-citation></ref>
	<ref id="bib45"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Kumpulainen</surname>, <given-names>P.</given-names></string-name>, <string-name name-style="western"><surname>Cardó</surname>, <given-names>A. V.</given-names></string-name>, <string-name name-style="western"><surname>Somppi</surname>, <given-names>S.</given-names></string-name>, <string-name name-style="western"><surname>Törnqvist</surname>, <given-names>H.</given-names></string-name>, <string-name name-style="western"><surname>Väätäjä</surname>, <given-names>H.</given-names></string-name>, <string-name name-style="western"><surname>Majaranta</surname>, <given-names>P.</given-names></string-name>, <string-name name-style="western"><surname>Surakka</surname>, <given-names>V.</given-names></string-name>, <string-name name-style="western"><surname>Vainio</surname>, <given-names>O.</given-names></string-name>, <string-name name-style="western"><surname>Kujala</surname>, <given-names>M. V.</given-names></string-name>, <string-name name-style="western"><surname>Gizatdinova</surname>, <given-names>Y.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Vehkaoja</surname>, <given-names>A.</given-names></string-name> (<year>2021</year>). <article-title>Dog behaviour classification with movement sensors placed on the harness and the collar</article-title>. <source>Applied Animal Behaviour Science</source>, <volume>241</volume>, <elocation-id>105393</elocation-id>. <pub-id pub-id-type="doi">10.1016/j.applanim.2021.105393</pub-id></mixed-citation></ref>
	<ref id="bib46"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Langlois</surname>, <given-names>F.</given-names></string-name>, <string-name name-style="western"><surname>Freeston</surname>, <given-names>M. H.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Ladouceur</surname>, <given-names>R.</given-names></string-name> (<year>2000</year>). <article-title>Differences and similarities between obsessive intrusive thoughts and worry in a non-clinical population: Study 2</article-title>. <source><italic>Behaviour Research and Therapy</italic></source>, <volume>38</volume>(<issue>2</issue>), <fpage>175</fpage>–<lpage>189</lpage>. <pub-id pub-id-type="doi">10.1016/s0005-7967(99)00028-5</pub-id></mixed-citation></ref>
<ref id="bib47"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Likert</surname>, <given-names>R.</given-names></string-name> (<year>1932</year>). <article-title>A technique for the measurement of attitudes</article-title>. <source><italic>Archives of Scientific Psychology</italic></source>, <volume>140</volume>, <fpage>1</fpage>–<lpage>55</lpage>.</mixed-citation></ref>
	<ref id="bib48"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Lix</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Sajobi</surname>, <given-names>T.</given-names></string-name> (<year>2010</year>). <article-title>Discriminant analysis for repeated measures data: A review</article-title>. <source><italic>Frontiers in Psychology</italic></source>, <volume>1</volume>, <elocation-id>146</elocation-id>. <pub-id pub-id-type="doi">10.3389/fpsyg.2010.00146</pub-id></mixed-citation></ref>
	<ref id="bib49"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Lu</surname>, <given-names>N.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Zimmerman</surname>, <given-names>D.</given-names></string-name> (<year>2005</year>). <article-title>The likelihood ratio test for a separable covariance matrix</article-title>. <source><italic>Statistics &amp; Probability Letters</italic></source>, <volume>73</volume>(<issue>4</issue>), <fpage>449</fpage>–<lpage>457</lpage>. <pub-id pub-id-type="doi">10.1016/j.spl.2005.04.020</pub-id></mixed-citation></ref>
	<ref id="bib50"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>McLanahan</surname>, <given-names>S.</given-names></string-name>, <string-name name-style="western"><surname>Garfinkel</surname>, <given-names>I.</given-names></string-name>, <string-name name-style="western"><surname>Edin</surname>, <given-names>K.</given-names></string-name>, <string-name name-style="western"><surname>Waldfogel</surname>, <given-names>J.</given-names></string-name>, <string-name name-style="western"><surname>Hale</surname>, <given-names>L.</given-names></string-name>, <string-name name-style="western"><surname>Buxton</surname>, <given-names>O. M.</given-names></string-name>, <string-name name-style="western"><surname>Mitchell</surname>, <given-names>C.</given-names></string-name>, <string-name name-style="western"><surname>Hyde</surname>, <given-names>L. W.</given-names></string-name>, <string-name name-style="western"><surname>Notterman</surname>, <given-names>D. A.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Monk</surname>, <given-names>C. S.</given-names></string-name> (<year>2019</year>). <italic>Fragile families and child wellbeing study, public use, United States, 1998–2017.</italic> Inter-University Consortium for Political and Social Research. <pub-id pub-id-type="doi">10.3886/ICPSR31622.v4</pub-id></mixed-citation></ref> 	
	<ref id="bib51"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Morris</surname>, <given-names>T. P.</given-names></string-name>, <string-name name-style="western"><surname>White</surname>, <given-names>I. R.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Crowther</surname>, <given-names>M. J.</given-names></string-name> (<year>2019</year>). <article-title>Using simulation studies to evaluate statistical methods</article-title>. <source><italic>Statistics in Medicine</italic></source>, <volume>38</volume>(<issue>11</issue>), <fpage>2074</fpage>–<lpage>2102</lpage>. <pub-id pub-id-type="doi">10.1002/sim.8086</pub-id></mixed-citation></ref>
	<ref id="bib52"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Naik</surname>, <given-names>D. N.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Rao</surname>, <given-names>S. S.</given-names></string-name> (<year>2001</year>). <article-title>Analysis of multivariate repeated measures data with a Kronecker product structured covariance matrix</article-title>. <source><italic>Journal of Applied Statistics</italic></source>, <volume>28</volume>(<issue>1</issue>), <fpage>105</fpage>–<lpage>191</lpage>. <pub-id pub-id-type="doi">10.1080/02664760120011626</pub-id></mixed-citation></ref>
	<ref id="bib53"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Neto</surname>, <given-names>E.</given-names></string-name>, <string-name name-style="western"><surname>Biessmann</surname>, <given-names>F.</given-names></string-name>, <string-name name-style="western"><surname>Aurlien</surname>, <given-names>H.</given-names></string-name>, <string-name name-style="western"><surname>Nordby</surname>, <given-names>H.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Eichele</surname>, <given-names>T.</given-names></string-name> (<year>2016</year>). <article-title>Regularized linear discriminant analysis of EEG features in dementia patients</article-title>. <source><italic>Frontiers in Aging Neuroscience</italic></source>, <volume>8</volume>, <elocation-id>273</elocation-id>. <pub-id pub-id-type="doi">10.3389/fnagi.2016.00273</pub-id></mixed-citation></ref>
	<ref id="bib54"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Norman</surname>, <given-names>G.</given-names></string-name> (<year>2010</year>). <article-title>Likert scales, levels of measurement and the “laws” of statistics</article-title>. <source><italic>Advances in Health Sciences Education</italic></source>, <volume>15</volume>(<issue>5</issue>), <fpage>625</fpage>–<lpage>632</lpage>. <pub-id pub-id-type="doi">10.1007/s10459-010-9222-y</pub-id></mixed-citation></ref>
	<ref id="bib55"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>O’Brien</surname>, <given-names>J.</given-names></string-name>, <string-name name-style="western"><surname>Tsermentseli</surname>, <given-names>S.</given-names></string-name>, <string-name name-style="western"><surname>Cummins</surname>, <given-names>O.</given-names></string-name>, <string-name name-style="western"><surname>Happé</surname>, <given-names>F.</given-names></string-name>, <string-name name-style="western"><surname>Heaton</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Spencer</surname>, <given-names>J. V.</given-names></string-name> (<year>2009</year>). <article-title>Discriminating children with autism from children with learning diﬃculties with an adaptation of the short sensory profile</article-title>. <source><italic>Early Child Development and Care</italic></source>, <volume>179</volume>(<issue>4</issue>), <fpage>383</fpage>–<lpage>394</lpage>.</mixed-citation></ref>
	<ref id="bib56"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Rausch</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Kelley</surname>, <given-names>K.</given-names></string-name> (<year>2009</year>). <article-title>A comparison of linear and mixture models for discriminant analysis under nonnormality</article-title>. <source>Behavior Research Methods</source>, <volume>41</volume><fpage>85</fpage>–<lpage>98</lpage>. <pub-id pub-id-type="doi">10.3758/BRM.41.1.85</pub-id></mixed-citation></ref>
	<ref id="bib57"><mixed-citation publication-type="other"><string-name name-style="western"><surname>Ribeiro</surname>, <given-names>C. E.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Freitas</surname>, <given-names>A.</given-names></string-name> (<year>2019</year>). <italic>A mini-survey of supervised machine learning approaches for coping with ageing-related longitudinal datasets</italic>. Third Workshop on AI for Aging, Rehabilitation and Independent Assisted Living (ARIAL) — IJCAI-2019.</mixed-citation></ref>
	<ref id="bib58"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Rickards</surname>, <given-names>G.</given-names></string-name>, <string-name name-style="western"><surname>Magee</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Artino</surname>, <given-names>A.</given-names></string-name> (<year>2012</year>). <article-title>You can’t fix by analysis what you’ve spoiled by design: Developing survey instruments and collecting validity evidence</article-title>. <source><italic>Journal of Graduate Medical Education</italic></source>, <volume>4</volume>8<issue>4</issue>, <fpage>407</fpage>–<lpage>410</lpage>. <pub-id pub-id-type="doi">10.4300/JGME-D-12-00239.1</pub-id></mixed-citation></ref>
	<ref id="bib59"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Ripley</surname>, <given-names>B.</given-names></string-name>, <string-name name-style="western"><surname>Venables</surname>, <given-names>B.</given-names></string-name>, <string-name name-style="western"><surname>Bates</surname>, <given-names>D. M.</given-names></string-name>, <string-name name-style="western"><surname>Hornik</surname>, <given-names>K.</given-names></string-name>, <string-name name-style="western"><surname>Gebhardt</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Firth</surname>, <given-names>D.</given-names></string-name> (<year>2022</year>). <source><italic>MASS: Support functions and datasets for Venables and Ripley’s MASS</italic></source>. <comment>Comprehensive R Archive Network</comment>. <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=MASS">https://CRAN.R-project.org/package=MASS</ext-link></mixed-citation></ref>
	<ref id="bib60"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Rogge</surname>, <given-names>R. D.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Bradbury</surname>, <given-names>T. N.</given-names></string-name> (<year>1999</year>). <article-title>Till violence does us part: The differing roles of communication and aggression in predicting adverse marital outcomes</article-title>. <source><italic>Journal of Consulting and Clinical Psychology</italic></source>, <volume>67</volume>(<issue>3</issue>, <fpage>340</fpage>–<lpage>351</lpage>. <pub-id pub-id-type="doi">10.1037/0022-006X.67.3.340</pub-id></mixed-citation></ref>
	<ref id="bib61"><mixed-citation publication-type="book"><string-name name-style="western"><surname>Rousseeuw</surname>, <given-names>P.</given-names></string-name> (<year>1985</year>). Multivariate estimation with high breakdown point. In W. Grossmann, G. Pflug, I. Vincze, &amp; W. Wertz (Eds.), <italic>Mathematical statistics and applications</italic> (pp. 283–297). Reidel Publishing Company.</mixed-citation></ref>
	
	<ref id="bib62"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Rousseeuw</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Driessen</surname>, <given-names>K.</given-names></string-name> (<year>1999</year>). <article-title>A fast algorithm for the minimum covariance determinant estimator</article-title>. <source><italic>Technometrics</italic></source>, <volume>41</volume>(<issue>3</issue>), <fpage>212</fpage>–<lpage>223</lpage>. <pub-id pub-id-type="doi">10.1080/00401706.1999.10485670</pub-id></mixed-citation></ref>
	<ref id="bib63"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Roy</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Khattree</surname>, <given-names>R.</given-names></string-name> (<year>2005a</year>). <article-title>Discrimination and classification with repeated measures data under different covariance structures</article-title>. <source><italic>Communications in Statistics – Simulation and Computation</italic></source>, <volume>34</volume>(<issue>1</issue>, <fpage>167</fpage>–<lpage>178</lpage>. <pub-id pub-id-type="doi">10.1081/SAC-200047072</pub-id></mixed-citation></ref>
	<ref id="bib64"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Roy</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Khattree</surname>, <given-names>R.</given-names></string-name> (<year>2005b</year>). <article-title>On discrimination and classification with multivariate repeated measures data</article-title>. <source><italic>Journal of Statistical Planning and Inference</italic></source>, <volume>134</volume>(<issue>2</issue>), <fpage>462</fpage>–<lpage>485</lpage>. <pub-id pub-id-type="doi">10.1016/j.jspi.2004.04.012</pub-id></mixed-citation></ref>
<ref id="bib65"><mixed-citation publication-type="other"><string-name name-style="western"><surname>Sentelle</surname>, <given-names>C.</given-names></string-name> (<year>2015</year>). <italic>simplesvmpath.</italic> GitHub. <ext-link ext-link-type="uri" xlink:href="https://github.com/csentelle/simplesvmpath.git">https://github.com/csentelle/simplesvmpath.git</ext-link></mixed-citation></ref>
	<ref id="bib66"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Sentelle</surname>, <given-names>C.</given-names></string-name>, <string-name name-style="western"><surname>Anagnostopoulos</surname>, <given-names>G. C.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Georgiopoulos</surname>, <given-names>M.</given-names></string-name> (<year>2016</year>). <article-title>A simple method for solving the SVM regularization path for semidefinite kernels</article-title>. <source>IEEE Transactions on Neural Networks and Learning Systems</source>, <volume>27</volume>(<issue>4</issue>), <fpage>709</fpage>–<lpage>722</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2015.2427333</pub-id></mixed-citation></ref>
	<ref id="bib67"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Sherry</surname>, <given-names>A.</given-names></string-name> (<year>2006</year>). <article-title>Discriminant analysis in counseling psychology research</article-title>. <source>Counseling Psychologist</source>, <volume>34</volume>(<issue>5</issue>), <fpage>661</fpage>–<lpage>683</lpage>. <pub-id pub-id-type="doi">10.1177/0011000006287103</pub-id></mixed-citation></ref>
	<ref id="bib68"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Shinba</surname>, <given-names>T.</given-names></string-name>, <string-name name-style="western"><surname>Murotsu</surname>, <given-names>K.</given-names></string-name>, <string-name name-style="western"><surname>Usui</surname>, <given-names>Y.</given-names></string-name>, <string-name name-style="western"><surname>Andow</surname>, <given-names>Y.</given-names></string-name>, <string-name name-style="western"><surname>Terada</surname>, <given-names>H.</given-names></string-name>, <string-name name-style="western"><surname>Kariya</surname>, <given-names>N.</given-names></string-name>, <string-name name-style="western"><surname>Tatebayashi</surname>, <given-names>Y.</given-names></string-name>, <string-name name-style="western"><surname>Matsuda</surname>, <given-names>Y.</given-names></string-name>, <string-name name-style="western"><surname>Mugishima</surname>, <given-names>G.</given-names></string-name>, <string-name name-style="western"><surname>Shinba</surname>, <given-names>Y.</given-names></string-name>, <string-name name-style="western"><surname>Sun</surname>, <given-names>G.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Matsui</surname>, <given-names>T.</given-names></string-name> (<year>2021</year>). <article-title>Return-to-work screening by linear discriminant analysis of heart rate variability indices in depressed subjects</article-title>. <source><italic>Sensors</italic></source>, <volume>21</volume>(<issue>15</issue>), <elocation-id>5177</elocation-id>. <pub-id pub-id-type="doi">10.3390/s21155177</pub-id></mixed-citation></ref>
	
	<ref id="bib69"><mixed-citation publication-type="other"><string-name name-style="western"><surname>Silan</surname>, <given-names>M. A. A.</given-names></string-name> (<year>2020</year>). <italic>When can we treat Likert type data as interval?</italic> PsyArXiv. <ext-link ext-link-type="uri" xlink:href="https://osf.io/preprints/psyarxiv/wvkyu_v1">https://osf.io/preprints/psyarxiv/wvkyu_v1</ext-link></mixed-citation></ref>
	<ref id="bib70"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Stoyanov</surname>, <given-names>D. S.</given-names></string-name>, <string-name name-style="western"><surname>Khorev</surname>, <given-names>V. S.</given-names></string-name>, <string-name name-style="western"><surname>Paunova</surname>, <given-names>R.</given-names></string-name>, <string-name name-style="western"><surname>Kandilarova</surname>, <given-names>S.</given-names></string-name>, <string-name name-style="western"><surname>Simeonova</surname>, <given-names>D.</given-names></string-name>, <string-name name-style="western"><surname>Badarin</surname>, <given-names>A. A.</given-names></string-name>, <string-name name-style="western"><surname>Hramov</surname>, <given-names>A. E.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Kurkin</surname>, <given-names>S. A.</given-names></string-name> (<year>2022</year>). <article-title>Resting-state functional connectivity impairment in patients with major depressive episode</article-title>. <source>International Journal of Environmental Research and Public Health</source>, <volume>19</volume>(<issue>21</issue>), <elocation-id>14045</elocation-id>. <pub-id pub-id-type="doi">10.3390/ijerph192114045</pub-id></mixed-citation></ref>
	<ref id="bib71"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Sullivan</surname>, <given-names>G.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Artino</surname>, <given-names>A.</given-names></string-name> (<year>2013</year>). <article-title>Analyzing and interpreting data from Likert-type scales</article-title>. <source><italic>Journal of Graduate Medical Education</italic></source>, <volume>5</volume>(<issue>4</issue>), <fpage>541</fpage>–<lpage>542</lpage>. <pub-id pub-id-type="doi">10.4300/JGME-5-4-18</pub-id></mixed-citation></ref>
	<ref id="bib72"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Talarska</surname>, <given-names>D.</given-names></string-name>, <string-name name-style="western"><surname>Tobis</surname>, <given-names>S.</given-names></string-name>, <string-name name-style="western"><surname>Kotkowiak</surname>, <given-names>M.</given-names></string-name>, <string-name name-style="western"><surname>Strugała</surname>, <given-names>M.</given-names></string-name>, <string-name name-style="western"><surname>Stanisławska</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Wieczorowska-Tobis</surname>, <given-names>K.</given-names></string-name> (<year>2018</year>). <article-title>Determinants of quality of life and the need for support for the elderly with good physical and mental functioning</article-title>. <source><italic>Medical Science Monitor</italic></source>, <volume>24</volume>,  <fpage>1604</fpage>–<lpage>1613</lpage>. <pub-id pub-id-type="doi">10.12659/msm.907032</pub-id></mixed-citation></ref>
	<ref id="bib73"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Tiku</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Balakrishnan</surname>, <given-names>N.</given-names></string-name> (<year>1984</year>). <article-title>Testing equality of population variances the robust way</article-title>. <source><italic>Communications in Statistics – Theory and Methods</italic></source>, <volume>13</volume>(<issue>17</issue>), <fpage>2143</fpage>–<lpage>2159</lpage>. <pub-id pub-id-type="doi">10.1080/03610928408828818</pub-id></mixed-citation></ref>
	<ref id="bib74"><mixed-citation publication-type="other"><string-name name-style="western"><surname>Todorov</surname>, <given-names>V.</given-names></string-name> (<year>2022</year>). <source><italic>rrcov: Scalable robust estimators with high breakdown point</italic></source>. <comment>Comprehensive R Archive Network</comment>. <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=rrcov">https://CRAN.R-project.org/package=rrcov</ext-link></mixed-citation></ref>
	<ref id="bib75"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Tomasko</surname>, <given-names>L.</given-names></string-name>, <string-name name-style="western"><surname>Helms</surname>, <given-names>R. W.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Snapinn</surname>, <given-names>S. M.</given-names></string-name> (<year>2010</year>). <article-title>A discriminant analysis extension to mixed models</article-title>. <source><italic>Statistics in Medicine</italic></source>, <volume>18</volume>(<issue>10</issue>), <fpage>1249</fpage>–<lpage>1260</lpage>.</mixed-citation></ref>
	<ref id="bib76"><mixed-citation publication-type="other"><string-name name-style="western"><surname>van den Boogaart</surname>, <given-names>K. G.</given-names></string-name>, <string-name name-style="western"><surname>Tolosana-Delgado</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Bren</surname>, <given-names>M.</given-names></string-name> (<year>2022</year>). <source><italic>compositions: Compositional data analysis</italic></source>. <comment>Comprehensive R Archive Network</comment>. <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=compositions">https://CRAN.R-project.org/package=compositions</ext-link></mixed-citation></ref>
<ref id="bib77"><mixed-citation publication-type="book"><string-name name-style="western"><surname>Vapnik</surname>, <given-names>V.</given-names></string-name> (<year>1982</year>). <source><italic>Estimation of dependences based on empirical data: Empirical inference science</italic></source>. <publisher-name>Springer</publisher-name>.</mixed-citation></ref>
	<ref id="bib78"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Veronese</surname>, <given-names>G.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Pepe</surname>, <given-names>A.</given-names></string-name> (<year>2017</year>). <article-title>Life satisfaction and trauma in clinical and non-clinical children living in a war-torn environment: A discriminant analysis</article-title>. <source><italic>Journal of Health Psychology</italic></source>, <volume>25</volume>(<issue>4</issue>), <fpage>459</fpage>–<lpage>471</lpage>. <pub-id pub-id-type="doi">10.1177/1359105317720004</pub-id></mixed-citation></ref>
	<ref id="bib79"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Wahl</surname>, <given-names>S.</given-names></string-name>, <string-name name-style="western"><surname>Boulesteix</surname>, <given-names>A.-L.</given-names></string-name>, <string-name name-style="western"><surname>Zierer</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Thorand</surname>, <given-names>B.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Wiel</surname>, <given-names>M.</given-names></string-name> (<year>2016</year>). <article-title>Assessment of predictive performance in incomplete data by combining internal validation and multiple imputation</article-title>. <source><italic>BMC Medical Research Methodology</italic></source>, <volume>16</volume>, <elocation-id>144</elocation-id>. <pub-id pub-id-type="doi">10.1186/s12874-016-0239-7</pub-id></mixed-citation></ref>
	<ref id="bib80"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Wang</surname>, <given-names>K.</given-names></string-name>, <string-name name-style="western"><surname>Shi</surname>, <given-names>H.-S.</given-names></string-name>, <string-name name-style="western"><surname>Geng</surname>, <given-names>F.-L.</given-names></string-name>, <string-name name-style="western"><surname>Zou</surname>, <given-names>L.-Q.</given-names></string-name>, <string-name name-style="western"><surname>Tan</surname>, <given-names>S.-P.</given-names></string-name>, <string-name name-style="western"><surname>Wang</surname>, <given-names>Y.</given-names></string-name>, <string-name name-style="western"><surname>Neumann</surname>, <given-names>D. L.</given-names></string-name>, <string-name name-style="western"><surname>Shum</surname>, <given-names>D. H. K.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Chan</surname>, <given-names>R. C. K.</given-names></string-name> (<year>2016</year>). <article-title>Cross-cultural validation of the Depression Anxiety Stress Scale-21 in China</article-title>. <source><italic>Psychological Assessment</italic></source>, <volume>28</volume>(<issue>5</issue>), <fpage>e88</fpage>–<lpage>e100</lpage>. <pub-id pub-id-type="doi">10.1037/pas0000207</pub-id></mixed-citation></ref>
	<ref id="bib81"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Weber</surname>, <given-names>L.</given-names></string-name>, <string-name name-style="western"><surname>Saelens</surname>, <given-names>W.</given-names></string-name>, <string-name name-style="western"><surname>Cannoodt</surname>, <given-names>R.</given-names></string-name>, <string-name name-style="western"><surname>Soneson</surname>, <given-names>C.</given-names></string-name>, <string-name name-style="western"><surname>Hapfelmeier</surname>, <given-names>A.</given-names></string-name>, <string-name name-style="western"><surname>Gardner</surname>, <given-names>P.</given-names></string-name>, <string-name name-style="western"><surname>Boulesteix</surname>, <given-names>A.-L.</given-names></string-name>, <string-name name-style="western"><surname>Saeys</surname>, <given-names>Y.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Robinson</surname>, <given-names>M.</given-names></string-name> (<year>2019</year>). <article-title>Essential guidelines for computational method benchmarking</article-title>. <source><italic>Genome Biology</italic></source>, <volume>20</volume>, <elocation-id>125</elocation-id>. <pub-id pub-id-type="doi">10.1186/s13059-019-1738-8</pub-id></mixed-citation></ref>
	<ref id="bib82"><mixed-citation publication-type="other"><string-name name-style="western"><surname>Wilhelm</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Manjunath</surname>, <given-names>B.</given-names></string-name> (<year>2022</year>). <source><italic>tmvtnorm: Truncated Multivariate Normal and Student t Distribution</italic></source>. <comment>Comprehensive R Archive Network</comment>. <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=tmvtnorm">https://CRAN.R-project.org/package=tmvtnorm</ext-link></mixed-citation></ref>
	<ref id="bib83"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Woodruff</surname>, <given-names>D. L.</given-names></string-name>, &amp; <string-name name-style="western"><surname>Rocke</surname>, <given-names>D. M.</given-names></string-name> (<year>1993</year>). <article-title>Heuristic search algorithms for the minimum volume ellipsoid</article-title>. <source><italic>Journal of Computational and Graphical Statistics</italic></source>, <volume>2</volume>(<issue>1</issue>), <fpage>69</fpage>–<lpage>95</lpage>. <pub-id pub-id-type="doi">10.1080/10618600.1993.10474600</pub-id></mixed-citation></ref>
<ref id="bib84"><mixed-citation publication-type="journal"><string-name name-style="western"><surname>Youden</surname>, <given-names>W. J.</given-names></string-name> (<year>1950</year>). <article-title>Index for rating diagnostic tests</article-title>. <source><italic>Cancer</italic></source>, <volume>3</volume>(<issue>1</issue>), <fpage>32</fpage>–<lpage>35</lpage>. <ext-link ext-link-type="uri" xlink:href="https://acsjournals.onlinelibrary.wiley.com/doi/10.1002/1097-0142(1950)3:1%3C32::AID-CNCR2820030106%3E3.0.CO;2-3">https://acsjournals.onlinelibrary.wiley.com/doi/10.1002/1097-0142(1950)3:1%3C32::AID-CNCR2820030106%3E3.0.CO;2-3</ext-link></mixed-citation></ref>
	<ref id="bib85"><mixed-citation publication-type="other"><string-name name-style="western"><surname>Zeldovich</surname>, <given-names>M.</given-names></string-name> (<year>1982</year>). <italic>Outcome measurement in russian clinical praxis: clinical outcome in routine evaluation - outcome measure (CORE-OM)</italic> [Doctoral dissertation, Universität Klagenfurt]. AAU Open-Access publications. <ext-link ext-link-type="uri" xlink:href="https://netlibrary.aau.at/obvuklhs/content/titleinfo/5370233">https://netlibrary.aau.at/obvuklhs/content/titleinfo/5370233</ext-link></mixed-citation></ref>
	</ref-list>
		

	
	<sec sec-type="data-availability" id="das"><title>Data Availability</title>
		<p>The code, data, and supplementary materials are available at <xref ref-type="bibr" rid="bib31.5">Graf et al. (2025)</xref></p>
	</sec>	
	
	
	<sec sec-type="supplementary-material" id="sp1"><title>Supplementary Materials</title>
		<p>For this article, the following Supplementary Materials are available:
			<list list-type="bullet">
				<list-item><p>R Code. (<xref ref-type="bibr" rid="bib31.5">Graf et al., 2025</xref>)</p></list-item>
				<list-item><p>Data. (<xref ref-type="bibr" rid="bib31.5">Graf et al., 2025</xref>)</p></list-item>
				
				<list-item><p>Study materials. (<xref ref-type="bibr" rid="bib31.5">Graf et al., 2025</xref>)</p></list-item>
			</list></p>
	</sec>
			

<fn-group>
<fn fn-type="financial-disclosure"><p>The authors have no funding to report.</p></fn>
</fn-group>
<fn-group>
<fn fn-type="conflict"><p>The authors have declared that no competing interests exist.</p></fn>
</fn-group>
<ack>
<p>The authors gratefully acknowledge the resources on the LiCCA HPC cluster of the University of Augsburg, co-funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - Project-ID 499211671.</p>
</ack>
	<notes>
		<title>Publisher Note</title>
		<p>This Corrected Version of Record (CVoR) differs from the original Version of Record (VoR), published on July 10, 2025, by correcting an error within the affiliations section. This correction was made on July 23, 2025. </p>
	</notes>
</back>
</article>
