<?xml version="1.0" encoding="UTF-8"?><?xml-model type="application/xml-dtd" href="http://jats.nlm.nih.gov/publishing/1.1d3/JATS-journalpublishing1.dtd"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1d3 20150301//EN" "http://jats.nlm.nih.gov/publishing/1.1d3/JATS-journalpublishing1.dtd">
<article xmlns:ali="http://www.niso.org/schemas/ali/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" dtd-version="1.1d3" specific-use="Marcalyc 1.2" article-type="research-article" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="redalyc">3442</journal-id>
<journal-title-group>
<journal-title specific-use="original" xml:lang="es">TecnoLógicas</journal-title>
</journal-title-group>
<issn pub-type="ppub">0123-7799</issn>
<issn pub-type="epub">2256-5337</issn>
<publisher>
<publisher-name>Instituto Tecnológico Metropolitano</publisher-name>
<publisher-loc>
<country>Colombia</country>
<email>tecnologicas@itm.edu.co</email>
</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="art-access-id" specific-use="redalyc">344268257011</article-id>
<article-id pub-id-type="doi">https://doi.org/10.22430/22565337.2088</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Artículos de investigación</subject>
</subj-group>
</article-categories>
<title-group>
<article-title xml:lang="en">Support Vector Machines for Biomarkers Detection <italic>in in vitro</italic> and <italic>in vivo</italic> Experiments  of Organochlorines Exposure</article-title>
<trans-title-group>
<trans-title xml:lang="es">Máquinas de vectores de soporte para detección de biomarcadores en experimentos <italic>in vitro</italic> e <italic>in vivo </italic>de exposición a organoclorados</trans-title>
</trans-title-group>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="no">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-2824-4678</contrib-id>
<name name-style="western">
<surname>Lopera-Rodríguez</surname>
<given-names>Jorge Alejandro</given-names>
</name>
<xref ref-type="aff" rid="aff1"/>
<email>alejandrolopera@itm.edu.co</email>
</contrib>
<contrib contrib-type="author" corresp="no">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-1720-8476</contrib-id>
<name name-style="western">
<surname>Zuluaga</surname>
<given-names>Martha</given-names>
</name>
<xref ref-type="aff" rid="aff2"/>
<email>martha.zuluaga@unad.edu.co</email>
</contrib>
<contrib contrib-type="author" corresp="no">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-3195-7588</contrib-id>
<name name-style="western">
<surname>Jaramillo-Garzón</surname>
<given-names>Jorge Alberto</given-names>
</name>
<xref ref-type="aff" rid="aff3"/>
<email>jorge.jaramillo@ucaldas.edu.co</email>
</contrib>
</contrib-group>
<aff id="aff1">
<institution content-type="original">Instituto Tecnológico Metropolitano, Medellín-Colombia,  alejandrolopera@itm.edu.co</institution>
<institution content-type="orgname">Instituto Tecnológico Metropolitano</institution>
<country country="CO">Colombia</country>
</aff>
<aff id="aff2">
<institution content-type="original">Universidad Nacional Abierta y a Distancia,   Dosquebradas-Colombia, martha.zuluaga@unad.edu.co</institution>
<institution content-type="orgname">Universidad Nacional Abierta y a Distancia</institution>
<country country="CO">Colombia</country>
</aff>
<aff id="aff3">
<institution content-type="original">Universidad de Caldas, Manizales-Colombia,  jorge.jaramillo@ucaldas.edu.co</institution>
<institution content-type="orgname">Universidad de Caldas</institution>
<country country="CO">Colombia</country>
</aff>
<pub-date pub-type="epub-ppub">
<season>Septiembre-Diciembre</season>
<year>2021</year>
</pub-date>
<volume>24</volume>
<issue>52</issue>
<elocation-id>e2088</elocation-id>
<history>
<date date-type="received" publication-format="dd mes yyyy">
<day>21</day>
<month>07</month>
<year>2021</year>
</date>
<date date-type="accepted" publication-format="dd mes yyyy">
<day>22</day>
<month>11</month>
<year>2021</year>
</date>
<date date-type="pub" publication-format="dd mes yyyy">
<day>16</day>
<month>12</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-year>2017</copyright-year>
<copyright-holder>Instituto Tecnológico Metropolitano</copyright-holder>
<ali:free_to_read/>
<license xlink:href="https://creativecommons.org/licenses/by-nc-sa/4.0/">
<ali:license_ref>https://creativecommons.org/licenses/by-nc-sa/4.0/</ali:license_ref>
<license-p>Esta obra está bajo una Licencia Creative Commons Atribución-NoComercial-CompartirIgual 4.0 Internacional.</license-p>
</license>
</permissions>
<abstract xml:lang="en">
<title>Abstract</title>
<p>Metabolomic studies generate large amounts of data, whose complexity increases if they are derived from <italic>in vivo</italic> experiments. As a result, analysis methods highly used in metabolomics, such as Partial Least Squares Discriminant Analysis (PLS-DA), can have particular difficulties with this type of data. However, there is evidence that indicates that Support Vector Machines (SVMs) can better deal with complex data. On the other hand, chronic exposure to organochlorines is a public health problem. It has been associated with diseases such as cancer. Therefore, its identification is relevant to reduce their impact on human health. This study explores the performance of SVMs in classifying metabolic profiles and identifying relevant metabolites in studies of exposure to organochlorines. For this purpose, two experiments were conducted: in the first one, organochlorine exposure was evaluated in HepG2 cells; and, in the second one, it was evaluated in serum samples of agricultural workers exposed to pesticides. The performance of SVMs was compared with that of PLS-DA. Four kernel functions were assessed in SVMs, and the accuracy of both methods was evaluated using a k-fold cross-validation test. In order to identify the most relevant metabolites, Recursive Feature Elimination (RFE) was used in SVMs and Variable Importance in Projection (VIP) in PLS-DA. The results show that SVMs exhibit a higher percentage of accuracy with fewer training samples and better performance in classifying the samples from the exposed agricultural workers. Finally, a workflow based on SVMs for the identification of biomarkers in samples with high biological complexity is proposed.</p>
</abstract>
<trans-abstract xml:lang="es">
<title>Resumen</title>
<p>Los estudios en metabolómica generan gran cantidad de datos cuya complejidad aumenta si surgen de experimentos <italic>in vivo</italic>. A pesar de esto, métodos ampliamente usados en metabolómica como el análisis discriminante por mínimos cuadrados parciales (PLS-DA) tienen dificultades con este tipo de datos, sin embargo, hay evidencia que las máquinas de vectores de soporte (SVM) pueden tener un mejor desempeño. Por otro lado, la exposición crónica a organoclorados es un problema de salud pública. Esta se asocia a enfermedades como el cáncer. Identificarla exposición es relevante para disminuir su impacto. Este estudio tuvo como objetivo explorar el rendimiento de las SVM en la clasificación de perfiles metabolómicos e identificación de metabolitos relevantes en estudios de exposición a organoclorados. Se realizaron dos experimentos: primero se evaluó la exposición a organoclorados en células HepG2. Luego, se evaluó la exposición a pesticidas en muestras de suero de trabajadores agrícolas. El rendimiento de las SVM se comparó con PLS-DA. Se evaluaron cuatro funciones kernel en SVM y la precisión de ambos métodos se evaluó mediante prueba de validación cruzada k-fold. Para identificar los metabolitos relevantes, se utilizó eliminación recursiva de características (RFE) en SVM y la proyección de importancia de variables (VIP) se usó en PLS-DA. Los resultados mostraron que las SVM tuvieron mayor precisión en la clasificación de los trabajadores agrícolas expuestos usando menos muestras de entrenamiento. Se propone un flujo de trabajo basado en SVM que permita la identificación de biomarcadores en muestras con alta complejidad biológica.</p>
</trans-abstract>
<kwd-group xml:lang="en">
<title>Keywords</title>
<kwd>Organochlorines</kwd>
<kwd>Pesticides</kwd>
<kwd>Recursive feature elimination</kwd>
<kwd>Multivariate statistical methods</kwd>
<kwd>Support vector machines</kwd>
<kwd>Metabolomics</kwd>
</kwd-group>
<kwd-group xml:lang="es">
<title>Palabras clave</title>
<kwd>Organoclorados</kwd>
<kwd>Eliminación Recursiva de Características</kwd>
<kwd>Estadística Multivariada</kwd>
<kwd>Máquinas de Vectores de Soporte</kwd>
<kwd>Metabolómica</kwd>
</kwd-group>
<counts>
<fig-count count="3"/>
<table-count count="3"/>
<equation-count count="0"/>
<ref-count count="31"/>
</counts>
<custom-meta-group>
<custom-meta>
<meta-name>How to cite / Cómo citar</meta-name>
<meta-value>J. A. Lopera-Rodríguez; M. Zuluaga; J. A. Jaramillo-Garzón, “Support Vector Machines for Biomarkers Detection in in vitro and in vivo Experiments of Organochlorines Exposure”, <italic>TecnoLógicas</italic>, vol. 24, nro. 52, e2088, 2021. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.22430/22565337.2088">https://doi.org/10.22430/22565337.2088</ext-link>
</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec>
<title>
<bold>Highlights</bold>
</title>
<p>
<list list-type="simple">
<list-item>
<p>This study sought to describe the SVMs ability to handle complex data.</p>
</list-item>
<list-item>
<p>PLS-DA accuracy decreased in analysis data from in vivo studies with few samples.</p>
</list-item>
<list-item>
<p>The linear and sigmoid kernels showed the best performance.</p>
</list-item>
<list-item>
<p>SVMs are robust methods for data analysis in in vivo experiments of organochlorines.</p>
</list-item>
</list>
</p>
</sec>
<sec>
<title>
<bold>1.     INTRODUCTION</bold>
</title>
<p>Modern analytical technologies such as mass spectrometry, nuclear magnetic resonance, and tandem mass spectrometry facilitate the study of the metabolome. Metabolomics is defined as the quantitative and comprehensive study of metabolites in a biological system [<xref ref-type="bibr" rid="redalyc_344268257011_ref1">1</xref>]. Metabolomic studies produce large amounts of data on metabolites present in a specific biological scenario, which has been termed “metabolic profile” [<xref ref-type="bibr" rid="redalyc_344268257011_ref2">2</xref>].</p>
<p>The complexity of metabolic profiles depends on the conditions in which the data are generated. For example, metabolic profiles from <italic>in vitro</italic> experiments show low variability, while those from <italic>in vivo</italic> studies (e.g., with humans) might be highly variable between individuals. This complexity affects the ability of statistical algorithms to make accurate predictions based on metabolic profiles.</p>
<p>Methods such as Principal Component Analysis (PCA), Partial Least Squares Discriminant Analysis (PLS-DA), and Orthogonal PLS-DA (OPLS-DA) are commonly used to analyze metabolomics data. However, some studies have identified that their classification capacity can be suboptimal in studies with real life conditions where several variables cannot be controlled and the data can have a nonlinear distribution [<xref ref-type="bibr" rid="redalyc_344268257011_ref3">3</xref>].</p>
<p>Support Vector Machines (SVMs) area supervised learning method that generates a model able to map a training dataset with two categories into a higher-dimensional space in order to separate them by a margin as large as possible [<xref ref-type="bibr" rid="redalyc_344268257011_ref4">4</xref>]. Additionally, SVMs use kernel functions to deal with nonlinear distributions [<xref ref-type="bibr" rid="redalyc_344268257011_ref4">4</xref>], [<xref ref-type="bibr" rid="redalyc_344268257011_ref5">5</xref>], thus being able to work with a large number of variables and few samples. Some studies have shown that, in experiments with complex samples like blood, SVMs can identify relevant metabolites where PLS-DA has not achieved it [<xref ref-type="bibr" rid="redalyc_344268257011_ref3">3</xref>], [<xref ref-type="bibr" rid="redalyc_344268257011_ref6">6</xref>]. For example, a study published in 2008 [<xref ref-type="bibr" rid="redalyc_344268257011_ref3">3</xref>] revealed that PLS-DA omitted creatinine, an important feature to differentiate females from males, which does not occur with SVMs.</p>
<p>Recent studies have also compared PLS-DA with other methods, including SVMs. Mendez <italic>et al. </italic>[<xref ref-type="bibr" rid="redalyc_344268257011_ref7">7</xref>] evaluated the classification performance of PLS-DA, logistic regression of principal components, SVMs, Random Forest (RF), and Artificial Neural Networks (ANNs) in metabolomics studies. The results of such study showed that SVMs and ANNs achieved an improvement in predictive performance over PLS-DA, which did not occur with RF. Gromski <italic>et al. </italic>[<xref ref-type="bibr" rid="redalyc_344268257011_ref8">8</xref>] compared the capabilities of techniques such as discriminant function analysis of principal components, PLS-DA, RF, and SVMs and found that SVMs are suitable to handle outliers and they resist overfitting.</p>
<p>Like in PLS-DA, a list of the most relevant metabolites can be generated by SVMs using SVM-Recursive Feature Elimination (SVM-RFE) [<xref ref-type="bibr" rid="redalyc_344268257011_ref9">9</xref>]. This method employs a loop in which a SVM is trained with a linear kernel, and the feature with the lowest decision value in the model is eliminated. Hence, features are sorted according to their decision value [<xref ref-type="bibr" rid="redalyc_344268257011_ref6">6</xref>], [<xref ref-type="bibr" rid="redalyc_344268257011_ref9">9</xref>]–[11]. Among the techniques that have been implemented to identify relevant metabolites, SVM-RFE has proven to be the most robust [<xref ref-type="bibr" rid="redalyc_344268257011_ref6">6</xref>], [<xref ref-type="bibr" rid="redalyc_344268257011_ref10">10</xref>], [<xref ref-type="bibr" rid="redalyc_344268257011_ref11">11</xref>]. For these reasons, SVMs can be a useful method in the analysis of metabolomics data obtained from complex samples.</p>
<p>On the other hand, organochlorines are a group of pesticides used to control plagues [<xref ref-type="bibr" rid="redalyc_344268257011_ref12">12</xref>]. However, acute exposure to them can produce death; chronic exposure can cause serious diseases such as cancer; and there is not antidote [<xref ref-type="bibr" rid="redalyc_344268257011_ref13">13</xref>]. Also, they can persist in the environment and penetrate the trophic chain. Chronic human exposure to organochlorines can be imperceptible until it is too late [<xref ref-type="bibr" rid="redalyc_344268257011_ref14">14</xref>]. Hence, new diagnostic methods should be developed, and potential biomarkers in humans should be identified. Metabolomics studies can help in this regard. Therefore, data analysis methods with good performance are key to drawing reliable conclusions.</p>
<p>Thus, the aim of this study was to describe the discriminant ability of SVMs to handle samples from both <italic>in vitro</italic> and <italic>in vivo</italic> studies and compare their results with those obtained with PLS-DA. In addition, the capacity of SVMs to propose metabolites as candidate biomarkers in the context of organochlorine pesticide exposure was explored.</p>
</sec>
<sec>
<title>
<bold>2.     METHODS</bold>
</title>
<sec>
<title>
<bold>2.1.  Sample preparation -<italic>in vitro </italic>study</bold>
</title>
<p>A secondary dataset from a study published in 2016 [<xref ref-type="bibr" rid="redalyc_344268257011_ref15">15</xref>] was used here. In such study, HepG2 cell cultures were exposed to four different organochlorines (i.e., aldrin, DDT, endosulfan, and lindane) at concentrations below the cytotoxicity index 50 in order to establish which concentration would be sufficient to induce the metabolic reaction without causing cell destruction and maintaining cell viability above 70 %. Additionally, a control was included: Dimethyl Sulfoxide (DMSO). Each exposure was repeated six times under the same cell passage to avoid genetic variation. The pesticide concentrations employed to assess cell viability were 5 µM, 10 µM, 25 µM, 50 µM, and 100 µM of endosulfan and lindane; 30 µM, 60 µM, 150 µM, 300 µM, and 600 µM of aldrin; and 2.5 µM, 5 µM, 10 µM, 25 µM, and 50 µM of DDT. The concentrations that achieved the desired results were 100 µM of endosulfan and lindane, 50 µM of DDT, and 150 µM of aldrin.</p>
<p>Subsequently, 36 samples of HepG2 cells were exposed to the organochlorine solutions (i.e., 100 µM of endosulfan, 100 µM of lindane, 50 µM of DDT, and 150 µM of aldrin), a mixture treatment at equimolar concentration, and the controls with DMSO (1 % v/v); six samples per treatment. In addition, the cells were incubated for 24 hours with 5 % CO<sub>2</sub> at 37 °C. After such period of exposure, cellular metabolism was inactivated, and endogenous metabolites were extracted adopting the quenching methodology previously published in [<xref ref-type="bibr" rid="redalyc_344268257011_ref15">15</xref>]. Then, the extracts were derivatized using methoxamine hydrochloride and N-methyl-N-(trimethylsilyl) trifluoroacetamide (MSTFA) and analyzed via Gas Chromatography combined with Time-Of-Flight Mass Spectrometry (GC/TOF-MS) following the protocols established by the West Coast Metabolomics Center of the University of California, Davis [<xref ref-type="bibr" rid="redalyc_344268257011_ref16">16</xref>].</p>
<p>The information was processed as follows. First, the signals were automatically deconvolved using ChromaTOF software. Then, the data were extracted without smoothing, and peaks were detected at signal/noise ratios of 5:1 and a peak width of 3 s [<xref ref-type="bibr" rid="redalyc_344268257011_ref15">15</xref>]. Subsequently, the retention peak width was filtered and calculated by means of the BinBase algorithm [<xref ref-type="bibr" rid="redalyc_344268257011_ref17">17</xref>] and cross-checked with the Fiehn mass spectral library.</p>
<p>Finally, 1081 signals were deconvolved. Those with more than 30 % missing values were discarded, leaving 399 signals related to potential metabolites, out of which 153 were identified and 246 remained unidentified. The dataset obtained was composed of 6 classes (aldrin, DDT, endosulfan, lindane, mixture, and DMSO) and 153 features.</p>
</sec>
<sec>
<title>
<bold>2.2.  Sample preparation -<italic>in vivo </italic>study</bold>
</title>
<p>A secondary dataset from a study into agricultural workers exposed to different pesticides was used here. In that study, plasma samples were collected from 100 agricultural workers on coffee plantations. This process was led by the Laboratorio de Pesticidas of Universidad del Quindío (Colombia). Besides, a negative control group of thirty volunteers who had not been exposed to pesticides was included.</p>
<p>All the participants signed an informed consent (previously approved by the ethical committee) to take part in the study. The inclusion criteria included male subjects aged 18 or older and living in the Colombian Coffee Region.</p>
<p>A blood sample was taken from each participant and processed to obtain blood plasma. Each plasma was analyzed to evaluate the presence and concentration of pesticides using Gas Chromatography with Flame Ionization Detector (GC-FID). Out of the 100 cases, 27 were found to be below the detection limit and considered negative cases, while 73 were found to be above the detection limit and considered positive cases.</p>
<p>In the plasma of the 73 positive cases, the presence of six organochlorine pesticides (i.e., endosulfan, endrin, heptachlor, DDT, methoxychlor, and lindane) and chlorpyrifos (an organophosphorus pesticide) was identified. Furthermore, to assess the metabolic profile, the samples were processed and derivatized following the same protocol used for cell extracts [<xref ref-type="bibr" rid="redalyc_344268257011_ref16">16</xref>]. Then, they were analyzed using a GC/MS single quadrupole, thus obtaining 478 signals. The dataset obtained was composed of 8 classes (endosulfan, endrin, heptachlor, DDT, methoxychlor, lindane, chlorpyrifos, and control) and 478 features.</p>
</sec>
<sec>
<title>
<bold>2.3.  Statistical analysis</bold>
</title>
<p>PLS-DA and SVM-RFE were performed here to evaluate the metabolic profiles taken from the <italic>in vitro</italic> and <italic>in vivo</italic> studies. In the <italic>in vitro</italic> data, a subset with 153 metabolites was identified. Five groups were defined, one for each organochlorine: aldrin, DDT, endosulfan, lindane, and the mixture. Each group was compared with the control; hence, each test consisted of six experimental replicates.</p>
<p>Regarding the <italic>in vivo</italic> data, the plasma samples were classified into seven groups according to the pesticide found in them: 5 samples in endosulfan, 31 in endrin, 28 in heptachlor, 3 in DDT, 4 in methoxychlor, 35 in lindane, and 18 in chlorpyrifos. Each group was then compared with the negative control.</p>
<p>MetaboAnalyst 4.0 was used to perform PLS-DA [<xref ref-type="bibr" rid="redalyc_344268257011_ref18">18</xref>]. For this purpose, the data were normalized with logarithmic transformation and scaled using the Pareto method. Subsequently, PLS-DA was applied to each group. Its accuracy to predict each metabolic profile was assessed with the k-fold cross-validation method for groups with at least ten samples, while, for those with less than ten samples, the Leave-One-Out Cross-Validation (LOOCV) technique was employed. Parameters R2 and Q2 were also measured.</p>
<p>The list of the ten most relevant metabolites was obtained compiled using the Variable Importance in Projection (VIP) score [<xref ref-type="bibr" rid="redalyc_344268257011_ref19">19</xref>], attaching greater relevance to those with higher VIP values. The SVM method was implemented in R language [<xref ref-type="bibr" rid="redalyc_344268257011_ref20">20</xref>] using the RStudio platform [<xref ref-type="bibr" rid="redalyc_344268257011_ref21">21</xref>] and the e1071 library [<xref ref-type="bibr" rid="redalyc_344268257011_ref22">22</xref>]. Four kernels (linear, polynomial, sigmoid, and radial) were evaluated using four different margin penalties (1e<sup>100</sup>, 1e<sup>10</sup>, 1, and 1e<sup>-10</sup>).</p>
<p>Incremental training was carried out with 20 %, 40 %, 60 %, and 80 % of the samples in order to identify the lowest number of samples needed to achieve 100 % accuracy (measured by k-fold cross-validation). SVM training was conducted with both normalized and raw data. The kernel with the best performance and minimum sample size required for training was employed to implement the SVM-RFE algorithm, but, in this case, using 100 % of the available samples. The lists with the ten most relevant metabolites were obtained for each metabolic profile.</p>
</sec>
<sec>
<title>
<bold>2.4.  Comparative analysis</bold>
</title>
<p>The two methods were compared based on the accuracy results of the k-fold cross-validation and the position and inclusion of metabolites in the lists obtained by both.</p>
</sec>
</sec>
<sec>
<title>
<bold>3.     RESULTS</bold>
</title>
<p>Data normalization with log transformation, Pareto scaling, and PLS-DA were performed using MetaboAnalyst 4.0. PLS-DA was conducted with normalized data.</p>
<p>Regarding PLS-DA, although accuracy was measured by k-fold cross-validation, it should be noted that MetaboAnalyst 4.0 requires a minimum of ten samples to apply such method. This criterion was not fulfilled by the DDT and methoxychlor samples in the <italic>in vivo</italic> study. In those cases, accuracy was validated using the LOOCV algorithm. Note that the results shown in <xref ref-type="fig" rid="gf1">Figure 1</xref> represent the first principal component (the component with the best score).</p>
<p>
<fig id="gf1">
<label>Figure 1.</label>
<caption>
<title>PLS-DA classification performance. Accuracy, R2, and Q2 values in PLS-DA are presented for each type of experiment (<italic>in vivo</italic> and <italic>in vitro</italic>). All the values of the pesticides were obtained by k-mean cross validations, except for DDT in an in vivo experiment. Values are presented as percentages</title>
</caption>
<alt-text>Figure 1.  PLS-DA classification performance. Accuracy, R2, and Q2 values in PLS-DA are presented for each type of experiment (in vivo and in vitro). All the values of the pesticides were obtained by k-mean cross validations, except for DDT in an in vivo experiment. Values are presented as percentages</alt-text>
<graphic xlink:href="344268257011_gf2.png" position="anchor" orientation="portrait"/>
<attrib>Source: Created by the authors.</attrib>
</fig>
</p>
<p>From <xref ref-type="fig" rid="gf1">Figure 1</xref>, we observe that, when PLS-DA was implemented using the data from the <italic>in vitro</italic> study, R2 was above 95 %; and Q2, above 85 %.</p>
<p>However, when implemented using the data from the <italic>in vivo</italic> study, its accuracy decreased in those groups in which there were fewer samples. R2 fell to 63.4 % (endrin). In addition, Q2 was also affected; it fell to 50.05 % (lindane) and did not exceed 81.9 % (endosulfan).</p>
<p>SVMs were applied to both normalized and raw data. Nevertheless, the best results were achieved with normalized data; they are shown in <xref ref-type="fig" rid="gf2">Figures 2</xref> and <xref ref-type="fig" rid="gf3">3</xref>. In these, SVM training was performed with 80 % of the data, and 20 % was used to test the SVM model obtained. Four kernels were evaluated in terms of SVM training: Linear, Polynomial, Sigmoid, and Radial. Four cost margin penalties were implemented in each kernel: 1e<sup>100</sup>, 1e<sup>10</sup>, 1, and 1e<sup>-10</sup>. Kernel and cost used in each model are specified in the figure. Bar sizes represent the prediction accuracy obtained by each SVM model in a scale between 0 % and 100 %.</p>
<p>
<fig id="gf2">
<label>Figure 2.</label>
<caption>
<title>Accuracy of the SVMs trained with the<italic> in vitro </italic>data using different types of kernel and margin penalties</title>
</caption>
<alt-text>Figure 2. Accuracy of the SVMs trained with the in vitro data using different types of kernel and margin penalties</alt-text>
<graphic xlink:href="344268257011_gf3.png" position="anchor" orientation="portrait"/>
<attrib>Source: Created by the authors.</attrib>
</fig>
</p>
<p>
<fig id="gf3">
<label>Figure. 3.</label>
<caption>
<title>Accuracy of the SVMs trained with the in vivo data using different types of kernel and margin penalties</title>
</caption>
<alt-text>Figure. 3.  Accuracy of the SVMs trained with the in vivo data using different types of kernel and margin penalties</alt-text>
<graphic xlink:href="344268257011_gf4.png" position="anchor" orientation="portrait"/>
<attrib>Source: Created by the authors.</attrib>
</fig>
</p>
<p>Employing the normalized data from the <italic>in vitro</italic> study, all kernels exhibited good performance (except for the polynomial one) with 100 % accuracy using 80 % of the samples for training. Implementing the normalized data from the <italic>in vivo</italic> study, there was a slight decrease in accuracy, especially in those groups with a number of samples below ten (DDT and methoxychlor). However, the latter achieved 100 % accuracy in some scenarios of the sigmoid kernel.</p>
<p>With respect to raw data, the best performance was achieved using 80 % of the samples for training, and the accuracy was between 90.63 % and 93.15 %. In addition to the linear kernel, the polynomial and radial kernels showed good performance. The polynomial kernel in particular yielded an accuracy of 93.1 % using a margin penalty of 1<sup>e-10</sup>. This case opens up the possibility of overfitting.</p>
<p>In PLS-DA, the relevant features were identified by means of VIP scores, while, in SVM, the SVM-RFE technique was employed for such purpose. Both scenarios used normalized data. <xref ref-type="table" rid="gt1">Tables 1</xref> and<xref ref-type="table" rid="gt2"> 2</xref> show the features proposed by both methods for the <italic>in vitro</italic> study.</p>
<p>
<table-wrap id="gt1">
<label>Table 1</label>
<caption>
<title>Top ten metabolites obtained by PLSDA</title>
</caption>
<alt-text>Table 1 Top ten metabolites obtained by PLSDA</alt-text>
<alternatives>
<graphic xlink:href="344268257011_gt2.png" position="anchor" orientation="portrait"/>
<table style="width:468.25pt;border-collapse:collapse;border:none;" id="gt2-526564616c7963">
<tbody>
<tr style="   height:17.0pt">
<td style="width:92.0pt;border-top:solid windowtext 1.0pt;   border-left:none;border-bottom:solid windowtext 1.0pt;border-right:none;      padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">
<bold>Aldrin</bold>
</td>
<td style="width:98.55pt;border-top:solid windowtext 1.0pt;   border-left:none;border-bottom:solid windowtext 1.0pt;border-right:none;      padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">
<bold>DDT</bold>
</td>
<td style="border-top:solid windowtext 1.0pt;border-left:none;border-bottom:   solid windowtext 1.0pt;border-right:none;   padding:0cm 1.4pt 0cm 1.4pt;   height:17.0pt">
<bold>Endosulfan</bold>
<bold/>
</td>
<td style="width:103.9pt;border-top:solid windowtext 1.0pt;   border-left:none;border-bottom:solid windowtext 1.0pt;border-right:none;      padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">
<bold>Lindane</bold>
</td>
<td style="width:83.55pt;border-top:solid windowtext 1.0pt;   border-left:none;border-bottom:solid windowtext 1.0pt;border-right:none;      padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">
<bold>Mixture</bold>
</td>
</tr>
<tr style="height:17.0pt">
<td style="width:92.0pt;border:none;   padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Phosphoethanolamine</td>
<td style="width:98.55pt;border:none;   padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Phosphoethanolamine</td>
<td style="border:none;padding:0cm 1.4pt 0cm 1.4pt;   height:17.0pt">N-acetyl aspartate</td>
<td style="width:103.9pt;border:none;   padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Taurine</td>
<td style="width:83.55pt;border:none;   padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Glucose-6-phosphate</td>
</tr>
<tr style="height:17.0pt">
<td style="width:92.0pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Phosphogluconic acid</td>
<td style="width:98.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Cytidine-5 -monophosphate</td>
<td style="padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Citric acid</td>
<td style="width:103.9pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Phosphogluconic acid</td>
<td style="width:83.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Phosphoethanolamine</td>
</tr>
<tr style="height:17.0pt">
<td style="width:92.0pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Cytosin</td>
<td style="width:98.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Phosphogluconic acid</td>
<td style="padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Taurine</td>
<td style="width:103.9pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Gluconic acid</td>
<td style="width:83.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Citric acid</td>
</tr>
<tr style="height:17.0pt">
<td style="width:92.0pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Cysteine</td>
<td style="width:98.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Glutathione</td>
<td style="padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Glucose-6-phosphate</td>
<td style="width:103.9pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Alpha-ketoglutarate</td>
<td style="width:83.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Isocitric acid</td>
</tr>
<tr style="height:17.0pt">
<td style="width:92.0pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Gluconic acid</td>
<td style="width:98.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">2.5-dihydroxy pyrazine</td>
<td style="padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Phosphogluconic acid</td>
<td style="width:103.9pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">N-acetyl mannosamine</td>
<td style="width:83.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Phosphogluconic acid</td>
</tr>
<tr style="height:17.0pt">
<td style="width:92.0pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Ribulose-5-phosphate</td>
<td style="width:98.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">5'-deoxy-5'-methyl thio adenosine</td>
<td style="padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Phosphoethanolamine</td>
<td style="width:103.9pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Glutaric acid</td>
<td style="width:83.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Ribose</td>
</tr>
<tr style="height:17.0pt">
<td style="width:92.0pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Hypoxanthine</td>
<td style="width:98.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Cytosin</td>
<td style="padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Alpha-keto glutarate</td>
<td style="width:103.9pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">2.5-dihydroxy pyrazine</td>
<td style="width:83.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Aspartic acid</td>
</tr>
<tr style="height:17.0pt">
<td style="width:92.0pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Alpha ketoglutarate</td>
<td style="width:98.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Gluconic acid</td>
<td style="padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Isocitric acid</td>
<td style="width:103.9pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Alpha-amino adipic acid</td>
<td style="width:83.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Hypoxanthine</td>
</tr>
<tr style="height:17.0pt">
<td style="width:92.0pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Fructose 1 phosphate</td>
<td style="width:98.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Hexitol</td>
<td style="padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Hexose-6-phosphate</td>
<td style="width:103.9pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Adenine</td>
<td style="width:83.55pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Hexose-6-phosphate</td>
</tr>
<tr style="height:17.0pt">
<td style="width:92.0pt;border:none;border-bottom:solid windowtext 1.0pt;   padding:0cm 1.4pt 0cm 1.4pt;   height:17.0pt">Aspartic acid</td>
<td style="width:98.55pt;border:none;border-bottom:solid windowtext 1.0pt;   padding:0cm 1.4pt 0cm 1.4pt;   height:17.0pt">Sulfuric acid</td>
<td style="border:none;border-bottom:solid windowtext 1.0pt;padding:0cm 1.4pt 0cm 1.4pt;height:17.0pt">Cysteine</td>
<td style="width:103.9pt;border:none;border-bottom:solid windowtext 1.0pt;   padding:0cm 1.4pt 0cm 1.4pt;   height:17.0pt">Ribulose-5-phosphate</td>
<td style="width:83.55pt;border:none;border-bottom:solid windowtext 1.0pt;   padding:0cm 1.4pt 0cm 1.4pt;   height:17.0pt">Alpha-ketoglutarate</td>
</tr>
</tbody>
</table>
</alternatives>
<attrib>Source: Created by the authors.</attrib>
</table-wrap>
</p>
<p>
<table-wrap id="gt2">
<label>Table 2</label>
<caption>
<title>Top ten metabolites obtained by SVM-RFE</title>
</caption>
<alt-text>Table 2 Top ten metabolites obtained by SVM-RFE</alt-text>
<alternatives>
<graphic xlink:href="344268257011_gt3.png" position="anchor" orientation="portrait"/>
<table style="width:469.8pt;border-collapse:collapse;border:none;" id="gt3-526564616c7963">
<tbody>
<tr style="   height:20.05pt">
<td style="width:77.75pt;border-top:solid windowtext 1.0pt;   border-left:none;border-bottom:solid windowtext 1.0pt;border-right:none;      padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">
<bold>Aldrin</bold>
</td>
<td style="width:121.45pt;border-top:solid windowtext 1.0pt;   border-left:none;border-bottom:solid windowtext 1.0pt;border-right:none;      padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">
<bold>DDT</bold>
</td>
<td style="width:80.35pt;border-top:solid windowtext 1.0pt;   border-left:none;border-bottom:solid windowtext 1.0pt;border-right:none;      padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">
<bold>Endosulfan</bold>
<bold/>
</td>
<td style="width:90.65pt;border-top:solid windowtext 1.0pt;   border-left:none;border-bottom:solid windowtext 1.0pt;border-right:none;      padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">
<bold>Lindane</bold>
</td>
<td style="width:99.6pt;border-top:solid windowtext 1.0pt;   border-left:none;border-bottom:solid windowtext 1.0pt;border-right:none;      padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">
<bold>Mixture</bold>
</td>
</tr>
<tr style="height:20.05pt">
<td style="width:77.75pt;border:none;   padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Phosphogluconic acid</td>
<td style="width:121.45pt;border:none;   padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Phosphoethanolamine</td>
<td style="width:80.35pt;border:none;   padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Citric acid</td>
<td style="width:90.65pt;border:none;   padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Alpha ketoglutarate</td>
<td style="width:99.6pt;border:none;   padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Glucose 6 phosphate</td>
</tr>
<tr style="height:20.05pt">
<td style="width:77.75pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Alpha aminoadipic acid</td>
<td style="width:121.45pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Cytosin</td>
<td style="width:80.35pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Phosphogluconic acid</td>
<td style="width:90.65pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Phosphogluconic acid</td>
<td style="width:99.6pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Phosphogluconic acid</td>
</tr>
<tr style="height:20.05pt">
<td style="width:77.75pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Phosphoethanolamine</td>
<td style="width:121.45pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">2,5 dihydroxypyrazine</td>
<td style="width:80.35pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Isocitric acid</td>
<td style="width:90.65pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Taurine</td>
<td style="width:99.6pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Phosphoethanolamine</td>
</tr>
<tr style="height:20.05pt">
<td style="width:77.75pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Cysteine</td>
<td style="width:121.45pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Alpha aminoadipic acid</td>
<td style="width:80.35pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Alpha ketoglutarate</td>
<td style="width:90.65pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">N acetylmannosamine</td>
<td style="width:99.6pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Citric acid</td>
</tr>
<tr style="height:20.05pt">
<td style="width:77.75pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Gluconic acid</td>
<td style="width:121.45pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Phosphogluconic acid</td>
<td style="width:80.35pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Hexose 6 phosphate</td>
<td style="width:90.65pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Glycerol alpha phosphate</td>
<td style="width:99.6pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Ribose</td>
</tr>
<tr style="height:20.05pt">
<td style="width:77.75pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Cytosin</td>
<td style="width:121.45pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Gluconic acid</td>
<td style="width:80.35pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Aspartic acid</td>
<td style="width:90.65pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Gluconic acid</td>
<td style="width:99.6pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Cytidine 5 monophosphate</td>
</tr>
<tr style="height:20.05pt">
<td style="width:77.75pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Hypoxanthine</td>
<td style="width:121.45pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Aspartic acid</td>
<td style="width:80.35pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">N acetylaspartate</td>
<td style="width:90.65pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Xilitol</td>
<td style="width:99.6pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Isocitric acid</td>
</tr>
<tr style="height:20.05pt">
<td style="width:77.75pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Alpha ketoglutarate</td>
<td style="width:121.45pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Cytidine 5 monophosphate</td>
<td style="width:80.35pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Hypoxanthine</td>
<td style="width:90.65pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Creatinine</td>
<td style="width:99.6pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Hypoxanthine</td>
</tr>
<tr style="height:20.05pt">
<td style="width:77.75pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">3 phosphoglycerate</td>
<td style="width:121.45pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Ribulose 5 phosphate</td>
<td style="width:80.35pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">3 phosphoglycerate</td>
<td style="width:90.65pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">2 5 dihydroxypyrazine</td>
<td style="width:99.6pt;padding:0cm 1.4pt 0cm 1.4pt;height:20.05pt">Alpha ketoglutarate</td>
</tr>
<tr style="height:20.05pt">
<td style="width:77.75pt;border:none;border-bottom:solid windowtext 1.0pt;   padding:0cm 1.4pt 0cm 1.4pt;   height:20.05pt">Malic acid</td>
<td style="width:121.45pt;border:none;border-bottom:solid windowtext 1.0pt;   padding:0cm 1.4pt 0cm 1.4pt;   height:20.05pt">Glutathione</td>
<td style="width:80.35pt;border:none;border-bottom:solid windowtext 1.0pt;   padding:0cm 1.4pt 0cm 1.4pt;   height:20.05pt">Aconitic acid</td>
<td style="width:90.65pt;border:none;border-bottom:solid windowtext 1.0pt;   padding:0cm 1.4pt 0cm 1.4pt;   height:20.05pt">Asparagine</td>
<td style="width:99.6pt;border:none;border-bottom:solid windowtext 1.0pt;   padding:0cm 1.4pt 0cm 1.4pt;   height:20.05pt">Cysteine</td>
</tr>
</tbody>
</table>
</alternatives>
<attrib>Source: Created by the authors.</attrib>
</table-wrap>
</p>
<p>The results of the <italic>in vivo</italic> study are not reported because no identification of the compounds was performed in that case. However, the percentage of coincidence between the two methods (i.e., PLS-DA and SVM-RFE) in the two studies was calculated here (<xref ref-type="table" rid="gt3">Table 3</xref>).</p>
<p>
<table-wrap id="gt3">
<label>Table 3</label>
<caption>
<title>Comparison of top ten metabolites. Results in percentage</title>
</caption>
<alt-text>Table 3 Comparison of top ten metabolites. Results in percentage</alt-text>
<alternatives>
<graphic xlink:href="344268257011_gt4.png" position="anchor" orientation="portrait"/>
<table style="width:457.7pt;border-collapse:collapse;border:none;" id="gt4-526564616c7963">
<tbody>
<tr style="   height:14.2pt">
<td style="border:none;border-top:solid #7F7F7F 1.0pt;   padding:0cm 5.4pt 0cm 5.4pt;   height:14.2pt" colspan="3">Data from the <italic>in vitro</italic> study</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;border-bottom:solid windowtext 1.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Pesticide</td>
<td style="border:none;border-bottom:solid windowtext 1.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Included in the top ten</td>
<td style="border:none;border-bottom:solid windowtext 1.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Same position in the top</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;   height:14.2pt">Aldrin</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;   height:14.2pt">70.00</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;   height:14.2pt">40.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">DDT</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">60.00</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">10.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Endosulfan</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">60.00</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">0.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Lindane</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">60.00</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">10.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;border-bottom:solid windowtext 1.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Mixture</td>
<td style="border:none;border-bottom:solid windowtext 1.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">80.00</td>
<td style="border:none;border-bottom:solid windowtext 1.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">20.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;border-bottom:solid windowtext 1.0pt;      padding:0cm 5.4pt 0cm 5.4pt;   height:14.2pt" colspan="3">Data from the <italic>in vivo</italic> study</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;border-bottom:solid windowtext 1.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Pesticide</td>
<td style="border:none;border-bottom:solid windowtext 1.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Included in the top ten</td>
<td style="border:none;border-bottom:solid windowtext 1.0pt;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Same position in the top</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;   height:14.2pt">Chlorpyrifos</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;   height:14.2pt">50.00</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;   height:14.2pt">20.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">DDT</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">30.00</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">0.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Endosulfan</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">50.00</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">0.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Endrin</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">70.00</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">0.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Heptachlor</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">80.00</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">20.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Lindane</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">80.00</td>
<td style="border:none;padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">10.00</td>
</tr>
<tr style="height:14.2pt">
<td style="border:none;border-bottom:solid #7F7F7F 1.0pt;      padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">Methoxychlor</td>
<td style="border:none;border-bottom:solid #7F7F7F 1.0pt;      padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">60.00</td>
<td style="border:none;border-bottom:solid #7F7F7F 1.0pt;      padding:0cm 5.4pt 0cm 5.4pt;height:14.2pt">10.00</td>
</tr>
</tbody>
</table>
</alternatives>
<attrib>Source: Created by the authors.</attrib>
</table-wrap>
</p>
<p>The comparative analysis reveals that, in the <italic>in vitro</italic> study, aldrin and the mixture were found to have the highest coincidences among the relevant metabolites identified in each method. Nevertheless, this panorama changes in the <italic>in vivo</italic> study, as heptachlor and lindane exhibited the highest number of coincidences.</p>
</sec>
<sec>
<title>
<bold>4.     DISCUSSION</bold>
</title>
<p>In this study, PLS-DA was proven to be a good method to analyze data from <italic>in vitro</italic> studies, as it presented an R2 and a Q2 close to ideal values. However, when analyzing data from <italic>in vivo</italic> studies, its accuracy decreased in scenarios with few samples. Conversely, SVMs achieved 100 % accuracy in all the scenarios (<italic>in vitro</italic> and <italic>in vivo</italic>), but it was necessary to test the performance of the different kernels. Although the linear and sigmoid kernels exhibited good performance using margin penalties of 1e100, 1e10, and 1, the radial and polynomial kernels showed a poor one.</p>
<p>According to this, the accuracy of PLS-DA and SVMs can be affected by conditions such as high variability and few samples, like those in <italic>in vivo</italic> studies. Nonetheless, it is possible to identify the kernels with the best performance for data analysis from <italic>in vivo </italic>studies and use them in SVMs, thus allowing a better classification. Moreover, another advantage of SVMs is that they can achieve an accuracy of 100 % with fewer training samples. For instance, in this study, they employed 80 % of the samples, while PLS-DA required all of them.</p>
<p>Furthermore, comparing the lists of the ten relevant features of each profile in each method (SVM-RFE and PLS-DA), both methods shared similarities in the analysis of the <italic>in vitro </italic>study (equal to or greater than 70 %), but the results were heterogeneous for the <italic>in vivo </italic>study. The group with the lowest number of coincidences was DDT in the <italic>in vivo</italic> study, which poses the question of whether the number of samples could have influenced these results.</p>
<p>This study identified an improvement in the predictive performance of SVMs over PLS-DA in the analysis of data from <italic>in vivo</italic> experiments, something previously described by Mahadevan <italic>et al.</italic>[<xref ref-type="bibr" rid="redalyc_344268257011_ref3">3</xref>] and Mendez <italic>et al. </italic>[<xref ref-type="bibr" rid="redalyc_344268257011_ref7">7</xref>]. Although Gromski <italic>et al. </italic>[<xref ref-type="bibr" rid="redalyc_344268257011_ref8">8</xref>] reported some shortcomings of SVM in dealing with missing values and assessing the importance of compounds, we consider that these problems could be overcome with SVM-RFE implementation. Gromski <italic>et al.</italic> also reported problems in visualizing, interpreting, reducing dimensions, and selecting parameters. This could be solved with an appropriate kernel selection.</p>
<p>In this study, the linear and sigmoid kernels showed the best performance. Although the radial kernel did not exhibit an adequate performance in this article, it has been one of the most widely employed [<xref ref-type="bibr" rid="redalyc_344268257011_ref23">23</xref>], [<xref ref-type="bibr" rid="redalyc_344268257011_ref24">24</xref>]. In some studies, it has even shown a superior performance compared to other popular predictors such as Naive Bayes, linear discriminant analysis, and quadratic linear discriminant analysis [<xref ref-type="bibr" rid="redalyc_344268257011_ref25">25</xref>]. Probably, the present results may be explained by the fact that the data underwent a previous normalization process. The effect of data normalization on kernel performance has been analyzed by Wan <italic>et al. </italic>[<xref ref-type="bibr" rid="redalyc_344268257011_ref26">26</xref>].</p>
<p>Although this study focused on the classic SVM kernels, new kernels have been proposed, such as the Hermite orthogonal polynomial kernel. This kernel makes it possible to use fewer support vectors for classification. In addition, it has been reported to achieve better error-rate performance [<xref ref-type="bibr" rid="redalyc_344268257011_ref27">27</xref>]. Another new kernel is the weighted variable kernel, whose implementation in SVMs outperforms the classification of methods such as RF [<xref ref-type="bibr" rid="redalyc_344268257011_ref28">28</xref>]. Other techniques with SVMs, such as SVM least squares, have been proposed for medical image analysis [<xref ref-type="bibr" rid="redalyc_344268257011_ref29">29</xref>], [<xref ref-type="bibr" rid="redalyc_344268257011_ref30">30</xref>]. These approaches could be evaluated to be implemented in metabolomics.</p>
<p>SVM-RFE was employed here to select a list of relevant features. For this purpose, we suggest implementing SVMs with a kernel having an optimal margin penalty before using SVM-RFE. In particular, in this study, the linear and sigmoid kernels presented a margin penalty that was optimal for most scenarios. Nevertheless, for scenarios with few samples such as DDT, the sigmoid kernel was the only one that showed optimal performance. Although there were enough samples for data comparison in the <italic>in vitro</italic> study, some scenarios in the <italic>in vivo</italic> study, such as DDT and methoxychlor, had few samples. In this case, there was the risk of overfitting in both methods.</p>
<p>Furthermore, it should be noted that, in the <italic>in vivo</italic> study, among the 73 cases with proven pesticide exposure, some agricultural workers had been exposed to more than one pesticide, which could have influenced the metabolic profiles and, hence, the performance of each technique.</p>
<p>Although we identified 153 metabolites from the spectrometry signals obtained in the <italic>in vitro</italic> experiment, this was not done in the <italic>in vivo</italic> study, but it remains to be performed in order to define the biological impact in each scenario.</p>
<p>In addition to SVM-RFE, another strategy that has been proposed to identify relevant features is multiclass SVM using L1-norm [<xref ref-type="bibr" rid="redalyc_344268257011_ref10">10</xref>] and L2-norm, the latter exhibiting greater stability [<xref ref-type="bibr" rid="redalyc_344268257011_ref31">31</xref>]. Thus, it may be interesting to explore these options in future studies.</p>
<p>In summary, according to the findings of this work and those of the other studies mentioned here, SVMs are robust methods suitable for data derived from <italic>in vivo</italic> experiments and exhibit good classification performance even with few samples. Also, SVMs are advantageous in dealing with outliers, predictive power, and resistance to overfitting. However, their performance will depend on the hyperparameters and kernels used.</p>
<p>Therefore, in order to make the most of the analysis with SVM-RFE and the “kernel trick”, it is recommended to initially evaluate each kernel, as well as the different margin penalty scenarios. Performance must also be evaluated based on the percentage of samples used for training in order to avoid overfitting. In this study, 80 % of the samples were needed for most scenarios. However, this may vary depending on the number of features and samples available. Next, the last step would be to implement SVM-RFE with the best kernel identified.</p>
<p>Finally, it is necessary to clarify that the results obtained from one or the other method should be validated in future biological experiments to determine the biological impact of exposure to pesticides.</p>
</sec>
<sec>
<title>
<bold>5.     CONCLUSIONS</bold>
</title>
<p>In this study, SVMs and PLS-DA were proven to be appropriate methods to analyze data from <italic>in vitro</italic> studies with controlled conditions, but PLS-DA presented difficulties with data from <italic>in vivo</italic> studies (non-controlled conditions and non-linear data) in the context of organochlorine exposure.</p>
<p>Regarding class prediction in data from <italic>in vivo</italic> studies, SVMs exhibited a greater predictive power than PLS-DA. Moreover, the kernel with the best performance identified by SVM analysis can be used in SVM-RFE to obtain an adequate list of most relevant features in the context of pesticides exposure. Additionally, the computational cost of SVMs is low.</p>
<p>SVM-RFE is becoming a useful tool for biomarker identification, even when there are few samples. In addition, it is considered a robust method to analyze data derived from <italic>in vivo</italic> and <italic>in vitro</italic> studies.</p>
</sec>
</body>
<back>
<ack>
<title>Acknowledgments</title>
<p>The authors wish to thank the B2SLab of the Universitat Politècnica de Catalunya for providing the equipment to conduct the analyses, the Instituto Tecnológico Metropolitano (ITM) for funding the exchange program that led to the results of this study, and ITM Language Center for proofreading the manuscript.</p>
</ack>
<ref-list>
<title>REFERENCES</title>
<ref id="redalyc_344268257011_ref1">
<mixed-citation>[1] J. C. Lindon, J. K. Nicholson: E. Holmes, <italic>The Handbook of Metabonomics and Metabolomics</italic>. Elsevier, 2007.</mixed-citation>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Lindon</surname>
<given-names>J. C.</given-names>
</name>
<name>
<surname>Nicholson</surname>
<given-names>J. K.</given-names>
</name>
<name>
<surname>Holmes</surname>
<given-names>E.</given-names>
</name>
</person-group>
<source>The Handbook of Metabonomics and Metabolomics</source>
<year>2007</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref2">
<mixed-citation>[2] E. C. Horning, M. G. Horning, “Human Metabolic Profiles Obtained by GC and GC/MS,”<italic> J. Chromatogr. Sci.</italic>, vol. 9, no. 3, pp. 129–140, Mar. 1971. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1093/chromsci/9.3.129">https://doi.org/10.1093/chromsci/9.3.129</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Horning</surname>
<given-names>E. C.</given-names>
</name>
<name>
<surname>Horning</surname>
<given-names>M. G.</given-names>
</name>
</person-group>
<article-title>Human Metabolic Profiles Obtained by GC and GC/MS</article-title>
<source>J. Chromatogr. Sci.</source>
<year>1971</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref3">
<mixed-citation>[3] S. Mahadevan, S. L. Shah, T. J. Marrie, C. M. Slupsky, “Analysis of Metabolomic Data Using Support Vector Machines,” <italic>Anal. Chem</italic>., vol. 80, no. 19, pp. 7562–7570, Sep. 2008. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1021/ac800954c">https://doi.org/10.1021/ac800954c</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mahadevan</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Shah</surname>
<given-names>S. L.</given-names>
</name>
<name>
<surname>Marrie</surname>
<given-names>T. J.</given-names>
</name>
<name>
<surname>Slupsky</surname>
<given-names>C. M.</given-names>
</name>
</person-group>
<article-title>Analysis of Metabolomic Data Using Support Vector Machines</article-title>
<source>Anal. Chem</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref4">
<mixed-citation>[4] C. Cortes, V. Vapnik, “Support-vector networks,”<italic> Mach. Learn</italic>., vol. 20, no. 3, pp. 273–297, Sep. 1995. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/BF00994018">https://doi.org/10.1007/BF00994018</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cortes</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Vapnik,</surname>
<given-names>V.</given-names>
</name>
</person-group>
<article-title>Support-vector networks,</article-title>
<source>Mach. Learn</source>
<year>1995</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref5">
<mixed-citation>[5] A. Alonso, S. Marsal, A. JuliÃ, “Analytical Methods in Untargeted Metabolomics: State of the Art in 2015,” <italic>Front. Bioeng. Biotechnol</italic>., vol. 3, p. 23, Mar. 2015. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fbioe.2015.00023">https://doi.org/10.3389/fbioe.2015.00023</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alonso</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Marsal</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>JuliÃ</surname>
<given-names>A.</given-names>
</name>
</person-group>
<article-title>Analytical Methods in Untargeted Metabolomics: State of the Art in 2015</article-title>
<source>Front. Bioeng. Biotechnol</source>
<year>2015</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref6">
<mixed-citation>[6] J. Heinemann, A. Mazurie, M. Tokmina-Lukaszewska, G. J. Beilman, B. Bothner, “Application of support vector machines to metabolomics experiments with limited replicates,” <italic>Metabolomics</italic>, vol. 10, no. 6, pp. 1121–1128, Dec. 2014, <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s11306-014-0651-0">https://doi.org/10.1007/s11306-014-0651-0</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Heinemann</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Mazurie</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Tokmina-Lukaszewska</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Beilman</surname>
<given-names>G. J.</given-names>
</name>
<name>
<surname>Bothner</surname>
<given-names>B.</given-names>
</name>
</person-group>
<article-title>Application of support vector machines to metabolomics experiments with limited replicates</article-title>
<source>Metabolomics</source>
<year>2014</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref7">
<mixed-citation>[7] K. M. Mendez, S. N. Reinke, D. I. Broadhurst, “A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification,” <italic>Metabolomics</italic>, vol. 15, no. 12, p. 150, Nov. 2019. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s11306-019-1612-4">https://doi.org/10.1007/s11306-019-1612-4</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mendez</surname>
<given-names>K. M.</given-names>
</name>
<name>
<surname>Reinke</surname>
<given-names>S. N.</given-names>
</name>
<name>
<surname>Broadhurst</surname>
<given-names>D. I.</given-names>
</name>
</person-group>
<article-title>A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification</article-title>
<source>Metabolomics</source>
<year>2019</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref8">
<mixed-citation>[8] P. S. Gromski et al., “A tutorial review: Metabolomics and partial least squares-discriminant analysis – a marriage of convenience or a shotgun wedding,” <italic>Anal. Chim. Acta</italic>, vol. 879, pp. 10–23, Jun. 2015. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.aca.2015.02.012">https://doi.org/10.1016/j.aca.2015.02.012</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gromski</surname>
<given-names>P. S.</given-names>
</name>
</person-group>
<article-title>A tutorial review: Metabolomics and partial least squares-discriminant analysis – a marriage of convenience or a shotgun wedding</article-title>
<source>Anal. Chim. Acta</source>
<year>2015</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref9">
<mixed-citation>[9] I. Guyon, J. Weston, S. Barnhill, V. Vapnik, “Gene selection for cancer classification using support vector machines,” <italic>Mach. Learn</italic>., vol. 46, no. 1, pp. 389–422, Jan. 2002. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1023/A:1012487302797">https://doi.org/10.1023/A:1012487302797</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guyon</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Weston</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Barnhill</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Vapnik</surname>
<given-names>V.</given-names>
</name>
</person-group>
<article-title>Gene selection for cancer classification using support vector machines</article-title>
<source>Mach. Learn</source>
<year>2002</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref10">
<mixed-citation>[10] W. Guan et al., “Ovarian cancer detection from metabolomic liquid chromatography/mass spectrometry data by support vector machines,” <italic>BMC Bioinformatics</italic>, vol. 10, no. 259, Aug. 2009. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1186/1471-2105-10-259">https://doi.org/10.1186/1471-2105-10-259</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guan</surname>
<given-names>W.</given-names>
</name>
</person-group>
<article-title>Ovarian cancer detection from metabolomic liquid chromatography/mass spectrometry data by support vector machines</article-title>
<source>BMC Bioinformatics</source>
<year>2009</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref11">
<mixed-citation>[11] X. Lin et al., “A support vector machine-recursive feature elimination feature selection method based on artificial contrast variables and mutual information,” <italic>J. Chromatogr</italic>. B, vol. 910, pp. 149–155, Dec. 2012. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.jchromb.2012.05.020">https://doi.org/10.1016/j.jchromb.2012.05.020</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>X.</given-names>
</name>
</person-group>
<article-title>A support vector machine-recursive feature elimination feature selection method based on artificial contrast variables and mutual information</article-title>
<source>J. Chromatogr</source>
<year>2012</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref12">
<mixed-citation>[12] M. Abdollahi, A. Ranjbar, S. Shadnia, S. Nikfar, A. Rezaiee, “Pesticides and oxidative stress: a review,”<italic> Med. Sci. Monit</italic>., vol. 10, no. 6, Jun. 2004. <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/15173684/">https://pubmed.ncbi.nlm.nih.gov/15173684/</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abdollahi</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ranjbar</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Shadnia</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Nikfar</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Rezaiee</surname>
<given-names>A.</given-names>
</name>
</person-group>
<article-title>Pesticides and oxidative stress: a review</article-title>
<source>Med. Sci. Monit</source>
<year>2004</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref13">
<mixed-citation>[13] V. Moses, J. V. Peter, “Acute intentional toxicity: endosulfan and other organochlorines,” <italic>Clin. Toxicol</italic>., vol. 48, no. 6, pp. 539–544, Jul. 2010. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3109/15563650.2010.494610">https://doi.org/10.3109/15563650.2010.494610</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Moses</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Peter</surname>
<given-names>J. V.</given-names>
</name>
</person-group>
<article-title>Acute intentional toxicity: endosulfan and other organochlorines,</article-title>
<source>Clin. Toxicol</source>
<year>2010</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref14">
<mixed-citation>[14] R. Jayaraj, P. Megha, P. Sreedev, “Organochlorine pesticides, their toxic effects on living organisms and their fate in the environment,”<italic> Interdiscip. Toxicol</italic>., vol. 9, no. 3–4, p. 90- 100, Dec. 2016. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1515/intox-2016-0012">https://doi.org/10.1515/intox-2016-0012</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>ayaraj</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Megha</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Sreedev</surname>
<given-names>P.</given-names>
</name>
</person-group>
<article-title>Organochlorine pesticides, their toxic effects on living organisms and their fate in the environment</article-title>
<source>Interdiscip. Toxicol</source>
<year>2016</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref15">
<mixed-citation>[15] M. Zuluaga, J. J. Melchor, F. A. Tabares-Villa, G. Taborda, J. C. Sepúlveda-Arias, “Metabolite Profiling to Monitor Organochlorine Pesticide Exposure in HepG2 Cell Culture,” <italic>Chromatographia</italic>, vol. 79, no. 17–18, pp. 1061–1068, Sep. 2016. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s10337-016-3031-2">https://doi.org/10.1007/s10337-016-3031-2</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zuluaga</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Melchor</surname>
<given-names>J. J.</given-names>
</name>
<name>
<surname>Tabares-Villa</surname>
<given-names>F. A.</given-names>
</name>
<name>
<surname>Taborda</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Sepúlveda-Arias,</surname>
<given-names>J. C.</given-names>
</name>
</person-group>
<article-title>Metabolite Profiling to Monitor Organochlorine Pesticide Exposure in HepG2 Cell Culture</article-title>
<source>Chromatographia</source>
<year>2016</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref16">
<mixed-citation>[16] O. Fiehn, T. Kind, “Metabolite Profiling in Blood Plasma,” in <italic>Metabolomics</italic>, Springer, 2007, pp. 3–17. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/978-1-59745-244-1_1">https://doi.org/10.1007/978-1-59745-244-1_1</ext-link>
</mixed-citation>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Fiehn</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Kind</surname>
<given-names>T.</given-names>
</name>
</person-group>
<source>Metabolomics</source>
<year>2007</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref17">
<mixed-citation>[17] O. Fiehn et al., “Quality control for plant metabolomics: reporting MSI-compliant studies,” <italic>Plant J</italic>., vol. 53, no. 4, pp. 691–704, Feb. 2008. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1111/j.1365-313X.2007.03387.x">https://doi.org/10.1111/j.1365-313X.2007.03387.x</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fiehn</surname>
<given-names>O.</given-names>
</name>
</person-group>
<article-title>Quality control for plant metabolomics: reporting MSI-compliant studies</article-title>
<source>Plant J</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref18">
<mixed-citation>[18] J. Chong, D. S. Wishart;,J. Xia, “Using MetaboAnalyst 4.0 for Comprehensive and Integrative Metabolomics Data Analysis,”<italic> Curr. Protoc. Bioinforma</italic>., vol. 68, no. 1, p. e86, Sep. 2019. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1002/cpbi.86">https://doi.org/10.1002/cpbi.86</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chong</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wishart</surname>
<given-names>D. S.</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Using MetaboAnalyst 4.0 for Comprehensive and Integrative Metabolomics Data Analysis</article-title>
<source>Curr. Protoc. Bioinforma</source>
<year>2019</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref19">
<mixed-citation>[19] L. Eriksson, <italic>Introduction to multi-and megavariate data analysis using projection methods (PCA &amp; PLS)</italic>. Umetrics AB, 1999.</mixed-citation>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Eriksson</surname>
<given-names>L.</given-names>
</name>
</person-group>
<source>Introduction to multi-and megavariate data analysis using projection methods (PCA &amp; PLS)</source>
<year>1999</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref20">
<mixed-citation>[20] R. C. Team, “R: A language and environment for statistical computing,” 2013. <ext-link ext-link-type="uri" xlink:href="https://www.yumpu.com/en/document/read/6853895/r-a-language-and-environment-for-statistical-computing">https://www.yumpu.com/en/document/read/6853895/r-a-language-and-environment-for-statistical-computing</ext-link>
</mixed-citation>
<element-citation publication-type="webpage">
<person-group person-group-type="author">
<name>
<surname>Team</surname>
<given-names>R. C.</given-names>
</name>
</person-group>
<source>R: A language and environment for statistical computing,</source>
<year>2013</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref21">
<mixed-citation>[21] M. Campbell, “RStudio Projects,” in <italic>Learn RStudio IDE</italic>, Berkeley, CA: Apress, 2019, pp. 39–48. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/978-1-4842-4511-8_4">https://doi.org/10.1007/978-1-4842-4511-8_4</ext-link>
</mixed-citation>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Campbell</surname>
<given-names>M.</given-names>
</name>
</person-group>
<source>Learn RStudio IDE</source>
<year>2019</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref22">
<mixed-citation>[22] D. Meyer et al., “Package ‘e1071, Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien’”, versió 1.7-9, <italic>R J</italic>., 2019. http://sunsite2.icm.edu.pl/pub/unix/math/cran/web/packages/e1071/e1071.pdf</mixed-citation>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Meyer</surname>
<given-names>D.</given-names>
</name>
</person-group>
<source>Package ‘e1071, Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien</source>
<year>2019</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref23">
<mixed-citation>[23] H. Zheng et al., “Predictive diagnosis of major depression using NMR-based metabolomics and least-squares support vector machine,”<italic> Clin. Chim</italic>. Acta, vol. 464, pp. 223–227, Jan. 2017. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.cca.2016.11.039">https://doi.org/10.1016/j.cca.2016.11.039</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zheng</surname>
<given-names>H.</given-names>
</name>
</person-group>
<article-title>Predictive diagnosis of major depression using NMR-based metabolomics and least-squares support vector machine</article-title>
<source>Clin. Chim. Acta</source>
<year>2017</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref24">
<mixed-citation>[24] B. Feizizadeh, M. S. Roodposhti, T. Blaschke, J. Aryal, “Comparing GIS-based support vector machine kernel functions for landslide susceptibility mapping,” <italic>Arab. J. Geosci</italic>., vol. 10, no. 122, Mar. 2017. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s12517-017-2918-z">https://doi.org/10.1007/s12517-017-2918-z</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feizizadeh</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Roodposhti</surname>
<given-names>M. S.</given-names>
</name>
<name>
<surname>Blaschke</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Aryal</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Comparing GIS-based support vector machine kernel functions for landslide susceptibility mapping</article-title>
<source>Arab. J. Geosci.</source>
<year>2017</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref25">
<mixed-citation>[25] M. A. Horaira, M. S. Ahmed, M. H. Kabir, M. N. H. Mollah, M. A. Rahman Shah, “Colon Cancer Prediction from Gene Expression Profiles Using Kernel Based Support Vector Machine,” in <italic>2018 International Conference on Computer, Communication, Chemical, Material and Electronic Engineering (IC4ME2)</italic>, Feb. 2018, pp. 1–4. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/IC4ME2.2018.8465636">https://doi.org/10.1109/IC4ME2.2018.8465636</ext-link>
</mixed-citation>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Horaira</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Ahmed</surname>
<given-names>M. S.</given-names>
</name>
<name>
<surname>Kabir</surname>
<given-names>M. H.</given-names>
</name>
<name>
<surname>Mollah</surname>
<given-names>M. N. H.</given-names>
</name>
<name>
<surname>Rahman Shah</surname>
<given-names>M. A.</given-names>
</name>
</person-group>
<source>Colon Cancer Prediction from Gene Expression Profiles Using Kernel Based Support Vector Machine</source>
<year>2018</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref26">
<mixed-citation>[26] V. Wan, W. M. Campbell, “Support vector machines for speaker verification and identification,” in <italic>Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501)</italic>, vol. 2, pp. 775–784. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/NNSP.2000.890157">https://doi.org/10.1109/NNSP.2000.890157</ext-link>
</mixed-citation>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Wan</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Campbell</surname>
<given-names>W. M.</given-names>
</name>
</person-group>
<source>Support vector machines for speaker verification and identification</source>
<year>2000</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref27">
<mixed-citation>[27] V. Hooshmand Moghaddam, J. Hamidzadeh, “New Hermite orthogonal polynomial kernel and combined kernels in Support Vector Machine classifier,” <italic>Pattern Recognit</italic>., vol. 60, pp. 921–935, Dec. 2016. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.patcog.2016.07.004">https://doi.org/10.1016/j.patcog.2016.07.004</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hooshmand Moghaddam</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Hamidzadeh</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>New Hermite orthogonal polynomial kernel and combined kernels in Support Vector Machine classifier</article-title>
<source>Pattern Recognit</source>
<year>2016</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref28">
<mixed-citation>[28] X. Huang, Q.-S. Xu, Y.-H. Yun, J.-H. Huang, Y.-Z. Liang, “Weighted variable kernel support vector machine classifier for metabolomics data analysis,”<italic> Chemom. Intell. Lab. Syst</italic>., vol. 146, pp. 365–370, Aug. 2015. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.chemolab.2015.06.009">https://doi.org/10.1016/j.chemolab.2015.06.009</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>Q.-S.</given-names>
</name>
<name>
<surname>Yun</surname>
<given-names>Y.-H.</given-names>
</name>
<name>
<surname>Liang</surname>
<given-names>Y.-Z.</given-names>
</name>
</person-group>
<article-title>Weighted variable kernel support vector machine classifier for metabolomics data analysis</article-title>
<source>Chemom. Intell. Lab. Syst.</source>
<year>2015</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref29">
<mixed-citation>[29] D. A. López-Sarmiento, H. C. Manta-Caro, N. E. Vera-Parra, “Clasificador basado en una máquina de vectores de soporte de mínimos cuadrados frente a un clasificador por regresión logística ante el reconocimiento de dígitos numéricos,”<italic> TecnoLógicas</italic>, no. 31, pp. 37-51, Nov. 2011. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.22430/22565337.99">https://doi.org/10.22430/22565337.99</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>López-Sarmiento</surname>
<given-names>D. A.</given-names>
</name>
<name>
<surname>Manta-Caro</surname>
<given-names>H. C.</given-names>
</name>
<name>
<surname>Vera-Parra</surname>
<given-names>N. E.</given-names>
</name>
</person-group>
<article-title>Clasificador basado en una máquina de vectores de soporte de mínimos cuadrados frente a un clasificador por regresión logística ante el reconocimiento de dígitos numéricos</article-title>
<source>TecnoLógicas</source>
<year>2011</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref30">
<mixed-citation>[30] L. A. Muñoz-Bedoya, L. E. Mendoza, H. J. Velandia-Villamizar, “Segmentación de Imágenes de Resonancia Magnética IRM utilizando LS-SVM y Análisis Multiresolución Wavelet,” <italic>TecnoLógicas</italic>, pp. 681-693, Nov. 2013. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.22430/22565337.381">https://doi.org/10.22430/22565337.381</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Muñoz-Bedoya</surname>
<given-names>L. A.</given-names>
</name>
<name>
<surname>Mendoza</surname>
<given-names>L. E.</given-names>
</name>
<name>
<surname>Velandia-Villamizar</surname>
<given-names>H. J.</given-names>
</name>
</person-group>
<article-title>Segmentación de Imágenes de Resonancia Magnética IRM utilizando LS-SVM y Análisis Multiresolución Wavelet,</article-title>
<source>TecnoLógicas</source>
<year>2013</year>
</element-citation>
</ref>
<ref id="redalyc_344268257011_ref31">
<mixed-citation>[31] M. Moon, K. Nakai, “Stable feature selection based on the ensemble L 1 -norm support vector machine for biomarker discovery,” <italic>BMC Genomics</italic>, vol. 17, no. s13, Dec. 2016. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1186/s12864-016-3320-z">https://doi.org/10.1186/s12864-016-3320-z</ext-link>
</mixed-citation>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Moon</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Nakai</surname>
<given-names>K.</given-names>
</name>
</person-group>
<article-title>Stable feature selection based on the ensemble L 1 -norm support vector machine for biomarker discovery</article-title>
<source>BMC Genomics</source>
<year>2016</year>
</element-citation>
</ref>
</ref-list>
<fn-group>
<title>Notes</title>
<fn id="fn4" fn-type="other">
<label>-</label>
<p>
<bold> CONFLICTS OF INTEREST </bold>
</p>
<p>All authors declare that there were no conflicts of financial, professional, or personal interest that could inappropriately influence the results obtained in the present study.</p>
</fn>
<fn id="fn5" fn-type="other">
<label>-</label>
<p>
<bold> AUTHOR CONTRIBUTIONS</bold>
</p>
<p>
<list list-type="simple">
<list-item>
<p>Jorge Alejandro Lopera-Rodríguez contributed with developing the research idea, conducting the analysis using SVM-RFE and PLS-DA, implementing the methods in R, carrying out the comparative analysis, writing the article (introduction, methods, results, discussion, conclusions, and tables), and translating it.</p>
</list-item>
<list-item>
<p>Martha Zuluaga contributed with conducting the in vitro and in vivo experiments, designing the figures, writing the article (methods, results, and discussion), and translating it.</p>
</list-item>
<list-item>
<p>Jorge Alberto Jaramillo-Garzón contributed with assessing the analysis methods using PLS-DA and SVM-RFE, supervising the application of these methods, and supervising and correcting the article.</p>
</list-item>
</list>
</p>
</fn>
</fn-group>
</back>
</article>