Comparison of data summarization and feature selection techniques for in-process spectral data

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

In this work, approaches to data summarization and feature selection are assessed for predicting the mechanical properties of a polymer product based on complex heterogeneous in-process data. Pressure and temperature data as well as Near Infrared (NIR) spectroscopy data were captured at different sampling frequencies during the process and used to predict the yield strength of the product. Direct interpretation of NIR spectra is recognized as an intractable problem in material processing and chemometric approaches are applied to build models which must be calibrated against lab-characterized response data. The low sampling rate of such lab characterization relative to in-process data capture raises the question of how best to summarize the process data when predicting the material properties. Further, conventional Principal Component Regression (PCR) and Partial Least Squares (PLS) regression chemometric methods lack interpretability of the model and do not provide much insight for how best to control the process. In this work we compare two different approaches to data summarization and compare two different Recursive Feature Elimination (RFE) methods for feature selection. It is shown that RFE using Random Forest regression with data summarized over the entire production run yields the best predictive performance. It also delivers a sparse model in the original features which facilitates interpretation of physio-chemical changes in the material and provides useful insight for process control.

Original languageEnglish
Title of host publication2021 32nd Irish Signals and Systems Conference, ISSC 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665434294
DOIs
Publication statusPublished - 10 Jun 2021
Event32nd Irish Signals and Systems Conference, ISSC 2021 - Athlone, Ireland
Duration: 10 Jun 202111 Jun 2021

Publication series

Name2021 32nd Irish Signals and Systems Conference, ISSC 2021

Conference

Conference32nd Irish Signals and Systems Conference, ISSC 2021
Country/TerritoryIreland
CityAthlone
Period10/06/2111/06/21

Keywords

  • NIR
  • PLA
  • bagging
  • random forest
  • recursive feature elimination

Fingerprint

Dive into the research topics of 'Comparison of data summarization and feature selection techniques for in-process spectral data'. Together they form a unique fingerprint.

Cite this