|New Website is Here!
Visit our newly redesigned website for resources, solutions, downloads of Infometrix commercial software products and much more. Browse through the plethora of information available. If you don't find what you are looking for, contact us and we'll do our best to point you to a resource on our site or elsewhere on the web.
In May, 2014, Infometrix released updated versions of all of its software offerings. In particular, Pirouette added jaggedness, a new tool to help determine the optimal number of factors in regression, a lack of fit measure for ALS, and the ability to save and use LWR (locally-weighted regression) models.
New features were added to InStep, including allowing control limits for outlier diagnostics to be added to reports and a change in formatting the method and format files to use ini-style layout. In addition, the report itself can now be formatted in ini-style to make parsing the reports by third party software easier.
It has always been possible to invoke the LineUp executable via command line, but now a command line access to the LineUpGUI has also been implemented.
With these new changes to LineUp and InStep, it becomes possible to have a fully turnkey multivariate analysis system tied to chromatographic analysis. Look for more details on this capability in upcoming newsletters.
Spectroscopy Best Practices
In order to build robust chemometric regression models, particularly for instruments employing optical spectroscopy, there are best practices that should be used when handling the chemometric modeling tasks across any industry or application. The list below would constitute a checksheet to follow when doing modeling with any chemometric software. Not all of the steps are absolutely necessary all of the time, but you should consider the information yielded by each of the algorithmic results before deciding that your modeling task is complete. These 10 items are our definition of Best Practices tied to the following diagnostics derived during model construction, validation and in prediction. Following these steps aims you toward an optimized model that can be integrated into any legacy system; there is no additional hardware or software that is needed.
- Randomized t-test and regression vector analysis – a model diagnostic to help determine the relevant number of factors to include in the model;
- Comparison of predicted versus known – a model, a sample, and a prediction diagnostic to look for systematic patterns that deviate from the ideal line;
- Sample consistency – a sample diagnostic based on Robust multivariate analysis including score distance (within model or Mahalanobis distance), orthogonal distance (F-ratio) and studentized residual (concentration residual) to identify samples that are inconsistent with the model;
- Scores analysis – a sample diagnostic to determine the completeness of the training set and to identify constraints that should be placed on the model;
- Concentration residual analysis – a variable, a model and a sample diagnostic to look for systematic deviations and correlations with other parameters, identify non-instrument measurements that should be added to the model, and variables that are unusual across all samples (e.g., excessive noise);
- Studentized residual and leverage – a sample and a validation diagnostic used to identify unusual samples or samples that have undue impact on the model;
- Loadings analysis – both a variable and a model diagnostic to support the evaluation of the degree of complexity (number of factors) to include in the model and to assess important and unusual variables;
- Measurement residual evaluation – a sample and a prediction diagnostic to identify outliers and look for structured patterns that may represent unmodeled chemical information;
- Root mean square error of prediction – a model diagnostic and estimate of bias to establish the point where errors are comparable to known (reference) values; and Fisher ratio – a sample and a prediction diagnostic that identifies unusual validation samples and ties closely to the reliability of future predictions; and
- Fisher ratio – a sample and a prediction diagnostic that identifies unusual validation samples and ties closely to the reliability of future predictions.
Words from the PresidentBest Practices? In my travels, I get a chance to talk to a great many folks in the lab as well as analysts from the process side. Because Infometrix focuses on supplying chemometrics solutions for chromatography and spectroscopy, much of the discussion centers on how to best build the chemometric models to manage the interpretation step.
In a recent visit to a refinery, I met a technician who had been newly charged with developing chemometric models for a FTNIR system. It turns out that the instrument was originally calibrated by someone who was no longer at that location (or even still within the company) and so the task was re-assigned. He was frustrated that the calibrations were not as robust as he would like and wanted to see what he could do to lengthen the time between calibration tasks.
The basic problem was that the technician was overfitting, as the approach he used was to examine the predicted versus actual plot and set the smallest number of factors that seemed to not make much of a change to the scatter about the line. His training consisted of a few hours working with the instrument company sales rep.
“Common practices” need to give way to “Best Practices”. In this issue of the Infometrix newsletter, we remind you of the steps for doing a chemometric calibration.
Brian G. Rohrback
In This Newsletter