The Premier Chemometrics Company

November 2014
No. 4

New Website is Here!
Visit our newly redesigned website for resources, solutions, downloads of Infometrix commercial software products and much more. Browse through the plethora of information available. If you don't find what you are looking for, contact us and we'll do our best to point you to a resource on our site or elsewhere on the web.

Product Releases
In May, 2014, Infometrix released updated versions of all of its software offerings. In particular, Pirouette added jaggedness, a new tool to help determine the optimal number of factors in regression, a lack of fit measure for ALS, and the ability to save and use LWR (locally-weighted regression) models.

New features were added to InStep, including allowing control limits for outlier diagnostics to be added to reports and a change in formatting the method and format files to use ini-style layout. In addition, the report itself can now be formatted in ini-style to make parsing the reports by third party software easier.

It has always been possible to invoke the LineUp executable via command line, but now a command line access to the LineUpGUI has also been implemented.

With these new changes to LineUp and InStep, it becomes possible to have a fully turnkey multivariate analysis system tied to chromatographic analysis. Look for more details on this capability in upcoming newsletters.
Spectroscopy Best Practices
In order to build robust chemometric regression models, particularly for instruments employing optical spectroscopy, there are best practices that should be used when handling the chemometric modeling tasks across any industry or application. The list below would constitute a checksheet to follow when doing modeling with any chemometric software. Not all of the steps are absolutely necessary all of the time, but you should consider the information yielded by each of the algorithmic results before deciding that your modeling task is complete. These 10 items are our definition of Best Practices tied to the following diagnostics derived during model construction, validation and in prediction. Following these steps aims you toward an optimized model that can be integrated into any legacy system; there is no additional hardware or software that is needed.
  1. Randomized t-test and regression vector analysis – a model diagnostic to help determine the relevant number of factors to include in the model;
  2. Comparison of predicted versus known – a model, a sample, and a prediction diagnostic to look for systematic patterns that deviate from the ideal line;
  3. Sample consistency – a sample diagnostic based on Robust multivariate analysis  including score distance (within model or Mahalanobis distance), orthogonal distance (F-ratio) and studentized residual (concentration residual) to identify samples that are inconsistent with the model;
  4. Scores analysis – a sample diagnostic to determine the completeness of the training set and to identify constraints that should be placed on the model;
  5. Concentration residual analysis – a variable, a model and a sample diagnostic to look for systematic deviations and correlations with other parameters, identify non-instrument measurements that should be added to the model, and variables that are unusual across all samples (e.g., excessive noise);
  6. Studentized residual and leverage – a sample and a validation diagnostic used to identify unusual samples or samples that have undue impact on the model;
  7. Loadings analysis – both a variable and a model diagnostic to support the evaluation of the degree of complexity (number of factors) to include in the model and to assess important and unusual variables;
  8. Measurement residual evaluation – a sample and a prediction diagnostic to identify outliers and look for structured patterns that may represent unmodeled chemical information;
  9. Root mean square error of prediction – a model diagnostic and estimate of bias to establish the point where errors are comparable to known (reference) values; and Fisher ratio – a sample and a prediction diagnostic that identifies unusual validation samples and ties closely to the reliability of future predictions; and
  10. Fisher ratio – a sample and a prediction diagnostic that identifies unusual validation samples and ties closely to the reliability of future predictions.

Words from the PresidentBest Practices? In my travels, I get a chance to talk to a great many folks in the lab as well as analysts from the process side. Because Infometrix focuses on supplying chemometrics solutions for chromatography and spectroscopy, much of the discussion centers on how to best build the chemometric models to manage the interpretation step.

In a recent visit to a refinery, I met a technician who had been newly charged with developing chemometric models for a FTNIR system. It turns out that the instrument was originally calibrated by someone who was no longer at that location (or even still within the company) and so the task was re-assigned. He was frustrated that the calibrations were not as robust as he would like and wanted to see what he could do to lengthen the time between calibration tasks.

The basic problem was that the technician was overfitting, as the approach he used was to examine the predicted versus actual plot and set the smallest number of factors that seemed to not make much of a change to the scatter about the line. His training consisted of a few hours working with the instrument company sales rep.

“Common practices” need to give way to “Best Practices”. In this issue of the Infometrix newsletter, we remind you of the steps for doing a chemometric calibration.

Brian G. Rohrback

In This Newsletter
-New Website is Here!
-Product Releases
-Spectroscopy Best Practices
-Upcoming Events
-Tech Tip: Cloaking
-Words from the President

Upcoming EventsEastern Analytical Symposium and Exposition
November 17-19, 2014

IFPAC Annual Meeting
January 25-28, 2015

Pittcon 2015
March 8-12, 2015

Chemometrics Training Course
April 8-10, 2015

ISA-Analysis Division Symposium
April 26-30, 2015

Tech Tip: Cloaking
Visualization is a process we emphasize for understanding your data. In Pirouette, two complementary features facilitate visualization: dynamic linking and cloaking. You are probably already familiar with dynamic linking: if you highlight a sample or group of samples in one view, say a 2D scatter plot, those samples will appear highlighted in any other sample-oriented plot, including table views and line plots.

Sometimes, the plots are too busy to see where the highlighted samples are located. Click on the Cloak button in the ribbon (just to the left of the label button with the upper case A). The cloaking button is a 3-way toggle: the first time you click on it, it will show only the highlighted samples; the second time, only those samples not highlighted; the third time, all samples will be shown, back in the initial state.

By toggling the cloak state, you will be able to see which samples are highlighted in the other view state. This may help reveal more nuance about the differences between the selected and unselected samples. Clicking to change the state to unhighlighted samples might be just as informational. Try this with the data in both scatter plot views and as a line plot as well.

"There is a great satisfaction in building good tools for other people to use."

- Freeman Dyson (1923- )

The Infometrix mission is to provide high quality, easy-to-use software for the handling of multivariate data.

Copyright © 2014 Infometrix, Inc., All rights reserved.