Looking to Fall 2015.
Hopefully all of you are enjoying your summer wherever you may be. With just a few weeks of summer left, now is the time to plan for fall. Remember that Infometrix' October training course
is not far away. Register by completing an application form
and sending it to firstname.lastname@example.org
to save a spot. Also visit our website as new information gets added from time to time. A page with videos
on related topics has been added recently as well as Infometrix local area information
page, giving travel related guides, information and links.
In this edition, look for a Tech Tip on Outlier Diagnostics, Topological Quantitation for the feature article and the schedule for upcoming events.
Recent Happenings at Infometrix
We apply the principles embodied in standard operating procedures for much of what we do in industry. One exception: chemometrics modeling. Every chemometrics model has a bit of art project to it. The question is: can we succeed at doing multivariate calibrations in an optimized way and yet uniformly across practitioners and locations? A consistent approach to addressing the topology of the data set is a good place to start.
From the newsroom: Infometrix has always tried to be more generous, than say Microsoft, with our support. As Windows 10 gains its footing, we find it more difficult to support Windows XP as new products roll out. Our current line of products support XP, but conflicts have arisen. If we drop support for anything pre-Vista in the next version, will that be a cause for concern? If so, let us know!
Topology is a branch of mathematics that deals with continuity and connectivity in a data set and has its roots back to Leonhard Euler in 1736. Several texts are devoted to this topic. [1, 2]
In cases where optical spectroscopy is used to characterize substances with variability in source such as crude oil characterization or the monitoring of refinery fractions, a single multivariate model is often not adequate to handle the complexity and the non-linearity of the sample/instrument combination. As a result, the process should be monitored in a way that allows a more localized approach. [3, 4] This is often referred to as topological quantitation or topological mapping, as it is designed to follow the variations inherent in the data.
There are three mechanisms for handling this problem.
- One is to employ quantitation algorithms that are built for non-linear applications. There are a number of examples such as Gauss-Newton or Gradient Descent. The advantage is that we can deploy a single algorithm to cover a variety of cases. The disadvantage is that the use of the non-linear approach can lead to overfitting the data, which leads to good models but can underperform in routine use. 
- The second mechanism is to use a succession of linear regression models in a hierarchy. This concept was first commercially introduced in the early 1990s to improve the performance of PLS predictions for the octane rating of gasoline. Here all the models are fixed and one regression is used to find the best subsequent models for a “fine-tuning” of the assessment. 
- The third approach is to have the situation dictate which samples will be used for a more-localized assessment. Literature references are listed under the term Locally-Weighted Regression (LWR). Where the hierarchical approach transitions from one fixed model to the next, the LWR technique uses the current spectrum as the center point and chooses spectra from a database (model) that are similar, building a localized model on demand and immediately using this model for prediction. This approach has the advantage of not having to prepare models ahead of time, but still has the attribute of following the topology of the multivariate space. [3, 4]
The latter two approaches are covered in the Infometrix software suite. The hierarchical approach is embodied in the product InStep. A LWR system is handled with the IPAK dll. Prediction using LWR is available in the algorithm DLL and used by Pirouette, InStep and 3rd party applications.
- Bourbaki; Elements of Mathematics: General Topology, Addison–Wesley (1966).
- Rysxard Engelking, General Topology, Heldermann Verlag, Sigma Series in Pure Mathematics, December 1989.
- Naes, T.; Isaksson, T.; Kowalski, B., Locally weighted regression and scatter correction for near-infrared reflectance data. Anal. Chem., 1990.
- Bouveresse, E.; Massart, D.L.; Dardenne, P., Modified algorithm for standardization of near-infrared spectrometric instruments. Anal. Chem., 1995.
- G.A.F Seber and C.J. Wild, Nonlinear Regression. John Wiley and Sons, 1989.
- InStep Manual. Infometrix, Inc., first edition 1993.
After Further Review...
The past couple of years have allowed Infometrix to participate in a very diverse set of projects. We will write more about these activities in future newsletters.
- We refined the optimization process for on-line spectrometers eliminating roughly 50% of the prediction errors in the case of our study across a dozen separate processing plants, several analyzer technologies, and multiple parameters.
- We had the opportunity over the last five years to prepare a series of chemometric models that are used in tandem to assess the quality in a batch manufacturing process for the pharmaceutical industry.
- We participated in the development of a new gas chromatograph for which we have integrated chemometrics (both alignment and pattern recognition technologies) directly into the control system. This allows anyone to convert the instrument into an application-specific appliance quickly and inexpensively.
- We constructed an on-line multi-terabyte, centralized database that pulls in analytical data from diverse locations, automates the multivariate quality assessment, distills the critical information content, and delivers the results in real-time.
The best part about these projects is that they have led us to a better understanding of how and where to do the chemometric processing for maximum impact. Chemometrics is too often constrained to the activities of R&D and, even if they get deployed in a process setting, implementations are not easily maintained.
Change is in the air.
Brian G. Rohrback
To leave comments or questions on any of the topics presented in this newsletter, please visit the Discuss page on our website and look under the title of this newsletter.
Thank you for your support and continued readings.