10 COMPUTER-ASSISTED HPLC AND KNOWLEDGE MANAGEMENT Yuri Kazakevich, Michael McBrien, and Rosario LoBrutto 10.1 INTRODUCTION In modern high-performance liquid chromatography (HPLC), computers in a broad sense are used in every instrumental module and at every stage of analy- sis. Computers control the flow rate, eluent composition, temperature, injec- tion volume, and injection process. Detector output signal is converted from analog form into the digital representation to recognize the presence of peaks, and then at higher level of computer analysis a chromatogram is obtained.All these computer-based functions are performed in the background, and the chromatographer usually does not think about them. The second level of computer utilization in HPLC is extraction of valuable analytical and physicochemical information from the chromatogram. This includes standard analytical procedures of peak integration, calibration and quantitation, and more complex correlation of the retention dependencies with variation of selected parameters. At the third (and probably highest) level, a computer is used for the sophis- ticated analysis of many different experimental results stored in databases. This level is usually regarded as a knowledge management level and can have quite a variety of different goals: • Selection of the starting conditions for method development by using information of similar separations 503 HPLC for Pharmaceutical Scientists, Edited by Yuri Kazakevich and Rosario LoBrutto Copyright © 2007 by John Wiley & Sons, Inc. • Optimization of the existing method, to speed up the analysis, increase ruggedness of the chromatographic method, and so on • Review of a multitude of data from different experiments and their cor- relation with information from other physicochemical methods • Cross-laboratory information exchange (early drug discovery, preformu- lation groups,drug metabolism and pharmokinetic groups,drug substance and drug product groups) In this chapter the third level of computer-assisted HPLC—the use of expert systems (like Drylab [1], AutoChrom TM [2], and ChromSword ® [3]) for effec- tive method development—is discussed. Computer-assisted method development has received a great deal of atten- tion from management within the pharmaceutical industry, mainly from the perspective of cost savings associated with faster and more efficient develop- ment. Adoption and incorporation of the tools in day-to-day workflows has been relatively limited due in part to a reluctance of chromatographers to believe that computers can replace the intuition of the expert chromatogra- pher. With the present state-of-the-art, there is little question that computers can play a role in efficient method development. However, it must be accepted that computers are a supplement to, rather than a replacement for, the knowl- edge of the method development chromatographer. Two main types of software tools exist that are directly applicable to the problem of chromatographic method development. 1. Optimization or experimental design software packages for modeling the chromatographic response as a function of one or more method vari- ables. These can also play a key role in data management of the consid- erable information that results from rigorous method development exercises. 2. Structure-based prediction software predicts retention times or impor- tant physicochemical processes based on chemical structures. Applica- tion databases store chromatographic methods for later retrieval and adaptation to new samples with similar structures and physicochemical parameters. 10.2 PREDICTION OF RETENTION AND SIMULATION OF PROFILES In Chapters 2, 3, and 4, all aspects of the analyte retention on the HPLC column are discussed. There are many mathematical functions describing retention dependencies versus various parameters (organic composition, tem- perature, pH, etc.). Most of these dependencies rely on empirical coefficients. Analyte retention is a function of many factors: analyte interactions with the stationary and mobile phases; analyte structure and chemical properties; struc- 504 COMPUTER-ASSISTED HPLC AND KNOWLEDGE MANAGEMENT ture and geometry of the column packing material; and many other parame- ters . The theoretical functional description of the influence of the eluent com- position, mobile-phase pH, salt concentration, and temperature, as well as the influence of the type of organic modifier and type of salt added to the mobile phase, are discussed in detail in Chapter 2 and 4. Currently, eluent composition, column temperature, and eluent pH are the only continuous parameters used as the arguments in functional optimization of HPLC retention. However, other parameters such as ionic strength, buffer concentration and concentration of salts and/or ion-pairing reagents can be taken into account, and mathematical functions for these can be constructed and employed. The simplest and the most widely used forms of retention time prediction for analytical scale HPLC are based on the empirical linear dependence of the logarithm of the retention factor on the eluent composition. 10.2.1 General Thermodynamic Basis Association of the chromatographic retention factor with the equilibrium con- stant is the basis for all optimization or prediction algorithms. As was shown in Chapter 2, this association is only very approximate and should be used with caution. In short, an approximate mathematical description of the retention factor dependences on the eluent composition and temperature is written in the form (10-1) where f is the molar fraction of the organic eluent modifier, DG el. is the Gibbs free energy of the organic eluent modifier interaction with the stationary phase; R is the gas constant; T is the absolute temperature, and DG an.frag. is the Gibbs free energy of the interactions of structural analyte fragments with the stationary phase. Equation (10-1) is based on the assumption of simple additivity of all inter- actions and a competitive nature of analyte/eluent interactions with the sta- tionary phase. The paradox is that these assumptions are usually acceptable only as a first approximation, and their application in HPLC sometimes allows the description and prediction of the analyte retention versus the variation in elution composition or temperature. For most demanding separations where discrimination of related components is necessary, the accuracy of such pre- diction is not acceptable. It is obvious from the exponential nature of equa- tion (10-1) that any minor errors in the estimation of interaction energy, or simple underestimation of mutual influence of molecular fragments (neglected in this model), will generate significant deviation from predicted retention factors. k G RT G RT =− ∑ exp ∆ ∆ an.frag. el. f PREDICTION OF RETENTION AND SIMULATION OF PROFILES 505 10.2.2 Structure–Retention Relationships Many attempts to correlate the analyte structure with its HPLC behavior have been made in the past [4–6]. The Quantitative structure–retention relation- ships (QSRR) theory was introduced as a theoretical approach for the pre- diction of HPLC retention in combination with the Abraham and co-workers adaptation of the linear solvation energy relationship (LSER) theory to chro- matographic retention [7, 8]. The basis of all these theories is the assumption of the energetic additivity of interactions of analyte structural fragments with the mobile phase and the stationary phase, and the assumption of a single-process partitioning-type HPLC retention mechanism. These assumptions allow mathematical repre- sentation of the logarithm of retention factor as a linear function of most con- tinuous parameters (see Chapter 2). Unfortunately, these coefficients are mainly empirical, and usually proper description of the analyte retention behavior is acceptable only if the coefficients are obtained for structurally similar components on the same column and employing the same mobile phase. To date, the shortcomings in the theoretical [22] and functional description of HPLC column properties make all these theories insufficient for practical application to HPLC method design and selection. In the past, several theoretical models were proposed for the description of the reversed-phase retention process. Some theories based on the detailed consideration of the analyte retention mechanism give a realistic physico- chemical description of the chromatographic system, but are practically inap- plicable for routine computer-assisted optimization or prediction due to their complexity [9, 10]. Others allow retention optimization and prediction within a narrow range of conditions and require extensive experimental data for the retention of model compounds at specified conditions [11]. Probably the most widely studied is the solvophobic theory [12] based on the assumption of the existence of a single partitioning retention mechanism and using essentially equation (10-1) for the calculation of the analyte reten- tion. Carr and co-workers adapted the solvophobic theory [12, 13] and LSER theory [11, 14–17] to elucidate the retention of solutes in a reversed-phase HPLC system on nonpolar stationary phases. The free energy of transfer of a molecule from the mobile phase to the sta- tionary phase, DG, can be regarded as a linear combination of the free reten- tion energies, DG i , arising from various molecular subunits (solvatochromic parameters). Many solvatochromic parameters for some analytes could be found in the literature [18–21]. The signs and magnitudes of the coefficients depict the direction and relative strength of different kinds of solute/station- ary and solute/mobile phase interactions contributing to the retention in the investigated matrix [11–15]. The most influential factors governing RP-HPLC retention on alkyl and phenyl-type bonded phases were determined to be hydrogen bonding and the solute molecular volume [12, 13, 20, 23].The hydro- 506 COMPUTER-ASSISTED HPLC AND KNOWLEDGE MANAGEMENT gen bonding is measured as the effect of complexation between hydrogen- bond acceptor (HB A) solutes and hydrogen-bond donor (HBD) bulk phases [24]. The solute molecular volume is comprised of two terms: One measures the cohesiveness of the chromatographic phases (both the mobile and sta- tionary phases) and the other is the dispersive term that measures the ability of the chromatographic phases to interact with solutes via dispersive forces. 10.3 OPTIMIZATION OF HPLC METHODS 10.3.1 Off-Line Optimization The most common software tools used for chromatographic method develop- ment are optimization packages. All of these tools take advantage of the fact that the retention of a given compound will change in a predictable manner as a function of virtually any continuous chromatographic variable. The classic example (and certainly most common application) of computer- assisted chromatographic optimization is eluent composition, commonly called solvent strength optimization. The chromatographer performs at least two experiments varying the gradient slope for gradient separations or con- centration of organic modifier for isocratic separations at a certain tempera- ture. The system is then modeled for any gradient or concentration of organic modifier. A simplistic description of the chromatographic zone migration through the column under gradient conditions is given in Chapter 2. At iso- cratic conditions the linear dependence of the logarithm of retention factor on the eluent composition is used for optimization: (10-2) where k is the retention factor of the compound, φ is the fraction of organic solvent in the mobile phase, and A and B are constants for a given compound, chromatographic column, and solvent system. Based on a few experiments, the constants in the expression can be extracted, and retention of each compound can be predicted. This optimization approach can be used to model both retention times and selectivities due to the fact that both the A and B terms are unique for a given analyte. The typical output from method optimization software is a resolution map, as shown in Figure 10-1. The map shows resolution of the critical pair (two closest eluting peaks) as a function of the parameter(s). The example shows resolution as a function of gradient time (slope of the gradient). The resolu- tion map has several advantages as an experimental display tool: It forms a concise summary of experiments performed, it allows the chromatographer to select areas of interest and communicate the expected result, and it facilitates the viewing of data that would allow for a more robust separation. ln kA B () =+j OPTIMIZATION OF HPLC METHODS 507 Optimization of the eluent composition is commonly based on the linear relationship of ln k to f (10-4) and generally applicable for ideal chromato- graphic systems with unionizible analytes in methanol/water mixtures . It is commonly assumed that: • A single partitioning-like equilibrium process dominates in the retention mechanism. • Analyte ionization changes do not occur in the pertinent solvent range. • Column property changes do not occur over the course of the experiment. Like in any optimization tool, the chromatographer should be wary of extrapolation beyond the scope of the training experiments. Behavior of certain parameters, like temperature and solvent strength, is fairly easily modeled. Other parameters, such as buffer concentration and pH, can be much more difficult to model. In these cases, interpolation between fairly closely spaced points (actual experiments that were performed) is most appropriate. Figure 10.2 shows a resolution map for a two-dimensional system in which solvent composition and trifluoroacetic acid concentration are simultaneously optimized.The chromatographer has collected systematic experiments at TFA concentrations of 5, 9, 13,and 17mM and acetonitrile concentrations of 30, 50, and 70v/v% for a series of small molecules on a Primesep 100 column. 508 COMPUTER-ASSISTED HPLC AND KNOWLEDGE MANAGEMENT Figure 10-1. DryLab ® software version 3.0 modeling the separation of a mixture of naphthalenes. Resolution of the critical pair (the two peaks that elute closest together) is denoted as a function of time of gradient. Experimental runs are shown as solid lines on the resolution map; selected prediction is a dashed line. Note. T he type and concentration of the organic eluent can cause a pH shift of the aqueous portion of the mobile phase as well as change the ionization state of the analyte in a particular hydro-organic mixture. Temperature can also lead to change in the ionization constants of analytes. Even when chromatographers are careful to keep buffer strengths constant during modification of organic solvent strengths, effective analyte pK a changes and mobile-phase pH changes as a result of solvent strength, which can cause changes in ionization state of compounds, changes in the resultant mobile- phase pH, and/or changes in the behavior of chromatographic columns [25]. Departures from linearity can be particularly striking in acetonitrile as opposed to methanol. For systems in which the greatest possible quality of method is required in terms of resolution, run time, and robustness, the results from predictions should be verified against experimental data and, where nec- essary, nonlinear predictions should be used to refine the model and to locate the optimal conditions. Computer-assisted optimization of parameters has not been universally accepted, primarily due to a lack of ease of use. All compounds must be tracked across all experiments, and all retention times must be introduced to the system for each component. This is sometimes difficult because significant variations in the retention and elution order could be observed for certain ana- lytes. With diode array detection, even if the different analytes have distinct OPTIMIZATION OF HPLC METHODS 509 Figure 10-2. ACD/LC Simulator TM 9.0 modeling the separation of a series of com- pounds as a function of solvent composition and TFA concentration (mM). Experi- ments are shown as white dots on the resolution map with the predicted optimal method shown in yellow. See color plate. diode array profiles, the analytes with low concentration in the mixture may still be difficult to track. The use of MS detection can assist in the detection of the peaks in the different experiments, with the assumption that they are not isomers of each other. Software vendors have begun to address much of this with the implementation of automated peak-tracking systems (see Section 10.3.4.2) and direct transfer of experimental information from chromatogra- phy data systems. Advantages of this technique are the efficiency of development of methods, structured development profiles, and effective reporting of what was per- formed during the different method development iterations. In addition, it is possible to model the effect of parameter variation on the robustness of methods in addition to general chromatographic figures of merit: apparent efficiency, tailing, resolution of critical pairs, backpressure of system, total run time. 10.3.2 On-Line Optimization Recently there has been renewed interest in automated method development in which the optimization software directly interfaces with the instrument in order to run or suggest new experiments based on the prior results that gen- erated the initial resolution maps. In the late 1980s, a number of approaches to this problem were attempted, but none of these tools prevailed, due in part to the challenges of tracking peaks between experiments. The current second-generation tools offer more promise due to (a) a focus on secondary detection techniques for peak tracking and (b) better automa- tion tools offered by instrument vendors. The advantages of on-line automation are the achievement of time savings in relation to the chromatographic method development time. The software can make decisions at any time of the day or night and can immediately communicate this information to the instrument after the completion of the experiment. There is also a more subtle benefit to the link of optimization software to the chromatography data system. Method development “wizards” with drop-down menus/user-defined fields can simplify the process of config- uring the instrument sequence/method prior to a method development session. Disadvantages of on-line optimization lie primarily in the maturity of this technology. If manual method development is based on the experience and intuition, the automated method development in principle should follow the logic of chromatographic theory, which unfortunately is not yet developed enough to provide a logical guide for automated optimization. Software and instrument vendors are relying on the statistical optimization with minimal use of available theoretical developments and only on the level of simple parti- tioning mechanism and energetical additivity. The capacity of software inno- vators to address detection limit, peak-tracking, and artificial intelligence issues remains in question at present, but the considerable commitment by 510 COMPUTER-ASSISTED HPLC AND KNOWLEDGE MANAGEMENT instrument and software vendors points to the future value of these tools. As spectroscopic peak-tracking algorithms mature , the effectiveness of the tools will grow considerably. 10.3.3 Method Screening There are some chromatographic parameters that do not readily lend them- selves to optimization. There have been some efforts to quantify the selectiv- ity in chromatographic columns [26, 27], but it is often difficult to achieve targeted values for each of the parameters involved without custom prepara- tion of materials.Experimental mobile-phase pH values must typically be very close together in order to enable subsequent pH optimization. Column and pH choice are critical to the selectivity of a given system, so it is clear that their effects should not be ignored. One solution to this problem is to screen different columns and pH values prior to commencing any kind of optimiza- tion.The screening results are reviewed, and optimization systems at a particu- lar pH are designed accordingly. With the advent of column switchers and more reproducible alternative column materials, it is now quite feasible to screen multiple pH values—for example, at high, medium, and low pH—using scouting gradients in order to choose the column and pH at which to perform further optimization experi- ments.This is a particularly tempting scenario when few or no chemical struc- tures are available for the synthetic by-products or degradation products in the sample, or when samples are particularly complex. Recently there has been considerable development on systems for selection of optimal pH and type of column concomitantly [28]. For complex samples, it can be time-consuming and challenging to review all the results of system screens objectively. In addition, online optimization precludes the direct involvement of the chromatographer. For this reason, it is desirable to use some numerical description of the potential effectiveness of a given set of conditions so the on-line optimization software can trigger further separations on the chromatographic system. Screening review tools cannot work solely based on the venerable “resolu- tion of the critical pair” approach; the results of an initial screen must be able to give nonzero results even with co-elution of two components,when the reso- lution of the critical pair will, of course, be zero. Additionally, a suitability approach involving criteria related to run time is unwise, since run time can be fine-tuned based on solvent strength or flow rate in final optimization. Rather, at the screening stage, the chromatographer should be focused on sufficient selectivity to form the basis of an eventual suc- cessful separation, and then fine-tuning can be performed.There are a number of different measures of the desirability of an initial screen, including average resolution, resolution of critical pair, selectivity of critical pairs, and so on. The chromatographer need not be intimately familiar with the nuances of every rating system available. The only key is to be certain that appropriate OPTIMIZATION OF HPLC METHODS 511 rating systems are used at appropriate times.Table 10-1 shows some common approaches to the rating of chromatographic column screens [29]. 10.3.4 Method Optimization All approaches to method optimization based on multiple experiments have the requirement that all components be detected and that they be tracked between runs. For complex samples, this is typically the most labor-intensive aspect of method development. For unattended method development, the instrument is required to monitor the change in retention of each component automatically. The historical limitations to this technology have been a key stumbling block in the widespread adoption of automated method development. 10.3.4.1 Peak Matching in Method Optimization. An initial solution to the problem of peak tracking across multiple experiments was the isolation of each impurity on a preparative or semiprep scale, followed by injection of each component individually. The chromatographic world has essentially rejected this concept outright.Very few chromatographers have the time or willingness to isolate standards for each component.The use of crude samples and mother 512 COMPUTER-ASSISTED HPLC AND KNOWLEDGE MANAGEMENT TABLE 10-1. Numerical Approaches to Ranking Separations Approach Basis Application Minimum resolution Resolution of closest-eluting peaks Final model of (resolution of the (R CP ) separation critical pair) Method suitability Product or minimum of various Final model of criteria: run time, resolution of separation critical pair, and resistance of (customizable) viability to small changes in conditions Mean resolution Average resolution Assessment of selectivity Run time (RT) versus N = 1 if RT < Target; Evaluating suitability Target (t) and N = 0 if RT > maximum; of solvent strength Maximum (M) N = 1 − (RT − t)/(M − t) and column choice Equidistance deviation from equal peaks resolution Comparison of starting systems Resolution score average value of normalized Comparison of resolutions between all the peaks starting systems detected on a chromatogram RsScore = − ∑ Rs N n 1 N t n = − − RunTime 0 1 [...]... daunting Before embarking on choosing the optimal conditions for optimization, generally a pH screen (at least five pH values) in either gradient or isocratic mode is performed to determine the most suitable pH ranges for the active pharmaceutical ingredient (at least one unit below or above the target analyte pKa in a particular hydro-organic system) This results in at least five experi- OPTIMIZATION OF HPLC. .. the neutral form of the basic compound and the neutral form of the acidic compound For basic compounds (or basic functionalities) the lower the pH, the more the ionic equilibrium is shifted toward the protonated form of the analyte, which continually increases its concentration in the aqueous phase and decreases its content in oil phase Therefore there is no plateau region at low pH However, for an acidic... data for original traces is sorted by the chromatographic method, tracing for which sample/condition set the data were collected In the project architecture, information is grouped according to experimental conditions, or “experiments.” Multiple detector traces are arranged for each subsample, with subsamples organized by experiment Experiments are grouped according to waves that are designed for optimization... could be defined, then scientists can save time in their method development journey Programs that allow for structures or partial structures searching can be used to assist with the selection of starting points These data could be easily searched for The method development work that a chromatographer plans to 518 COMPUTER-ASSISTED HPLC AND KNOWLEDGE MANAGEMENT employ may have been performed prior in early... pointer to new opportunities Structure-based separation databases integrated with other analytical and pharmaceutical information provides a basis for a significant increase of development efficiency If analytical chemists from the various areas of drug development (drug metabolism, preformulation, formulation, drug substance) enter their separations of the target compounds into the database and link... be made between chemical formulae and chemical structures For databases with any type of diversity to be realized, the chemical formula cannot provide effective retrieval of compounds Structure-based searches can take three different approaches: 520 • • • COMPUTER-ASSISTED HPLC AND KNOWLEDGE MANAGEMENT Structure Substructure Structure similarity [35] Structure searches look for molecules that are identical...OPTIMIZATION OF HPLC METHODS 513 liquors enriched with synthetic byproducts for initial method development of drug substance is recommended Another approach is to look at the molecule of interest and predict most probable degradation product(s) and use forced degraded samples for initial method development For example, if a compound contains ester functionality,... to use methods developed in the past as a knowledge base for the determination of a starting point Stored methods are retrieved, and method development sessions can be designed based on the past work performed in different line units of the organization (early drug discovery, preformulation group, DS and DP groups) A key point here is the need for chemical structures to assist with locating similar... separate components in the forced degradation samples as if they were all present in the same sample The development of methods for these “composite samples” is typically required to be exceedingly rigorous Columns, solvent systems, and pH values will be screened, and multidimensional optimization performed The software tools that have been discussed in this chapter are invaluable for this kind of project... setup of the system If the approach is not combined with instrument control, then a process must be devised for efficient transfer of information to the data system STRUCTURE-BASED TOOLS 517 10.4 STRUCTURE-BASED TOOLS It is uncommon for the method development chromatographer to have absolutely zero information with regard to the chemical structures present in a given sample Typically, at least one or more . goals: • Selection of the starting conditions for method development by using information of similar separations 503 HPLC for Pharmaceutical Scientists, Edited by Yuri Kazakevich. functions for these can be constructed and employed. The simplest and the most widely used forms of retention time prediction for analytical scale HPLC are