An integrated Platform for Automatic Design and Screening of Virtual Mutants Based on 3D-QSAR analysis
CHALLENGE - An innovative application of 3D-QSAR methodology to the rational design of enzymes is here reported. The introduction of amidase activity inside the scaffold of lipase B from Candida antarctica (CaLB) was studied and 3D-QSAR models were constructed to correlate the structures of a set of CaLB mutants with their experimentally measured activities. Rationales for driving enzyme engineering are disclosed and a priori evaluation of new virtual candidate mutants becomes feasible. On that respect, the whole procedure for production of virtual mutants and scoring of their activity was automated within a workflow constructed by means of the modeFRONTIER package.
SOLUTION - The problem of engineering amidase activity inside a lipase (CaLB) scaffold was faced by taking into account that each mutation affects the properties of a mutant at multiple levels, leading to a complex, not linear, combination of factors. Consequently, simple visual inspection was considered inadequate for identifying structural motifs responsible for a certain property. Instead, multivariate statistical analysis (PLS) was applied because it has no conceptual bias in the interpretation of data and it simplifies the representation of results. Regarding the applicability of the 3D-QSAR approach to in silico screening of mutants the model here reported is able to discriminate between poor and good mutants. More importantly, by integrating all computational and statistical procedures inside a modeFRONTIER workflow it was demonstrated that it is possible to automate the whole methodology. This is of major importance in the perspective of a realistic application of the virtual screening process at industrial scale.
BENEFITS - modeFRONTIER It provides an environment for automation of different procedures or software. In the present case modeFRONTIER integrates in a single workflow the following programs: PyMOL (generation of 3D-structure of mutants), GROMACS (molecular dynamics, structure minimization and equilibration) and GRID (calculation of molecular descriptors); finally, also the mathematical equation of the 3D-QSAR model was integrated. modeFRONTIER generated randomly a first generation of 20 mutants (the number of mutants for each generation can be modified by the user), which were scored. The scoring results of the first generation were then exploited for the calculation of the next 20 mutants using a genetic algorithm that is already integrated into the modeFRONTIER software (NSGA II). The use of the genetic algorithm makes modeFRONTIER able to learn generation after generation which are the empiric rules necessary to increase the amidases activity of CaLB (hydrolysis of N-Benzyl-2-chloroacetamide). The automatic workflow generates and scores each virtual mutant in 2 hours on a normal workstation and, in principle, the modeFRONTIER software can compute generations of mutants until the established convergence criteria are achieved.