Welcome all readers, viewers, researchers and aspirants to this site for upgrading knowledge and aptitude in pharmaceutical industry oriented clinical research. "If you think Research is Expensive, Try Ineffectiveness in Illness"
Pages
- Home
- About the author
- Ask Your Query -forums
- Articles
- Posts
- FORMS AVAILABLE
- News Updates and Bulletins
- Announcements
- Links
- journals
- Pharm D India
- Pharma Mnemonics
- the New age health advocacy,communication, literacy and social tweetchat Posts
- contact details
- Pharma Corner- Enhancing Profession by Knowledge Interconnection
- Photo and Video Gallery
- Clinically Oriented blogs and news- Clinical Research
- Industry oriented blogs and news
- The International pharmacist
- Number of Publicly Listed pharmaceutical companie...
- Number of Biotechnology companies in India
- number of CROs
- number of Indian Franchises/Subsidiaries/ Corporates & Indian Pharma companies in Clinical Research
- Testimonials
- FPGEE and NAPLEX
- societies,institutions,organizations,associations,confederations and alliances
- Funny Pharmatoons
- Pharmaceutical, medical and clinical councils
- Pharmaceutical, medical and clinical Boards
- Pharmaceutical, medical and clinical licensing exams
- rules , protocols and regulatory affairs
- Achievements
- sams epharmacy cum pharmaclinic cum dic cum pvc- look at my presentations and publications
Saturday, 29 June 2013
Thursday, 20 June 2013
Fundamental Of Trial Design : Randomized Controlled Trials
INTRODUCTION
Randomized clinical trials are scientific investigations that examine
and evaluate the safety and efficacy of new drugs or therapeutic
procedures using human subjects. The results that these studies generate
are considered to be the most valued data in the era of evidence-based
medicine. Understanding the principles behind clinical trials enables an
appreciation of the validity and reliability of their results.
What is a randomized clinical trial?
A clinical trial evaluates the effect of a new drug (or device or
procedure) on human volunteers. These trials can be used to evaluate the
safety of a new drug in healthy human volunteers, or to assess
treatment benefits in patients with a specific disease. Clinical trials
can compare a new drug against existing drugs or against dummy
medications (placebo) or they may not have a comparison arm. A large
proportion of clinical trials are sponsored by pharmaceutical or
biotechnology companies who are developing the new drug, but some
studies using older drugs in new disease areas are funded by health
related government agencies, or through charitable grants.
In a randomized clinical trial, patients and trial personnel are
deliberately kept unaware of which patient is on the new drug. This
minimizes bias in the later evaluation so that the initial blind random
allocation of patients to one or other treatment group is preserved
throughout the trial. Clinical trials must be designed in an ethical
manner so that patients are not denied the benefit of usual treatments.
Patients must give their voluntary consent that they appreciate the
purpose of the trial. Several key guidelines regarding the ethics,
conduct, and reporting of clinical trials have been constructed to
ensure that a patient’s rights and safety are not compromised by
participating in clinical trials.
Are there different types of clinical trials?
Clinical trials vary depending on who is conducting the trial.
Pharmaceutical companies typically conduct trials involving new drugs or
established drugs in disease areas where their drug may gain a new
license. Device manufacturers use trials to prove the safety and
efficacy of their new device. Clinical trials conducted by clinical
investigators unrelated to pharmaceutical companies might have other
aims. They might use established or older drugs in new disease areas,
often without commercial support, given that older drugs are
unlikely to generate much profit. Clinical investigators might also:
- look at the best way to give or withdraw drugs
- investigate the best duration of treatment to maximize outcome
- assess the benefits of prevention with vaccination or screening programs
Thus, different types of trials are needed to cover these needs; these can be classified under the following headings:
Phases:
The pharmaceutical industry has adopted a specific trial classification
based on the four clinical phases of development of a particular drug (Phases I–IV). In Phase I,
manufacturers usually test the effects of a new drug in healthy
volunteers or patients unresponsive to usual therapies. They look at how
the drug is handled in the human body
(pharmacokinetics/pharmacodynamics), particularly with respect to the
immediate short-term safety of higher doses. Clinical trials in Phase II
examine dose–response curves in patients and what benefits might be
seen in a small group of patients with a particular disease. In Phase III,
a new drug is tested in a controlled fashion in a large patient
population against a placebo or standard therapy. This is a key phase,
where a drug will either make or break its reputation with respect to
safety and efficacy before marketing begins. A positive study in Phase III
is often known as a landmark study for a drug, through which it might
gain a license to be prescribed for a specific disease. A study in Phase
IV is often called a post-marketing study as the drug has already been
granted regulatory approval/license. These studies are crucial for
gathering additional safety information from a larger group of patients
in order to understand the long-term safety of the drug and appreciate
drug interactions.
Trial design:
Trials can be further classified by design. This classification is more
descriptive in terms of how patients are randomized to treatment. The
most common design is the parallel-group trial. Patients are randomized
to the new treatment or to the standard treatment and followed-up to
determine the effect of each treatment in parallel groups. Other trial
designs include, amongst others, crossover trials, factorial trials, and
cluster randomized trials.
Crossover trials randomize patients to different sequences of
treatments, but all patients eventually get all treatments in varying
order, i.e., the patient is his/her own control. Factorial trials
assign patients to more than one treatment-comparison group. These are
randomized in one trial at the same time, i.e., while drug A is being
tested against placebo, patients are re-randomized to drug B or placebo,
making four possible treatment combinations in total. Cluster randomized trials
are performed when larger groups (e.g., patients of a single
practitioner or hospital) are randomized instead of individual patients.
Number of centers:
Clinical trials can also be classified as single-center or multicenter studies according to the number of sites involved. While single-center studies are mainly used for Phase I and II studies, multicenter studies can be carried out at any stage of clinical development. Multicenter studies are necessary for two
major reasons:
Number of centers:
Clinical trials can also be classified as single-center or multicenter studies according to the number of sites involved. While single-center studies are mainly used for Phase I and II studies, multicenter studies can be carried out at any stage of clinical development. Multicenter studies are necessary for two
major reasons:
- to evaluate a new medication or procedure more efficiently in terms of accruing sufficient subjects over a shorter period of time
- to provide a better basis for the subsequent generalization of the trial’s findings, i.e., the effects of the treatment are evaluated in many types of centers.
Labels:
clinical
CLINICAL TRIAL PROTOCOL DEVELOPMENT AND KEY COMPONENTS OF TRIAL PROTOCOL
CLINICAL TRIAL PROTOCOL DEVELOPMENT
Once a clinical question has been postulated, the first step in the
conception of a clinical trial to answer that question is to develop a
trial protocol. A well-designed protocol reflects the scientific and
methodological integrity of a trial. Protocol development has evolved in
a complex way over the last 20 years to reflect the care and attention
given to undertaking clinical experiments with human volunteers,
reflecting the high standards of safety and ethics involved as well as
the complex statistical issues.
Questions addressed by a protocol:
Questions addressed by a protocol:
- What is the clinical question being asked by the trial?
- How should it be answered, in compliance with the standard ethical and regulatory requirements?
- What analyses should be performed in order to produce meaningful results?
- How will the results be presented?
Qualities of a good protocol:
CLINICAL TRIAL PROTOCOL
The contents of a trial protocol should generally include the following topics. However, site specific information may be provided on separate protocol page(s), or addressed in a separate agreement, and some of the information listed below may be contained in other protocol referenced documents, such as an Investigator’s Brochure.
General Information:
Background Information:
Trial Objectives and Purpose:
Trial Design:
The scientific integrity of the trial and the credibility of the data from the trial depend
substantially on the trial design. A description of the trial design, should include:
(b) Blinding.
Selection and Withdrawal of Subjects:
Treatment of Subjects:
Assessment of Efficacy:
Assessment of Safety:
Statistics:
Direct Access to Source Data/Documents:
Quality Control and Quality Assurance:
Ethics:
Data Handling and Record Keeping:
Financing and Insurance:
Publication Policy:
Supplements:
(NOTE: Since the protocol and the clinical trial/study report are closely related, further relevant information can be found in the ICH Guideline for Structure and Content of Clinical Study Reports.)
- Clear, comprehensive, easy to navigate, and unambiguous.
- Designed in accordance with the current principles of Good Clinical Practice and other regulatory requirements.
- Gives a sound scientific background of the trial.
- Clearly identifies the benefits and risks of being recruited into the trial.
- Plainly describes trial methodology and practicalities.
- Ensures that the rights, safety, and well-being of trial participants are not unduly compromised.
- Gives enough relevant information to make the trial and its results reproducible.
- Indicates all features that assure the quality of every aspect of the the trial.
CLINICAL TRIAL PROTOCOL
The contents of a trial protocol should generally include the following topics. However, site specific information may be provided on separate protocol page(s), or addressed in a separate agreement, and some of the information listed below may be contained in other protocol referenced documents, such as an Investigator’s Brochure.
General Information:
- Protocol title, protocol identifying number, and date. Any amendment(s) should also bear the amendment number(s) and date(s).
- Name and address of the sponsor and monitor (if other than the sponsor).
- Name and title of the person(s) authorized to sign the protocol and the protocol amendment(s) for the sponsor.
- Name, title, address, and telephone number(s) of the sponsor's medical expert (or dentist when appropriate) for the trial.
- Name and title of the investigator(s) who is (are) responsible for conducting the trial, and the address and telephone number(s) of the trial site(s).
- Name, title, address, and telephone number(s) of the qualified physician (or dentist, if applicable), who is responsible for all trial-site related medical (or dental) decisions (if other than investigator).
- Name(s) and address(es) of the clinical laboratory(ies) and other medical and/or technical department(s) and/or institutions involved in the trial.
Background Information:
- Name and description of the investigational product(s).
- A summary of findings from nonclinical studies that potentially have clinical significance and from clinical trials that are relevant to the trial.
- Summary of the known and potential risks and benefits, if any, to human subjects.
- Description of and justification for the route of administration, dosage, dosage regimen, and treatment period(s).
- A statement that the trial will be conducted in compliance with the protocol, GCP and the applicable regulatory requirement(s).
- Description of the population to be studied.
- References to literature and data that are relevant to the trial, and that provide background for the trial.
Trial Objectives and Purpose:
- A detailed description of the objectives and the purpose of the trial.
Trial Design:
The scientific integrity of the trial and the credibility of the data from the trial depend
substantially on the trial design. A description of the trial design, should include:
- A specific statement of the primary endpoints and the secondary endpoints, if any, to be measured during the trial.
- A description of the type/design of trial to be conducted (e.g. double-blind, placebo-controlled, parallel design) and a schematic diagram of trial design, procedures and stages.
- A description of the measures taken to minimize/avoid bias, including:
(b) Blinding.
- A description of the trial treatment(s) and the dosage and dosage regimen of the investigational product(s). Also include a description of the dosage form, packaging, and labelling of the investigational product(s).
- The expected duration of subject participation, and a description of the sequence and duration of all trial periods, including follow-up, if any.
- A description of the "stopping rules" or "discontinuation criteria" for individual subjects, parts of trial and entire trial.
- Accountability procedures for the investigational product(s), including the placebo(s) and comparator(s), if any.
- Maintenance of trial treatment randomization codes and procedures for breaking codes.
- The identification of any data to be recorded directly on the CRFs (i.e. no prior written or electronic record of data), and to be considered to be source data.
Selection and Withdrawal of Subjects:
- Subject inclusion criteria.
- Subject exclusion criteria.
- Subject withdrawal criteria (i.e. terminating investigational product treatment/trial treatment) and procedures specifying:
- When and how to withdraw subjects from the trial/ investigational product treatment.
- The type and timing of the data to be collected for withdrawn subjects.
- Whether and how subjects are to be replaced.
- The follow-up for subjects withdrawn from investigational product treatment/trial treatment.
Treatment of Subjects:
- The treatment(s) to be administered, including the name(s) of all the product(s), the dose(s), the dosing schedule(s), the route/mode(s) of administration, and the treatment period(s), including the follow-up period(s) for subjects for each investigational product treatment/trial treatment group/arm of the trial.
- Medication(s)/treatment(s) permitted (including rescue medication) and not permitted before and/or during the trial.
- Procedures for monitoring subject compliance.
Assessment of Efficacy:
- Specification of the efficacy parameters.
- Methods and timing for assessing, recording, and analysing of efficacy parameters.
Assessment of Safety:
- Specification of safety parameters.
- The methods and timing for assessing, recording, and analyzing safety parameters.
- Procedures for eliciting reports of and for recording and reporting adverse event and intercurrent illnesses.
- The type and duration of the follow-up of subjects after adverse events.
Statistics:
- A description of the statistical methods to be employed, including timing of any planned interim analysis(ses).
- The number of subjects planned to be enrolled. In multicentre trials, the numbers of enrolled subjects projected for each trial site should be specified. Reason for choice of sample size, including reflections on (or calculations of) the power of the trial and clinical justification.
- The level of significance to be used.
- Criteria for the termination of the trial.
- Procedure for accounting for missing, unused, and spurious data.
- Procedures for reporting any deviation(s) from the original statistical plan (any deviation(s) from the original statistical plan should be described and justified in protocol and/or in the final report, as appropriate).
- The selection of subjects to be included in the analyses (e.g. all randomized subjects, all dosed subjects, all eligible subjects, evaluable subjects).
Direct Access to Source Data/Documents:
- The sponsor should ensure that it is specified in the protocol or other written agreement that the investigator(s)/institution(s) will permit trial-related monitoring, audits, IRB/IEC review, and regulatory inspection(s), providing direct access to source data/documents.
Quality Control and Quality Assurance:
Ethics:
- Description of ethical considerations relating to the trial.
Data Handling and Record Keeping:
Financing and Insurance:
- Financing and insurance if not addressed in a separate agreement.
Publication Policy:
- Publication policy, if not addressed in a separate agreement.
Supplements:
(NOTE: Since the protocol and the clinical trial/study report are closely related, further relevant information can be found in the ICH Guideline for Structure and Content of Clinical Study Reports.)
Key components of a trial protocol
The trial protocol is a comprehensive document and the core structure of the protocol should be adapted according to the type of trial. ICH–GCP can be used as a reference document when developing a protocol for pharmaceutical clinical trials (Phase I to Phase IV) involving a pharmaceutical substance (the investigational medicinal product [IMP]). Most institutions and pharmaceutical companies use a standard set of rules to define the main protocol outline, structure, format, and naming/numbering methods for their trials. In this section, we briefly describe the main components of a typical protocol.
Protocol information page:
The front page gives the:
- trial title
- trial identification number
- protocol version number
- date prepared
The descriptive title of the protocol should be kept as short as possible, but at the same time it should reflect the design, type of population, and aim of the trial. ICH–GCP suggests that the title of a pharmaceutical trial should additionally include the medicinal product(s), the nature of the treatment (eg, treatment, prophylaxis, diagnosis, radiosensitizer), any comparator(s) and/or placebo(s), indication, and setting (outpatient or inpatient). The key investigational site, investigator, and sponsor should also be detailed on the title page.
Trial summary or synopsis:
A synopsis should provide the key aspects of the protocol in no more than two pages, and can be prepared in a table format. The main components of the protocol summary include:
full title
- principal investigator
- planned study dates
- objectives
- study design
- study population
- treatments
- procedures
- sample size
- outcome measures
- statistical methods
Labels:
clinical
Friday, 14 June 2013
Docking (molecular)
In the field of molecular modeling, docking is a method which predicts the preferred orientation of one molecule to a second when bound to each other to form a stable complex.[1] Knowledge of the preferred orientation in turn may be used to predict the strength of association or binding affinity between two molecules using for example scoring functions.
The associations between biologically relevant molecules such as proteins, nucleic acids, carbohydrates, and lipids play a central role in signal transduction. Furthermore, the relative orientation of the two interacting partners may affect the type of signal produced (e.g., agonism vs antagonism). Therefore docking is useful for predicting both the strength and type of signal produced.
Docking is frequently used to predict the binding orientation of small molecule drug candidates to their protein targets in order to in turn predict the affinity and activity of the small molecule. Hence docking plays an important role in the rational design of drugs.[2] Given the biological and pharmaceutical significance of molecular docking, considerable efforts have been directed towards improving the methods used to predict docking.
The focus of molecular docking is to computationally simulate the molecular recognition process. The aim of molecular docking is to achieve an optimized conformation for both the protein and ligand and relative orientation between protein and ligand such that the free energy of the overall system is minimized..
A variety of conformational search strategies have been applied to the ligand and to the receptor. These include:
Multiple static structures experimentally determined for the same protein in different conformations are often used to emulate receptor flexibility.[19] Alternatively rotamer libraries of amino acid side chains that surround the binding cavity may be searched to generate alternate but energetically reasonable protein conformations.[20][21]
Most scoring functions are physics-based molecular mechanics force fields that estimate the energy of the pose; a low (negative) energy indicates a stable system and thus a likely binding interaction. An alternative approach is to derive a statistical potential for interactions from a large database of protein-ligand complexes, such as the Protein Data Bank, and evaluate the fit of the pose according to this inferred potential.
There are a large number of structures from X-ray crystallography for complexes between proteins and high affinity ligands, but comparatively fewer for low affinity ligands as the later complexes tend to be less stable and therefore more difficult to crystallize. Scoring functions trained with this data can dock high affinity ligands correctly, but they will also give plausible docked conformations for ligands that do not bind. This gives a large number of false positive hits, i.e., ligands predicted to bind to the protein that actually don't when placed together in a test tube.
One way to reduce the number of false positives is to recalculate the energy of the top scoring poses using (potentially) more accurate but computationally more intensive techniques such as Generalized Born or Poisson-Boltzmann methods.[8]
The associations between biologically relevant molecules such as proteins, nucleic acids, carbohydrates, and lipids play a central role in signal transduction. Furthermore, the relative orientation of the two interacting partners may affect the type of signal produced (e.g., agonism vs antagonism). Therefore docking is useful for predicting both the strength and type of signal produced.
Docking is frequently used to predict the binding orientation of small molecule drug candidates to their protein targets in order to in turn predict the affinity and activity of the small molecule. Hence docking plays an important role in the rational design of drugs.[2] Given the biological and pharmaceutical significance of molecular docking, considerable efforts have been directed towards improving the methods used to predict docking.
Definition of problem
Molecular docking can be thought of as a problem of “lock-and-key”, where one is interested in finding the correct relative orientation of the “key” which will open up the “lock” (where on the surface of the lock is the key hole, which direction to turn the key after it is inserted, etc.). Here, the protein can be thought of as the “lock” and the ligand can be thought of as a “key”. Molecular docking may be defined as an optimization problem, which would describe the “best-fit” orientation of a ligand that binds to a particular protein of interest. However, since both the ligand and the protein are flexible, a “hand-in-glove” analogy is more appropriate than “lock-and-key”.[3] During the course of the process, the ligand and the protein adjust their conformation to achieve an overall “best-fit” and this kind of conformational adjustments resulting in the overall binding is referred to as “induced-fit”.[4]The focus of molecular docking is to computationally simulate the molecular recognition process. The aim of molecular docking is to achieve an optimized conformation for both the protein and ligand and relative orientation between protein and ligand such that the free energy of the overall system is minimized..
Docking approaches
Two approaches are particularly popular within the molecular docking community. One approach uses a matching technique that describes the protein and the ligand as complementary surfaces.[5][6][7] The second approach simulates the actual docking process in which the ligand-protein pairwise interaction energies are calculated.[8] Both approaches have significant advantages as well as some limitations. These are outlined below.Shape complementarity
Geometric matching/ shape complementarity methods describe the protein and ligand as a set of features that make them dockable.[9] These features may include molecular surface / complementary surface descriptors. In this case, the receptor’s molecular surface is described in terms of its solvent-accessible surface area and the ligand’s molecular surface is described in terms of its matching surface description. The complementarity between the two surfaces amounts to the shape matching description that may help finding the complementary pose of docking the target and the ligand molecules. Another approach is to describe the hydrophobic features of the protein using turns in the main-chain atoms. Yet another approach is to use a Fourier shape descriptor technique.[10][11][12] Whereas the shape complementarity based approaches are typically fast and robust, they cannot usually model the movements or dynamic changes in the ligand/ protein conformations accurately, although recent developments allow these methods to investigate ligand flexibility. Shape complementarity methods can quickly scan through several thousand ligands in a matter of seconds and actually figure out whether they can bind at the protein’s active site, and are usually scalable to even protein-protein interactions. They are also much more amenable to pharmacophore based approaches, since they use geometric descriptions of the ligands to find optimal binding.Simulation
The simulation of the docking process as such is a much more complicated process. In this approach, the protein and the ligand are separated by some physical distance, and the ligand finds its position into the protein’s active site after a certain number of “moves” in its conformational space. The moves incorporate rigid body transformations such as translations and rotations, as well as internal changes to the ligand’s structure including torsion angle rotations. Each of these moves in the conformation space of the ligand induces a total energetic cost of the system, and hence after every move the total energy of the system is calculated. The obvious advantage of the method is that it is more amenable to incorporate ligand flexibility into its modeling whereas shape complementarity techniques have to use some ingenious methods to incorporate flexibility in ligands. Another advantage is that the process is physically closer to what happens in reality, when the protein and ligand approach each other after molecular recognition. A clear disadvantage of this technique is that it takes longer time to evaluate the optimal pose of binding since they have to explore a rather large energy landscape. However grid-based techniques as well as fast optimization methods have significantly ameliorated these problems.Mechanics of docking
To perform a docking screen, the first requirement is a structure of the protein of interest. Usually the structure has been determined using a biophysical technique such as x-ray crystallography, or less often, NMR spectroscopy. This protein structure and a database of potential ligands serve as inputs to a docking program. The success of a docking program depends on two components: the search algorithm and the scoring function.Search algorithm
Main article: Searching the conformational space for docking
The search space in theory consists of all possible orientations and conformations
of the protein paired with the ligand. However in practice with current
computational resources, it is impossible to exhaustively explore the
search space—this would involve enumerating all possible distortions of
each molecule (molecules are dynamic and exist in an ensemble of
conformational states) and all possible rotational and translational orientations of the ligand relative to the protein at a given level of granularity.
Most docking programs in use account for a flexible ligand, and several
attempt to model a flexible protein receptor. Each "snapshot" of the
pair is referred to as a pose.A variety of conformational search strategies have been applied to the ligand and to the receptor. These include:
- systematic or stochastic torsional searches about rotatable bonds
- molecular dynamics simulations
- genetic algorithms to "evolve" new low energy conformations
Ligand flexibility
Conformations of the ligand may be generated in the absence of the receptor and subsequently docked[13] or conformations may be generated on-the-fly in the presence of the receptor binding cavity,[14] or with full rotational flexibility of every dihedral angle using fragment based docking.[15] Force field energy evaluation are most often used to select energetically reasonable conformations,[16] but knowledge-based methods have also been used.[17]Receptor flexibility
Computational capacity has increased dramatically over the last decade making possible the use of more sophisticated and computationally intensive methods in computer-assisted drug design. However, dealing with receptor flexibility in docking methodologies is still a thorny issue. The main reason behind this difficulty is the large number of degrees of freedom that have to be considered in this kind of calculations. Neglecting it, however, leads to poor docking results in terms of binding pose prediction.[18]Multiple static structures experimentally determined for the same protein in different conformations are often used to emulate receptor flexibility.[19] Alternatively rotamer libraries of amino acid side chains that surround the binding cavity may be searched to generate alternate but energetically reasonable protein conformations.[20][21]
Scoring function
Main article: Scoring functions for docking
The scoring function takes a pose as input and returns a number
indicating the likelihood that the pose represents a favorable binding
interaction.Most scoring functions are physics-based molecular mechanics force fields that estimate the energy of the pose; a low (negative) energy indicates a stable system and thus a likely binding interaction. An alternative approach is to derive a statistical potential for interactions from a large database of protein-ligand complexes, such as the Protein Data Bank, and evaluate the fit of the pose according to this inferred potential.
There are a large number of structures from X-ray crystallography for complexes between proteins and high affinity ligands, but comparatively fewer for low affinity ligands as the later complexes tend to be less stable and therefore more difficult to crystallize. Scoring functions trained with this data can dock high affinity ligands correctly, but they will also give plausible docked conformations for ligands that do not bind. This gives a large number of false positive hits, i.e., ligands predicted to bind to the protein that actually don't when placed together in a test tube.
One way to reduce the number of false positives is to recalculate the energy of the top scoring poses using (potentially) more accurate but computationally more intensive techniques such as Generalized Born or Poisson-Boltzmann methods.[8]
Applications
A binding interaction between a small molecule ligand and an enzyme protein may result in activation or inhibition of the enzyme. If the protein is a receptor, ligand binding may result in agonism or antagonism. Docking is most commonly used in the field of drug design — most drugs are small organic molecules, and docking may be applied to:- hit identification – docking combined with a scoring function can be used to quickly screen large databases of potential drugs in silico to identify molecules that are likely to bind to protein target of interest (see virtual screening).
- lead optimization – docking can be used to predict in where and in which relative orientation a ligand binds to a protein (also referred to as the binding mode or pose). This information may in turn be used to design more potent and selective analogs.
- Bioremediation – Protein ligand docking can also be used to predict pollutants that can be degraded by enzymes.[22]
- tutorials from rcmd.it - The use of Autodock and Autodock Vina is illustrated in a couple of tutorials prepared by Prof. Rino Ragno @Sapienza University. The tutorials are downloadable from www.rcmd.it
Chemometrics
Chemometrics is the science of extracting information from
chemical systems by data-driven means. It is a highly interfacial
discipline, using methods frequently employed in core data-analytic
disciplines such as multivariate statistics, applied mathematics, and computer science, in order to address problems in chemistry, biochemistry, medicine, biology and chemical engineering. In this way, it mirrors several other interfacial ‘-metrics’ such as psychometrics and econometrics.
Introduction
Chemometrics is applied to solve both descriptive and predictive problems in experimental life sciences, especially in chemistry. In descriptive applications, properties of chemical systems are modeled with the intent of learning the underlying relationships and structure of the system (i.e., model understanding and identification). In predictive applications, properties of chemical systems are modeled with the intent of predicting new properties or behavior of interest. In both cases, the datasets can be small but are often very large and highly complex, involving hundreds to thousands of variables, and hundreds to thousands of cases or observations.
Chemometric techniques are particularly heavily used in analytical chemistry and metabolomics, and the development of improved chemometric methods of analysis also continues to advance the state of the art in analytical instrumentation and methodology. It is an application driven discipline, and thus while the standard chemometric methodologies are very widely used industrially, academic groups are dedicated to the continued development of chemometric theory, method and application development.
Many early applications involved multivariate classification, numerous quantitative predictive applications followed, and by the late 1970s and early 1980s a wide variety of data- and computer-driven chemical analyses were occurring.
Multivariate analysis was a critical facet even in the earliest applications of chemometrics. The data resulting from infrared and UV/visible spectroscopy are often easily numbering in the thousands of measurements per sample. Mass spectrometry, nuclear magnetic resonance, atomic emission/absorption and chromatography experiments are also all by nature highly multivariate. The structure of these data was found to be conducive to using techniques such as principal components analysis (PCA), and partial least-squares (PLS). This is primarily because, while the datasets may be highly multivariate there is strong and often linear low-rank structure present. PCA and PLS have been shown over time very effective at empirically modeling the more chemically interesting low-rank structure, exploiting the interrelationships or ‘latent variables’ in the data, and providing alternative compact coordinate systems for further numerical analysis such as regression, clustering, and pattern recognition. Partial least squares in particular was heavily used in chemometric applications for many years before it began to find regular use in other fields.
Through the 1980s three dedicated journals appeared in the field: Journal of Chemometrics, Chemometrics and Intelligent Laboratory Systems, and Journal of Chemical Information and Modeling. These journals continue to cover both fundamental and methodological research in chemometrics. At present, most routine applications of existing chemometric methods are commonly published in application-oriented journals (e.g., Applied Spectroscopy, Analytical Chemistry, Anal. Chim. Acta., Talanta). Several important books/monographs on chemometrics were also first published in the 1980s, including the first edition of Malinowski’s "Factor Analysis in Chemistry",[2] Sharaf, Illman and Kowalski’s "Chemometrics",[3] Massart et al. "Chemometrics: a textbook",[4] and "Multivariate Calibration" by Martens and Naes.[5]
Some large chemometric application areas have gone on to represent new domains, such as molecular modeling and QSAR, cheminformatics, the ‘-omics’ fields of genomics, proteomics, metabonomics and metabolomics, process modeling and process analytical technology.
An account of the early history of chemometrics was published as a series of interviews by Geladi and Esbensen.[6][7]
Techniques in multivariate calibration are often broadly categorized as classical or inverse methods.[5][8] The principal difference between these approaches is that in classical calibration the models are solved such that they are optimal in describing the measured analytical responses (e.g., spectra) and can therefore be considered optimal descriptors, whereas in inverse methods the models are solved to be optimal in predicting the properties of interest (e.g., concentrations, optimal predictors).[9] Inverse methods usually require less physical knowledge of the chemical system, and at least in theory provide superior predictions in the mean-squared error sense,[10][11][12] and hence inverse approaches tend to be more frequently applied in contemporary multivariate calibration.
The main advantages of the use of multivariate calibration techniques is that fast, cheap, or non-destructive analytical measurements (such as optical spectroscopy) can be used to estimate sample properties which would otherwise require time-consuming, expensive or destructive testing (such as HPLC-MS). Equally important is that multivariate calibration allows for accurate quantitative analysis in the presence of heavy interference by other analytes. The selectivity of the analytical method is provided as much by the mathematical calibration, as the analytical measurement modalities. For example near-infrared spectra, which are extremely broad and non-selective compared to other analytical techniques (such as infrared or Raman spectra), can often be used successfully in conjunction with carefully developed multivariate calibration methods to predict concentrations of analytes in very complex matrices.
Unsupervised classification (also termed cluster analysis) is also commonly used to discover patterns in complex data sets, and again many of the core techniques used in chemometrics are common to other fields such as machine learning and statistical learning.
Signal processing is also a critical component of almost all chemometric applications, particularly the use of signal pretreatments to condition data prior to calibration or classification. The techniques employed commonly in chemometrics are often closely related to those used in related fields.[19]
Performance characterization, and figures of merit Like most arenas in the physical sciences, chemometrics is quantitatively oriented, so considerable emphasis is placed on performance characterization, model selection, verification & validation, and figures of merit. The performance of quantitative models is usually specified by root mean squared error in predicting the attribute of interest, and the performance of classifiers as a true-positive rate/false-positive rate pairs (or a full ROC curve). A recent report by Olivieri et al. provides a comprehensive overview of figures of merit and uncertainty estimation in multivariate calibration, including multivariate definitions of selectivity, sensitivity, SNR and prediction interval estimation.[20] Chemometric model selection usually involves the use of tools such as resampling (including bootstrap, permutation, cross-validation).
Multivariate statistical process control (MSPC), modeling and optimization accounts for a substantial amount of historical chemometric development.[21][22][23] Spectroscopy has been used successfully for online monitoring of manufacturing processes for 30–40 years, and this process data is highly amenable to chemometric modeling. Specifically in terms of MSPC, multiway modeling of batch and continuous processes is increasingly common in industry and remains an active area of research in chemometrics and chemical engineering. Process analytical chemistry as it was originally termed,[24] or the newer term process analytical technology continues to draw heavily on chemometric methods and MSPC.
Multiway methods are heavily used in chemometric applications.[25][26] These are higher-order extensions of more widely used methods. For example, while the analysis of a table (matrix, or second-order array) of data is routine in several fields, multiway methods are applied to data sets that involve 3rd, 4th, or higher-orders. Data of this type is very common in chemistry, for example a liquid-chromatography / mass spectrometry (LC-MS) system generates a large matrix of data (elution time versus m/z) for each sample analyzed. The data across multiple samples thus comprises a data cube. Batch process modeling involves data sets that have time vs. process variables vs. batch number. The multiway mathematical methods applied to these sorts of problems include PARAFAC, trilinear decomposition, and multiway PLS and PCA.
Introduction
Chemometrics is applied to solve both descriptive and predictive problems in experimental life sciences, especially in chemistry. In descriptive applications, properties of chemical systems are modeled with the intent of learning the underlying relationships and structure of the system (i.e., model understanding and identification). In predictive applications, properties of chemical systems are modeled with the intent of predicting new properties or behavior of interest. In both cases, the datasets can be small but are often very large and highly complex, involving hundreds to thousands of variables, and hundreds to thousands of cases or observations.
Chemometric techniques are particularly heavily used in analytical chemistry and metabolomics, and the development of improved chemometric methods of analysis also continues to advance the state of the art in analytical instrumentation and methodology. It is an application driven discipline, and thus while the standard chemometric methodologies are very widely used industrially, academic groups are dedicated to the continued development of chemometric theory, method and application development.
Origins
Although one could argue that even the earliest analytical experiments in chemistry involved a form of chemometrics, the field is generally recognized to have emerged in the 1970s as computers became increasingly exploited for scientific investigation. The term ‘chemometrics’ was coined by Svante Wold in a grant application 1971,[1] and the International Chemometrics Society was formed shortly thereafter by Svante Wold and Bruce Kowalski, two pioneers in the field. Wold was a professor of organic chemistry at Umeå University, Sweden, and Kowalski was a professor of analytical chemistry at University of Washington, Seattle.Many early applications involved multivariate classification, numerous quantitative predictive applications followed, and by the late 1970s and early 1980s a wide variety of data- and computer-driven chemical analyses were occurring.
Multivariate analysis was a critical facet even in the earliest applications of chemometrics. The data resulting from infrared and UV/visible spectroscopy are often easily numbering in the thousands of measurements per sample. Mass spectrometry, nuclear magnetic resonance, atomic emission/absorption and chromatography experiments are also all by nature highly multivariate. The structure of these data was found to be conducive to using techniques such as principal components analysis (PCA), and partial least-squares (PLS). This is primarily because, while the datasets may be highly multivariate there is strong and often linear low-rank structure present. PCA and PLS have been shown over time very effective at empirically modeling the more chemically interesting low-rank structure, exploiting the interrelationships or ‘latent variables’ in the data, and providing alternative compact coordinate systems for further numerical analysis such as regression, clustering, and pattern recognition. Partial least squares in particular was heavily used in chemometric applications for many years before it began to find regular use in other fields.
Through the 1980s three dedicated journals appeared in the field: Journal of Chemometrics, Chemometrics and Intelligent Laboratory Systems, and Journal of Chemical Information and Modeling. These journals continue to cover both fundamental and methodological research in chemometrics. At present, most routine applications of existing chemometric methods are commonly published in application-oriented journals (e.g., Applied Spectroscopy, Analytical Chemistry, Anal. Chim. Acta., Talanta). Several important books/monographs on chemometrics were also first published in the 1980s, including the first edition of Malinowski’s "Factor Analysis in Chemistry",[2] Sharaf, Illman and Kowalski’s "Chemometrics",[3] Massart et al. "Chemometrics: a textbook",[4] and "Multivariate Calibration" by Martens and Naes.[5]
Some large chemometric application areas have gone on to represent new domains, such as molecular modeling and QSAR, cheminformatics, the ‘-omics’ fields of genomics, proteomics, metabonomics and metabolomics, process modeling and process analytical technology.
An account of the early history of chemometrics was published as a series of interviews by Geladi and Esbensen.[6][7]
Techniques
Multivariate calibration
Many chemical problems and applications of chemometrics involve calibration. The objective is develop models which can be used to predict properties of interest based on measured properties of the chemical system, such as pressure, flow, temperature, infrared, Raman, NMR spectra and mass spectra. Examples include the development of multivariate models relating 1) multi-wavelength spectral response to analyte concentration, 2) molecular descriptors to biological activity, 3) multivariate process conditions/states to final product attributes. The process requires a calibration or training data set, which includes reference values for the properties of interest for prediction, and the measured attributes believed to correspond to these properties. For case 1), for example, one can assemble data from a number of samples, including concentrations for an analyte of interest for each sample (the reference) and the corresponding infrared spectrum of that sample. Multivariate calibration techniques such as partial-least squares regression, or principal component regression (and near countless other methods) are then used to construct a mathematical model that relates the multivariate response (spectrum) to the concentration of the analyte of interest, and such a model can be used to efficiently predict the concentrations of new samples.Techniques in multivariate calibration are often broadly categorized as classical or inverse methods.[5][8] The principal difference between these approaches is that in classical calibration the models are solved such that they are optimal in describing the measured analytical responses (e.g., spectra) and can therefore be considered optimal descriptors, whereas in inverse methods the models are solved to be optimal in predicting the properties of interest (e.g., concentrations, optimal predictors).[9] Inverse methods usually require less physical knowledge of the chemical system, and at least in theory provide superior predictions in the mean-squared error sense,[10][11][12] and hence inverse approaches tend to be more frequently applied in contemporary multivariate calibration.
The main advantages of the use of multivariate calibration techniques is that fast, cheap, or non-destructive analytical measurements (such as optical spectroscopy) can be used to estimate sample properties which would otherwise require time-consuming, expensive or destructive testing (such as HPLC-MS). Equally important is that multivariate calibration allows for accurate quantitative analysis in the presence of heavy interference by other analytes. The selectivity of the analytical method is provided as much by the mathematical calibration, as the analytical measurement modalities. For example near-infrared spectra, which are extremely broad and non-selective compared to other analytical techniques (such as infrared or Raman spectra), can often be used successfully in conjunction with carefully developed multivariate calibration methods to predict concentrations of analytes in very complex matrices.
Classification, pattern recognition, clustering
Supervised multivariate classification techniques are closely related to multivariate calibration techniques in that a calibration or training set is used to develop a mathematical model capable of classifying future samples. The techniques employed in chemometrics are similar to those used in other fields – multivariate discriminant analysis, logistic regression, neural networks, regression/classification trees. The use of rank reduction techniques in conjunction with these conventional classification methods is routine in chemometrics, for example discriminant analysis on principal components or partial least squares scores.Unsupervised classification (also termed cluster analysis) is also commonly used to discover patterns in complex data sets, and again many of the core techniques used in chemometrics are common to other fields such as machine learning and statistical learning.
Multivariate curve resolution
In chemometric parlance, multivariate curve resolution seeks to deconstruct data sets with limited or absent reference information and system knowledge. Some of the earliest work on these techniques was done by Lawton and Sylvestre in the early 1970s.[13][14] These approaches are also called self-modeling mixture analysis, blind source/signal separation, and spectral unmixing. For example, from a data set comprising fluorescence spectra from a series of samples each containing multiple fluorophores, multivariate curve resolution methods can be used to extract the fluorescence spectra of the individual fluorophores, along with their relative concentrations in each of the samples, essentially unmixing the total fluorescence spectrum into the contributions from the individual components. The problem is usually ill-determined due to rotational ambiguity (many possible solutions can equivalently represent the measured data), so the application of additional constraints is common, such as non-negatively, unmodality, or known interrelationships between the individual components (e.g., kinetic or mass-balance constraints).[15][16]Other techniques
Experimental design remains a core area of study in chemometrics and several monographs are specifically devoted to experimental design in chemical applications.[17][18] Sound principles of experimental design have been widely adopted within the chemometrics community, although many complex experiments are purely observational, and there can be little control over the properties and interrelationships of the samples and sample properties.Signal processing is also a critical component of almost all chemometric applications, particularly the use of signal pretreatments to condition data prior to calibration or classification. The techniques employed commonly in chemometrics are often closely related to those used in related fields.[19]
Performance characterization, and figures of merit Like most arenas in the physical sciences, chemometrics is quantitatively oriented, so considerable emphasis is placed on performance characterization, model selection, verification & validation, and figures of merit. The performance of quantitative models is usually specified by root mean squared error in predicting the attribute of interest, and the performance of classifiers as a true-positive rate/false-positive rate pairs (or a full ROC curve). A recent report by Olivieri et al. provides a comprehensive overview of figures of merit and uncertainty estimation in multivariate calibration, including multivariate definitions of selectivity, sensitivity, SNR and prediction interval estimation.[20] Chemometric model selection usually involves the use of tools such as resampling (including bootstrap, permutation, cross-validation).
Multivariate statistical process control (MSPC), modeling and optimization accounts for a substantial amount of historical chemometric development.[21][22][23] Spectroscopy has been used successfully for online monitoring of manufacturing processes for 30–40 years, and this process data is highly amenable to chemometric modeling. Specifically in terms of MSPC, multiway modeling of batch and continuous processes is increasingly common in industry and remains an active area of research in chemometrics and chemical engineering. Process analytical chemistry as it was originally termed,[24] or the newer term process analytical technology continues to draw heavily on chemometric methods and MSPC.
Multiway methods are heavily used in chemometric applications.[25][26] These are higher-order extensions of more widely used methods. For example, while the analysis of a table (matrix, or second-order array) of data is routine in several fields, multiway methods are applied to data sets that involve 3rd, 4th, or higher-orders. Data of this type is very common in chemistry, for example a liquid-chromatography / mass spectrometry (LC-MS) system generates a large matrix of data (elution time versus m/z) for each sample analyzed. The data across multiple samples thus comprises a data cube. Batch process modeling involves data sets that have time vs. process variables vs. batch number. The multiway mathematical methods applied to these sorts of problems include PARAFAC, trilinear decomposition, and multiway PLS and PCA.
Chemogenomics
Chemogenomics, or Chemical Genomics, is the systematic screening of targeted chemical libraries of small molecules against individual drug target families (e.g., GPCRs, nuclear receptors, kinases, proteases, etc.) with the ultimate goal of identification of novel drugs and drug targets.[1]
Typically some members of a target library have been well characterized
where both the function has been determined and compounds that modulate
the function of those targets (ligands in the case of receptors, inhibitors of enzymes, or blockers of ion channels)
have been identified. Other members of the target family may have
unknown function with no known ligands and hence are classified as orphan receptors.
By identifying screening hits that modulate the activity of the less
well characterized members of the target family, the function of these
novel targets can be elucidated. Furthermore the hits for these targets can be used as a starting point for drug discovery.
A common method to construct a targeted chemical library is to include known ligands of at least one and preferably several members of the target family. Since a portion of ligands that were designed and synthesized to bind to one family member will also bind to additional family members, the compounds contained in a targeted chemical library should collectively bind to a high percentage of the target family
A common method to construct a targeted chemical library is to include known ligands of at least one and preferably several members of the target family. Since a portion of ligands that were designed and synthesized to bind to one family member will also bind to additional family members, the compounds contained in a targeted chemical library should collectively bind to a high percentage of the target family
CHEMOINFORMATICS
Cheminformatics (also known as chemoinformatics and chemical informatics) is the use of computer and informational techniques applied to a range of problems in the field of chemistry. These in silico techniques are used in, for example, pharmaceutical companies in the process of drug discovery. These methods can also be used in chemical and allied industries in various other forms
Virtual libraries of classes of compounds (drugs, natural products, diversity-oriented synthetic products) were recently generated using the FOG (fragment optimized growth) algorithm. [8] This was done by using cheminformatic tools to train transition probabilities of a Markov chain on authentic classes of compounds, and then using the Markov chain to generate novel compounds that were similar to the training database.
History
The term chemoinformatics was defined by F.K. Brown [1][2] in 1998:Chemoinformatics is the mixing of those information resources to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the area of drug lead identification and optimization.Since then, both spellings have been used, and some have evolved to be established as Cheminformatics,[3] while European Academia settled in 2006 for Chemoinformatics.[4] The recent establishment of the Journal of Cheminformatics is a strong push towards the shorter variant.
Basics
Cheminformatics combines the scientific working fields of chemistry, computer science and information science for example in the areas of topology, chemical graph theory, information retrieval and data mining in the chemical space.[5][6][7] Cheminformatics can also be applied to data analysis for various industries like paper and pulp, dyes and such allied industries.Applications
Storage and retrieval
Main article: Chemical data and databases
The primary application of cheminformatics is in the storage,
indexing and search of information relating to compounds. The efficient
search of such stored information includes topics that are dealt with in
computer science as data mining, information retrieval, information extraction and machine learning. Related research topics include:File formats
Main article: Chemical file format
The in silico representation of chemical structures uses specialized formats such as the XML-based Chemical Markup Language or SMILES. These representations are often used for storage in large chemical databases.
While some formats are suited for visual representations in 2 or 3
dimensions, others are more suited for studying physical interactions,
modeling and docking studies.Virtual libraries
Chemical data can pertain to real or virtual molecules. Virtual libraries of compounds may be generated in various ways to explore chemical space and hypothesize novel compounds with desired properties.Virtual libraries of classes of compounds (drugs, natural products, diversity-oriented synthetic products) were recently generated using the FOG (fragment optimized growth) algorithm. [8] This was done by using cheminformatic tools to train transition probabilities of a Markov chain on authentic classes of compounds, and then using the Markov chain to generate novel compounds that were similar to the training database.
Virtual screening
Main article: Virtual screening
In contrast to high-throughput screening, virtual screening involves computationally screening in silico libraries of compounds, by means of various methods such as docking, to identify members likely to possess desired properties such as biological activity against a given target. In some cases, combinatorial chemistry
is used in the development of the library to increase the efficiency in
mining the chemical space. More commonly, a diverse library of small
molecules or natural products is screened.Quantitative structure-activity relationship (QSAR)
Main article: Quantitative structure-activity relationship
This is the calculation of quantitative structure-activity relationship and quantitative structure property relationship
values, used to predict the activity of compounds from their
structures. In this context there is also a strong relationship to Chemometrics. Chemical expert systems are also relevant, since they represent parts of chemical knowledge as an in silico representation.
Subscribe to:
Posts (Atom)