Wednesday, June 26, 2013

Gamess (US) frequently asked questions Part 2: Installation in Linux boxes


This is a guest post by Kirill Berezovsky (Petrozavodsk State university), based on instructions he posted some time ago in the Gamess (US) list



First, you need to get some software!
• Fortran compiler: gfortran or Intel Fortran (ifort);
• Math library's: ACML (for AMD CPU’s), ATLAS or MKL (Intel Math Kernel Library);
• (optionally) MPI: Intel MPI, OpenMPI, MVAPICH2, ... (if you using MPI);
(bold is better)

  • I’m not recommending the use of OpenMPI because it’s really slower than Intel MPI.
    But, if you want to try use it, go to this site. This is a very short and informative solution for building 64-bit OpenMPI.

This software for non-commercial using you can get on
http://software.intel.com/en-us/non-commercial-software-development
(Intel Fortran and MKL places in Intel® Fortran Composer XE 2013 for Linux)

As GAMESS (US) developers, I’m using:
• ifort 12.0.4 (places in Intel Fortran Composer XE 2011 update 4, “l_fcompxe_2011.4.191.tgz”)
• MKL 11.0 (places in Intel Fortran Composer XE 2013 initial release, “l_fcompxe_2013.0.079.tgz”).

Be aware, ifort 13.0.0 can’t compile some GAMESS (US) objects!

Anyway, you can install separate software from these archives – look hard when installing. It will safe HDD space and protect from some mistakes, maybe.




Great, when you have these packages, let’s start to configure your system!
You should be root, as I think that is better.

And before starting, configure memory by terminal commands. First, answer the question – how many RAM does your PC have? For example, my PC has 4 GB RAM, so in bytes it will be:

4 GB = 4*1024 MB = 4*1024*1024 KB = 4*1024*1024*1024 bytes = 4294967296 bytes.

And go on: 4294967296 bytes / 2 = 2147483648 bytes.
This number shows maximum size of 1 segment of shared memory.

Next, total size of shared memory in pages will be: 2147483648 / 4096 = 524288 pages.

This numbers you need to write in /etc/sysctl.conf:
echo “kernel.shmmax=2147483648” >> /etc/sysctl.conf
echo “kernel.shmall=524288” >> /etc/sysctl.conf
And restart your machine.

Install the packages!
1. It's better to use 64-bit Linux, whatever you like. I’m using Debian 6.0.7;

2. In terminal, install these packages (just in case):
apt-get install tcsh gcc g++ gfortran build-essential dpkg-dev binutils zlib1g-dev

3. Install Intel Fortan Composer XE 2011;
1. Unpack downloaded archive by: tar xvf l_fcompxe_2011.4.191.tar
2. Goto unpacked folder by: cd l_fcompxe_2011.4.191
3. Install by run: ./install.sh
4. ...And follow the instructions

4. Install GAMESS (US);
1. Get the GAMESS-archive;
2. Unpack it (default in /usr/local/);
3. Go to the GAMESS-folder and run script: ./config
4. Answer the questions:
1. Target machine name: linux64
2. GAMESS location: /usr/local/gamess
3. Build location: /usr/local/gamess
4. GAMESS executable version name (any name you want. In this example, we will use "cpu" as the name): cpu
5. Fortran compiler (choose what you use):
  •  ifort --> version : 12
  • gfortran --> version (like 4.4) you can get if you run in other terminal by: gfortran -v
6. Math library: mkl
7. Math library location: /opt/intel/composerxe_2011/mkl (verify it for your installation!)
8. When it shows string which contains 'bin' and 'lib' then type: skip
9. If you are not using MPI then type: sockets and you'll finish configuration;

10. Else, if you are using MPI type: mpi
1. Next, choose MPI-program: impi
2. Select MPI location directory, and go on.

11. Answer “no” for “LIBCCHEM”-question. If you are using NVIDIA GPU for calculations, anyway answer “no” at this first configuration time go on and don’t forget to read important note after these steps.

5. Goto ddi folder by: cd ddi
6. Edit 'compddi'-script by: gedit compddi
7. Find and change strings:
1. set MAXCPUS=4 (number of cores, is it 4 in your machine?)
2. set MAXNODES=1 (for single node)
8. Run 'compddi'-script by: ./compddi
• If you're NOT using MPI, there will be file ddikick.x - move in by: mv ddikick.x .. (two dots means upper folder)

9. Go upper folder: cd ..
10. Then compile GAMESS (US) by: ./compall

11. Link by: ./lked gamess cpu which creates gamess.cpu.x file. Of course, you can name it as you want, not only 'cpu'


12. Edit 'rungms'-script:
  • See here the rungms-script rewritten by Kirill.

Next you should create folders:
  • mkdir /scr
  • mkdir /scr/root
  • mkdir /root/scr

Add system variables into the ~/.bashrc file:

# iFort
export PATH=/opt/intel/composerxe-2011.4.191/bin/intel64:$PATH
export LD_LIBRARY_PATH=/opt/intel/composerxe-2011.4.191/compiler/lib/intel64:$LD_LIBRARY_PATH

# iMPI
export PATH=/opt/intel/impi/4.0.2.003/intel64/bin:$PATH
export LD_LIBRARY_PATH=/opt/intel/impi/4.0.2.003/intel64/lib:$LD_LIBRARY_PATH

# MKL
export LD_LIBRARY_PATH=/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64:$LD_LIBRARY_PATH


To run GAMESS (US) just type:

/usr/local/gamess/rungms [input file] [optionally,  the 'version name' of your gamess file ("cpu" in this example)]
just like this:

/usr/loca/gamess/rungms BSi85H95.inp
/usr/loca/gamess/rungms BSi85H95.inp cpu   will run exactly the same.

For simple run edit ~/.bashrc like:
gedit ~/.bashrc
• add line alias gamess=’/usr/local/gamess/rungms’
• apply changes by source ~/.bashrc
• and now you can run it by:

gamess BSi85H95.inp cpu

END!

Thursday, June 20, 2013

Science by press release

I woke up today with the news that researchers at the University of Aveiro had, "for the first time", altered the translational apparatus of an organism. I was outraged with the news: not with the science itself, but with the mindless hype surrounding it: actually, such a modification had already been performed in 2011 in C. elegans . I first thought that the "first time evah" pitch had been added by ignorant journalists, but the hype was already present in the press release from Univ. Aveiro!
The research publicized today is good and interesting, no doubt about that, but the quest for "good press" should never come at the expense of the truth. There is no excuse for that. Every bit of "good press" achieved with hype/exageration unfairly benefits those institutions and/or researchers with no moral qualms, leaving those researchers who are honest enough to not misrepresent their results in a disadvantage.

I've always disliked "science by press release", because (all other things being equal) it disproportionately benefits those who have access to the mass media, or who can afford publicists. Hyped press releases are even worse. And this can only end when science journalists stop relying on press releases to decide what is newsworthy. Though I strongly believe that such a day will not happen in the next 5 * 109 years.


Addendum: Previous reports all reassigned a STOP codon to an unnatural aminoacid. The report from Univ. Aveiro is indeed the first time that a non-STOP codon has been reassigned in an organism. This difference is unfortunately not present in the press release. I still stand by all other points on my post.

Wednesday, June 19, 2013

Gamess (US) frequently asked questions Part 1: SCF convergence

In spite of the very high quality of the Gamess(US) documentation, the Gamess(US) list is very often flooded with requests from new users regarding the lack of convergence of the SCF procedure. A few words of advice:

When your SCF does not converge,  you should re-run the job including a $guess guess=moread $end line, as well as the complete $VEC group present in the output PUNCH file (usually called <jobname>.dat, and present in you scratch directory).

    Addendum:

    Whenever you read a $VEC group from a UHF run you must assign NORB in the $GUESS group. An additional problem is that by default the $VEC group only includes the occupied orbitals, and this means that in UHF runs the $VEC group does not include equal numbers of alpha and beta orbitals (e.g., a run with 41 electrons and MULT=2) will have 21 alpha orbitals and 20 beta orbitals. Therefore, if you include

    $guess guess=moread NORB=21 $end

    Gamess will crash because there are not 21 beta orbitals, and if you input

    $guess guess=moread NORB=20 $end

    there will be another error, since there are more than 20 alpha orbitals. In these cases, you should check the number of alpha and beta orbitals. Then , copy the coefficients of the extra alpha orbitals to the end of the beta orbitals. In my example above

    $guess guess=moread NORB=21 $end

    will yield no problems, since the modification of the VEC group yields equal numbers of alpha and beta orbitals. There is also an option to PUNCH every orbital (occupied+virtuals) at every step. In this case, Gamess always punches a full $VEC group, making it very easy to assign NORB as one can simply inspect the output file to learn the number of orbitals. However, this yields gigantic PUNCH files, and may therefore not be feasible.




You should also experiment with changing convergers, damping, etc. Some systems are notoriously hard to converge, and may require several re-iterations of the whole process. 

Thursday, August 30, 2012

Advances in peptide chemistry

Protein synthesis is nowadays achieved through molecular biology techniques: the relevant gene is cloned in an appropriate vector, over-expressed with e.g. a poly-histidine tag, and then purified through high affinity chromatography. Peptide chemistry is therefore often forgotten by biochemists, unless we need to order a short customized peptide from a commercial source.
Danishefsky et al. have now combined solid phase peptide synthesis, native chemical ligation and metal-free dethyilation to synthesize a number of analogues of human parathormone. Their strategy afforded native parathormone with higher purity than obtained from commercial sources, as well as pure analogues not achievable by any other means. These analogues were shown to be much more stable (10% decomposition in 7 days) than parathormone ,(>90% loss in 7 days), and to be as active as parathormone when injected to mice.
This is a very interesting work, which should pave the way towards the synthesis of long-lived synthetic peptide hormones, thus potentially decreasing the number of injections needed to control hormone levels in patients suffering from impaired endocrine function.

Friday, April 13, 2012

Drawing can be torture



Drawing complex three-dimensional molecules in two-dimensions can be a real torture. I am glad I have never had to draw anything as convoluted as palhinine A. Check the 3-D structure on the left, and try to draw it in less than 10 minutes in ChemDraw or ChemSketch. Good luck!
palhinin A


Thursday, March 15, 2012

QM/MM vs. QM-only studies of large cluster models

How large must a quantum model of an enzyme active site be to achieve optimum results? Proponents of the so-called "cluster model" argue that, most often, good results may be obtained even with small models (< 100 atoms). Fahmi Himo has repeatedly shown that fully including the first layer of aminoacids surrounding the reacting substrate (i.e. to about 150 atoms) yields results that are insensitive to the inclusion of a polarizable-continuum solvent field, and has concluded from these data that such models are sufficient to capture all the relevant enzymatic effexts on catalysis.

Walter Thiel has now published a QM/MM analysis of the reaction mechanism of acetylene hydratase (previously studied by Fahmi Himo using increasingly large QM-only models). Inclusion of the surrounding protein dramatically changed the results for the largest model studied by Himo, due to the absence (in the "cluster model") of two negatively charged phosphate groups adjacent to the active site. Although these charges are quite "shielded" from the active site because of neighbouring positively-charged amino acids, they originate local charge assymmetries that interact differently with the active site during each step of the catalytic cycle. This effect is quite similar to the major influence of the internal protein dipoles on enzyme catalysis expounded by Arieh Warshel, and should be kept in mind by all of us who tend to prefer the QM-only approach: a polarizable-continuum model assumes a homogeneous environment surrounding the QM system, and in proteins "it ain't necessarily so".

Tuesday, November 29, 2011

An interesting hypothesis on the selection of glucose as major fuel source in neurons

Earlier this year, I wondered why neurons preferentially use glucose as fuel. I have now found an interesting paper by Dave Speijer regarding this problem. He proposes the following reasoning to explain this observation:
  • reactive oxygen species are generated in large amounts by NADH dehydrogenase (complex I) when the amount of oxidized ubiquinone is limited
  • generation of large amounts of FADH2 increases the rate of reduction of ubiquinone, and therefore increases indirectly the amount of harmful radical species generated by NADH dehydrogenase
  • glucose oxidation generates a much smaller amount of FADH2 than fatty-acid oxidation. Therefore:


  • Especially vulnerable cells may be expected to have evolved a preference for glucose.

    Incidentally, neurons do seem to lack large amounts of one of the enzymes involved in fatty acid oxidation: thiolase.

  • The limits of homology modeling

    The computational prediction of three-dimensional structures of protein sequences may be performed using a wide variety of techniques, such as homology modeling or threading. In threading, the correct fold is searched for by evaluating the energy of the intended sequence when it is "forced" to adopt each of the known folding patterns. In homology modeling, one looks for a high-similarity protein sequence with experimentally-determined 3D structure, and mutates it in silico until the desired sequence is obtained. Many different programs and web-servers are now available for these tasks, differing among themselves in the forcefields used, alignment algorithms, etc. Performance is usually quite good when templates with similarity >40% are used.

    Recently, two small proteins with very high homology (>95%) but widely differing structure have been designed and studied. Starting from a pair of proteins with < 20 % identity and different 3D structures, the authors gradually mutated one sequence into the other, and ended up generating two sequences differing only in one amino acid, but with different folds. Attempts to unravel the precise mechanisms governing the selection of one fold over the other have however been inconclusive, because current molecular dynamics protocols and force fields are not accurate enough to measure the small energy differences involved.

    Monday, October 17, 2011

    Limitations of PCM

    A new paper claims to compute the pKa of nitrous acidium ion from gas phase DFT computations followed by estimation of solvation effects by a Polarizable Continuum Method (PCM). It is true that most often geometries do not change too much when going from gas phase to solution, but I strongly doubt the results are as accurate as they could be: PCM does not include the contribuiton from hydrogen bonds between the solute and the solvent, and I would expect that effect to be quite different in neutral HONO and protonated H2ONO+

    Thursday, September 29, 2011

    Dividing research into very small chunks...

    Research roductivity is most often measured by people who do not have the ability to distinguish good papers from bad papers. Such measurements therefore tend to devolve into mechanical algorithms that count the number of publications and the impact factor of the journal where the research was published, rather than sensible arguments about the merits (or demerits) of the researcher. Evaluating a researcher therefore becomes a "numbers games", where a researcher with a higher number of small papers easily outranks another who has a smaller number of longer, more complex, publications. The race to the "smallest publishable piece of research" increases the number of papers (arguably "good" to the researcher who needs a "good" evaluation) but makes accompanying the literature more difficult, as one has to keep track of ever increasing numbers of papers with dwindling individual importance. It also detracts from the value of research being reported: in my example today, two papers report computations of very similar compounds. The only difference is the interchange of a nitrogen with a phosphorus atom.
    A single paper would have been much more useful and important, but research managers would count that as less productive :-(


    PS: I happen to disagree strongly with the suggestion, in these papers, of the existence of intramolecular H-bonding, as the angles involved are too small for H-bonds.

    Tuesday, September 27, 2011

    What's in a name?

    The IUPAC distinguishes "Lewis acidity" from "electrophilicity": the first concept relates to the equilibrium constant of the reaction of an electrophile (i.e. the termodynamics), whereas electrophilicity is related to the rate constant (i.e. the kinetics) of the reaction. However, the actual usage of the words in ordinary chemical parlance is somewhat more ambiguous, as the concepts are often used interchangeably.
    A recent paper on this topic "Separating Electrophilicity and Lewis Acidity: The Synthesis, Characterization, and Electrochemistry of the Electron Deficient Tris(aryl)boranes B(C6F5)3–n(C6Cl5)n (n = 1–3)" caught my attention. However, this paper does not compare the changes in thermodynamics vs. kinetics ofthe title compounds upon increasing n. It rather compares their Lewis acidity with their ability to capture an electron (which the authors call electrophilicity). Quite a difference, don't you think?

    Coming soon to a worm near you....

    Three possible stop codons are common in mRNA: UGA, UAA and UAG. These codons usually bind release factors, that prompt the release of of the nascent amino acid chain from the ribosome. Some organisms, however, contain tRNA complementary to one of these codons. In these organisms, that codon no longer triggers the ending of the translation process, but codes an amino acid instead. Several researchers have used this special tRNA to develop mutant cells with expanded genetic codes.Greiss and Chin have now taken this a step further: they have engineered a mutant strain of the worm C. elegans that translates every UAG codon as an artificial aminoacid. It was a complex endeavour (details are in their paper...) that surely would have deserved a well-publicized press conference :-)

    Thursday, September 22, 2011

    Puns and wordplay in Science

    In 1975, E. M. Southern developed an elegant method to detect specific DNA after gel electrophoresis (J. Mol. Biol. 98, 503-517) . His technique soon became known as "the Southern blot", and the paper has so far gathered >35 thousand citations. This number is a dramatic under-estimate of the impact of Southern blot in the field of molecular biology, as the technique has became routine and "common knowledge", which means that most practitioners no longer cite the original paper. In 1977, a variation of the technique was developed by Alwine et al. to detect RNA. The name "Northern blotting" was soon proposed for their technique, as a wordplay on the original method. The application of a similar technique on proteins is called "Western blotting".

    Naming methods (or variations) using wordplay is not limited to biochemical techniques. In computational chemistry, novel basis sets obtained from the well-known aug-cc-pVXZ basis set family by decreasing the number of polarization basis functions have recently been proposed by Don Truhlar. In a humorous touch, the aug- prefix (originally an abreviation of augmented) was considered an abbreviation of August. The new, smaller, basis sets aretherefore called apr-cc-pVXZ, may-cc-pVXZ, jun-cc-pVXZ and jul-cc-pVXZ. Not outright comedy material, but it does bring a quirky smile to your lips, right?

    Tuesday, September 20, 2011

    QM molecular dynamics

    In classical molecular dynamics simulations, we follow the evolution of a system of particles that interact with each other according to newtonian mechanics. The correct description of chemical bonds, angles and torsions in classical mechanics can only be achieved by introducing carefully parameterized expressions that represent the change in electronic energy upon stretching/compressing a bond, or bending an angle. These parameterized force fields (AMBER, CHARMM, GROMOS, YASARA, OPLS) allow the simulation of very large systems (>10000 atoms) for long simulation times (>20 ns) with an obvious drawback: the quality of the simulations is only as good as the quality of the parameterized expressions, and therefore one is limited to the simulation of specific classes of previously characterized molecules/functional groups. Simulating chemical reactions is generally not possible without special protocols (like thermodynamic integration).

    Ab initio molecular simulations (e.g. Car-Parrinello MD) are much more expensive, and are generally limited to (at most) a few dozen atoms and <100 ps. Two papers from Prof. Shogo Sakai's group show that QM molecular simulations can be performed with considerable time-savings if the system is partitioned into several smaller systems. They have not yet developed the theory to the point where one can attempt bond-breaking, but theirs seems a fruitful approach to the problem.

    Thursday, July 14, 2011

    Fe-S clusters

    Biological Fe-S clusters come in many sizes and flavours:
  • 2Fe-2S clusters ligated by four cysteines
  • 2Fe-2S clusters ligated by three cysteines and one aspartate
  • 2Fe-2S clusters ligated by cysteines and histidines (the so-called Rieske clusters)
  • 3Fe-4S clusters ligated by three cysteines
  • 4Fe-4S clusters ligated by four cysteines
  • 4Fe-4S clusters ligated by three cysteines and one aspartate
  • the hideously complex cluster present in hybrid cluster protein (also known as fuscoredoxin or "prismane protein")
  • the P-cluster in nitrogenase
  • etc., etc., etc.
    The large number of electrons in Fe and the complexity of the possible couplings between spin states make the theoretical analysis of the electronic structures in Fe-S clusters quite difficult.
    Takano et al. have recently published a paper on the differences between a Cys3Asp ligated 4Fe-4S cluster and the "regular" (all Cys) 4Fe-4S cluster. The authors nicely analyze the influence of the Asp (and other) ligands on the electronic structure of the 4Fe-4S cluster, observe a -0.10 V difference in redox potential (vs. normal 4Fe-4S) in high dielectric constants, and offer this observation as the reason for the low potential of this cluster.
    I do not accept this last conclusion for two reasons:
  • redox potentials of Cys-ligated 4Fe-4S clusters may differ by >0.4 V from each other, which shows that the influence of the charge distribution of the protein is much more important than the small difference observed by the authors
  • the 0.1 V difference found amounts to ca. 2.3 kcal/mol, which is well within the error range of the computational methods used.
  • Monday, July 11, 2011

    Energy metabolism in brain

    It is a well-known "fact" that under normal conditions glucose is responsible for providing almost all the energy needed by the healthy brain. However, it is not at all clear why that should be so: after all, fatty acids are well known to cross the brain-blood barrier. Why souldn't they be substrates for beta-oxidation in neurons? After browsing the literature, I still do not have an answer for that question. The Gene Expression Database reports that the enzymes involved in beta-oxidation are indded expressed in brain, but it is not clear if the data are from tissue homogeneates ot form purified neurons/astrocytes, etc. Back in 1993, Ebert et al.  showed that ca. 20% of the brain's energy needs may be met by medium-chain fatty acids. Drawing on earlier research by other authors, Ebert et al. concluded that astrocytes probably account for the fatty acids oxidation, while the neurons survive on glucose alone (or a mixture of glucose and lactate provided by the astrocytes themselves).

    I would still like to find out any explanation for the neurons' dependence on glucose (or glucose/lactate).. Any ideas?


    Wednesday, July 6, 2011

    Should we suspect any shameless self-promotion in some Impact Factors?

    Selecting the journal for your next submission is a decision with lots of variables:

  • how likely is the journal to find your work "sexy" enough?
  • what is its impact factor?
  • how long does the journal take from acceptance to online/paper publication?
  • how desperate are you to get your paper published?




    Ideally, impact factor would be an objective measurement... We all know, however, that the actual relationship between "real journal impact" and the impact factor is not always perfect: a single paper with many citations in a small journal may increase its IF dramatically, even if all other papers in that journal are less cited than the papers form preceding years; citations may be inflated artificially by the authors self-citing themselves to exhaustion, bad papers may be highly cited (e.g. in refutations), etc.
    I have now found (entirely by accident) a journal that increased its impact factor five-fold from 2009 to 2010. That would be surprising in itself. But the real surprise is that in August 2010, this journal published a paper that has thus far received 37 citations, ALL IN THIS SAME JOURNAL.

    You may check for yourselves in Web of Science.. The paper is

    Aman MJ , Karauzum H , Bowden MG , Nguyen TL (2010) "Structural Model of the Pre-pore Ring-like Structure of Panton-Valentine Leukocidin: Providing Dimensionality to Biophysical and Mutational Data" J. Biomol. Struct. Dyn., 28, 1-12



    This is not the only surprise. Other papers with high citations are:

    Tao Y , Rao ZH , Liu SQ (2010) "Insight Derived from Molecular Dynamics Simulation into Substrate-Induced Changes in Protein Motions of Proteinase K" J. Biomol. Struct. Dyn., 28, 143-157 (36 citations, of which 35 in J. Biomol. Struct. Dyn.)


    Sklenovsky P, Otyepka M (2010) "In Silico Structural and Functional Analysis of Fragments of the Ankyrin Repeat Protein P18(INK4c)" J. Biomol. Struct. Dyn., 27, 521-539 (36 citations, of which 35 in J. Biomol. Struct. Dyn.)


    Zhang JP (2009) "Studies on the Structural Stability of Rabbit Prion Probed by Molecular Dynamics Simulations" J. Biomol. Struct. Dyn., 27, 159-162 (36 citations, of which 31 in J. Biomol. Struct. Dyn. and 4 others are self-citations by the author)

    Chen CYC, Chen YF, Wu CH, Tsai (2008) "What is the effective component in suanzaoren decoction for curing insomnia? Discovery by virtual screening and molecular dynamic simulation " J. Biomol. Struct. Dyn., 26, 57-64 (35 citations, of which 21 in J. Biomol. Struct. Dyn. and 11 others are self-citations by the author)

    Mittal A, Jayaram B, Shenoy S, Bawa TS (2010) "A Stoichiometry Driven Universal Spatial Organization of Backbones of Folded Proteins: Are there Chargaff's Rules for Protein Folding?" J. Biomol. Struct. Dyn., 28, 133-142 (34 citations, of which 33 in J. Biomol. Struct. Dyn.)





  • Wednesday, June 29, 2011

    "We are pleased to invite you....."

    The recent trend toward open access science publishing has yielded a very uneven crop of journals. We do have a few respected Open Access-only publications with high quality research (PLoS ONE and many titles on BioMedCentral) but there is also a very large number of publishing firms that email researchers to solicit submissions to brand new Open Access journals. I have received several of these emails, which always claim to have selected me because of my expertise on the topic even though I have often not published anything on it, or even on related subjects. So far, I have received requests to submit reviews to:

  • a special issue on protein biogenesis in "Archaea". (I have studied enzymes of P. furiosus, but never did any on protein biogenesis or post-translational modifications)
  • International Journal of Medicinal Chemistry
  • Recent Patents on DNA and Gene Sequences (I have never done any sequencing, but that did not prevent the editors from considering me an expert on the area ;-)

    This morning I received the most ludicrous example of "scientific" spam: I was invited to present my work on "A tale of two acids: when arginine is a more appropriate acid than H3O+" to the "EPS Montreal International Renewable Energy Forum 2011". Definitely off-topic!

  • Thursday, July 1, 2010

    Computing redox potentials

    First-principles computations of redox potentials in solution is a difficult task due to the large number of solvent molecules that must be included. As the computational cost increases steeply with the number of basis functions, a common approach consists of performing a geometry optimization of the reduced and oxidized species in vacuo, and then computing the energy of these species with a larger basis set and a continuum method that represents the influence of the solvent on the solute electron distribution. Besides the error introduced by assuming that the geometry does not change upon solvation, this approach includes two main sources of errors:
    a) the intrinsic error of the theoretical level used to compute the electronic energies
    b) the error associated with the continuum method itself.

    Whereas the first error may be rigorously quantified by comparison with experimental gas phase values and made very small with the choice of an appropriate basis set/theory level combination , most continuum methods yield less predictable errors (especially when the redox-active portion of the solute is present in a very heterogenous environment, like an enzyme active site).


    Dejun Si and Hui Li have now improved the continuum solvation methods by including the possibility of assigning different dielectric constants to different parts of the solute cavity surface, thus improving the description of heterogeneous environments. These authors have also shown this approach to correctly predict the relative redox potentials of the type I copper centers (optimized in vacuo) in eleven different proteins with maximum errors < 0.1 V (provided that the systems include approximately 100 protein atoms around the Cu Center). The error can be minimized to < 0.05 V by optimizing the geometries using the newly-developed heterogenuous polarizable continuum.

    This new continuum method is implemented in the latest release of GAMESS, a free and very powerful quantum chemistry package available from Mark Gordon's group, at Iowa State University.


    Wednesday, June 2, 2010

    Making the most out of a failed experiment

    The scientific literature is heavily biased towards positive results, and it is therefore a matter of considerable dismay to realize that the experiments one has been doing do not work. Organic letters today has a new paper showing how to snatch victory from the jaws of (experimental) defeat. The title says it all: Unlikeliness of Pd-Free Gold(I)-Catalyzed Sonogashira Coupling Reactions. Most of us would probably sigh, bang the table, and bury these results in a dissertation footnote where they might never be found by anyone. Congratulations to the authors for their perseverance and for teaching us something more about what does not work in Sonogashira couplings!