The newzmat utility was designed primarily for converting molecule specifications between a variety of standard formats. It can also perform many related functions, such as extracting molecule specifications from Gaussian checkpoint files. Its full set of capabilities includes the following:
newzmat can convert molecule specifications between a variety of data file formats. This includes generating a Z-matrix (and hence input for Gaussian) from the files produced by other programs and also converting between the file formats of any of these programs. newzmat can thus be used to produce Gaussian input from the data files of many popular graphics and mechanics packages, allowing them to act as graphical input front-ends to Gaussian. The resulting data files have the proper symmetry constraints for efficient computation (if applicable).
newzmat can also generate Gaussian 09 checkpoint files from other data files, and (more importantly) generate the data files from checkpoint files. This capability can be used to extract data for display with a visualization package.
newzmat can retrieve intermediate structures from a checkpoint file from (or during) a geometry optimization, for reuse or display.
newzmat has the following general syntax:
newzmat option(s) input-file output-file
where option(s) is one or more options, specifying the desired operations, input-file is the file containing the structure to be converted (or retrieved), and output-file is the file in which to place the new molecule specification (or Gaussian input). A slightly variant syntax is used when merging information from two files (see below).
If the output filename is omitted, it is given the same base name as the input file, along with a conventional extension denoting its file type. In general, extensions can be omitted from file specifications provided that extension conventions are followed. The default extensions are listed in the following table:
Extension | Description | Option Form | ||
.bgf | Biograf internal data file | bgf | ||
.cac | CaChe molecule file | cache | ||
.chk | Gaussian 09 checkpoint file | chk | ||
.com | Gaussian input file (Z-matrix mol. spec.) | zmat | ||
.com | Gaussian input file (Cart. coords. mol. spec.) | cart | ||
.con | QUIPU system data file | con | ||
.dat | Model/XModel/MM2 data file | model | ||
.dat | MacroModel data file (may be formatted or unformatted) | mmodel, ummodel | ||
.ent | Brookhaven data file (≡PDB) | ent | ||
.com | Fractional coords. for crystal structures (requires exactly 3 trans. ops.) | fract | ||
.inp | MOPAC input file | mopac | ||
.pdb | Protein Data Bank format (≡Brookhaven) | pdb | ||
.ppp | Some PPP program (output only) | ppp | ||
.xyz | Unadorned Cartesian coordinates | xyz | ||
.zin | Ancient version of ZIndo | zindo |
The options specifying the formats of the input and output molecule specifications are formed from the string -i or -o (respectively), followed immediately by the appropriate option form string from the preceding table corresponding to the desired molecule specification format (no spaces intervene). For example, -ipdb indicates that the input molecule specification is in PDB format and that the extension .pdb should be applied to the input filename if no extension is specified. Similarly, -oxyz specifies an output format of Cartesian coordinates along with a default extension of .xyz for the output filename. The default input and output options are -izmat and -ozmat. Note that -izmat and -icart are synonyms, and either one of them can read a Gaussian input file containing any molecule specification format: Z-matrix, Cartesian coordinates, or mixed internal and Cartesian coordinates.
In order to communicate with a non-supported visualization system, the first choice of format to try is the PDB file. This format includes the connectivity information and is widely supported. Note that some software packages use the .ent extension, rather than .pdb; the -ient and -oent options select the former, while -ipdb and -opdb select the latter. Another commonly used alternative is the Mopac file format.
The following options further specify the input for newzmat:
-ngeom N
Use geometry from Nth structure on the checkpoint file. This functions in the same manner as Geom=(NGeom=N).
-ot list
Use geometries from the listed structures on the checkpoint file. Lists can include multiple structure numbers (separated by commas) and ranges of structure numbers. For example, -ot 3,7 extracts structures from steps 3 and 7, and -ot 2-5 extracts all structures ranging from steps 2 through 5.
-step N
Use the structure from step N of the geometry optimization data in a Gaussian 09 checkpoint file (valid only for the -ichk input option).
This option is not available for optimizations in redundant internal coordinates (the default coordinate system). Instead, retrieve the structure from the checkpoint file in a subsequent job by using a route section containing Geom=(Check,Step=N).
-ubohr
Input distances in input file are specified in Bohr (the default is Angstroms).
-urad
Input angles in input stream are specified in radians (the default is degrees).
The following options retrieve changes from the checkpoint file and apply them as the MM charges for both regular atoms and link atoms. These options specify which kind of charges to retrieve:
-qmul
Mulliken charges.
-qesp
ESP-fit charges.
-qaim
AIM charges.
-qnpa
NPA charges.
-qapt
APT charges.
The following options further specify the output file format:
-mof1
Use macromodel format 1 (only valid with -ommodel).
-mof2
Use macromodel format 2 (this is default if -ommodel is specified).
-optprompt
Prompt for which parameters should be optimized; used when setting up a molecule specification destined for a geometry optimization and -ozmat is specified (or no output option is included). By default, all parameters not fixed by symmetry are optimized.
-prompt
Prompt for route section and title section lines and for the charge and multiplicity when using -ozmat (or no output option is specified). Gaussian input files produced by newzmat set up HF/6-31G(d) single point energy calculations by default.
The following command reads the molecule specification from the PDB file water.pdb and writes a Gaussian input file, including the equivalent Z-matrix, to the file h2o.com:
$ newzmat -ipdb water h2o -ozmat is the default, so it can be omitted. Charge and multiplicity [0,1]? A return accepts the default values shown.
newzmat prompts for the charge and multiplicity for the Z-matrix since these items cannot be determined from the PDB file.
The following command reads the molecule specification from the Gaussian 09 checkpoint file job-11234.chk and writes the PDB file propell.pdb:
$ newzmat -ichk -opdb job-11234 propell
The following command reads the molecule specification from step 5 of the optimization from the checkpoint file newopt.chk and produces the Mopac file step5.inp:
$ newzmat -ichk -omopac -step 5 newopt step5
The following command reads the molecule specification from the Mopac file newsalt.inp and writes a Gaussian input file including the equivalent Z-matrix to the file newsalt.com, prompting for the route and title sections and the charge and spin multiplicity for the molecule:
$ newzmat -imopac -prompt newsalt Percent or Route card? # B3LYP/6-31G(d,p) Opt Route card? End route section with a blank line. Titles? Optimization of caffeine at B3LYP/6-31G** Titles? End title section with a blank line. Charge and Multiplicity? 0,1
newzmat can create an input file where data from two different files have been combined. This feature is helpful for setting up ONIOM calcuations in which one wants to apply a custom setup (ONIOM layers, MM atom types, MM atom charges, etc.) from one file to another structure on a different file. It allows you to specify the custom setup only once, and then later apply that information to other structures.
The -t and -s options request and control the creation of the merged input file. A merge command has the following general form:
$ newzmat -itype [-otype] -tformat -sitemn files
The command requires that you specify the locations of the existing input file (input), the new input file you are creating (output), and the template file (template). The order of the input, output and template files on the command line varies depending on whether the input and/or template file(s) are Gaussian checkpoint files:
Is checkpoint file: | files arguments | |
---|---|---|
input? | template? | |
n | n | input output ignore template |
n | y | input output template |
y | n | ignore output input template |
y | y | input output template |
In the preceding table, "ignore" is a placeholder.
-t requires a file type argument like -i and -o, and it accepts the same values.
The -s option, which can be specified multiple times, indicates the various information for the new input file. It accepts the following keywords, each of which is followed by either 1 or 2, which indicates that the item should be taken from the input or template file (respectively):
XYZn | Geometry (coordinates, nuclear charges, etc.). | |
MMTn | MM types. | |
MMCn | MM charges. If copies, also copied to link atoms if there is ONIOM data present. | |
PDBn | PDB information (including secondary structure). | |
Conn | Connectivity. | |
Onin | ONIOM layer and ink atom data. | |
Micn | MicOpt (freezing/optimizing atoms and rigid block info). |
The default for all items is the input file (i.e., file 1).
Here is an example. We have a Gaussian input file, allsetup.gjf, containing the MM atom type, MM charges, PDB information, connectivity, ONIOM layers, and so on. We can apply all of these settings to the structure in the PDB file new.pdb with the following command:
$ newzmat -icart -tpdb -sXYZ2 -sCon2 allsetup.gjf newinput.gjf ignore new.pdb
This command takes only the geometry and connectivity from file 2 — the template file new.pdb — and everything else (MM atom types, MM charges, PDB information ONIOM data and optimize/freeze atom data) from file 1, the input file allsetup.gjf. The output file, newinput.gjf, will be the result of this merge.
The other options to newzmat are concerned with generating connectivity information, with the use of standard geometrical parameters, and with the determination and use of molecular symmetry. A complete connectivity table can be used to generate Z-matrix specifications suitable for inclusion of symmetry constraints. Such a table is also required for output of the data files for the molecular mechanics programs. If one of the input formats which includes full connectivity is used (e.g., MacroModel data files), the connectivity that it provides is used.
However, when Z-matrix or MOPAC format input is provided, only the connectivity information which is implied by the internal coordinate specification is available. Thus if a new Z-matrix which incorporates the molecular symmetry is to be generated, the remaining connectivity information must be generated. When Cartesian coordinates are read in, naturally, no connectivity information is provided, so the default is to generate the table using the internally stored atomic radii. In addition, when used to generate input structures, the mechanics programs may not generate suitable bond distances and often produce coordinates which are close to but not exactly symmetric. Options control how each of these cases is handled.
-allbonded
In generating new connectivity information, assume all atoms are bonded.
-bmodel
Use standard model B bond lengths along with internal values in determining bond distances.
-density N
Generate natural orbitals for density number N. This option is only useful if you are generating a CaChe file. N should be set to 0 for HF, to 2 for MP2, to 6 for CI, and to 7 for QCISD or CCD.
-fudge
Fudge bond distances to make sure they are reasonable, using internal values. This is the default for model input and is not applicable elsewhere.
-gencon
Generate connectivity information using internal radii.
-getfile
Insist on filename specifications for all arguments, making standard input and output unacceptable.
-lsymm
Use loose cutoffs for determining symmetry. This option implies -symav.
-mdensity M
Subtract generalized density M from that specified with -density to make a difference density, which is then converted to natural orbitals.
-nofudge
Do not fudge bond distances. This is the default and only choice for all cases except model input.
-nogetfile
Cancels -getfile.
-noround
Turns off rounding of Z-matrix parameters.
-nosymav
Turns off averaging of input coordinates.
-nosymm
Turns off all use of symmetry.
-order
Keeps the order of atoms as close as possible to the input order.
-round
Rounds Z-matrix parameters to 0.01 Å and 1 degree.
-symav
Average input coordinates using approximate symmetry operations to achieve exact symmetry.
-symm
Assign molecular symmetry.
-tsymm
Use tight cutoffs for determining symmetry. The option is the default.
-rebuildzmat
Build a new Z-matrix rather than using the read-in one (as would be the default for Z-matrix or MOPAC input). This option implies -gencon, and the option may be abbreviated as -redoz.
The symmetry averaging process, which guesses the intended symmetry given coordinates which are only approximately symmetric, does not always achieve the intended symmetry. It will take coordinates printed in a Gaussian output file to 6 digits and restore symmetry, and it will usually work given coordinates from molecular mechanics provided that the mechanics optimization was converged reasonably far. In generating coordinates with MacroModel, for example, it is sometimes necessary to do a final full Newton-Raphson step after the normal minimization.
newzmat computes the nuclear repulsion energy of the initial read-in structure and of the final structure as a consistency check. If these disagree, a warning is printed. Substantial disagreement indicates a failure of the program.
Last update: 5 June 2013