Commercial Vendors & Databases Gary Wiggins I571 Fall 2006 Factors in the Current Environment • Interdisciplinary science • Consolidation of the Scientific-TechnicalMedical (STM) publishing world • Different.

Download Report

Transcript Commercial Vendors & Databases Gary Wiggins I571 Fall 2006 Factors in the Current Environment • Interdisciplinary science • Consolidation of the Scientific-TechnicalMedical (STM) publishing world • Different.

Commercial Vendors &
Databases
Gary Wiggins
I571
Fall 2006
Factors in the Current Environment
• Interdisciplinary science
• Consolidation of the Scientific-TechnicalMedical (STM) publishing world
• Different cultures in the chemistry
publishing environment compared to that
in biology
• Move to open access journals and data
• Influence of the Web
Size of the Chemical Literature:
2002 Estimate
•
•
•
•
•
~ 50 million chemical substances
~ 6 million reagents
~ 7 million published reactions
~16,000 protein crystal structures
~250,000 small molecule x-ray structures
--Robert Glen and Susan Aldridge (2002)
Size of the Chemical Literature:
2006
•
•
•
•
•
~ 88 million chemical substances
~ ? million reagents
~ ? million published reactions
~ 39,000 protein crystal structures
~ 367,000 small molecule x-ray structures
Vendors and Publishers
• Partnership between commercial vendors
and abstracting/indexing services (and to
some extent with journal publishers)
– Most activity in online searching started in the
early 1970s
– Comparatively little change in the vendors’
search systems until relatively recently
• Aggregation of databases
• Cross-file searching
• Command-driven access
Vendors of Chemical Databases
• STN International (http://info.cas.org/stn.html)
– SciFinder and SciFinder Scholar (http://www.cas.org/)
• Thomson Scientific (http://www.isinet.com) (ISI)
• Questel (http://www.questel.orbit.com/index.htm) (Orbit)
– Merged Markush Service
•
•
•
•
•
•
•
•
•
Thomson Dialog (http://www.dialog.com/)
Elsevier Scopus (http://www.info.scopus.com/)
Elsevier MDL (http://www.mdl.com/)
US National Library of Medicine (http://www.nlm.nih.gov/)
Ovid Technologies (http://www.ovid.com/)
CSA (Cambridge Scientific Abstracts) (http://www.csa.com/)
Chemical Information System (http://www.nisc.com/cis/qcis1.asp)
Knovel (http://www.knovel.com/)
Technical Database Services (http://www.tdsonline.com/)
STN International
• Partnership among Chemical Abstracts
Service, FIZ Chemie, and the Japan
Science and Technology Corporation
• Has over 200 STM databases
– STN Database Summary Sheets:
http://info.cas.org/ONLINE/DBSS/dbsslist.html
– Includes some databases also available free
through other venues (e.g., Medline,
GenBank)
Features in Commercial Systems
•
Concept of the Basic Index
–
•
•
•
Special Boolean operators (proximity, adjacency, etc.)
Truncation (wild cards and left-hand or right-hand truncation)
Controlled vocabulary tools (MeSH, CAS’s Index Guide, CA
Lexicon)
Classification of the documents
•
–
–
•
•
•
•
Default field; in bibliographic databases often limited to keywords
from titles, abstracts, and index terms
PACS (Physics and Astronomy Classification Scheme)
CA Sections/Subsections
Structure searching (usually range from exact to full substructure
search)
Numeric and other data that is searchable
Data analysis tools
Current awareness options
Vocabulary Control
• “The entities we deal with, such as genes,
sequences, and chemical data, and manipulate
and analyze in the context of bioinformatics and
biomedical research have not always been
properly defined. There are no control
vocabularies, no standards for much of the data,
and no unified way to refer to them.”
--Pablo Tamayo, senior computational biologist and manager, cancer
genomics informatics, MIT Broad Institute, quoted in Drug Discovery &
Development August 2004, 7(8), 52.
Command Language Systems
• Allow field-directed searches
• Incorporate sophisticated Boolean
relationships
– AND, OR, NOT
– Adjacency, Proximity, Logical linking to the
same field or sub-field of a record
– Numbers of intervening words can be
specified
• User must learn the commands
User-Oriented Software
• Front-end systems to mask command
language
– STN’s SciFinder (&SF Scholar)
– STN on the Web, STNEasy, STN Express
– Elsevier MDL’s CrossFire Commander and
DiscoveryGate
– Questel’s QWeb and Imagination
Main Chemical Databases
•
•
•
•
•
Chemical Abstracts
Beilstein/Gmelin
Cambridge Structural Database
Protein Data Bank
Many other relevant databases
CAS DBs: CA File
• CA File, a bibliographic database covering
journal articles (from ~9500 journals), technical
reports, conference proceedings, dissertations,
patents and other literature
• 1907 to the present (and and some earlier); full indexing was
added retrospectively for all records
• Linked through the Registry Number to compound data
• CAplus File, includes CA File data plus ejournals, some preprints, and all articles from
~1500 key chemical journals within one week of
receipt
Old References Recently Added to
CA Database
The boiling-point curve for mixtures of ethyl alcohol and water.
Noyes, William A.; Warfel, R. R. Rose Polytechnic Institute, Terre
Haute, Journal of the American Chemical Society (1901), 23(7),
463-8. CODEN: JACSAT ISSN: 0002-7863. Journal written in
English. CAN 0:1311 AN 1906:1311 CAPLUS (Copyright
2004 ACS on SciFinder (R))
Abstract
In the determination with small amounts of alcohol, the readings of the
thermometer were taken when the vapors first entered the
condenser, as after boiling for a few minutes a relatively large
proportion of the alcohol present would be found in the upper layers
and in the condenser. The thermometer under these conditions
registered about 0.3 higher. An examination of the table and curve
revealed that the minimum boiling point is for alcohol of 96% by
weight. The curve was steeper on the side toward absolute alcohol.
Alcohol of 90.7% had the same boiling point as absolute alcohol.
Relative Contributions of Literature
Types to CA
Used with the permission of Chemical Abstracts Service (CAS),
a division of the American Chemical Society, from:
http://www.cas.org/casdb.html
Growth of Articles in CA
Year
1907
1945
1960
1970
1980
1990
2000
2005
Articles Abstracted
7,994
22,824
104,484
230,902
407,342
394,945
573,469
737,480
Source: http://www.cas.org/EO/casstats.pdf
Basic Index from the CA File
Field Name
|Examples
Basic Index: single words
from title (TI), supplementary
term (ST), index term (IT),
and abstract field (AB), as
well as CAS Registry Numbers.
| S 50-21-5
| S ?FLUOROCARBON?
| S (WATER(S)OIL)/BI
| S transgenic cotton
Special Fields in the CA File
• In addition to the standard bibliographic
citation data, have:
– Controlled Terms (CT) or (IT)
– Classification Codes (CC: the 80 section codes
into which the content of the paper CA is divided:
http://www.cas.org/PRINTED/sects.html)
– Document type (DT)
– Language Code (LA)
– Role (RL)
CAS Roles
• Used in conjunction with chemical substance searches
• Seven super roles, e.g., ANST, BIOL, CMBI, FORM,
OCCU, PREP, PROC, RACT, USES
• Over 60 more specific role descriptors, e.g., with PREP:
–
–
–
–
–
–
–
–
BMF Bioindustrial manufacture
BPN Biosynthetic preparation
BYP Byproduct
Combinatorial preparation
IMF Industrial manufacture
PNU Preparation, unclassified
PUR Purification or recovery
SPN Synthetic preparation
• Also two roles not up-posted to super roles: PRP
(Properties) and MSC (Miscellaneous)
CAS DBs: Registry File
• “Authority” file that lets indexers and
searchers definitively identify a substance
as new or find a previous entry
• Contains all types of chemical substances,
including biomolecules
• Best file for chemical names
• Many physical properties being added
• Linked to CA and other files through the
Registry Number (RN)
CAS Registry Number
• Serves as the accession number in the
Registry File
• RN has no meaning
– Example: Isatin is 91-56-5
Registry File Contents
• Includes synonyms, molecular formulas,
alloy composition tables, classes for
polymers, nucleic acid and protein
sequences, ring analysis data, and
structure diagrams
• Also: experimental and calculated property
data from various sources as well as super
roles and document type information from
CAplus
Registry File Contents
• 87,711,955 substances have a RN in the
Registry File as of 9/8/2006
• All substances in CAS files plus others
• Many physical constants now added to the
records, most of them calculated
– Lipinski Rule of Five values
– BP, MP, Density, Optical Rotatory Power,
Refractive Index
– Data for 3D visualization
Size of the Registry File
Date
Friday, 9/8/2006
Count
29,884,228 organic and
inorganic substances
57,827,727 sequences
CAS Registry Number
906063-52-3 most
recent CAS RN
Source: http://www.cas.org/cgibin/regreport.pl
CAS DBs: CASReact
• Derived from journal and patent
documents from 1840 to date
• Contains both single-step and multistep
reactions
• Structure searchable
• Contains yield data, reaction conditions,
etc.
CAS Databases: Other
• CHEMCATS--information about commercially
available chemicals and their worldwide
suppliers
• CHEMLIST--contains chemical substances on
national inventories
• MARPAT--more than 500,000 Markush structure
records for patents found in the CA File with
patent publication year 1988 to the present
• TOXCENTER--covers the pharmacological,
biochemical, physiological, and toxicological
effects of drugs and other chemicals
SciFinder and SciFinder Scholar
• Includes access to the CA, Registry,
CHEMCATS, CHEMLIST files, plus
Medline (1957-)
• Easy structure searching capabilities
• Integrated with ChemPort for easy access
to the primary literature
• Download page for SFS:
– http://www.libraries.iub.edu/index.php?pageId=2114
SciFinder Scholar Under the Hood
• A. Ben Wagner’s look at what really
underlies the apparent simplicity of SFS
searches
• http://ublib.buffalo.edu/libraries/eresources/SciFinder/SciFinder200dpi.pdf
PubChem: A Threat to CAS?
• PubChem, part of the NIH Roadmap plan
under the Molecular Libraries and Imaging
Initiative
• Several million compounds already in the
database
• To be linked to assay data from High
Throughput Screening analyses
• http://pubchem.ncbi.nlm.nih.gov/
InChI: Another Threat?
• IUPAC-NIST Chemical Identifier
• a unique label which would be a non-proprietary
identifier for chemical substances that could be
used in printed and electronic data sources thus
enabling easier linking of diverse data
compilations
• latest version handles:
– organic, covalent structures
– inorganic and organometallic compounds
• http://chemdata.nist.gov/IChI/INChIv11b.zip
Beilstein Database
•
•
•
•
•
Covers organic chemistry back to 1771
Includes many physical properties
Includes reaction information
Structure searchable
Available on the CrossFire Commander
system (and direct from MDL via
DiscoveryGate) for academic institutions
Gmelin Database
• Covers inorganic and organometallic
chemistry back to 1771
• Includes many physical and chemical
properties
• Not searchable for reactions
• Accessible through the CrossFire
Commander system (and direct from MDL
via DiscoveryGate) for academic
institutions
MDL’s CrossFire Commander
• Download page for Commander at IU:
– http://www.libraries.iub.edu/index.php?pageId=2114
DiscoveryGate for Academics
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
CrossFire Beilstein
CrossFire Gmelin
MDL® Available Chemicals Directory
MDL® Screening Compounds Directory
MDL® Reference Library of Synthetic Methodology
MDL® Solid-Phase Organic Reactions
ORGSYN (Organic Syntheses) Database
Encyclopedia of Reagents for Organic Synthesis
Comprehensive Organic Functional Group Transformations
Comprehensive Asymmetric Catalysis
MDL® Comprehensive Medicinal Chemistry
MDL® Drug Data Report
MDL® Metabolite Database
MDL® Toxicity Database
ChemInform Reaction Library
Current Synthetic Methodology
Derwent Journal of Synthetic Methods
National Cancer Institute Database
http://www.mdl.com/solutions/solutions_for/academics/dg_academics.jsp
Reaction Databases
• CASReact
• SPRESI
– http://www.spresi.com/
• Organic Syntheses
– Free version:
http://chemfinder.cambridgesoft.com/reactions/orgsyn.asp
• ISI’s Index Chemicus
• e-EROS (Encyclopedia of Reagents for Organic
Synthesis)
• MDL’s Integrated Major Reference Works
– Reactions indexed with InfoChem’s CLASSIFY Reaction
Classification Code, based on the degree of specificity around
the reacting center:
– http://www.infochem.de/content/downloads/classify.pdf
Cross-Product Approaches
• MDL/InfoChem’s Integrated Major Reference
Works
– Thieme’s Science of Synthesis (successor to
Houben–Weyl)
– Springer’s Comprehensive Asymmetric Synthesis and
their Glycoscience
– Elsevier Science’s Comprehensive Organic
Functional Group Transformations
– Wiley’s Encyclopedia of Reagents for Organic
Synthesis
– Links to primary journal literature.
Physical Property Databases
•
•
•
•
Beilstein & Gmelin
CRC Handbook (CHEMnetBASE)
Ei ChemVillage
Knovel
– Perry’s Chemical Engineers’ Handbook
– Lange’s Handbook of Chemistry
• Landolt-Börnstein
• CAS Registry File
Spectral Databases
•
•
•
•
Bio-Rad
Aldrich
NIST Chemical WebBook
Some high-quality free databases on the
Web, e.g.,
• SDBS, Spectral Database for Organic
Compounds
– http://www.aist.go.jp/RIODB/SDBS/menu-e.html
SDBS IR Spectrum for
Traumatic Acid
CCDC
Isatin on the CSD
Cambridge Structural Database
• Bibliographic, chemical and crystallographic
information for:
– organic molecules
– metal-organic compounds
• 3D structures have been determined using:
– X-ray diffraction
– neutron diffraction
• The CSD records results of:
– 3D atomic coordinate data for at least all non-H atoms
CSD components
• ConQuest: search and information
retrieval
• Mercury: structure visualization
• Vista: numerical analysis
• PreQuest: database creation
Accessing the CSD at IUB
• Download the Citrix Metaframe client at:
– http://www.citrix.com/site/SS/downloads/downl
oads.asp?dID=2755
• Connect to IUB via VPN and link to:
– http://www.libraries.iub.edu/scripts/countReso
urces.php?resourceId=1399945
– For IUPUI, ask Kelsey Forsythe
Other Structural Databases
• Protein Data Bank for polypeptides and
polysaccharides having more than 24 units
http://www.rcsb.org/pdb/
• Nucleic Acids Database for oligonucleotides
http://ndbserver.rutgers.edu/
• Inorganic Crystal Structure Database
http://www.fizinformationsdienste.de/en/DB/icsd/
• CRYSTMET® for metals and alloys
http://www.tothcanada.com/
Materials Chemistry Databases
• TDS specializes in chemical engineering
data. Includes:
– American Institute of Chemical Engineers’
DIPPR Pure Component Data
• 29 fixed-value properties and 13 temperaturedependent properties for about 1600 industrial
chemicals
Patent Databases
•
•
•
•
Derwent World Patents Index
USPATFULL
PCTFULL (WIPO/PCT Patents Full Text)
INPADOC (INternational PAtent
DOcumentation Center)
• IFIPAT
• CA and CAplus
• MDL Patent Chemistry Database
Chemical Information System
• 34 environmental databases
– Originally developed by the US National Institutes of
Health and the Environmental Protection Agency
• Covers over 515,000 compounds
–
–
–
–
–
–
Toxicological and/or carcinogenic research data
information on handling hazardous materials
chemical/physical property information
Regulations
safety and health effects information
pharmaceutical data
Hybrid Links to the Web
• STN’s eScience
– http://www.escience.org/
• Elsevier Science’s Scirus
– http://www.scirus.com/srsapp/
• Elsevier Science’s Scopus (includes Scirus)
– http://www.info.scopus.com
– 15,000 titles going back to the mid 1960s
– More than 500,000 records link to the Beilstein
database on either CrossFire or DiscoveryGate
– 250 million web sources
Traumatic Acid: SFS  eScience
Electronic Journals
• Coverage in some cases back to the 17th
century
• Most major publishers’ backfiles are now
online
• CrossRef
• DOI
• SFX
Shift from Ownership to Licensing
of Journals
• IUB Chemistry Library e-journals
– http://www.indiana.edu/~libchem/ejournals.html
• Shift away from ownership
• Archival issues
– Publisher archives (usually 2-3 locations)
– LOCKSS and other proposals
– Libraries often have no archival rights
Archival Issues
• “Given their transitory nature, are
commercial and even society publishers
the parties to which we want to entrust the
task of keeping and preserving human
knowledge?”
William W. Armstrong, Chemistry Librarian, Louisiana State
University, C&EN 10 October 2005, 83(41), 53
Single Publisher Databases
• Elsevier’s ScienceDirect and their
encyclopedia DBs
– Scirus: http://www.scirus.com/srsapp/
• Wiley’s journal, book, and encyclopedia
DBs: http://www3.interscience.wiley.com/
• American Chemical Society journals
– http://pubs.acs.org/
CrossRef
• CrossRef Search
http://www.crossref.org/crossrefsearch.html
• Pilot initiative running in 2004 in collaboration
with Google
• Includes the content of 45 publishers (out of the
1488 CrossRef publishers and societies)
• Now covers approximately 6.5 million research
articles
• Allows XML searching
Getting at the Data
• New CAS Information Use Policies
– http://www.cas.org/infopolicy.html
• STN’s Information Keep & Share Program
– http://info.cas.org/copyright/index.html
• SciFinder Scholar download restrictions:
100 items at a time
Data Analysis Tools
• STN’s Analyze and Tabulate feature
• STN Express with Discover! (Analysis Edition)
• STN AnaVist
– http://www.cas.org/stnanavist/prices.html
– Flat fee cost based on size of the answer set
analyzed (ranges from $230 for up to 1,000 to $850
for up to 20,000)
• Limited access to records because of A&I
publishers’ fear of data piracy
Open Access
• Institute of Physics: most papers free for 30 days
after publication
– http://www.iop.org/EJ/ and
http://www.iop.org/EJ/journal/NJP
• Public Library of Science
– http://www.publiclibraryofscience.org
• Highwire Press
– http://www.highwire.org/
• PubMed Central
– http://www.pubmedcentral.nih.gov/
Budapest Open Access Initiative
• Based on:
– Self archiving by authors
– Open Access journals, e.g., BioMed Central
• http://www.soros.org/openaccess/
Open Access + Semantic Web
• "Almost all of an author's output (compounds,
spectra, reactions, properties, etc.) is nowadays
computerised and in principle redistributable to
the community for re-use. Few journals actively
validate the primary data (e.g. spectra) involved
in a publication (chemical crystallography being
a clear exception where data are intensively
reviewed by machine). We reassert that
chemists must now move towards publishing
their collective knowledge in a systematic and
easily accessible form for re-use and
innovation....
Open Access + Semantic Web
• We urge that authors, funders, editors,
publishers and readers move further towards the
following protocol:
[1] All information should be ultimately machineunderstandable in XML....
[2] Machine-understandable information for a compound
should include a connection table, the IUPAC unique
identifier (InChI) which guarantees that the
connection table can be checked and regenerated,
and a name....
[3] Rights metadata.”
-- Murray-Rust, Rzepa, Tyrrella, Zhanga (2004)
Opposition to Open Access
• Reacting to NIH’s proposed policy on open
access, C&EN Editor Rudy Baum says:
“[This] action will inflict long-term damage on
the communication of scientific results and
on maintenance of the archive of scientific
knowledge.”
-- C&EN, September 20, 2004, p. 7
Recent Legislative Action
• In the US, Senators Cornyn and Lieberman
introduced S. 2696 (109th Congress, 2nd
session).
– Federal Research Public Access Act of 2006
• In the UK, the half of the Research Councils UK
and the Wellcome trust have endorsed open
access.
– http://www.rcuk.ac.uk/access/index.asp
– http://www.wellcome.ac.uk/doc_WTD002766.html
Free Services
• ChemFinder
– http://chemfinder.cambridgesoft.com/
• ChemIDplus
– http://chem.sis.nlm.nih.gov/chemidplus/
• Frederick/Bethesda Data and Online Services
– http://cactus.nci.nih.gov/
• PubMed
– http://www.ncbi.nlm.nih.gov/entrez/query.fcgi
• DOE’s STI Information Bridge
– http://www.osti.gov/bridge/
Future
• XML and metadata
– Dymond (DYnamic Metadata ON Demand)
• Virtual journals (Virtual Journal of Nanoscale
Science and Technology)
• Copyright question and open access resolution
• Legal protection of databases
• Impact of InChI and CML
• Demise of Abstracting and Indexing Services?
Conclusion
• “The main challenge is for chemists to
recognise the value of making their data
machine-understandable, rather than
destroying it with traditional paper or slidefocused publication and dissemination
processes.”
-- Murray-Rust, Rzepa, Tyrrella, Zhanga (2004)
What is Citation Indexing?
• Utilizes a known relevant document regardless
of when published to find newer journal articles
that have cited that document
• Assumption: Authors who are citing the
document must be writing on a related topic
– Citation indexing lets you find newer articles from an
older reference
– Found on other tools, e.g., SciFinder Scholar,
SCOPUS, but citation indexing doesn’t go as far back
as does SCI
• Gets around the problems of doing a subject
search when you aren’t sure of the words to use
Source Journal Coverage
• SCIE:
• SSCI:
• A&HCI:
5700 titles
1735 titles*
1145 titles*
*also includes selected articles from SCIE
• Weekly updates
– Lag time: 2-3 weeks
• Journal List: http://www.isinet.com/journals/
Web of Science Search Screen
Search Example: Cited
Reference Searching
• Use the Full General Search and Cited
Reference Search Option
• Find publications that have cited the works
of Donald E. Linn.
– Dots before his name indicates he is not the
first listed author on the publication.
– Links are to ISI source journals.
– Unlinked items may be incorrect forms of the
reference.
SCI Cited Ref Search for DE Linn
Lookup Results for DE Linn Search
DE Linn’s 2003 JACS Article
Newer Articles Citing the 2003
JACS Article by DE Linn et al.
Analysis of All Authors Citing DE
Linn
Searches
•
•
•
•
Isatin (91-56-5)
Moronic Acid (RN 6713-27-5)
Traumatic Acid (RN 6402-36-4)
Others:
http://www.chm.bris.ac.uk/sillymolecules/sillymols.htm
Beilstein Structure Search
R1=O or S
R2=H, OH, OMe, CH3, or CO2H
X = any halogen
? = any bond value
Bibliography
•
Kaufman-Wills Group LLC. The Facts About Open Access. Association of
Learned and Professional Societies, 2005. ISBN 0-907341-29-2
•
Culp, F. Bartow. "Ten or so things that every chemistry librarian absolutely,
positively has to have to keep from being an absolute plonk." Sci-Tech
News, February 2004, 58(1), 9. also published as: SLA Chemistry Division
E-Newsletter Winter 2004, 18(3), 19-20).
http://www.sla.org/division/dche/Newsletters/Feb_2004.pdf
•
Gasaway, Laura. “The open archives movement.” Information Outlook
October 2004, 8(10), 36, 39-40.
•
Glen, Robert; Aldridge, Susan. “Developing tools and standards in
molecular informatics.” Chemical Communications 2002, (23), 2745-2747.
DOI: 10.1039/b207793k
http://xlink.rsc.org/?DOI=b207793k
Bibliography
• Huber, C.; Porter, K. “Cheap tricks.”
http://www.indiana.edu/~cheminfo/workshop/cheap.html
• McLeland, Le-Nhung. What every chemist should know
about patents.
http://www.chemistry.org/portal/resources/?id=1b41692a6cf811d6f8dd6ed9fe80
0100
• Murray-Rust, Peter; Rzepa, Henry S.; Tyrrella, Simon M.;
Zhanga, Y. “Representation and use of chemistry in the
global electronic age.” Organic & Biomolecular
Chemistry 2004, 2, 3192-3203.
http://www.ch.ic.ac.uk/rzepa/obc/ (preprint)
Bibliography
• Wagner, A. Ben. "Finding physical properties of chemicals: A
practical guide for scientists, engineers, and librarians.” Science &
Technology Libraries 2001, 21(3/4), 27-45. (published Fall 2003)
Text for personal and professional use available at:
http://ublib.buffalo.edu/libraries/asl/staff/documents/wagner_phys_prop_stl_art.
pdf
• Wiggins, Gary. “Overview of databases/data sources.” in Gasteiger,
Johannes, ed. Handbook of Chemoinformatics: From Data to
Knowledge in 4 Volumes. Wiley-VCH: 2003, v. 2, pp. 496-506.
http://www.indiana.edu/~cheminfo/C571/wiggins_chapter_2003.pdf
• Wiggins, Gary. “Teaching chemical literature, databases, and
chemical informatics.” CPT; Committee on Professional Training
[newsletter] Spring 2004, 4(1), 1-2.
http://acswebcontent.acs.org/PDF/cpt/nl_cpt_spring2004.pdf