Retrieving Precise Substances Using
Element Counts with Generic
Groups in CAS REGISTRY
SM
Science IP
®, The CAS Search Service.
CAS REGISTRY, the gold standard for chemical information,contains more than 70 million unique organic and inorganic chemical substances, such as alloys, coordination compounds, minerals, mixtures, polymers and salts.
When structure searching in REGISTRY and the other exemplified compound databases, the element counts option with generic groups provides a useful means to conduct broad searches with specific element count requirements.
You can apply element counts to generic groups using Query Definition/Element Count in STN Express® Structure Drawing. The software gives different element count options depending
on the selected generic group (Table 1). The Element Count option applied to generic groups limits the retrieval to an exact number (can be zero), minimum, maximum or range of atoms. For example, you can draw a heterocycle with exactly one oxygen molecule located at any position using the generic heterocycle group (Hy) with an element count of exactly one oxygen.
Table 1: Element Count Options for Generic Groups Using Query Definition/Element Count or Query Definition/Generic Definition
Generic
Group Definition (Exact, Minimum, Maximum or Range) Element Count Options Generic Definition Options Related to Element Count Cy Any ring
system Any element •• Any (default) No. of Heteroatoms: 2 or more OR Exactly 1
• No. of Carbons:
7 or more OR Less than 7
Hy Heterocycle Any element • Any (default)
• No. of heteroatoms: 2 or more OR Exactly 1
• No. of carbons:
7 or more OR Less than 7
Cb Carbocycle Carbon only • Any (default)
• No. of carbons:
7 or more OR Less than 7
Ak Alkyl Carbon only • Any (default)
• No. of carbons:
7 or more OR Less than 7
The available Generic group options under Query Definition/Generic Definition provide set element counts to select, but lack the specificity of the Element Count option. Either option (Element Count or Generic Definition) can suffice depending on the search query
In the following example, we show how using the Element Count settings in conjunction with Query Definition/Generic Definition (Table 2) together provide a useful way to retrieve the precise structures of interest.
Table 2: Generic Definition Options
Generic group Options
Saturation
Any Unsaturated
Saturated Type of Chain
Note: Applies only to Ak.
Any Branched
Linear Number of Hetero Atoms
Note: Applies only to Hy and Cy.
Any 2 or more
Exactly 1 Type of Ring System
Note: Applies only to Hy, Cy and Cb.
Any Monocyclic
Polycyclic Number of Carbon Atoms 7 or more Any Less than 7
In our search example, we fixed the structure query, Hy (example below) as always
unsaturated and monocyclic, but vary the element count settings. The queries in the example include those with:
• No element count restriction
• An element count restriction of minimum of 3 nitrogen atoms
• Element count requirement of nitrogen and oxygen each with a minimum of 1, and sulfur and phosphorus each with an element count of exactly zero.
Example: Search for structures with different element counts for the heterocyclic group (Hy) using the structure query below in CAS REGISTRY.
=> FILE REGISTRY
=> Uploading nucleoside hy unlimited.str
Generic attributes : 14:
Saturation : Unsaturated Type of Ring System : Monocyclic
L1 STRUCTURE UPLOADED
=> SEARCH L1 SSS FULL
L3 7828 SEA SSS FUL L1
=> DISPLAY SCAN
L3 7828 ANSWERS REGISTRY COPYRIGHT 2013 ACS on STN
Absolute stereochemistry.
L3 7828 ANSWERS REGISTRY COPYRIGHT 2013 ACS on STN
Absolute stereochemistry.
Continued on next page
Hy:
- unsaturated, monocyclic - no element count restrictions
=> Uploading nucleoside hy nitrogen.str
Generic Attributes: 14:
Saturation : Unsaturated Type of Ring System : Monocyclic
Element Count : Node 14: Unlimited N,Min,3 L4 STRUCTURE UPLOADED => S L4 SSS FULL L6 995 SEA SSS FUL L4 => D SCAN
L6 995 ANSWERS REGISTRY COPYRIGHT 2013 ACS on STN Absolute stereochemistry.
Absolute stereochemistry.
CM 2
Continued on next page
Hy:
- unsaturated, monocyclic - element count with nitrogen minimum of three
=> Uploading nucleoside hy nitrogen oxygen.str
Generic attributes : 14:
Saturation : Unsaturated Type of Ring System : Monocyclic Element Count : Node 14: Unlimited N,Min,1 S,Exact,0 O,Min,1 P,Exact,0 L7 STRUCTURE UPLOADED => S L7 SSS FULL L9 39 SEA SSS FUL L7 => D SCAN
L9 39 ANSWERS REGISTRY COPYRIGHT 2013 ACS on STN
Absolute stereochemistry.
L9 39 ANSWERS REGISTRY COPYRIGHT 2013 ACS on STN
Absolute stereochemistry. Rotation (+).
This example highlights the use of element count options for cases with unfixed heteroatoms in a heterocycle. The table below (Table 3) summarizes the precision achieved by using element count restrictions based on the decreasing number of hits retrieved with each of the searches.
Hy:
- unsaturated, monocyclic - element count with
- Nitrogen: minimum of 1 - Oxygen: minimum of 1 - Sulfur: exactly 0 - Phosphorus: exactly 0
Table 3. Summary of Search Results in the Example
Restrictions Substances Retrieved (L#)
No element count restriction 7828 substances (L3) An element count restriction of minimum of 3 nitrogen
atoms
995 substances (L6) Element count requirement of nitrogen and oxygen
each with a minimum of 1, and sulfur and phosphorus
each with an element count of exactly zero 39 substances (L9)
By using settings for both Element Count and Generic Definition, the search retrieved precise substances meeting both of the requirements.
STN provides access to key science, technology and patent databases, and is the premier single source for the world’s disclosed scientific and technical research. STN databases such as CAS REGISTRY can be accessed through STN Express, STN® on the WebSM and STN Easy®. To find
more information about the REGISTRY database or other display fields, please contact CAS Customer Center or refer to the CAS REGISTRY Database Summary Sheet at:
http://www.cas.org/File%20Library/Training/STN/DBSS/registry.pdf.