No Slide Title
Download
Report
Transcript No Slide Title
Faster Sorting, Manipulation, Reporting & Test Data Speed High-Volume File Processing In/Outside Natural
www.cosort.com
Copyright 2006, IRI, Inc.
Who We Are
Innovative Routines International (IRI), Inc.
ISV - HQ in Melbourne, FL
Founded 1978 – Manhasset, NY
World’s
1st commercial sorts for:
CP/M, DOS, UNIX, Windows, Linux
Experts in sorting and data manipulation
Recommended by all UNIX H/W vendors
Embedded by leading ISVs like:
Cincom, Clerity, EDIWatch,
Experian, Fiserv, Kalido,
Mereo, Sabre, SPSS, ViPS
30+ international support offices
www.cosor t.com
1-800-333-SORT
What We Sell
CoSORT
High Performance Sorting, Reporting, ETL
RowGen
Safe Test Data in Custom File Formats
FAst extraCT (FACT)
High Performance Oracle Unload
CoSORT Platforms
netCONVERT
Legacy File Migration using Copybook layouts
ALL 32 & 64-bit UNIX
(AIX, HP-UX, Solaris, Tru64, IRIX,
MP-RAS, DG/UX, SINIX, ptx, etc.)
Logon Security
Granular UNIX Account Access Control and Audit
ALL 32 & 64-bit LINUX
(RHEL, SLES, Debian, Fedora,
Debian, Mandrake, Gentoo, Turbo,
WOW, AsianUX, Ubuntu, etc.)
Windows XP, NT, 2K/3, DC
IBM i, p, x & z Series
x-PRESS
Comprehensive Cross-Platform Compression Suite
Permitas
Software Licensing Libraries and Management
When You’d Want:
CoSORT
Fast, Large File Transformation
Select, Sort, Join, Convert, Aggregate, Reformat
Legacy Sort Migration
From: VS/VSE JCL, SS Unix, Natural Sort, et al
Data and File Format Conversion
e.g. EBCDIC to ASCII, MF-ISAM to CSV
Custom Reporting & Hand-off
Multi-target/format, segmented detail/summary BI
DB Loader Acceleration
Pre-sort flat files on primary index key
ETL Tool & Application Acceleration
Sort & Metadata Hooks for DataStage, Informatica
RowGen
Test Data & Safe File Synthesis
FAst extraCT (FACT)
Oracle Unload, Reorg & ETL
CoSORT Functionality
High-Performance Record Processing
–
–
–
–
–
Sort, Copy, Merge, Join, Check
Input / Output Conditional Filter/Select, De-Dupe
Aggregate (Sum, Min, Max, Count, Average), Sequence
Cross-Calculate (Expression Logic)
Segment, Re-map, Re-format, Report (Custom Layouts)
File Integration & Transformation
–
–
–
–
–
–
–
Sequential files (Line, Record, Variable)
Unisys VB & Blocked format files
MF-ISAM & ACUCOBOL Vision (Index) files
Named & Un-named Pipes
Records-in-memory
Custom input, compare & output procedures (User Exits)
Coming soon: Huge LDIF & Flat XML Sources
CoSORT Functionality
Data Type Collation & Conversion
ASCII
Binary integer
Bit
Character
Date/timestamps
IP Address
Currency
Double
EBCDIC
Edited numeric
Embedded sign
Float
Packed decimal
Whole number
Unicode
Unsigned decimal
Miscellaneous
–
–
–
–
Replacement and Conversion of 3rd-Party Sorts
LOCALE (operating system defined) collation
Thread-safe APIs
Granular resource tuning and monitoring
CoSORT User Interfaces
Sort Replacements
– Seamless drop-in services for many third-party sorts
Sort-I (Sort Interactive)
– Command line prompt/batch program for novices
SortCL (Sort Control Language)
– mainframe-familiar DDL/DML for JCL sort migration,
data warehouse integration/staging (ETL), + reporting
– CLI, API, Java GUI for cross-platform design/launch
SortCL Conversion Tools
– for third-party metadata and legacy sort parms
Application Programming Interfaces (APIs)
– 3 callable libraries for third-party software integration
Leveraging What’s There
“The maturing IT industry can
no longer propagate the notion of
scrapping previous investments to
adopt new technologies.
Billions of dollars have already
been invested in hardware,
operating systems and applications.
Our solutions integrate with
what is already there, and thus,
can deliver exceptional ROI.”
Norman Praed, CEO
Progeni Corporation
Sort Interface Support for:
–
–
–
–
–
–
–
–
–
–
–
Amdocs Ensemble
Ascential (IBM WebSphere) DataStage
Cincom Supra SQL
IBM DB2 UDB Loader
Informatica PowerMart/Center
MF COBOL (Workbench, Srv/Net Express)
SAS System
Software AG Natural
Sun Mainframe Rehosting MTP/MBM
SyncSort UNIX (via script conversion)
Unix (/bin/sort)
Metadata Re-Use Support for:
–
–
–
–
–
MVS and VSE JCL sort parms
COBOL Copybooks
Common & Extended Web Log formats
Microsoft CSV files
ETL, BI, XML and RDB file formats via MIMB
1) Copy $COSORT_HOME/etc/Makefile.nat2cs
into your ~sag/nat/vxxx/bin/build directory
2) Install the replacement with:
cd ~sag/nat/vxxx/bin/build
mv Makefile Makefile.orig
cp Makefile.nat2cs Makefile
3) Uncomment Makefile LIB_COSORT entry for your O/S
4) Link with:
make natural cosort=yes
5) Run with:
setenv PATH $PATH:$COSORT_HOME/bin
natural [...]
6) Enable debugging by setting the environment variable:
NAT2SCL_DEBUG=1.
Coroutine SORT Architecture
+ Parallel CPU Exploitation
19 MB in 2 seconds on P200/2 w/ NT, 2 keys
1.0 GB in 12 seconds on IBM p690/4
1.8 GB in 67 seconds on Compaq GS140/8
2.4 GB in 39 seconds on SunFire 15K/6
5.2 GB in 20 minutes on Sun UE3000, 23 keys
272 GB in <2 hours on IBM Numa-Q 2000/4
setting a 2000 TPC-H DSS benchmark record.
Sort Control Language (SortCL)
CoSORT’s
SortCL …
is used for
very fast
data integration
and staging:
extract –
transform –
load (ETL)
operations
on multiple,
large external
(flat file)
data sources.
TDWI’s Customer Intelligence Lifecycle
SortCL: Single Pass
Data Manipulations
… through many, large, differently-formatted inputs:
Sort/Merge
Join
Select
Convert
Aggregate
Calculate
Re-map
Report
User Exits
on any number of keys in any position
matching 1-1, many-1, inner/outer, left/right
via record filters or conditional include/omit
translate input field data types to new types
min, max, average, sum, count (sub and grand)
across rows to perform math (+ sci functions)
change field positions, sizes, and values
to highly-formatted, multi-level output targets
for custom input, compare and output criteria
SortCL Also Re-hosts JCL Sorts.
Consider Tetrad’s OPX:
Operational Processing for UNIX
Tools and architecture to manage jobs on UNIX
Separates job definitions from job processing
Easy integration with 3rd party tools like:
Natural, CoSORT, MF COBOL
Developed in Perl for multi-platform use
Not dependent on Software AG products
How OPX Rehosts via “Job Wrappers”
OPX provides job access to programs
through the use of “Wrappers”
A wrapper is a script or program that
facilitates the interface between the OPX job
and an external program or process
A sample set of wrappers are provided with
OPX (including Natural, SORT, IEFBR14,
GENER, FTP)
OPX SORT Wrapper
Integration with CoSORT
SORT Wrapper
Creates pseudo JCL
for CoSORT
pseudo-JCL
for CoSORT
Translation
CoSORT mvs2scl
conversion utility
Final CoSORT
.scl statements to
perform sort
SORT Wrapper
Pre-processes the
.scl statements
Converted
Sort Control
Language (.scl)
SORT Wrapper
Calls CoSORT
SortCL utility
CoSORT SortCL
performs the .scl
job as requested
Done!
OPX SORT Wrapper Example
Can we run this on Unix?
//SUMS
//
//STEPLIB
//SYSOUT
//SORTIN
//
//
//SORTOUT
//
//
//SORTWK01
//SYSIN
SORT
SUM
/*
JOB
EXEC
DD
DD
DD
PGM=SORT
DSN=SORT.RESI.DENCE,DISP=SHR
SYSOUT=A
DSN=chiefs30.votes,
UNIT=2400-3,VOL=SER=887766,
DISP=(OLD,KEEP)
DD
DSN=termsums,
UNIT=2400-3,VOL=SER=554433,
DD
DISP=(NEW,KEEP)
DD
UNIT=SYSDA,SPACE=(CYL,20)
DD
*
FIELDS=(40,3,CH,A,45,2,CH,A)
FIELDS=(23,3,CH)
...with OPX and CoSORT - Sure!
1
2
3
4
5
6
7
8
9
10
11
OPX SORT Wrapper Example
What does the CoSORT SortCL Version Look Like?
/INFILES=chiefs30.votes
/FIELD=(field_0, POSITION=40, SIZE=3, EBCDIC)
/FIELD=(field_1, POSITION=45, SIZE=2, EBCDIC)
/FIELD=(field_2, POSITION=23, SIZE=3, EBCDIC)
/CONDITION=(cond_0, TEST=(field_0 OR field_1))
/SORT
/KEY=(field_0, ASCENDING)
/KEY=(field_1, ASCENDING)
/OUTFILE=termsums
/FIELD=(field_0, POSITION=40, SIZE=3, EBCDIC)
/FIELD=(field_1, POSITION=45, SIZE=2, EBCDIC)
/FIELD=(field_2_sum, POSITION=23, SIZE=3, EBCDIC)
/SUM field_2_sum FROM field_2 BREAK cond_0
Same job, easier to expand …
Sort Control Language (SortCL)
CLI/Batch Mode Example (Join)
################################################
# CoSORT/SortCL Job Spec file csumtv_join_1.spec
# gets run with $sortcl /spec=csumtv_join_1.spec
# Created 09/18/2006 15:20EST
by DW_team_alpha
# Conditional indexed summary join report + calc
################################################
/STATISTICS=/warehouse/stats/csumptv_join.sta
/MEMORY-WORK="$COSORT_HOME/etc/cosortrc“
/SPEC=/warehouse/views/csumtv
# runtime performance log
# job-specific tuning file
# metadata, condition references
/INFILE=${SQL_DATA}csumtv_temp
/ALIAS=left
/INFILE=${SQL_DATA}csumtv_ICN_temp
/ALIAS=right
# first input/join file (EV)
# second input/join file (EV)
/JOIN LEFT_OUTER left right WHERE left.TID==right.TID AND left.SSN==right.SSN
/OUTFILE=${SQL_DATA}csumtv.report
# new record layout, data types
/HEADREC=“>>>>> STARS Report >>>> %D, DATE, %S, USER ****”
/FIELD=(left.YDATE, SEPARATOR='~', POSITION=1, SIZE=4.2, NUMERIC)
/FIELD=(left.TID, SEPARATOR=‘\t', POSITION=2, SIZE=10, EBCDIC)
/DATA="~*“
/FIELD=(left.SSN, SEPARATOR=‘,', POSITION=3, SIZE=22, ASCII)
/FIELD=(left.sum_DAYS_SUPPLY, SEPARATOR=‘|', POSITION=4, SIZE=12, INTEGER)
/FIELD=(left.sum_A_AMT, SEPARATOR='~', POSITION=5, SIZE=15.2, NUMERIC)
/DATA= IF Cond1 THEN "~ 0.0“ THEN IF Cond2 ELSE left.YDATE+1900
/SUM RUNNING FROM left.PERIOD WHERE COND3
# accumulating aggregate
/FIELD=(SEQUENCER+50, SEPARATOR=‘|', POSITION=6, SIZE=5) # DB reindexer
CoSORT’s Java GUI–to–SortCL (gui2scl) Client/Server Sort/ETL Application
CoSORT’s Java GUI–to–SortCL (gui2scl) Client/Server Sort/ETL Application
Using CoSORT SortCL
with Natural for Unix
CALL shcmd sortcl /spec=file.scl
/INFILE=Natural WORK FILE
/OUTFILE=Natural WORK FILE
Benefits:
–
–
–
–
1-pass, multi-file integration, staging, and reporting
No limit on # of in/output files
No limits on file sizes or layouts
Same metadata as RowGen –
so …. The same SortCL job script can be
used in RowGen to generate safe test data
in exactly the same file format.
CoSORT Value
Affordable ($1K - $25K)
Perpetual use (not a lease)
Volume license discounts
ISV / OEM runtime pricing
GSA schedule, state bidder
Global support (30+ offices)
Product bundling discounts:
–
–
–
–
–
–
FAst extraCT (DB unload)
RowGen (safe test data)
netCONVERT (file porting)
x-PRESS (compression)
Logon Security (access)
Permitas (app licensing)
CoSORT’s
RowGen
Data Synthesizer
New Product!
Create Custom Files with Safe Data
(Using SortCL Metadata!)
Prototype Applications
– Create data and file formats your projects need
Share Files with Outsourcers
– Provide accurate layouts, not real data
Specify Value Ranges
– Use selection and set files: better than real data
Simulate DB Ops
– Quickly test table loading and query scenarios
Benchmark Testing
– Gen big files for hardware and software PoCs
Using
Oracle?
Use FACT
to Speed
Extracts
and write
metadata for
SortCL and
SQL*Loader.
Single-pass
entire E-T-L
operations
through a
pipe!
Now You Know.
CoSORT is the innovator in UNIX and Windows sort
software, and a key infrastructure tool for the staging,
integrating, manipulating and presenting of large data
volumes. Since 1978, IT installations have chosen
CoSORT to meet their project and performance
objectives in:
Natural & JCL sort migrations
VLDB reorg (unload, sort, reload)
Data warehouse staging (ETL)
Detail and summary reporting
3rd-party sort replacements
Batch jobs and new products