No Slide Title

Download Report

Transcript No Slide Title

Faster Sorting, Manipulation, Reporting & Test Data Speed High-Volume File Processing In/Outside Natural
www.cosort.com
Copyright 2006, IRI, Inc.
Who We Are
Innovative Routines International (IRI), Inc.

ISV - HQ in Melbourne, FL
Founded 1978 – Manhasset, NY
World’s
1st commercial sorts for:
CP/M, DOS, UNIX, Windows, Linux

Experts in sorting and data manipulation

Recommended by all UNIX H/W vendors

Embedded by leading ISVs like:
Cincom, Clerity, EDIWatch,
Experian, Fiserv, Kalido,
Mereo, Sabre, SPSS, ViPS

30+ international support offices
www.cosor t.com
1-800-333-SORT
What We Sell
CoSORT
High Performance Sorting, Reporting, ETL
RowGen
Safe Test Data in Custom File Formats
FAst extraCT (FACT)
High Performance Oracle Unload
CoSORT Platforms
netCONVERT
Legacy File Migration using Copybook layouts
ALL 32 & 64-bit UNIX
(AIX, HP-UX, Solaris, Tru64, IRIX,
MP-RAS, DG/UX, SINIX, ptx, etc.)
Logon Security
Granular UNIX Account Access Control and Audit
ALL 32 & 64-bit LINUX
(RHEL, SLES, Debian, Fedora,
Debian, Mandrake, Gentoo, Turbo,
WOW, AsianUX, Ubuntu, etc.)
Windows XP, NT, 2K/3, DC
IBM i, p, x & z Series
x-PRESS
Comprehensive Cross-Platform Compression Suite
Permitas
Software Licensing Libraries and Management
When You’d Want:
CoSORT
Fast, Large File Transformation
Select, Sort, Join, Convert, Aggregate, Reformat
Legacy Sort Migration
From: VS/VSE JCL, SS Unix, Natural Sort, et al
Data and File Format Conversion
e.g. EBCDIC to ASCII, MF-ISAM to CSV
Custom Reporting & Hand-off
Multi-target/format, segmented detail/summary BI
DB Loader Acceleration
Pre-sort flat files on primary index key
ETL Tool & Application Acceleration
Sort & Metadata Hooks for DataStage, Informatica
RowGen
Test Data & Safe File Synthesis
FAst extraCT (FACT)
Oracle Unload, Reorg & ETL
CoSORT Functionality

High-Performance Record Processing
–
–
–
–
–

Sort, Copy, Merge, Join, Check
Input / Output Conditional Filter/Select, De-Dupe
Aggregate (Sum, Min, Max, Count, Average), Sequence
Cross-Calculate (Expression Logic)
Segment, Re-map, Re-format, Report (Custom Layouts)
File Integration & Transformation
–
–
–
–
–
–
–
Sequential files (Line, Record, Variable)
Unisys VB & Blocked format files
MF-ISAM & ACUCOBOL Vision (Index) files
Named & Un-named Pipes
Records-in-memory
Custom input, compare & output procedures (User Exits)
Coming soon: Huge LDIF & Flat XML Sources
CoSORT Functionality

Data Type Collation & Conversion
ASCII
Binary integer
Bit
Character
Date/timestamps
IP Address
Currency
Double

EBCDIC
Edited numeric
Embedded sign
Float
Packed decimal
Whole number
Unicode
Unsigned decimal
Miscellaneous
–
–
–
–
Replacement and Conversion of 3rd-Party Sorts
LOCALE (operating system defined) collation
Thread-safe APIs
Granular resource tuning and monitoring
CoSORT User Interfaces

Sort Replacements
– Seamless drop-in services for many third-party sorts

Sort-I (Sort Interactive)
– Command line prompt/batch program for novices

SortCL (Sort Control Language)
– mainframe-familiar DDL/DML for JCL sort migration,
data warehouse integration/staging (ETL), + reporting
– CLI, API, Java GUI for cross-platform design/launch

SortCL Conversion Tools
– for third-party metadata and legacy sort parms

Application Programming Interfaces (APIs)
– 3 callable libraries for third-party software integration
Leveraging What’s There

“The maturing IT industry can
no longer propagate the notion of
scrapping previous investments to
adopt new technologies.
Billions of dollars have already
been invested in hardware,
operating systems and applications.
Our solutions integrate with
what is already there, and thus,
can deliver exceptional ROI.”
Norman Praed, CEO
Progeni Corporation
Sort Interface Support for:
–
–
–
–
–
–
–
–
–
–
–

Amdocs Ensemble
Ascential (IBM WebSphere) DataStage
Cincom Supra SQL
IBM DB2 UDB Loader
Informatica PowerMart/Center
MF COBOL (Workbench, Srv/Net Express)
SAS System
Software AG Natural
Sun Mainframe Rehosting MTP/MBM
SyncSort UNIX (via script conversion)
Unix (/bin/sort)
Metadata Re-Use Support for:
–
–
–
–
–
MVS and VSE JCL sort parms
COBOL Copybooks
Common & Extended Web Log formats
Microsoft CSV files
ETL, BI, XML and RDB file formats via MIMB
1) Copy $COSORT_HOME/etc/Makefile.nat2cs
into your ~sag/nat/vxxx/bin/build directory
2) Install the replacement with:
cd ~sag/nat/vxxx/bin/build
mv Makefile Makefile.orig
cp Makefile.nat2cs Makefile
3) Uncomment Makefile LIB_COSORT entry for your O/S
4) Link with:
make natural cosort=yes
5) Run with:
setenv PATH $PATH:$COSORT_HOME/bin
natural [...]
6) Enable debugging by setting the environment variable:
NAT2SCL_DEBUG=1.
Coroutine SORT Architecture
+ Parallel CPU Exploitation
 19 MB in 2 seconds on P200/2 w/ NT, 2 keys
 1.0 GB in 12 seconds on IBM p690/4
 1.8 GB in 67 seconds on Compaq GS140/8
 2.4 GB in 39 seconds on SunFire 15K/6
 5.2 GB in 20 minutes on Sun UE3000, 23 keys
 272 GB in <2 hours on IBM Numa-Q 2000/4
setting a 2000 TPC-H DSS benchmark record.
Sort Control Language (SortCL)
CoSORT’s
SortCL …
is used for
very fast
data integration
and staging:
extract –
transform –
load (ETL)
operations
on multiple,
large external
(flat file)
data sources.
TDWI’s Customer Intelligence Lifecycle
SortCL: Single Pass
Data Manipulations
… through many, large, differently-formatted inputs:









Sort/Merge
Join
Select
Convert
Aggregate
Calculate
Re-map
Report
User Exits
on any number of keys in any position
matching 1-1, many-1, inner/outer, left/right
via record filters or conditional include/omit
translate input field data types to new types
min, max, average, sum, count (sub and grand)
across rows to perform math (+ sci functions)
change field positions, sizes, and values
to highly-formatted, multi-level output targets
for custom input, compare and output criteria
SortCL Also Re-hosts JCL Sorts.
Consider Tetrad’s OPX:
Operational Processing for UNIX

Tools and architecture to manage jobs on UNIX

Separates job definitions from job processing

Easy integration with 3rd party tools like:
Natural, CoSORT, MF COBOL

Developed in Perl for multi-platform use

Not dependent on Software AG products
How OPX Rehosts via “Job Wrappers”

OPX provides job access to programs
through the use of “Wrappers”

A wrapper is a script or program that
facilitates the interface between the OPX job
and an external program or process

A sample set of wrappers are provided with
OPX (including Natural, SORT, IEFBR14,
GENER, FTP)
OPX SORT Wrapper
Integration with CoSORT
SORT Wrapper
Creates pseudo JCL
for CoSORT
pseudo-JCL
for CoSORT
Translation
CoSORT mvs2scl
conversion utility
Final CoSORT
.scl statements to
perform sort
SORT Wrapper
Pre-processes the
.scl statements
Converted
Sort Control
Language (.scl)
SORT Wrapper
Calls CoSORT
SortCL utility
CoSORT SortCL
performs the .scl
job as requested
Done!
OPX SORT Wrapper Example
Can we run this on Unix?
//SUMS
//
//STEPLIB
//SYSOUT
//SORTIN
//
//
//SORTOUT
//
//
//SORTWK01
//SYSIN
SORT
SUM
/*
JOB
EXEC
DD
DD
DD
PGM=SORT
DSN=SORT.RESI.DENCE,DISP=SHR
SYSOUT=A
DSN=chiefs30.votes,
UNIT=2400-3,VOL=SER=887766,
DISP=(OLD,KEEP)
DD
DSN=termsums,
UNIT=2400-3,VOL=SER=554433,
DD
DISP=(NEW,KEEP)
DD
UNIT=SYSDA,SPACE=(CYL,20)
DD
*
FIELDS=(40,3,CH,A,45,2,CH,A)
FIELDS=(23,3,CH)
...with OPX and CoSORT - Sure!
1
2
3
4
5
6
7
8
9
10
11
OPX SORT Wrapper Example
What does the CoSORT SortCL Version Look Like?
/INFILES=chiefs30.votes
/FIELD=(field_0, POSITION=40, SIZE=3, EBCDIC)
/FIELD=(field_1, POSITION=45, SIZE=2, EBCDIC)
/FIELD=(field_2, POSITION=23, SIZE=3, EBCDIC)
/CONDITION=(cond_0, TEST=(field_0 OR field_1))
/SORT
/KEY=(field_0, ASCENDING)
/KEY=(field_1, ASCENDING)
/OUTFILE=termsums
/FIELD=(field_0, POSITION=40, SIZE=3, EBCDIC)
/FIELD=(field_1, POSITION=45, SIZE=2, EBCDIC)
/FIELD=(field_2_sum, POSITION=23, SIZE=3, EBCDIC)
/SUM field_2_sum FROM field_2 BREAK cond_0
Same job, easier to expand …
Sort Control Language (SortCL)
CLI/Batch Mode Example (Join)
################################################
# CoSORT/SortCL Job Spec file csumtv_join_1.spec
# gets run with $sortcl /spec=csumtv_join_1.spec
# Created 09/18/2006 15:20EST
by DW_team_alpha
# Conditional indexed summary join report + calc
################################################
/STATISTICS=/warehouse/stats/csumptv_join.sta
/MEMORY-WORK="$COSORT_HOME/etc/cosortrc“
/SPEC=/warehouse/views/csumtv
# runtime performance log
# job-specific tuning file
# metadata, condition references
/INFILE=${SQL_DATA}csumtv_temp
/ALIAS=left
/INFILE=${SQL_DATA}csumtv_ICN_temp
/ALIAS=right
# first input/join file (EV)
# second input/join file (EV)
/JOIN LEFT_OUTER left right WHERE left.TID==right.TID AND left.SSN==right.SSN
/OUTFILE=${SQL_DATA}csumtv.report
# new record layout, data types
/HEADREC=“>>>>> STARS Report >>>> %D, DATE, %S, USER ****”
/FIELD=(left.YDATE, SEPARATOR='~', POSITION=1, SIZE=4.2, NUMERIC)
/FIELD=(left.TID, SEPARATOR=‘\t', POSITION=2, SIZE=10, EBCDIC)
/DATA="~*“
/FIELD=(left.SSN, SEPARATOR=‘,', POSITION=3, SIZE=22, ASCII)
/FIELD=(left.sum_DAYS_SUPPLY, SEPARATOR=‘|', POSITION=4, SIZE=12, INTEGER)
/FIELD=(left.sum_A_AMT, SEPARATOR='~', POSITION=5, SIZE=15.2, NUMERIC)
/DATA= IF Cond1 THEN "~ 0.0“ THEN IF Cond2 ELSE left.YDATE+1900
/SUM RUNNING FROM left.PERIOD WHERE COND3
# accumulating aggregate
/FIELD=(SEQUENCER+50, SEPARATOR=‘|', POSITION=6, SIZE=5) # DB reindexer
CoSORT’s Java GUI–to–SortCL (gui2scl) Client/Server Sort/ETL Application
CoSORT’s Java GUI–to–SortCL (gui2scl) Client/Server Sort/ETL Application
Using CoSORT SortCL
with Natural for Unix

CALL shcmd sortcl /spec=file.scl

/INFILE=Natural WORK FILE

/OUTFILE=Natural WORK FILE

Benefits:
–
–
–
–
1-pass, multi-file integration, staging, and reporting
No limit on # of in/output files
No limits on file sizes or layouts
Same metadata as RowGen –
so …. The same SortCL job script can be
used in RowGen to generate safe test data
in exactly the same file format.
CoSORT Value

Affordable ($1K - $25K)

Perpetual use (not a lease)

Volume license discounts

ISV / OEM runtime pricing

GSA schedule, state bidder

Global support (30+ offices)

Product bundling discounts:
–
–
–
–
–
–
FAst extraCT (DB unload)
RowGen (safe test data)
netCONVERT (file porting)
x-PRESS (compression)
Logon Security (access)
Permitas (app licensing)
CoSORT’s
RowGen
Data Synthesizer
New Product!
Create Custom Files with Safe Data
(Using SortCL Metadata!)

Prototype Applications
– Create data and file formats your projects need

Share Files with Outsourcers
– Provide accurate layouts, not real data

Specify Value Ranges
– Use selection and set files: better than real data

Simulate DB Ops
– Quickly test table loading and query scenarios

Benchmark Testing
– Gen big files for hardware and software PoCs
Using
Oracle?
Use FACT
to Speed
Extracts
and write
metadata for
SortCL and
SQL*Loader.
Single-pass
entire E-T-L
operations
through a
pipe!
Now You Know.
CoSORT is the innovator in UNIX and Windows sort
software, and a key infrastructure tool for the staging,
integrating, manipulating and presenting of large data
volumes. Since 1978, IT installations have chosen
CoSORT to meet their project and performance
objectives in:

Natural & JCL sort migrations

VLDB reorg (unload, sort, reload)

Data warehouse staging (ETL)

Detail and summary reporting

3rd-party sort replacements

Batch jobs and new products