THE 1ST NATIONAL PUC DOCKETS DATABASE: AEE POWERSUITE Eric Fitz Director, Engineering and Product Development NARUC Subcommittee on Information Services November 2014

Download Report

Transcript THE 1ST NATIONAL PUC DOCKETS DATABASE: AEE POWERSUITE Eric Fitz Director, Engineering and Product Development NARUC Subcommittee on Information Services November 2014

THE 1ST NATIONAL
PUC DOCKETS DATABASE:
AEE POWERSUITE
Eric Fitz
Director, Engineering and Product Development
NARUC Subcommittee on Information Services
November 2014
Advanced Energy Economy
AEE is a national association of business leaders
who are making the global energy system more
secure, clean, and affordable.
Mission: Transform public policy to enable rapid
growth of advanced energy companies.
2
AEE Membership Across Technologies
3
Two Energy Policy Data Problems:
1) Fragmented Data
2) Big Data
+
4
Industry Data Is Fragmented
1
EIA
DSIRE
You must follow
dozens of data
sources to track
important issues.
OpenEI
NREL
CA
PUCs
M
A
TX
IL
CT
2
3
Databases
Industry Stakeholder Groups
Big Data
Policy work is plagued by the three “Vs”
• Volume of policy data
• Variety of legislative/regulatory processes
• Velocity of data change
AEE Digital Platform Vision
EIA
AEE Big
Data Asset
DSIRE
NREL
CA
PUCs
M
A
TX
IL
CT
OpenEI
Databases
Industry Stakeholder Groups
The Solution – AEE’s PowerSuite
PowerSuite is robust set of tools – including BillBoard,
DocketDash, and PowerPortal – that allows you to
search, track, and collaborate on energy legislation and
utility regulatory proceedings from across the country
with one, easy-to-use interface.
8
PowerSuite Products
9
Review of Features
Core Features
Search
Track
Collaborate
• First national PUC database
• Advanced energy focused bills
• Simple interface
• Email notifications
• Favorites
• Reporting
• Summaries
• Priority and Position
• Comments
10
User Testimonial
Jim Kennerly
Senior Policy Analyst
“PowerSuite is really amazing…I've already
discovered some incentives in California (tax
exemptions and such) we didn't even have in the
database! This is really going to help us
tremendously - great product.”
11
DEMO
DocketDash System Details
SH
13
DocketDash Coverage: 46 States + DC
Under Development
Quality Assessment (QA) Pending
Review Completed
14
DocketDash Key Stats
Dockets
190K
Documents
2.6M
900GB of pdfs
Pages
32M
60GB of raw text
15
Number of Pages: Wikipedia vs. DocketDash
VS.
# Pages [Millions]
40
DocketDash
30
20
34M*
32M
DocketDash will
surpass
Wikipedia’s
total content in
a few months.
10
0
*As of November 2014, http://en.Wikipedia.Org/wiki/wikipedia:statistics
16
DocketDash Technology Stack
Collect
Display
(User Interface)
Process
Store
Index
CA
PUCs
M
A
TX
IL
CT
Adapt
A
B
C
Bills
AEE Big
Data Asset
Dockets
Technology Stack Detail
Collect
• Dynamic docket metadata collection at off-peak hours
Adapt
• Map source schema to AEE standard
Download
Docket #, Title, Description, Parties, Date...
• Queue downloads and identify scanned documents
Process
Scanned
Document
OCR PIPELINE
Reassembled
PDF
Extracted
Text
20 CPU-Years
Validate
• Review metadata and check for failures
OCR = Optical Character Recognition
18
What Have We Learned?
• PUC docket sites vary dramatically state by state
• Usability
• Permalinks
• Search
• Data structure
• Nomenclature
• Digital vs. paper system
• Creating a standardized docket system is hard
19
PowerSuite is FREE
For federal, state, and municipal
government employees
Create an account today >
PowerSuite.aee.net
QUESTIONS?