Tarun Jain, ABB Inc, Extreme Faceting using SOLR Case study at ABB Inc © ABB Group November 6, 2015 | Slide 1
Download
Report
Transcript Tarun Jain, ABB Inc, Extreme Faceting using SOLR Case study at ABB Inc © ABB Group November 6, 2015 | Slide 1
Tarun Jain, ABB Inc,
Extreme Faceting using SOLR
Case study at ABB Inc
© ABB Group
November 6, 2015 | Slide 1
About ABB Inc
© ABB Group
November 6, 2015 | Slide 2
Global leader in Power & Industrial Automation
technologies
World’s largest producer of indutrial robots
Presence in 100+ countries
2009 revenues USD 33+ billion
About “ABB Products”
© ABB Group
November 6, 2015 | Slide 3
“ABB Products” is the central repository of all
product/catalog information within ABB
We maintain the master data related to Product
classification
We maintain master product attribute data for ~485,000
products with 21 million+ attributes
Classification tree structure has 45,000+ nodes and
maintained in 31 languages
Product attributes are translated in 10 languages
Connected to 18 different downstream applications within
ABB and 5 external applications
About “Product Information Services”
© ABB Group
November 6, 2015 | Slide 4
Sub-project of ABB Products
Started in 2008
Provides ABB Product catalog search services to
downstream applications
ABB.com & ABB BusinessOnLine (BOL) are main
consuming applications.
Several more applications in pipeline to start using search
services
Details about Product Information Systems
© ABB Group
November 6, 2015 | Slide 5
6+ million hits per month from abb.com & ABB BOL
420,000 items with 20+ million attribute values indexed
1200+ attribute types
31 Languages
Running on 2 load-balanced dual-processor quad-core
machines with 16 GB of RAM
Software used are
Windows Server 2003 OS
ASP.Net front-end used to create WebControl
Backend web-services using ASP.Net & WCF
SOLR 1.4 using Tomcat with 3 gb RAM
Oracle DB
Product Information Services - Features
© ABB Group
November 6, 2015 | Slide 6
Text search services
Advanced text search services
Browsing services (Navigation through Classification
Trees)
Facet search filters
Attribute Group List Resolution to Classification Nodes
One general Web Control to support Navigation,
Faceting/List page and Item Detail page
Web Services
Accurate hit counts everywhere
Challenges
© ABB Group
November 6, 2015 | Slide 7
Search & Classification tree results to be filtered on
Country, Customer, Consuming application etc..
Faceting on any of the 1200+ attributes
Hit counts needed to be accurate
Support ever growing languages
Same codebase for all 3 major consuming application
Index updates at-least 3 times a day
Average response time less than 500 ms
And most importantly.. “Everything should be always fast”
Solr vs “Large Commercial Vendor”… Fight !!
© ABB Group
November 6, 2015 | Slide 8
SOLR was compared to another major commercial
product
Stress test results in Proof of concept…
SOLR 35 req/sec vs 2 req/sec
Average response times 200 ms vs 1-7 secs
CPU usage 2-3% vs 100%
Sadly matchup was not even close (at least for the
scenarios we tested for)
Conclusion .. Performance of SOLR is inversely
proportional to the cost
Winner – SOLR by a KO
Observations
Going from SOLR 1.3 to 1.4, faceting performance improved 1.5x2.0x
SOLR has no issue with scaling to 1500+ facets
Java based index replication is faster than rsync (atleast on
windows)
Tagging filters and excluding them during faceting is cool feature &
super useful
While using REST api use chunky instead of chatty calls
Configure Tomcat/Java to minimize GC as much as possible
Do periodic cleanups of SOLR schema to minimize stored values,
indexed values, optimize types & fields etc.. Minimize unused bits
in the index
Minimize xml data to/from SOLR
Store all lookup data in memory. Cache as much as possible
Minimize database calls
© ABB Group
November 6, 2015 | Slide 9
DEMO
© ABB Group
November 6, 2015 | Slide 10