Tarun Jain, ABB Inc, Extreme Faceting using SOLR Case study at ABB Inc © ABB Group November 6, 2015 | Slide 1

Download Report

Transcript Tarun Jain, ABB Inc, Extreme Faceting using SOLR Case study at ABB Inc © ABB Group November 6, 2015 | Slide 1

Tarun Jain, ABB Inc,
Extreme Faceting using SOLR
Case study at ABB Inc
© ABB Group
November 6, 2015 | Slide 1
About ABB Inc
© ABB Group
November 6, 2015 | Slide 2

Global leader in Power & Industrial Automation
technologies

World’s largest producer of indutrial robots

Presence in 100+ countries

2009 revenues USD 33+ billion
About “ABB Products”
© ABB Group
November 6, 2015 | Slide 3

“ABB Products” is the central repository of all
product/catalog information within ABB

We maintain the master data related to Product
classification

We maintain master product attribute data for ~485,000
products with 21 million+ attributes

Classification tree structure has 45,000+ nodes and
maintained in 31 languages

Product attributes are translated in 10 languages

Connected to 18 different downstream applications within
ABB and 5 external applications
About “Product Information Services”
© ABB Group
November 6, 2015 | Slide 4

Sub-project of ABB Products

Started in 2008

Provides ABB Product catalog search services to
downstream applications

ABB.com & ABB BusinessOnLine (BOL) are main
consuming applications.

Several more applications in pipeline to start using search
services
Details about Product Information Systems
© ABB Group
November 6, 2015 | Slide 5

6+ million hits per month from abb.com & ABB BOL

420,000 items with 20+ million attribute values indexed

1200+ attribute types

31 Languages

Running on 2 load-balanced dual-processor quad-core
machines with 16 GB of RAM

Software used are

Windows Server 2003 OS

ASP.Net front-end used to create WebControl

Backend web-services using ASP.Net & WCF

SOLR 1.4 using Tomcat with 3 gb RAM

Oracle DB
Product Information Services - Features
© ABB Group
November 6, 2015 | Slide 6

Text search services

Advanced text search services

Browsing services (Navigation through Classification
Trees)

Facet search filters

Attribute Group List Resolution to Classification Nodes

One general Web Control to support Navigation,
Faceting/List page and Item Detail page

Web Services

Accurate hit counts everywhere
Challenges
© ABB Group
November 6, 2015 | Slide 7

Search & Classification tree results to be filtered on
Country, Customer, Consuming application etc..

Faceting on any of the 1200+ attributes

Hit counts needed to be accurate

Support ever growing languages

Same codebase for all 3 major consuming application

Index updates at-least 3 times a day

Average response time less than 500 ms

And most importantly.. “Everything should be always fast”
Solr vs “Large Commercial Vendor”… Fight !!
© ABB Group
November 6, 2015 | Slide 8

SOLR was compared to another major commercial
product

Stress test results in Proof of concept…

SOLR 35 req/sec vs 2 req/sec

Average response times 200 ms vs 1-7 secs

CPU usage 2-3% vs 100%

Sadly matchup was not even close (at least for the
scenarios we tested for)

Conclusion .. Performance of SOLR is inversely
proportional to the cost

Winner – SOLR by a KO 
Observations

Going from SOLR 1.3 to 1.4, faceting performance improved 1.5x2.0x

SOLR has no issue with scaling to 1500+ facets

Java based index replication is faster than rsync (atleast on
windows)

Tagging filters and excluding them during faceting is cool feature &
super useful

While using REST api use chunky instead of chatty calls

Configure Tomcat/Java to minimize GC as much as possible

Do periodic cleanups of SOLR schema to minimize stored values,
indexed values, optimize types & fields etc.. Minimize unused bits
in the index

Minimize xml data to/from SOLR

Store all lookup data in memory. Cache as much as possible

Minimize database calls
© ABB Group
November 6, 2015 | Slide 9
DEMO
© ABB Group
November 6, 2015 | Slide 10