Tarun Jain, ABB Inc, Extreme Faceting using SOLR Case study at ABB Inc © ABB Group November 6, 2015 | Slide 1
Download ReportTranscript Tarun Jain, ABB Inc, Extreme Faceting using SOLR Case study at ABB Inc © ABB Group November 6, 2015 | Slide 1
Tarun Jain, ABB Inc, Extreme Faceting using SOLR Case study at ABB Inc © ABB Group November 6, 2015 | Slide 1 About ABB Inc © ABB Group November 6, 2015 | Slide 2 Global leader in Power & Industrial Automation technologies World’s largest producer of indutrial robots Presence in 100+ countries 2009 revenues USD 33+ billion About “ABB Products” © ABB Group November 6, 2015 | Slide 3 “ABB Products” is the central repository of all product/catalog information within ABB We maintain the master data related to Product classification We maintain master product attribute data for ~485,000 products with 21 million+ attributes Classification tree structure has 45,000+ nodes and maintained in 31 languages Product attributes are translated in 10 languages Connected to 18 different downstream applications within ABB and 5 external applications About “Product Information Services” © ABB Group November 6, 2015 | Slide 4 Sub-project of ABB Products Started in 2008 Provides ABB Product catalog search services to downstream applications ABB.com & ABB BusinessOnLine (BOL) are main consuming applications. Several more applications in pipeline to start using search services Details about Product Information Systems © ABB Group November 6, 2015 | Slide 5 6+ million hits per month from abb.com & ABB BOL 420,000 items with 20+ million attribute values indexed 1200+ attribute types 31 Languages Running on 2 load-balanced dual-processor quad-core machines with 16 GB of RAM Software used are Windows Server 2003 OS ASP.Net front-end used to create WebControl Backend web-services using ASP.Net & WCF SOLR 1.4 using Tomcat with 3 gb RAM Oracle DB Product Information Services - Features © ABB Group November 6, 2015 | Slide 6 Text search services Advanced text search services Browsing services (Navigation through Classification Trees) Facet search filters Attribute Group List Resolution to Classification Nodes One general Web Control to support Navigation, Faceting/List page and Item Detail page Web Services Accurate hit counts everywhere Challenges © ABB Group November 6, 2015 | Slide 7 Search & Classification tree results to be filtered on Country, Customer, Consuming application etc.. Faceting on any of the 1200+ attributes Hit counts needed to be accurate Support ever growing languages Same codebase for all 3 major consuming application Index updates at-least 3 times a day Average response time less than 500 ms And most importantly.. “Everything should be always fast” Solr vs “Large Commercial Vendor”… Fight !! © ABB Group November 6, 2015 | Slide 8 SOLR was compared to another major commercial product Stress test results in Proof of concept… SOLR 35 req/sec vs 2 req/sec Average response times 200 ms vs 1-7 secs CPU usage 2-3% vs 100% Sadly matchup was not even close (at least for the scenarios we tested for) Conclusion .. Performance of SOLR is inversely proportional to the cost Winner – SOLR by a KO Observations Going from SOLR 1.3 to 1.4, faceting performance improved 1.5x2.0x SOLR has no issue with scaling to 1500+ facets Java based index replication is faster than rsync (atleast on windows) Tagging filters and excluding them during faceting is cool feature & super useful While using REST api use chunky instead of chatty calls Configure Tomcat/Java to minimize GC as much as possible Do periodic cleanups of SOLR schema to minimize stored values, indexed values, optimize types & fields etc.. Minimize unused bits in the index Minimize xml data to/from SOLR Store all lookup data in memory. Cache as much as possible Minimize database calls © ABB Group November 6, 2015 | Slide 9 DEMO © ABB Group November 6, 2015 | Slide 10