WebSphere Platform 하의 개발가이드

Download Report

Transcript WebSphere Platform 하의 개발가이드

Mathematical Approach on
Performance Analysis
for Web based System
WonYoung Lee
2006.02.01
http://www.javaservice.com
Question: What is the definition of Performance?
A.jsp
http://192.168.0.2/a.jsp
Unit Response time
0.5
seconds
B.jsp
http://192.168.0.2/b.jsp
Unit Response time
4.0
seconds
2
Response Time(sec)
Application A: Response Time Grahp
A.jsp
B.jsp
10
http://192.168.0.2/a.jsp
http://192.168.0.2/b.jsp
Unit Response Time
Unit Response Time
4.0 seconds
0.5
seconds
5
1
5
10
15
Concurrent Users
20
25
3
Response Time(sec)
Application B: Response Time Graph
10
B.jsp
A.jsp
http://192.168.0.2/a.jsp
Unit Response Time
0.5 seconds
http://192.168.0.2/b.jsp
Unit Response Time
4.0
seconds
5
5
10
15
Concurrent Users
20
25
4
Response Time(sec)
Comparison of A and B
A
10
5
B
5
10
15
Concurrent Users
20
25
5
1.
Objective of Performance
1.How many Clients
2.Reasonable response time
3.How to test or measure
6
1.1 Visit time & Think time
Click !
1. Definition
Click ! Click !
Click !
Click !
New User Visited
Click !
Leave
Visit Time
Click !
Response Time
Click !
Think Time
Request Interval
Request Interval(sec) = Response Time(sec) + Think Time(sec)
7
1.2 Think time
1. Definition
ThinkTime look like a constant value for the specific business domain
Click !
Response Time
Click !
Think Time
Request Interval
8
1.3 Concurrent User
1. Definition
Request Interval( ≈Think Time)
Visit Time
Definition
Max Thinktime
Concurrent User =
ActiveUser + Inactive User
Note: HTTP is connection-less
Concurrent Users : 6
Time
Note: How many active users?
9
1.4 Active User
1. Definition
Concurrent User
Response Time
Think Time
Worker-Thread
Active Service
Request Interval
System
Click
Click
Active User
10
1.5 Throughput
1. Definition
Throughput(tps) =
number of request
measure time(sec)
Unit : tph, tpm, tps, (pps, rps, ops, hit/sec)
3,600 tph = 60 tpm = 1 tps
Definition
Measure Time
Time
NOTE: Arrival Rate, Service Rate
11
2. Request/Response System Model
1.Mathematical Approach
2.Queuing Theory
3.Quantitative Analysis
4.Measuring
12
2.1 Request and Response
2. Request/Response System Model
Assumption: No ThinkTime
Ave. Resp. Time
Average Response Time
# of Test Users
Question:
What is the ability of her ?
Right Answer:
Maximum throughput
# per minutes
Throughput
# of Test Users
13
Increasing
2.2 MeasuringThroughput
2. Request/Response System Model
Response Time
Virtual User (Assumption: ThinkTime=0)
Throughput(tps)
ResponseTime(sec)
Virtual User(ThinkTime=0)
14
2.3 Little’s Law
2. Request/Response System Model
Response Time(sec)
Throughput(tps)
T
ActiveUser(ThinkTime=0)
ResponseTime(sec)
R
N
N=TxR
Throughput(tps) =
ActiveUser
Number of ActiveUser
Average Response Time(sec)
Number of ActiveUser = Throughput(tps) x Average Response Time(sec)
15
2.4 ActiveUser’s Law
2. Request/Response System Model
Concurrent User
Response Time
Think Time
Active User
Worker-Thread
Active Service
Request Interval
System
Click
ResponseTime’s Law
ActiveUser = ConcurrentUser x
Ave. ResponseTime(sec)
Ave. ResponseTime(sec) + ThinkTime(sec)
16
2. Request/Response System Model
2.4.1 Proff of ActiveUser’s Law
Throughput(tps) =
Throughput(tps) =
ActiveUser
Resp.Time(sec)
ActiveUser
Little’s Law
Resp.Time(sec)
ConcurrentUser
Request Interval(=Resp.Time+ThinkTime)
=
ActiveUser = ConcurrentUser x
ConcurrentUser
Request Interval(=Resp.Time+ThinkTime)
Resp.Time(sec)
Request Interval(=Resp.Time+ThinkTime)
17
2. Request/Response System Model
2.4.2 Meaning of Active User -1ActiveUser = ConcurrentUser x
Request Rate
Ave. ResponseTime(sec)
Ave. ResponseTime(sec) + ThinkTime(sec)
Service Rate
Active User
18
2. Request/Response System Model
2.4.3 Meaning of Active User -2-
19
2. Request/Response System Model
2.5 Concurrent User Equation
Definition
ActiveUser = ConcurrentUser x
Throughput(tps) =
Resp.Time(sec)
Request Interval(=Resp.Time+ThinkTime)
ActiveUser
Little’s Law
Resp.Time(sec)
Equations
ConcurrentUser = ActiveUser x
Resp.Time + ThinkTime
ResponseTime
= ActiveUser x ( 1 +
ThinkTime
ResponseTime
)
①
= ActiveUser + ( Throughput x ThinkTime)
②
= Throughtput x ( ResponseTime + ThinkTime)
③
(Note: Throughput(tps) : ArrivalRate or ServiceRate)
20
2. Request/Response System Model
2.6 Example of Concurrent User Monitoring
-Permanent Cookie
-Thinktime
-Arrival Rate
-Responsetime
-Active User(Service)
--
Concurrent User
21
2.7 SLA & ThinkTime
2. Request/Response System Model
SLA(Service Level Agreement)
- Concurrent User : 5,000
- Average Response Time: less than 3 sec
ThinkTime Agreement
- Tele-market : 10-15sec
- MIS Intranet : 15-20 sec
- Internet Banking : 25-35 sec
- Online Shopping Mall : 30-40 sec
- Community : more longer
+
Real ThinkTime Data
SLA(Service Level Agreement)
- Concurrent User : 5,000
- Think Time = 30 (Example)
- Average Response Time: less than 3 sec
22
2. Request/Response System Model
2.8 Performance Test ThinkTime = 30
235.3
158.7
Response Time
Throughput(tps)
Throughput Graph
AverageResponseTime Graph
4.0
1.5
5,000
8,000
Virtual User(ThinkTime=30  ConcurrentUser)
ConcurrentUser = Throughtput x { ResponseTime + ThinkTime(30) }
5000 = 158.7 x ( 1.5 + 30 )
8000 = 235.3 x ( 4.0 + 30 )
23
2. Request/Response System Model
235.3
Throughput Graph
Average Response Time Graph
158.7
4.0
Response Time
Throughput(tps)
2.9 Performance Test ThinkTime=0
1.5
941
238
Virtual User(ThinkTime=0  ActiveUser)
ConcurrentUser 5,000
8,000
ConcurrentUser = ActiveUser + Throughput(tps) x ThinkTime(sec)
ConcurrentUser = ActiveUser x { 1 +
ThinkTime(sec)
}
Resp.Time(sec)
ConcurrentUser = Throughput(tps) x { Resp.Time(sec)+ThinkTime(sec)}
24
2.10
2. Request/Response System Model
Queuing Theory - G/G/1
Arrival Rate
λ μ
Max Throughput
Throughput (λ<μ)
(λ<μ)
ActiveUser
Utilization
Response Time
25
2.11 Maximum Concurrent User
ActiveUser
Arrival Rate(λ)
2. Request/Response System Model
Response Time
Arrival rate
Arrival rate
Active User(N)
Response Time(R)
Concurrent User
?
Concurrent User
?
Concurrent User
26
2. Request/Response System Model
2.12 Saturation Point, Buckle Zone
27
2. Request/Response System Model
2.13 Understanding of Throughput Graph
Tuning ? What’s mean?
28
2. Request/Response System Model
2.14 Throughput and Active User
Demo
29
3. Multiple Applications
2. Request/Response System Model
- Different Hit Ratio
- Different Performance
- Homogeneous/heterogeneous
Bottleneck Condition
30
3. Multiple Applications
3.1 2-Application model
A application
TPS
40
Saturation Point
B application
TPS
1 : 1
20
Saturation Point
40 Active Threads
3 : 1
A
30
B
10
B
5
20
20
20
TPS
40
A
20
40
40
40 Active Threads
TPS
40
TPSmax
10
40 Active Threads
A
1 : 3
TPS
40
B
15
10
30 40 Active Threads
A
10
TPSmax
TPSmax
TPSmax
40
40
40
B
30 40 Active Threads
30
20
20
10
A
Graph 1
B
5
15
10
A
Graph 2
B
A
Graph 3
B
31
3. Multiple Applications
3.1.1 2-Applicatoin Model
A
40
30
20
Different Arrival Rate
graph 2
graph 1
graph 3
10
Limited by same resource bottleneck
 Homogeneous bottleneck condition
O
10
20
B
32
3.1.2 2-Applicatoin Model
A: r1 (req/sec)
B: r2 (req/sec)
y
T2
y
x
=
r1 r2
R2
r2
O
3. Multiple Applications
Q
(r1,r2)
P
r1 R1
y
x
=1
+
T1 T2
T1
x
33
3.2 3-Application Model
3. Multiple Applications
z
T3
(0, 0, T3)
y
z
x
=
=
r2
r3
r1
Q
P
(r1, r2, r3)
r3
(0, T2, 0)
(0, 0)
O
r1
T1
(T1, 0, 0)
r2
T2
y
y
x
z
= 1
+
+
T2 T3
T1
x
34
3.3 n-Applicatoin Model
3. Multiple Applications
( Critical Inequality Performance Equation )
NOTE: Under Homogeneous Bottleneck Condition
35
3. Multiple Applications
3.4 Performance Utilization
z
App.
TPSmax
ri
req/sec
ri / Ti
x
400
48.2
0.12
y
9
1.2
0.13
z
50
22.7
0.45
T3
y
x
z
= 1
+
+
T1 T2 T3
Q
P
Ti
0.12 + 0.13 + 0.45 = 0.70
T2
y
∑ { ri / Ti } x 100 = 70 %
Utilization of Critical Performance
T1
x
∑ { ri / Ti } ≤ 1.0
36
3. Multiple Applications
3.5 Revision of Utilization of Critical Performance
App
Arrival Rate
a.jsp : 5.4 req/sec
Ti
TPSmax
ri
req/sec
ri / Ti
a1
42
5.4
0.129
a2
9
1.9
0.210
a3
70
10.7
0.153
a4
15
4.5
0.300
∑ri =22.5
∑ { ri / T i } =
0.79
…
…
Sub
b.jsp : 1.9 req/sec
c.jsp : 10.7 req/sec
…
d.jsp : 4.5 req/sec
TOTAL
…
25(req/s)
…..
∑ { ri / Ti } ≤ ρ* 1.0
 ∑ { ri / T i } / ρ =
(ρ is total hit ratio of ∀ai)
∑ { ri / Ti } / (22.5/ 25) = 88%
37
3.6
3. Multiple Applications
Multiple Bottleneck Theory
z
Bottleneck surface
Arrival rate line
y
z
x
=
=
r2
r3
r1
Q
P
(r1, r2, r3)
r3
bottleneck saturation flat
(0, 0)
O
r2
Critical Inequality Performance Equation (final)
x
∑ { ri / Ti } ≤ 1.0 +ε(ε>0)
Heterogeneous bottleneck condition
38
3. Multiple Applications
3.7 Performance Matrix
NOTE: 8:2 Rule
App.
Arrival Rate(tps)
Arrival Ratio
Max
Throughput
λi /Τi
a1
λ1
λ1 /λ=ρ1
Τ1
λ1/Τ1
a2
λ2
λ2 /λ=ρ2
Τ2
λ2/Τ2
a3
λ3
λ3 /λ=ρ3
Τ3
λ3/Τ3
…
…
…
…
an
λn
λn /λ=ρn
Τn
λn/Τn
(Others)
λothers
1-ρ(Ex.20%)
N/A
N/A
SUM
∑ λi + λothers = λ
∑ρi + 0.2 = 1.0
N/A
∑ λi/Τi
∑ λi/Τi ≤ ρ(1.0+ε) (ε≥0) Critical Performance Utilization: ∑ λi/Τi ≤1.0
(When ConcurrentUser = N) ρ(1.0+ε)
(ρ is total hit ratio of ∀ai)
(ε≈ 0)
39
3. Multiple Applications
3.8 Objective of Performance Estimation
How many concurrent users can be accepted?
How many times than now?
Concurrent Users
kN
Arrival Rate
kλ
Future
Future
N
Current
Current Performance
Utilization :
Current
λ
∑ λi/Τi
ρ(1.0+ε)
x k
1.0 (100%)
40
3. Multiple Applications
3.9 Performance Test for Multiple Scenario
App.
Ratio
ρi
a1
0.5
a2
0.2
a3
0.1
…
…
(Others)
0.2(20%)
SUM
0.8(80%)
ThinkTime = 28.5 sec(Example)
Response Time Graph
Acceptable Response Time
Virtual User(ThinkTime = 28.5 sec) 942
41
3. Multiple Applications
3.10 Example of Performance Test
테스트 명
1.공매공고(listItemInfo)
2.매각물건(listitem)
3.매각물건(view itm_real)
4.홈페이지(index)
5.파워검색(searchItemByDetailInfo)
6.입찰임박물건(listImpendingBidItem)
7.주소검색(searchZipAddress_01)
8.물건정보view Item_bond
예상
A rriva l
A rriva l Ra te R( Ra tio) T ( T P S ) Rm a x ( Re
Ra te/ T
q/ s ec)
297
0.083 0.14
16.25 0.14Rmax 0.005
288
0.080 0.13
1.5 0.13Rmax 0.053
180
0.050 0.08
18.5 0.08Rmax 0.003
116
0.032 0.05
2.75 0.05Rmax 0.012
109
0.030 0.05
8 0.05Rmax 0.004
Hit C ount
161
55
53
9.공매공고(listKamcoAucNotice)
10.새로운공고new AucNotice
11.FAQ보기View FAQ
48
45
44
12.LISTFAQListFAQ
13.새로운공고-view AucNotice
14.ID중복확인checkDuplicateUserId
15.공매일정listItemInfoBySched
43
43
33
60
16.통합공고listAucNotice
17.회원가입w elcomeRegisterForm
18.회원가입동의createUserForm
26
24
22
19.부가정보listDocForm
20.파워검색searchItem
21.새로운물건listNew Item
22.로그인폼loginForm
21
21
19
19
23.이용약관contractPersonalUser
24.로그인login
17
16
Sum
0.489
R/ T
비고
0.0086
0.0867
0.0043
0.0182
JAVA EXCEPTION ERROR
0.0063
0.0025
0.0005
0.0010
8명일때 DB 서버 95% FULL
웹 서버 CPU 사용율 70%
웹 서버 CPU 사용율 100%
웹 서버 CPU 사용율 90%
0.0080
0.0050
0.0008
8명일때 DB 서버 CPU 사용율 100%
6명일때 DB 서버 CPU 사용율 100%
WEB 서버 CPU 사용율 80%
0.0010
0.0008
0.0003
0.0004
DB 서버 , WEB 서버CPU 사용율 90%
웹 서버 다운 - 세션 FULL
웹 서버 CPU 사용율 - 94%
웹 서버 CPU 사용율 - 90%
0.0011
0.0004
0.0004
DB 서버 CPU 사용율 100%
웹 서버 CPU 사용율 90%
웹 서버 CPU 사용율 80%
0.0003
0.0005
0.0025
0.0004
웹 서버 CPU 사용율 90%
웹 서버 CPU 사용율 90%
DB 서버 CPU 사용율 100%
웹 서버 CPU 사용율 90%
웹 서버 CPU 사용율 - 70%
DB 서버 CPU 사용율 100%
0.045 0.07
0.015 0.03
0.015 0.02
0.013 0.02
28 0.07Rmax
60 0.03Rmax
19.5 0.02Rmax
2.5 0.02Rmax
0.002
0.000
0.001
0.005
0.013 0.02
0.012 0.02
0.012 0.02
4 0.02Rmax
24 0.02Rmax
19.5 0.02Rmax
0.003
0.001
0.001
0.012 0.02
0.009 0.02
0.017 0.01
0.007 0.01
24 0.02Rmax
67 0.02Rmax
22.5 0.01Rmax
9 0.01Rmax
0.000
0.000
0.001
0.001
0.007 0.01
0.006 0.01
0.006 0.01
22.5 0.01Rmax
23.5 0.01Rmax
32 0.01Rmax
0.000
0.000
0.000
0.006 0.01
0.005 0.01
0.005 0.01
0.005 0.01
19 0.01Rmax
4 0.01Rmax
25.5 0.01Rmax
24 0.01Rmax
0.000
0.001
0.000
0.000
0.004 0.01
3.5 0.01Rmax
0.001
0.0004
0.0029
0.79
0.79Rmax
0.095
0.1535
CPU 100% FULL
Result of Test using each application test.
Maximum adaptable concurrent user: 219
Expected maximum TPS : 6.515(tps)
Result of Test using Multiple Scenario at the
same time
Maximum adaptable concurrent user : 202
Expected maximum TPS : 6.719 (tps)
1 + 1 = 2.5 ?
42
3.11 Termination about Performance Test
Load Test
Stress Test
Availability Test
Performance Test
43
Two Types of Performance Problem
 Relative Performance Problem

SQL Query bottleneck (DB index, Full Scan, Heavy Query)

Bottleneck on back-end transactions (CICS,TUXEDO, TCP/IP Socket)

Relative bad performance on specific application (synchronized, CPU time)

Relative bad performance on most of application caused by too small H/W

Other side issue, for example, Network bottleneck.
 Conditional Performance Problem

JDBC Connection resource leakage

Memory leakage (need too large memory, Memory Leak, Native Memory Leak)

Unbalanced WAS tuning (Pool Size, Number of Thread, Heap Size)

Caused by a bug on JVM/WAS/JDBC (Sybase JDBC, JVM Bug, WAS Bug)

Thread Lock/Dead Lock (Application issue, Firewall issue)
44
Relative Performance Problem
Request Rate
Service Rate
Active User
45
Relative Performance Problem: Ramp up Test
46
Relative Performance Problem: Ramp up Test
47
Conditional Performance Problem

JDBC Connection resource leakage
 Memory leakage (need too large memory, Memory Leak, Native Memory Leak)
 Unbalanced WAS tuning (Pool Size, Thread 개수, Heap Size)
 Caused by a bug on JVM/WAS/JDBC (Sybase JDBC, JVM Bug, WAS Bug)
 Thread Lock/Dead Lock (Application/Framework issue, Firewall issue)
 Database lock caused by uncommited nor unrollbacked
 database issue (buffer full, unexpected Batch Job, ..)
 upload or download a large file
 unexpected infinite loop on an application: CPU 100%
 Disk/memory Full
 bad performance on specific application or specific usres
48
3.1 Issue on Capacity Planning
3.1.1 K-university KMS System
S80(12-way,8GB, 78,126 tpmC)
S80(6-way,6GB, 41,140 tpmC)
Visitors a day: 4,506 users
Peak Concurrent users: 275 users
Peak Arrival Rate : 18.3 tps
Request Interval : 18 sec
Vistors vs Concurrent users: 6.1%
CPU : ?(unkown, no issue)
119,266tpmC/18.3tps = 6,517 tpmC/tps
49
3.1.2 B company
M80 (4-way, 3GB, 34,588 tpmC) x 2
S80 (12-way, 8GB, 78,126 tpmC) x 2
6F1 (4-way, 4GB, 44,500 tpmC) x 1
H80 (2-way, 1GB) x 8
[Seoul:M80(4-way)x2: 69,176 tpmC ]
Visitors a day: 1,621(total 3,435)
Peak Concurrent Users: 600 users
Hits a day : 466,639 hits
Peak Arrival Rate : 20.0 tps
Request Interval : 30 sec
Average Visit time: 1:25:43
Visitors vs Concurrent Users: 31%
CPU Utilization: 70-100%
69,176tpmC/20tps= 3,459 tpmC/tps
6F1(4-way,4GB, 44,500 tpmC ) Added.
CPU Utilization: 60%
113,676tpmC/20tps= 5,684 tpmC/tps
50
3.1.3 N-bank Internet banking
M80(4-way, 4GB, 34,588 tpmC) x 3
Visitors a day: 96,753 users
Peak Concurrent Users: 1,500-2,000 users
Hits a day : 1,795,867 hits
Peak Arrival Rate : 58.3 tps
Request Interval : 25.7 sec
Visit time: 6min 25sec
Average hits per visit time: 18.6 clicks
Visitors vs Concurrent Users: 1.6-2.0%
CPU Utilization: 70%
103,764 tpmC/58.3tps= 1,780tpmC/tps
51
3.1.4 J-bank CRM System
S85(12-way, 32GB, 124,818tpmC) X 2
+ HOST DB
Visitors a day: 5,028 users
Peak Concurrent Users: 250 users
Hits a day : 264,060 hits
Peak Arrival Rate : 10 tps
Request Interval : 25.3 sec
Visit time: 4min 10sec
Average hits per visit time: 9.9 clicks
Visitors vs Concurrent Users: 4.97 %
CPU Utilization: 30-40%
249,636tpmC / 10 tps = 24,964 tpmC/tps
52
3.1.5 D-insurance e-Hanaro System
WAS 6H1(4-Way 4 GB, 40,763 tpmC)
DB H70(4-Way 4GB, 17,134 tpmC)
+HOST CICS
Visitors a day: 2,800 user(resistered 3,300)
Peak Concurrent Users: 350 user
Hits a day : 301,190 hits
Peak Arrival Rate : 33 tps
Request Interval : 9 sec
Visit time: 16 min
Average Hits per visit time: 120.5 clicks
Visitors vs Concurrent Users: 12.5 %
CPU Utilization: 50%
40,763 tpmC / 33 tps= 1,235 tpmC/tps
53
3.1.6 D company
WAS H80(2-way, 2GB, 14,756(?) tpmC)
DB S80(12-way, 8GB, 67,908tpmC)
Visitors: 258 users
Peak Concurrent Users: 45 user
Hits a day : 65,192 hits
Peak Arrival Rate : 4.2 tps
Request Interval : 11-18 sec
Visitors vs Concurrent Users: 17.4 %
CPU Utilization: 50-60% (DB CPU: 35%)
14,756 tpmC / 4.2 tps= 3,513 tpmC/tps
54
3.1.7 K-bank eCRM System
WAS H80 (2-way, 4GB, 14,756(?) tpmC) x 2
DB M80(2way-4GB, 18,647(?) tpmC)
Visitors a day: 37,951 users
Peak Concurrent Users: 230 users
Hits a day : 235,527 hits
Peak Arrival Rate : 5.83 tps
Request Interval : 24.7 sec
Visit time: 2 min 33 sec
Average hits per visit time: 6.2 clicks
Visitors vs Concurrent Users: 0.6 %
CPU Utilization: ?(unknown, no issued)
29,512 tpmC / 5.83 tps= 5,062 tpmC/tps
55
3.1.8 K-Card
M80 (4-way, 4GB, 34,588 tpmC) x 9
WSBCC , Servlet/JSP, CTG
Visitors a day: 5,323 users
Visitors an hour: 4,500 users
Peak Concurrent Users: 2,800 users
Hits a day : 6,198,133 hits
Peak Arrival Rate : 217 tps
Request Interval : 23 sec
Visitors vs Concurrent Users: 53 %
CPU Utilization: (see left bottom side grahp)
311,292tpmC / 217 tps= 1,435 tpmC/tps
CPU Utilization
[2nd machine]
Peak Arrival Rate : 31.7 tps
Peak Concurrent Users: 600 users
34,588 tpmC / 31.7 tps= 1,091 tpmC/tps
[4th machine]
Peak Arrival Rate : 50 tps
34,588 tpmC / 50 tps= 692 tpmC/tps
56
3.1.9 Result of my statistics
tpmC / 1 tps
(based on 70% cpu)
57
3.2 Bad Process for Perf. Mgmt.
Proposal
Pilot BMT
Capacity
Estimation
Monitoring
Perf. Data
Logging
Perf Analysis
Function
est
Testing with
Small real users
Perf.
Prediction
Analysis
Methodology
Unit App.
Perf.Test
BMT
methodology
Experienced
data
System
Open
Analysis/
Developing
App
Tuning
Additional
App. Develop
Unit App.
Perf.Test
Perf.
Prediction
Perf Analysis
Applying
App.
Workload
Analsys
58
59