More HBase GC tuning

综合技术 HBase



By Lars Hofhansl


My
article on


hbase-gc-tuning-observations



explores how to config
ure the garbage collector for HBase.






There is
actuall
y a bit more to it
, especially when
block encoding


is enabled for
a column family
and the predom
in
ant access is via the scan API with row cachin g.


















Block enc oding currently requires HBase to









material

ize
each KeyValue after deco
ding during scann
ing
,
and h
ence this
has the p
otential to produce a lot of ga
r
bage
for each scan RPC, es
pec
ially when the scan re
spo
nse is lar
ge as might be the case when
scanner caching is set
to la
rger value
(see
Scan.getCaching ()

)
























My e
xperiment
s show that i
n that case it is better to run with a larger young gen of 512


M B

(-Xmn512m)




and –
c
rucially – make sure that all per RPC g
arbage across all
handlers actively performing scans
fits into the s

urv

i

vor space
.








(Not
e that this statem
ent is true whether or not block enc
oding is used
. B
lock encoding just
produces
a lot more gar
ba ge).







HBase actually has a way to
limit the siz
e
o
f
an in
dividual scan re
sponse by setting













hbase.client.scanner.max.result.size





.


Quick re cap:



The H
otspot JVM
d
ivid
es th
e hea
p into
PermGen
, Tenu
re
d Gen, and the Youn
g Gen











. YoungGen i
tsel
f is d
ivide
d into E
d
en and two survi vor spaces.


















By defa
ult the su
rvivor ratio is 8 (i.e.
each su
r
v
i
vor space is 1/8 of each, and together the
ir size is the configured young gen size)












What to do ?



With -Xm x

512m
this comes t
o ~
51
M B

for each of the two s
u
r
vivor space s.








Now you want to set
hbase.client.scanner.max.result.size
suc
h that the expected number of a handler threads times











th e max.result.size
is
less than
ea ch of the

su
rv
i
vor spaces.










With 30 handlers (default in HBase as of 0.98) this comes to 1.7MB, since not all handlers will always scan using the full buffer 2MB is probably a good setting.










Make
s sense
, doesn’t
?
If per scan results across all
active hand
lers cannot
fit
int
o the surv
ivor space the
collector has no choice b
ut to promote to the tenu
red
g
enerat
ion.
That is exactly the sc
enario
one would
l ike to avoid

as we would slow
ly polut
e the tenu
red
gen with per PRC garbage
, eventually
requir
ing a full GC to defragment .


































TL;DR:



When using block encoding make sure


#
handlers * max.results.size < survivor space



, and use a
slightly larger young generat ion:









-Xmx512m b


(in hbase-e nv.sh)











hbase.client.scanner.max.result.size


= 2097152


(in hbase-size.xml)





稿源:HBase (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合技术 » More HBase GC tuning

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录