Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโ€™ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch: ๐Ÿ› ์ •๊ธฐ ํ‘ธ์‹œ ์•Œ๋ฆผ ๋ฐฐ์น˜ ์ฟผ๋ฆฌ ํ”ฝ์Šค & ItemReader ๊ธฐ๋Šฅ ์ˆ˜์ • #140

Merged
merged 5 commits into from
Jul 25, 2024

Conversation

psychology50
Copy link
Member

@psychology50 psychology50 commented Jul 25, 2024

์ž‘์—… ์ด์œ 


์ž‘์—… ์‚ฌํ•ญ

๐Ÿ˜ฉ ๊ธฐ์กด ๋ฐฉ์‹์˜ ๋ฌธ์ œ์ 

์ž์„ธํ•œ ์ด์œ ๋Š” ๋ธ”๋กœ๊ทธ ๊ฐ€์žฅ ํ•˜๋‹จ์— ์ถ”๊ฐ€ํ•ด๋‘์—ˆ์Šต๋‹ˆ๋‹ค.

SELECT u.id, u.name, d.token 
FROM device_token d
INNER JOIN user u ON d.user_id = u.id AND u.deleted_at IS NULL
WHERE d.activated = true AND u.account_book_notify=true AND u.id > currentId AND u.id <= lastId
ORDER BY u.id
LIMIT 0, chunkSize;
-> Limit: 1000 row(s)  (cost=2532.33 rows=274) (actual time=0.077..44.888 rows=1000 loops=1)
    -> Nested loop inner join  (cost=2532.33 rows=274) (actual time=0.076..44.758 rows=1000 loops=1)
        -> Filter: ((u.account_book_notify = true) and (u.id > 0) and (u.id <= 5000) and (u.deleted_at is null))  (cost=1932.82 rows=479) (actual time=0.050..2.891 rows=2590 loops=1)
            -> Index range scan on u using PRIMARY  (cost=1932.82 rows=9582) (actual time=0.048..2.023 rows=2590 loops=1)
        -> Filter: (d.activated = true)  (cost=1.14 rows=1) (actual time=0.015..0.016 rows=0 loops=2590)
            -> Index lookup on d using FKo3afeawl2mah4kjjvx2chwlbu (user_id=u.id)  (cost=1.14 rows=1) (actual time=0.015..0.016 rows=0 loops=2590)
  • ํ…Œ์ด๋ธ”์„ ์ฝ๋Š” ํฌ์ปค์‹ฑ์ด device_token ํ…Œ์ด๋ธ”์ด ์•„๋‹ˆ๋ผ user์— ๋งž์ถฐ์ ธ ์žˆ์Œ.
  • ์‚ฌ์šฉ์ž๋งˆ๋‹ค ํ‘ธ์‹œ ์•Œ๋ฆผ ๋ฐ์ดํ„ฐ๊ฐ€ ์ €์žฅ๋˜๋Š” ๊ฒƒ์€ ๋ณด์žฅํ•˜์ง€๋งŒ, ๋ชจ๋“  device_token์— ๋Œ€ํ•ด ํ‘ธ์‹œ ์•Œ๋ฆผ์„ ๋ณด๋‚ธ๋‹ค๋Š” ๊ฒƒ์€ ๋ณด์žฅํ•  ์ˆ˜ ์—†์Œ.

image

  • currentId์™€ lastId๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•œ ์ฟผ๋ฆฌ์— ๋ถˆํ•„์š”ํ•œ JOIN์ด ํฌํ•จ๋จ.
  • ์ด ๋ฐฉ์‹๋Œ€๋กœ๋ผ๋ฉด JOIN์ด ๋จผ์ € ์ˆ˜ํ–‰์ด ๋  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์€๋ฐ, JOIN์ด ๋œ ํ…Œ์ด๋ธ”์€ ์ธ๋ฑ์Šค๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์œผ๋ฉฐ, ๋” ์ด์ƒ user.id๊ฐ€ pk๋„ ์•„๋‹ˆ๊ธฐ ๋•Œ๋ฌธ์— ์‚ฌ์šฉํ•ด์„œ๋Š” ์•ˆ ๋จ.
  • ๋˜ํ•œ user.id๋ฅผ ๊ธฐ์ค€์œผ๋กœ ํƒ์ƒ‰ํ•˜๋Š” ๊ฒƒ์€ ์œ„์—์„œ ์–ธ๊ธ‰ํ•œ ๋Œ€๋กœ ๋ฌธ์ œ์˜ ์†Œ์ง€๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— deviceToken.id๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํƒ์ƒ‰ํ•  ํ•„์š”๊ฐ€ ์žˆ์Œ.

1๏ธโƒฃ Query ์ˆ˜์ •

SELECT u.id, u.name, d.id, d.token 
FROM device_token d
INNER JOIN user u ON d.user_id = u.id AND u.deleted_at IS NULL
WHERE d.activated = true AND u.account_book_notify=true AND d.id > currentId AND d.id <= lastId
ORDER BY d.id
LIMIT 0, chunkSize;

image
image

-> Limit: 1000 row(s) (cost=12856797.61 rows=1000) (actual time=0.244..167.040 rows=1000 loops=1)
 ย  ย  -> Nested loop inner join (cost=12856797.61 rows=474067) (actual time=0.243..166.889 rows=1000 loops=1)
 ย  ย  ย  ย  -> Filter: ((d.activated = true) and (d.id > 76258) and (d.id <= 38150113) and (d.user_id is not null)) (cost=3823747.36 rows=9481344) (actual time=0.076..2.493 rows=1000 loops=1)
 ย  ย  ย  ย  ย  ย  -> Index range scan on d using PRIMARY (cost=3823747.36 rows=18962689) (actual time=0.072..1.810 rows=1000 loops=1)
 ย  ย  ย  ย  -> Filter: ((u.account_book_notify = true) and (u.deleted_at is null)) (cost=0.85 rows=0) (actual time=0.164..0.164 rows=1 loops=1000)
 ย  ย  ย  ย  ย  ย  -> Single-row index lookup on u using PRIMARY (id=d.user_id) (cost=0.85 rows=1) (actual time=0.163..0.163 rows=1 loops=1000)
  • device_token ํ…Œ์ด๋ธ” ์Šค์บ”
    • Primary ์ธ๋ฑ์Šค๋กœ range scan
    • ์˜ˆ์ƒ rows: 18,962,689, ์‹ค์ œ rows: 1,000 (LIMIT ์กฐ๊ฑด)
    • ์‹คํ–‰ ์‹œ๊ฐ„: ์•ฝ 1.810ms
  • user ํ…Œ์ด๋ธ” ์กฐํšŒ
    • ๊ฐ device_token ํ–‰์— ๋Œ€ํ•ด primary ์ธ๋ฑ์Šค๋กœ single-row lookup
    • 1,000๋ฒˆ์˜ ๋ฃจํ”„๋ฅผ ๋Œ๋ฉฐ ๊ฐ๊ฐ ์•ฝ 0.163ms ์†Œ์š”
  • ์ „์ฒด ์ฟผ๋ฆฌ ์„ฑ๋Šฅ
    • ์ด ์‹คํ–‰ ์‹œ๊ฐ„: ์•ฝ 167.040ms
    • ๋ฐ˜ํ™˜๋œ ํ–‰ ์ˆ˜: 1,000

current id ์—…๋ฐ์ดํŠธ๋ฅผ ์œ„ํ•œ d.id๋ฅผ ์ถ”๊ฐ€์ ์œผ๋กœ ๊ฐ€์ ธ์˜ด.


2๏ธโƒฃ idSelectQuery๋ฅผ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ •์˜

public class QuerydslNoOffsetPagingItemReader<T> extends QuerydslPagingItemReader<T> {
    private QuerydslNoOffsetOptions<T> options;
    private Function<JPAQueryFactory, JPAQuery<T>> idSelectQuery;

    ...

    public QuerydslNoOffsetPagingItemReader(EntityManagerFactory entityManagerFactory,
                                            int pageSize,
                                            QuerydslNoOffsetOptions<T> options,
                                            Function<JPAQueryFactory, JPAQuery<T>> queryFunction,
                                            Function<JPAQueryFactory, JPAQuery<T>> idSelectQuery) {
        this(entityManagerFactory, pageSize, options, queryFunction);
        this.idSelectQuery = idSelectQuery;
    }

    ...

    @Override
    protected JPAQuery<T> createQuery() {
        JPAQueryFactory queryFactory = new JPAQueryFactory(entityManager);
        JPAQuery<T> query = queryFunction.apply(queryFactory);
        options.initKeys((idSelectQuery != null) ? idSelectQuery.apply(queryFactory) : query, getPage()); // ์ œ์ผ ์ฒซ๋ฒˆ์งธ ํŽ˜์ด์ง•์‹œ ์‹œ์ž‘ํ•ด์•ผํ•  ID ์ฐพ๊ธฐ

        return options.createQuery(query, getPage());
    }
  • ์‚ฌ์šฉ์ž๋Š” idSelectQuery ํ•„ํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•˜๋ฉด, ํƒ์ƒ‰์— ์‚ฌ์šฉํ•  ์ฟผ๋ฆฌ์™€ ์•„์ด๋”” ๋ฒ”์œ„ ์„ ํƒ์— ์‚ฌ์šฉํ•  ์ฟผ๋ฆฌ๋ฅผ ๋ถ„๋ฆฌํ•  ์ˆ˜ ์žˆ์Œ.

3๏ธโƒฃ Step Builder ํŒจํ„ด์˜ QuerydslNoOffsetPagingItemReaderBuilder ์ถ”๊ฐ€

@Slf4j
@Component
@RequiredArgsConstructor
public class ActiveDeviceTokenReader {
    private final EntityManagerFactory emf;

    private final QUser user = QUser.user;
    private final QDeviceToken deviceToken = QDeviceToken.deviceToken;

    @Bean
    @StepScope
    public QuerydslNoOffsetPagingItemReader<DeviceTokenOwner> querydslNoOffsetPagingItemReader() {
        QuerydslNoOffsetOptions<DeviceTokenOwner> options = QuerydslNoOffsetNumberOptions.of(deviceToken.id, Expression.ASC, "deviceTokenId");

        return QuerydslNoOffsetPagingItemReaderBuilder.<DeviceTokenOwner>builder()
                .entityManagerFactory(emf)
                .pageSize(1000)
                .options(options)
                .queryFunction(queryFactory -> queryFactory
                        .select(createConstructorExpression())
                        .from(deviceToken)
                        .innerJoin(user).on(deviceToken.user.id.eq(user.id))
                        .where(deviceToken.activated.isTrue().and(user.notifySetting.accountBookNotify.isTrue()))
                )
                .idSelectQuery(queryFactory -> queryFactory.select(createConstructorExpression()).from(deviceToken))
                .build();
    }

    private ConstructorExpression<DeviceTokenOwner> createConstructorExpression() {
        return Projections.constructor(
                DeviceTokenOwner.class,
                user.id,
                deviceToken.id,
                user.name,
                deviceToken.token
        );
    }
}
  • Builder ๋ณ€ํ˜• ํŒจํ„ด์ธ Step Builder ํŒจํ„ด์„ ์‚ฌ์šฉํ•จ.
    • ๋ฉ”์„œ๋“œ ์ฒด์ด๋‹์„ ํ†ตํ•ด fluent API ์Šคํƒ€์ผ์„ ์œ ์ง€
  • idSelectQuery๋Š” ์„ ํƒ์ ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ๊ทธ ์™ธ ๋ชจ๋“  ์†์„ฑ์€ "์ˆœ์ฐจ์ "์œผ๋กœ "ํ•„์ˆ˜" ์ž…๋ ฅํ•˜๋„๋ก ๊ฐ•์ œ
  • ํƒ€์ž… ์•ˆ์ „์„ฑ, ๋ถˆ๋ณ€์„ฑ, ํ™•์žฅ์„ฑ ๋ชจ๋‘ ๊ณ ๋ คํ•˜์—ฌ ์ž‘์„ฑํ•จ.

๋ฆฌ๋ทฐ์–ด๊ฐ€ ์ค‘์ ์ ์œผ๋กœ ํ™•์ธํ•ด์•ผ ํ•˜๋Š” ๋ถ€๋ถ„

  • ๋ฒ„๊ทธ๋ฅผ ์ˆ˜์ •ํ•ด์„œ ์ •์ƒ ๋™์ž‘ํ•˜๋Š” ๋Œ€์‹  ์ฒ˜๋ฆฌ ์†๋„๊ฐ€ ๋Š๋ ค์กŒ์Šต๋‹ˆ๋‹ค ใ…œ
    • ๋ฐ์ดํ„ฐ ์ฒœ๋งŒ ๊ฐœ ๊ธฐ์ค€ 30min ์ •๋„ ๊ฑธ๋ฆฌ๋Š”๋ฐ, ํ˜„์žฌ ๊ณ ๋ คํ•  ํ•„์š” ์—†๋Š” ๊ฒฝ์šฐ๋ผ ํŒ๋‹จํ•˜๊ณ  ์ฒ˜๋ฆฌ ์•ˆ ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋ฐœ๊ฒฌํ•œ ์ด์Šˆ

  • Job์ด ํ•˜๋‚˜์˜ chunk๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š”๋ฐ ์•ฝ 180ms ์†Œ์š”.
    • device token์ด 100,000๊ฐœ๋ฉด chunk size๊ฐ€ 1,000์ด๋ฏ€๋กœ ์ด๋ก  ์ƒ 180,00ms (=3min) ์†Œ์š” (์‹ค์ œ: 42,895ms = 0.7min)
  • Device Token ๊ฐœ์ˆ˜์— ๋ฐฐ์น˜์˜ ์‹คํ–‰ ์‹œ๊ฐ„์ด ๊ฒฐ์ •๋œ๋‹ค๋Š” ์ ์„ ๊ณ ๋ คํ•˜๋ฉด ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์€ ์•„๋ž˜ ๋‘ ๊ฐ€์ง€๋กœ ๊ท€๊ฒฐ
    • JOIN์„ ์ œ๊ฑฐํ•˜๊ณ , chunk size๋ฅผ ํ‚ค์›Œ์„œ ๊ฐœ์„  (join์„ ์ œ๊ฑฐํ•˜๋ฉด ์ด๋ก  ์ƒ reader๋ฅผ 162ms to 0.7ms๊นŒ์ง€ ๊ฐ์†Œ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ.)
    • ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํŒŒํ‹ฐ์…”๋‹
  • ํ•˜์ง€๋งŒ ํ˜„์žฌ ๊ณ ๋ คํ•  ํ•„์š”๊ฐ€ ์—†์„ ์ •๋„๋กœ Batch์˜ ์„ฑ๋Šฅ์€ ์ค€์ˆ˜ํ•œ ์ƒํƒœ. device token 5,000,000๊ฐœ ๊นŒ์ง„ 3min ๋‚ด์— ์ˆ˜ํ–‰ ๊ฐ€๋Šฅํ•จ. ๋”ฐ๋ผ์„œ ์ด ์ด์ƒ์€ ์˜ค๋ฒ„ ์—”์ง€๋‹ˆ์–ด๋ง์ด๋ผ ํŒ๋‹จํ•˜์—ฌ ์ž‘์—… ์ค‘๋‹จํ•œ ์ƒํƒœ

@psychology50 psychology50 added bug ๊ธด๊ธ‰ํ•˜๊ณ , ์ค‘์š”๋„๊ฐ€ ๋†’์€ ์ด์Šˆ fix ๊ธฐ๋Šฅ ์ˆ˜์ • labels Jul 25, 2024
@psychology50 psychology50 self-assigned this Jul 25, 2024
@psychology50 psychology50 merged commit 6bd74c2 into dev Jul 25, 2024
2 checks passed
@psychology50 psychology50 deleted the fix/batch-query-bug branch July 25, 2024 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug ๊ธด๊ธ‰ํ•˜๊ณ , ์ค‘์š”๋„๊ฐ€ ๋†’์€ ์ด์Šˆ fix ๊ธฐ๋Šฅ ์ˆ˜์ •
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant