DevilKing's blog

冷灯看剑,剑上几分功名?炉香无需计苍生,纵一穿烟逝,万丈云埋,孤阳还照古陵

0%

Avoid double payment in distributed payment env

原文链接

There are three different common techniques used in distributed systems to achieve eventual consistency: read repair, write repair, and asynchronous repair.

Asynchronous repair involves the server being responsible for running data consistency checks, such as table scans, lambda functions, and cron jobs. Additionally, asynchronous notifications from the server to the client are widely used in the payments industry to force consistency on the client side.

write repair, where every write call from the client to the server attempts to repair an inconsistent, broken state.

By design, idempotency safely allows multiple identical calls from clients using an auto-retry mechanism for an API to achieve eventual consistency.

it offers low latency while still providing clean separation between high-velocity product code and low-velocity system management code.

  • An idempotency key is passed into the framework, representing a single idempotent request
  • Tables of idempotency information, always read and written from a sharded master database (for consistency)
  • Database transactions are combined in different parts of the codebase to ensure atomicity, using Java lambdas
  • Error responses are classified as “retryable” or “non-retryable

Orpheus is centered around the assumption that almost every standard API request can be separated into three distinct phases: Pre-RPC, RPC, and Post-RPC.

  1. Pre-RPC: Details of the payment request are recorded in the database.
  2. RPC: The request is made live to the external service over network and the response is received. This is a place to do one or more idempotent computations or RPCs (for example, query service for the status of a transaction first if it’s a retry-attempt).
  3. Post-RPC: Details of the response from the external service are recorded in the database, including its successfulness and whether a bad request is retryable or not.

To maintain data integrity, we adhere to two simple ground rules:

  1. No service interaction over networks in Pre and Post-RPC phases
  2. No database interactions in the RPC phases

针对pre-RPC和post-RPC的database操作进行合并。every database commit in each of the Pre-RPC and Post-RPC phases is combined into a single database transaction. This ensures atomicity — entire units of work (here the Pre-RPC and Post-RPC phases) can fail or succeed as a unit consistently.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
public Response processPayment(InitiatePaymentRequest request, UriInfo uriInfo)
throws YourCustomException {

return orpheusManager.process(
request.getIdempotencyKey(),
uriInfo,
// 1. Pre-RPC
() -> {
// Record payment request information from the request object
PaymentRequestResource paymentRequestResource = recordPaymentRequest(request);
return Optional.of(paymentRequestResource);
},
// 2. RPC
(isRetry, paymentRequest) -> {
return executePayment(paymentRequest, isRetry);
},
// 3. Post RPC - record response information to database
(isRetry, paymentResponse) -> {
return recordPaymentResponse(paymentResponse);
});
}

FSM state machine

针对retry和noretry的情况进行分类

retry or no-retry

Choosing an idempotency key is crucial — the client can choose either to have request-level idempotency or entity-level idempotency based on what key to use.

each api request has an expiring lease

A lease comes with an expiration to cover the scenario where there are timeouts on the server side.

由于replica lag的问题,尽量不要使用replica database