Skip to content

Create and document Java sync improved bulk write API #1458

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Sep 24, 2024

Conversation

stIncMale
Copy link
Member

@stIncMale stIncMale commented Jul 23, 2024

Documentation:

API changes (only new program elements were introduced):

  1. com.mongodb.client.model.bulk
    1.1. ClientWriteModel, ClientNamespacedWriteModel. a) Note how unlike WriteModel, it does not have a type parameter specifying the document type. Initially I implemented that (48b9614), but then I removed it (af854ed) because I don't think it is needed in this case. b) The reason behind introducing two write model types is that we are considering the possibility of exposing the new operation at the collection level, where each model must not have a namespace associated with it.
    1.2. ClientInsertOneModel, ClientUpdateOneModel, ClientUpdateManyModel, ClientReplaceOneModel, ClientDeleteOneModel, ClientDeleteManyModel.
    1.3. ClientDeleteOptions, ClientReplaceOptions, ClientUpdateOptions.
    1.4. ClientBulkWriteOptions.
  2. com.mongodb.client.result.bulk
    2.2. ClientDeleteResult.
    2.3. ClientInsertOneResult.
    2.4. ClientUpdateResult.
    2.5. ClientBulkWriteResult.
  3. com.mongodb.ClientBulkWriteException.
  4. com.mongodb.client.MongoCluster.bulkWrite (4 overloads).

BULK-TODO code comments:

The BULK-TODO code comments will be turned into comments (where possible) in the PR to merge JAVA-4586_bulk-write into master, which will be created once this PR is approved and merged into JAVA-4586_bulk-write.

JAVA-5527

@stIncMale stIncMale self-assigned this Jul 23, 2024
final MongoNamespace namespace,
final Bson filter,
final Bson update,
final ClientUpdateOptions options) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to half the number of overloads of methods in ClientWriteModel and the number of new MongoCluster.bulkWrite overloads by allowing to pass @Nullable options. However, the existing bulk write API does not allow null options, and I followed that in the new API.

final ServerAddress serverAddress) {
super(message, serverAddress);
// BULK-TODO Should ClientBulkWriteException.getCode be the same as error.getCode,
// and getErrorLabels/hasErrorLabel contain the same labels as error.getErrorLabels?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe reviewers have thoughts on this comment. My understanding is that the existing bulk write API does not wrap the top-level errors in BulkWriteException, but the new API wraps the top-level errors in ClientBulkWriteException. Given that the driver uses error code and labels to implement some of its functionality, the question seems reasonable. I don't currently have an answer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to think about this more. I'm not sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all the error labels the driver knows about and uses: TransientTransactionError, UnknownTransactionCommitResult, RetryableWriteError, NoWritesPerformed.

This question splits into two:

  1. What error codes/labels should be copied to ClientBulkWriteException from ClientBulkWriteException.error (the top-level error) for consumption by applications?
    • Simply copying some labels seems to be just incorrect, for example, RetryableWriteError - even if a specific bulkWrite command within a bulk operation can be retried, it does not mean that all previous commands can also be repeated when an application attempts to retry the whole operation. The same reasoning seems to apply to NoWritesPerformed. However, if no bulkWrite commands succeeded, then copying both of these labels seems safe.
    • I don't think we should copy any error codes for consumption by applications.
  2. What error codes/labels should be copied to ClientBulkWriteException from ClientBulkWriteException.error (the top-level error) for consumption by the driver?
    • Given that ClientBulkWriteException will be created by ClientBulkWriteOperation.execute, only the following driver code will have a chance to handle labels and the code from this exception:
      • MongoClusterImpl.OperationExecutorImpl.execute - reads TransientTransactionError, UnknownTransactionCommitResult, and code 91.
      • ClientSessionImpl.withTransaction - reads TransientTransactionError, UnknownTransactionCommitResult.
    • It seems safe to copy the code if its 91.
    • It seems safe to copy the label UnknownTransactionCommitResult.
    • I am not sure about copying TransientTransactionError. @jyemin does this label always result in driver aborting the transaction?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems safest not to copy any error codes / error labels to the top level Exception.

If a user needs to inspect them can't they use the writeErrors map?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems safest not to copy any error codes / error labels to the top level Exception.

I agree that we should not copy error codes/labels for consumption by applications. However, we may still decide to do that for consumption by the driver (and if we do so, the applications will be able to observe that).

I am OK with not copying anything, and instead updating the logic of MongoClusterImpl.OperationExecutorImpl.execute, ClientSessionImpl.withTransaction, and their reactive counterparts such that they look at ClientBulkWriteException.getError.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user needs to inspect them can't they use the writeErrors map?

I don't think that ClientBulkWriteException.getWriteErrors is relevant here. Only the code/labels from the top-level error seem to make sense to consider in the context of copying them to ClientBulkWriteException.

Copy link
Member Author

@stIncMale stIncMale Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in dbf9a26, f026ee3.

Comment on lines +426 to +435
* @throws ClientBulkWriteException If and only if the operation is unsuccessful or partially unsuccessful,
* and there is at least one of the following pieces of information to report:
* {@link ClientBulkWriteException#getWriteConcernErrors()}, {@link ClientBulkWriteException#getWriteErrors()},
* {@link ClientBulkWriteException#getPartialResult()}.
* @throws MongoException Only if the operation is unsuccessful.
* @since 5.3
* @mongodb.server.release 8.0
* @mongodb.driver.manual reference/command/bulkWrite/ bulkWrite
*/
ClientBulkWriteResult bulkWrite(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this throwing behavior is explicitly allowed by the specification.

@stIncMale stIncMale requested review from jyemin, a team and rozza and removed request for a team July 23, 2024 21:53
@stIncMale stIncMale requested a review from jyemin July 24, 2024 13:09
@stIncMale stIncMale requested a review from vbabanin August 13, 2024 15:09
*
* @return Whether this result was acknowledged.
*/
boolean isAcknowledged();
Copy link
Member

@vbabanin vbabanin Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Handling unacknowledged writes:

We are currently throwing an UnsupportedOperationException whenever an unacknowledged write occurs, rendering all methods ineffective under these conditions.

Ideally, API users would check status of isAcknowladged() before operations that might throw exceptions. However, in practice, I believe some users might overlook or misunderstand these requirements, leading to runtime exceptions.

The server, to my understanding, does not send a reply in cases of unacknowladged write. We might consider the API which handles such misuse more gracefully. For instance, wrapping ClientBulkWriteResult in an Optional within the bulkWrite method could indicate an unacknowledged write. This approach prevents the API from throwing exceptions, potentially reducing runtime issues in user applications. Using Optional might make the API more intuitive. This also seems to be allowed by the spec.

  1. API consistency

If we shift the status-checking approach of unacknowledged writes to use Optional, it could further unify the API. Currently, we have Optional<BsonValue> getUpsertedId(); which returns a non-empty value only if an upsert occurred. ClientUpdateResult lacks a status checking method. Similarly, we could make ClientBulkWriteResult more consistent by returning Optional from methods instead of throwing UnsupportedOperationException in cases when it is not verbose result, aligning it with what is already done in ClientUpdateResult. This way the API should be more consistent. This also seems to be allowed by the spec.

  1. Further consideration with VerboseResult (which i would personally prefer)

In recent discussions, @stIncMale proposed introducing a VerboseResult within ClientBulkWriteResult, which would be encapsulated in Optional. This design would contain all the current methods that throw unsupported exceptions related to verbose result and would align seamlessly with the use of Optional in ClientUpdateResult and Optional for unacknowledged write, enhancing the intuitiveness and cleanliness of the API.
(If I’ve missed any aspects of this approach, @stIncMale, please feel free to correct me.)

This approach would require deviation from the specification. However, current requirements in specification seem to be too restrictive.

  1. Balancing legacy compatibility and API improvements:
    While maintaining consistency with our 'legacy' bulk API aids in user transition, doing so could lead to outdated designs. If we focus on aligning with legacy standards, we risk repeating old patterns and missing opportunities to modernize our API.

I understand these approaches might have been considered previously. I would appreciate hearing your thoughts.

@stIncMale @rozza @jyemin

Copy link
Member Author

@stIncMale stIncMale Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work @vbabanin and @stIncMale

* The eligibility for retries is determined per each {@code bulkWrite} command:
* {@link ClientUpdateManyModel}, {@link ClientDeleteManyModel} in a command render it non-retryable.</p>
* <p>
* This operation is not supported by MongoDB Atlas Serverless instances.</p>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stIncMale
Copy link
Member Author

@jyemin, @rozza, @vbabanin, please re-review.

@stIncMale stIncMale requested a review from jyemin September 3, 2024 19:38
@stIncMale stIncMale requested a review from jyemin September 9, 2024 16:13
Copy link
Collaborator

@jyemin jyemin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work, Valentin.

Just one Javadoc nit, but LGTM either way.

* @since 5.3
*/
@Evolving
interface Verbose {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While Verbose accurately suggests that it provides more detailed or extended information than plain result, using an adjective as a class or interface name is somewhat unconventional in Java spec and the name might not immediately convey its purpose.

I would suggest noun-based naming, just couple of examples: VerboseBulkResult/ExtendedBulkResult

Copy link
Member Author

@stIncMale stIncMale Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent was to refer to it as ClientBulkWriteResult.Verbose. I updated the code where this was not done, and following is how the rendered API docs look like

Screenshot 2024-09-18 at 14 31 18 Screenshot 2024-09-18 at 14 31 24 Screenshot 2024-09-18 at 14 31 39

Do you still think this type should be renamed? If yes, then VerboseResults is the name I'll use.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in b6702a6.

stIncMale and others added 4 commits September 18, 2024 14:12
…riteResult.java

Co-authored-by: Viacheslav Babanin <frest0512@gmail.com>
…eResult.java

Co-authored-by: Viacheslav Babanin <frest0512@gmail.com>
@stIncMale stIncMale requested a review from vbabanin September 18, 2024 20:43
Copy link
Member

@vbabanin vbabanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@stIncMale stIncMale merged commit ce34eee into mongodb:JAVA-4586_bulk-write Sep 24, 2024
2 checks passed
@stIncMale stIncMale deleted the JAVA-5527 branch September 24, 2024 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants