Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support multiple per-row updates in a single mutation #2768

Closed
0x777 opened this issue Aug 23, 2019 · 47 comments
Closed

support multiple per-row updates in a single mutation #2768

0x777 opened this issue Aug 23, 2019 · 47 comments
Assignees
Labels

Comments

@0x777
Copy link
Member

0x777 commented Aug 23, 2019

Currently we allow updating multiple rows (through where) but all of them will get the same update. We need to add support for cases where the updates are different for each row, say you want to set name to Hello for the row with id=1 and to World when id=2. Something like this maybe?

{
  update_user_many(
    updates: [
      { where: {id: {_eq: 1}, _set: {name: "hello"}},
      { where: {id: {_eq: 2}, _set: {name: "world"}}
    ]
  ) {
    affected_rows
  }
}

Notes:

  1. What happens when there is overlap across the where conditions? What would affected_rows and returning return?
  2. Constructing the updates argument would need a bit of a boilerplate.

Maybe we can simplify the api to just use the primary key/unique constraints?

{
  update_user_many(
    updates: [
      { id: 1, _set: {name: "hello"}},
      { id: 2, _set: {name: "world"}}
    ]
  ) {
    affected_rows
  }
}

We can probably use update .. from as suggested here: https://stackoverflow.com/a/18799497

@0x777 0x777 added c/server Related to server e/intermediate can be wrapped up in a week k/enhancement New feature or improve an existing feature p/high candidate for being included in the upcoming sprint labels Aug 23, 2019
@revskill10
Copy link

@0x777 As i see, for the simpler API , we could also use where with unique constraint key ?

@hutber
Copy link

hutber commented Dec 28, 2019

I would enjoy this feature

@levid
Copy link

levid commented Jan 6, 2020

I would also really like this feature. Currently I have to update multiple records individually and this would be so much easier and more efficient.

@marcfalk
Copy link

Would love this too! For now I'm using upsert although it's not highly recommended in the docs. What are the drawbacks of doing that until a multi-update feature is here?

@revskill10
Copy link

Currently, i use multiple mutation in one graphql query to achieve this, as Hasura allows you to use multiple mutation inside one query, and all of them will execute in one transaction.

@tirumaraiselvan tirumaraiselvan added p/medium non-urgent issues/features that are candidates for being included in one of the upcoming sprints and removed p/high candidate for being included in the upcoming sprint labels Apr 11, 2020
@FarazPatankar
Copy link

Is there an update on this at all? Also, I am considering going the upsert way like @marcfalk has and was wondering if there are any drawbacks as well. Would love some thoughts on this. @tirumaraiselvan

@michael-land
Copy link

michael-land commented Apr 11, 2020

Is there an update on this at all? Also, I am considering going the upsert way like @marcfalk has and was wondering if there are any drawbacks as well. Would love some thoughts on this. @tirumaraiselvan

The pk may leaving gaps unless you use uuid

e.g.
id 1
id 9
id 100

@FarazPatankar
Copy link

Is there an update on this at all? Also, I am considering going the upsert way like @marcfalk has and was wondering if there are any drawbacks as well. Would love some thoughts on this. @tirumaraiselvan

The pk may leaving gaps unless you use uuid

e.g.
id 1
id 9
id 100

Anything other than this? And does this have any bad effects apart from the fact that there are gaps?

@aaronbski
Copy link

+1

@FarazPatankar
Copy link

For anyone struggling with this, I ended up using the upsert mutation for this due to the lack of a response and it works perfectly.

@michael-land
Copy link

Is this going to implement? I don't see why cron/schedule job has higher priority than multiple per-row updates and multiple auth role.

@hafiztahajamil
Copy link

@praveenweb This is an important feature. You must assign this to someone. Thanks

@BPiepmatz
Copy link

would love to see this too 👍

@hafiztahajamil
Copy link

@praveenweb Any updates ?

@hafiztahajamil
Copy link

Currently, i use multiple mutation in one graphql query to achieve this, as Hasura allows you to use multiple mutations inside one query, and all of them will execute in one transaction.

@revskill10 How to dynamically add multiple mutations to a single one each with different variable values? How to make use of the aliases dynamically

@revskill10
Copy link

@hafiztahajamil No, you can't. You have to embed query variables inside the mutations. It's fast.

@hafiztahajamil
Copy link

hafiztahajamil commented Oct 6, 2020 via email

@revskill10
Copy link

revskill10 commented Oct 18, 2020

@hafiztahajamil For example, i want to update first_name of multiple users in one mutation, i would do:

mutation {
update_users_1(objects:users1_objects) { affected_rows }
update_users_2(objects:users2_objects) { affected_rows }
update_users_3(objects:users3_objects) { affected_rows }
update_users_4(objects:users4_objects) { affected_rows }
}

In above mutation, i generated update_users_X from code , as well as usersX_objects, like:

usersX_objects = [{
first_name: 'test'
}]

@valstu
Copy link

valstu commented Mar 19, 2021

Any updates to this, currently most of my updates need to fallback to upserts and that is something I wouldn't like to use.

@Lexe003
Copy link

Lexe003 commented Mar 24, 2021

Updating multiple rows with different Pk's by calling the API multiple times sounds really bad. Hope this will be implemented soon. This should be a core feature.

@harpyng
Copy link

harpyng commented May 7, 2021

Please provide an update on this feature!

Although the upsert alternative is a possibility the documentation specifically suggests otherwise:
https://hasura.io/docs/latest/graphql/core/databases/postgres/mutations/upsert.html
"Upsert is not a substitute for update¶
The upsert functionality is sometimes confused with the update functionality. However, they work slightly differently. An upsert mutation is used in the case when it’s not clear if the respective row is already present in the database. If it’s known that the row is present in the database, update is the functionality to use.

For an upsert, all columns that are necessary for an insert are required."

I have a use case where I want to update JSON fields across multiple ecommerce products at once and doing it by running a mutation with hundreds of individual updates or serially calling the API is not ideal.

Upsert is a possibility, but as stated above from the docs, it requires all columns necessary for an insert - so doesn't work well if there are not null constraints to consider. (or requires a lot of additional and unnecessary information to be sent with each mutation).

@noel-dolan
Copy link

noel-dolan commented Jun 3, 2021

Wow! First posted in 23 Aug 2019 and still not a feature! This is exactly what I was/am looking for too, I won't hold my breath! :p upsert really isn't an option as all fields are required which would be other data being wiped. Surely this should be high up the list of features that are added! :/

Edit:

Looks like the following might be a viable, if not quite perfect, solution - just add an alias to each update request.

mutation myUpdates{
update1: update_users(where: {id : 1}, _set: { value: 10 }) { affected_rows }
update2: update_users(where: {id : 2}, _set: { value: 15 }) { affected_rows }
update3: update_users(where: {id : 3}, _set: { value: 20 }) { affected_rows }
}

@valstu
Copy link

valstu commented Jun 4, 2021

I also feel this should have higher priority, I've had multiple projects where this exact feature would have been useful. I mean it is quite common case on any project to update multiple rows at once. @0x777 any idea if this will progress or is this abandoned?

@carlpaten
Copy link
Contributor

Since upsert can't work against partial unique indexes this is causing issues with our ETL. I'd hate to have to go around Hasura and talk directly to Postgres, which up until now we've never had to do.

@SebasScript
Copy link

Hi am also i need of this, will likely go with multiple mutations within one call for now to get around limitations. But is a feature
that seems to be a very basic building block in the mutations toolset and agree with others on the priority of this

@gilligan gilligan added iteration and removed k/enhancement New feature or improve an existing feature p/medium non-urgent issues/features that are candidates for being included in one of the upcoming sprints e/intermediate can be wrapped up in a week labels Feb 2, 2022
@mikolajkniejski
Copy link

Please add this feature!

@0x777
Copy link
Member Author

0x777 commented Mar 28, 2022

Hey folks, we will be picking this up soon. Can you share your use cases here? It'll be really helpful in designing the API.

@dariocravero
Copy link
Contributor

@0x777 here's our use case. Say a list of fields need to be updated by PK, right now we do this:

mutation update($id1: uuid!, $object1: ..., $id2: uuid!, $object2: ...) {
  u1: update_users_by_pk(id: $id1, _set: $object2) { id }
  u2: update_users_by_pk(id: $id2, _set: $object2) { id }
}

Ideally, we'll have something like:

mutation update($objects: [{ id, _set }, {id, _set}...]) {
  update_users_many($objects) { returning { id } }
}

So effectively like inserting many but with a way to tell what you update.

Thanks for taking this on. It'll help DX a lot and make mutations safer! (Right now we construct that mutation on the fly)

@lxblvs
Copy link

lxblvs commented Mar 29, 2022

We are also updating a big list of objects. I would prefer something like this:

mutation updateMultiple($itemUpdates: [item_update_input!]!) {
  update_items(updates: $itemUpdates) {
    affected_rows
  }
}

where the parameter is

itemUpdates = [
                   {where: {id: {_eq: 1}}, set: {name: "potato"}},
                   {where: {id: {_eq: 2}}, set: {name: "another potato"}},
                   {where: {name: {_eq: "not a potato"}}, set: {name: "yet another potato"}},
]

@adimyth
Copy link

adimyth commented May 23, 2022

Exactly! None of the solutions above take into account the possibility of having variable number of updates required

@tjad
Copy link

tjad commented Jun 16, 2022

@0x777 I like the idea of having an [update_input!]! syntax which contains where and _set , similar to your first suggestion
or #2768 (comment)

This at least allows simple syntax for multiple updates within single transaction. We can easily create the list of update objects dynamically.

In terms of rows affected, I think it could return a value for each object/clause in order of the provided update objects.

K.I.S.S

This would be a good first iteration at the least IMO - we can worry about overlapping next iteration (caveat in the documentation)

Off the top of my head, if the queries are run in a transaction, the overlapping should not be significant as the order of updates will be maintained.

@eviefp
Copy link
Contributor

eviefp commented Jun 30, 2022

We've gone through a few internal design iterations on this, and I'm here to report what we've landed on and ask for feedback.

What I'm working on

I am currently working on implementing the version initially suggested by @0x7777, which is essentially a multi-record update by primary key.

Internally, we will make sure the keys don't repeat. If they do, we will use the last value (since it's a list of updates, we'll just pick the one that is closest to the list's end).

This will get translated into a single Postgres UPDATE statement. This is important, because this means we get the best possible performance.

What were the designs we considered

One key technical fact is that Postgres will NOT update a row twice in the same statement. For example, say we wanted to allow this query:

{
  update_user_many(
    updates: [
      { where: {id: {_gt: 1}, _set: {name: "hello"}},
      { where: {id: {_eq: 2}, _set: {description: "world"}}
    ]
  ) {
    affected_rows
  }
}

If we have a record with id = 2, then we can't know, before running the query, whether the first or the second update will trigger. However, only one of them will. This sort of non-determinism is not something we want to carry over to our product because it's unexpected and can easily introduce bugs in our user's code.

In the case of updates by primary key, it's fairly easy to detect when there's an overlap. However, as soon as we add more operators and other columns in the mix, the problem becomes incredibly complex (often times not solvable).

This means we end up with two options:

  • the one we ended up picking: only allow equality on ID instead of a generalized where clause; this means very good performance and predictable results
  • allow generic where clauses and run multiple updates in sequence; this means worse performance but allows overlapping/updating the same row multiple times, while keeping the latest value

What about RETURNING?

In the primary key update version, returning is relatively simple to do and shouldn't surprise anyone.

However, the general where clause brings further complications: say we run two updates, and the first updates 2 rows, and the second 3. Should affected_rows be 5? Well, that might not be true because some of those rows could overlap. We could say "at least 2, at most 5". Or we could just return a list of affected_rows (one for each update).

Returning columns for affected rows for each operation can also be a bit tricky and impair performance.

Conclusion

So, in conclusion, we're going for the solution that:

  • is fast
  • will be out ASAP for everyone to have a look at and give us feedback
  • has predictable results

At the same time, we're wondering: how important is having a generic where clause? Are there any specific operators other than eq which you feel are essential for this feature?

@jackherizsmith
Copy link

jackherizsmith commented Jul 7, 2022

Hello! I have just come across this thread, looks like great timing. Given the simplicity of option 1, the complexity of option 2, and significant improvement this update will mean for a lot of developers, I think your conclusion is the right one for the product. In the meantime, developers looking for an all in one solution can simply run a prior query that returns the IDs they are looking to update.

@eviefp
Copy link
Contributor

eviefp commented Jul 18, 2022

I'm happy to announce this feature has been merged: 84366c9.

You should be able to use this feature in the next release!

During these couple of weeks, we've iterated a few times on the solutions and ended up being able to provide a bit more features than originally anticipated. You can read about it in the commit's CHANGELOG.

Essentially, this feature will create a new mutation field named update_<table>_many. This will take a list of updates, each with a where clause along with its own set/inc/etc., clause. These updates will run in sequence. The return type will be the equivalent of running each update separately. The advantage is that everything is being ran in a single transaction, so if one of them fails, everything gets rolled back.

We're excited to hear back from you and get feedback on this new issue! Let us know how you end up using it.

@revskill10
Copy link

@eviefp This is a deal breaker for Hasura. Cheers for the launch.

@eviefp
Copy link
Contributor

eviefp commented Jul 18, 2022

Hey @revskill10, do you care to elaborate? What exactly is the deal breaker part?

@hutber
Copy link

hutber commented Jul 18, 2022

I'd just like to say an extreme thank you for the devs working on this feature over the last 3 years :D I love you, my wife loves you, my wives wive loves you!!! everybody loves you!

@lxblvs
Copy link

lxblvs commented Jul 18, 2022

This is very cool, thank you for this feature. I will definitely use it and praise the developers who wrote this code.

But I still hope that at some point you change your position on complex where clauses in the multiupdates. If we want to shoot ourselves in the foot and run conflicting mutations in a single call - please let us. Or maybe have a flag to run them sequentially and not as a single transaction?

For now we can do a preflight call to resolve the ids or maybe have an Apollo preprocessor in front of Hasura that does it for us.

@revskill10
Copy link

Hi @eviefp It's almost the same as Prisma Transaction feature, but one more difference.

In Prisma transaction, they can mix match both query and mutation though.

@eviefp
Copy link
Contributor

eviefp commented Jul 19, 2022

@lxblvs

But I still hope that at some point you change your position on complex where clauses in the multiupdates. If we want to shoot ourselves in the foot and run conflicting mutations in a single call - please let us.

We actually did change that! So right now we allow arbitrary where clauses that can freely overlap. We just run them in sequence, inside a transaction.

Or maybe have a flag to run them sequentially and not as a single transaction?

If there's user interest, we could definitely add a flag/option to allow running outside a transaction scope, ignoring errors.

@lxblvs
Copy link

lxblvs commented Jul 19, 2022

@eviefp oh great! Are you also having your original update_table_many_by_pk? Running optimised mutations by pk would be quite badass for bulk deterministic mutations (like updating rows in a table where ids are known by the front-end)

@eviefp
Copy link
Contributor

eviefp commented Jul 19, 2022

@eviefp oh great! Are you also having your original update_table_many_by_pk? Running optimised mutations by pk would be quite badass for bulk deterministic mutations (like updating rows in a table where ids are known by the front-end)

@lxblvs We gave that up in favor of this current iteration. However, if we get enough requests, we can definitely prioritize the original update_table_many_by_pk!

@ajohnson1200
Copy link

bulk deterministic mutations (like updating rows in a table where ids are known by the front-end)

Just so that we deeply understand the problem, can you talk through your use case in a little more detail? ie: when you say "bulk", are you talking 10 rows, 100 rows, or 1000 rows? And does the new solution that @eviefp outlined about prohibit you from accomplishing that goal... or is it instead that you can accomplish the goal, it's just not as fast?

@sassela
Copy link
Contributor

sassela commented Sep 15, 2022

Thanks for your patience with this, folks. I'm closing this issue as it was implemented at d76aab9 and released in v2.10.0. But please keep your feedback coming via comments on this issue; the Hasura team will be notified of it.

@sassela sassela closed this as completed Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests