-
-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-Tenancy through Database per Client #749
Comments
I'm dropping this off of the 2.0 release. When we do this, we'll need to change quite a few things in |
Is there currently a way to achieve schema per client? So all your points above but instead of separate db we have separate schema. Would the only way be to spin up separate document stores at runtime? |
@tonykaralis If you look, there should be an open issue about multi-tenancy through schemas. That turned out to be extremely complicated to pull off, and was dropped out of the 2.0 release and never done. You could do it with a DocumentStore per schema/tenant, yes. |
@jeremydmiller thanks, I spotted that issue shortly after. It's a shame as schema per client would be very powerful feature but realise it's a nightmare to achieve. I am thinking schema per client is going to be too costly too manage with separate document stores per client. |
This is coming up much more often, maybe we play this in the next big release |
Definitely interested in this, phase 2 of our project involves exposing the same multi tenanted data but via an api. The caveat being every call to the api will need to be tenanted. All calls from the api to the db go via Marten to the same database and schema. So for now I just need to figure out how to spin up a custom tenanted session per http request as the http request will have the tenantid to use. I realise my edge case is not multi tenancy by schema or client but regardless it be a huge feature for us. |
Would be a very attractive feature (or at least I'd have strong use cases for it). Technical implementation will be interesting though and not able to make the same guarantees as schema or column level tenancy. E.g. "Be able to apply all document schema operations on all tenants" cannot be transactionally done OOTB, with PG connections being per database. |
I have many services that resolve a connection string at runtime. That is to say that I do NOT have knowledge of the target database until the moment the service is asked to interact with it. For now, a custom |
@jaredwri I assume you don't have operations that would span multiple databases, i.e. requiring multiple stores at a specific call site ? Just thinking your workaround sounds reasonable. If tenancy by DB was supported, it would only move code around a bit (so how you configure store vs. sessions). What I think needs to be considered in supporting tenancy by DB is if/how it alters the outputs of Marten. So at the very least a matter of documentation - the same API call can yield very different exceptions or results based on initial configuration. |
@jokoko |
Glad to see the post on your blog - this proposal would be great to see - I'm currently stuck on version 3.x because of the overhead in multiple document stores in a containerized environment. The situation I have is service(s) running in a multitenant environment. The service determines which tenant database at the 'last minute', just before performing a query or operation. Currently the workaround I'm using is to have multiple DocumentStores and look them up based on the tenant, but this is painful in Marten due to DocumentStore size and startup time. It's unusable in Marten v4 due to memory usage and generation times. Some notes on my use case:
Basically, I just want to use one DocumentStore and have the connection to the database be provided by me and held at the session level (And have memory usage be tiny). However, I'm open to any solution that fits my use case without me also needing to up my pod memory limit to hundreds of megabytes |
@PhilipRieck Thank you for taking the time to write that up! The "generate ahead" model is meant to deal with the memory and cold start issue. I think I'm going to stick something in Marten 5 to make that much easier to use before I stock up on a lot more bourbon and attempt to move to IL generation instead. "I do not require Marten to 'know' what tenants exist - I have the list, I have the resolution. Giving the list to Marten and keeping them in sync just for 'all tenant operation' convenience methods doesn't really interest me, since it's just one more piece of data in two places." -- Think database migrations. In that case, Marten absolutely has to know what all the databases are in order to do the schema migrations. I want this built in somehow to Marten so that more people can use this. What you've apparently built for yourself, I'd like to have in the box for other folks. "if I have to restart my services to add a tenant I can work around that - would really rather not, though." -- if we make the tenancy discovery model a little bit pluggable, you could have custom -- in some in the box -- options to automatically spin up a new database for a valid new tenant |
Support for multitenancy with database per tenant is something that we are really interested in. Our use case seems to be pretty close to the one described by @PhilipRieck
These are just some points from the top of my head. I think that our use case is quite simple (famous last words?) and being able to pass connection string when opening new session might be enough for us. At the moment we open database connections and pass those to Marten, so being able to pass the connection string would already simplify our use case. |
Hey everyone, I started jotting down notes yesterday about implementing this. I think I want to say that right now we support these models:
Tenant per schema is still out of scope and really doesn't fit well w/ Marten internals anyway I think everybody is going to be on board for 1-3, so let's talk more about 4. In the notes I took on potential design, it's not actually going to be any more complicated to assume the hybrid model is a possibility. I also think we have to have some knowledge of what tenants are valid for a given database to do runtime assertions when we add the new separated database model. Further more, 4 is something that would conceivably be valuable for my company where we have clients with individual locations/sub-organizations. I'm not suggesting we try for a full blown tree structure model of tenancy here, but setting the foundation might help. At a minimum, I want to treat the concept of "Database" and "Tenant" as not necessarily locked together as we do this work regardless. |
@jeremydmiller Thanks for continuing to look at this. I have opinions on 1 and 3 - both models would be used by me. 2 and 4 are not something I or my team would need at all. I don't see a lot of use for them, but others may disagree. Just to be clear on usage as far as I'm concerned: In a theoretical world where you had a Marten library that allowed me to use a different connection string / selector per Session without rebuilding the DocumentStore, and then you built first-class tenancy notions in a separate library on top of that capability, I would only use the base library. Any way you can give me to separate the ideas of data shape and query building from the physical connection to the database, will be a win for me. If you solve other users tenancy needs at the same time, well that is a huge bonus. In fact, this is basically how I'm trying to work around it now, but it's much harder in the new versions. Looking at this as a whole, I think you'll require at least some minor architectural changes. Once you have a design you like and find yourself just needing someone's time coding it up, please let me know how I can assist - glad to send PRs your way. |
@PhilipRieck In what way is "it's much harder in the new versions"? You can still push a connection string into a session, and that's probably the most efficient way to do database per tenant with the existing Marten V4. And I definitely don't agree with having separate libraries for the multi-tenancy. At this point I think this is being a fair amount of work to enhance the database migrations capabilities in Marten for multiple databases, but hardly any code after that for the 3 & 4 models above. When we introduced 2. in Marten way back when, I thought that we'd also be doing 3., so the internal hooks are actually kinda set for database per client already. |
@PhilipRieck And I'm shutting down until at least next week, but I'll get back to you on the PR help. Definitely take you up on that if you're game. Think it's gonna be way more about writing tests than actual code. |
It's quite possible (likely even) that the way I'm working around this currently is not the best way. I'll look more into that and create other threads or use gitter to track that down.
I'm sorry - I wasn't suggesting this approach but trying to use it as an illustration. Re-reading it, I think it's more confusing than helpful so please disregard. If you can get the other approaches with minimal work that's great. |
Jotting down some implementation notes on what is going to be variable: Static database to tenant mappingWhere is this information stored? Thinking we support multiple options:
Dynamic creation of databasesMarten already has some functionality for spinning up databases with configuration (likely moving to Weasel very soon). In development mode, we could spin these up on the fly based on expected or even new tenants. |
Would be great if the "master" database didnt have to be a database eg making things like dynamodb/azure tables/cosmosdb be pluggable options here. There shouldnt need to be something as heavy as a full postgres instance needed so marten can map to tenants.
Assuming that databases are not being spun up on the fly by marten, having a way to register a new tenant with marten without needing to restart the application would be invaluable |
Well to use marten you likey already have "at least" one postgres instance to stick a central small DB on (it's going to be pretty small, like < 10M likely even for large scale systems) ... So it's not a horrible idea, why would we bring Dynamo into this architecture if it's already postgres based.. Having said that, providing an interface to overide this default master config might be valueable to some. |
"Having said that, providing an interface to override this default master config might be valueable to some." -- which is exactly what's already in place. And what you're talking about is maybe an extra schema & one table that isn't accessed very much riding on one of the databases that you're using. From my perspective, it makes no sense to use some other kind of storage. |
Happy to pull this into a separate issue as its a bit of a rabbit hole. To take a current real work example I manage (that doesn't use marten). Using AWS infrastructure, we have around 300 tenants in a single region. We put multiple databases on the same RDS servers which is by far the most cost effective option. One challenge is we have to put a cap on how many databases per server due to connections draining server memory (50 per server seems ok). On top of that some tenants are "hotter" than others with much more activity which means either moving them around to load balance or scaling each RDS instance based on demand. Moving individual databases across servers is not fun to do in RDS. There is a balance between cost and trying to ensure the performance/stability of one tenant doesn't affect others. |
It is indeed a rabbit hole!
We're not on AWS, so dynamodb wouldn't be our choice, but you have it right - Anything we add that holds state adds risk and management effort. As we are fully kubernetes, our current tenancy storage is a custom resource (CRD) we apply to the etcd database and have a controller managing.
This is perfect. Personally, I'd make the "master config" for tenancy have to be an affirmative choice (as in, you must select the implementation Marten will use to get tenants, translate tenant->connection, etc), rather than having one be the 'blessed' default. But: @jeremydmiller , I know you will need a default implementation to reduce friction for many users and may want to bless one - as long as overriding is clear and performant I'm happy. One note - I would guess most people will quickly outstrip any default you provide. Perhaps a default provider based on I know my opinion is very much colored by my single use case, so thanks for taking it into consideration on this. Also, thanks so much for the progress on this! (And on MartenDB in general, in case you don't hear it enough). |
Punchlist
Development Tasks
|
@PhilipRieck @elexisvenator To all the points:
|
I wanted to split up #435.
The idea here would be to keep each tenant in a completely separate database. We've already made some significant changes for 2.0 to make this a lot more efficient inside of Marten's internals. Here's what I think still needs to happen:
The text was updated successfully, but these errors were encountered: