Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit the implementation of --standalone as well as the assumptions made #513

Closed
2 of 5 tasks
Tracked by #506
trallard opened this issue Jul 31, 2023 · 25 comments
Closed
2 of 5 tasks
Tracked by #506
Assignees
Labels
area: api area: user experience 👩🏻‍💻 Items impacting the end-user experience project: JATIC Work item needed for the JATIC project

Comments

@trallard
Copy link
Collaborator

trallard commented Jul 31, 2023

One of our goals is to improve the local experience or story of the project. To do so, we need to:

Tasks

@trallard trallard added area: api project: JATIC Work item needed for the JATIC project area: user experience 👩🏻‍💻 Items impacting the end-user experience labels Jul 31, 2023
@trallard trallard added this to the 🚀 JATIC - Q1 milestone Jul 31, 2023
@trallard
Copy link
Collaborator Author

trallard commented Aug 1, 2023

More than likely it is using SQLite database for the local implementation

@asmeurer
Copy link
Contributor

asmeurer commented Aug 1, 2023

I was able to use --standalone to create an environment, but if I try to create a subsequent environment I get the error

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/uvicorn/middleware/message_logger.py", line 84, in __call__
    raise exc from None
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/uvicorn/middleware/message_logger.py", line 80, in __call__
    await self.app(scope, inner_receive, inner_send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/applications.py", line 289, in __call__
    await super().__call__(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/base.py", line 108, in __call__
    response = await self.dispatch_func(request, call_next)
  File "/home/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/server/app.py", line 238, in conda_store_middleware
    response = await call_next(request)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/base.py", line 84, in call_next
    raise app_exc
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/base.py", line 70, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/sessions.py", line 86, in __call__
    await self.app(scope, receive, send_wrapper)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/cors.py", line 91, in __call__
    await self.simple_response(scope, receive, send, request_headers=headers)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/cors.py", line 146, in simple_response
    await self.app(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/routing.py", line 192, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/home/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/server/views/api.py", line 572, in api_post_specification
    build_id = api.post_specification(conda_store, specification, namespace_name)
  File "/home/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/api.py", line 234, in post_specification
    return conda_store.register_environment(specification, namespace, force=True)
  File "/home/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/app.py", line 617, in register_environment
    build = self.create_build(environment.id, specification.sha256)
  File "/home/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/app.py", line 675, in create_build
    (
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/celery/canvas.py", line 1035, in apply_async
    return self.run(args, kwargs, app=app, **(
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/celery/canvas.py", line 1060, in run
    tasks, results_from_prepare = self.prepare_steps(
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/celery/canvas.py", line 1253, in prepare_steps
    app.backend.ensure_chords_allowed()
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/celery/backends/base.py", line 1101, in ensure_chords_allowed
    raise NotImplementedError(E_CHORD_NO_BACKEND.strip())
NotImplementedError: Starting chords requires a result backend to be configured.

Note that a group chained with a task is also upgraded to be a chord,
as this pattern requires synchronization.

Result backends that supports chords: Redis, Database, Memcached, and more.

@asmeurer
Copy link
Contributor

asmeurer commented Aug 1, 2023

I also get this error which I believe prevents the environment from actually showing up in the UI

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/uvicorn/middleware/message_logger.py", line 84, in __call__
    raise exc from None
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/uvicorn/middleware/message_logger.py", line 80, in __call__
    await self.app(scope, inner_receive, inner_send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/applications.py", line 289, in __call__
    await super().__call__(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/base.py", line 108, in __call__
    response = await self.dispatch_func(request, call_next)
  File "/home/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/server/app.py", line 238, in conda_store_middleware
    response = await call_next(request)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/base.py", line 84, in call_next
    raise app_exc
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/base.py", line 70, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/sessions.py", line 86, in __call__
    await self.app(scope, receive, send_wrapper)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/routing.py", line 192, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/home/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/home/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/server/views/api.py", line 398, in api_list_environments
    return paginated_api_response(
  File "/home/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/server/views/api.py", line 109, in paginated_api_response
    "data": [object_schema.from_orm(_).dict(exclude=exclude) for _ in query.all()],
  File "/home/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/server/views/api.py", line 109, in <listcomp>
    "data": [object_schema.from_orm(_).dict(exclude=exclude) for _ in query.all()],
  File "pydantic/main.py", line 579, in pydantic.main.BaseModel.from_orm
pydantic.error_wrappers.ValidationError: 1 validation error for Environment
current_build_id
  none is not an allowed value (type=type_error.none.not_allowed)

@trallard trallard added project: JATIC Work item needed for the JATIC project and removed project: JATIC Work item needed for the JATIC project labels Aug 2, 2023
@costrouc
Copy link
Member

costrouc commented Aug 2, 2023

The --standalone setting just tells conda-store-server to run the celery workers as a subprocess.

The defaults for conda-store-server:

Technically there shouldn't be anything blocking conda-store-server --standlone from running on windows/osx/linux.

@costrouc
Copy link
Member

costrouc commented Aug 3, 2023

Related to issue #520 that is the cause of the errors.

@asmeurer
Copy link
Contributor

asmeurer commented Aug 4, 2023

Something I noticed running --standalone on Mac (didn't check if this is also the case on Linux) is that the process is very difficult to actually shut down. If you control-C, it doesn't actually stop, and you have to control-C multiple times to get it to stop. But even then, it still has some other process that is still running in the background, and you have to kill it in manually!

@trallard
Copy link
Collaborator Author

trallard commented Aug 7, 2023

Thanks for sharing your findings @asmeurer - can you please start identifying/writing here concrete tasks/ideas to address these issues?

@costrouc
Copy link
Member

@asmeurer the issue about that is here https://github.com/conda-incubator/conda-store/blob/main/conda-store-server/conda_store_server/server/app.py#L341-L368. I would love it this were cleaned up. I am certain this was not implemented properly.

@asmeurer
Copy link
Contributor

After the latest changes in main I am now able to successfully create multiple environments on Linux in standalone mode, although the UI issue is still there.

@asmeurer
Copy link
Contributor

According to https://docs.celeryq.dev/en/stable/getting-started/first-steps-with-celery.html#running-the-celery-worker-server, the celery worker should be run in the background as a daemon using something like supervisor.

@asmeurer
Copy link
Contributor

Another big thing that needs to be cleaned up here once we get --standalone mostly working is the tests. Right now the tests all operate on the current conda environment:

return pathlib.Path(os.environ["CONDA_PREFIX"])

They rely on isolation happening a level above the tests. But if standalone mode is functional, the tests should be able to just run with their own isolation, and you should be able to just run pytest locally. This means making sure that everything is done in temp directories so that nothing about your current dev conda environment affects things, and so that the tests themselves don't affect it either (as far as I can see none of the tests actually mutate the current conda environment, but with current_prefix as it currently is there's always a risk that that could inadvertently happen). This also would make the tests more consistent. Currently several tests fail for me because my dev conda environment happens to fail some of the tests (e.g., some tests fail when the environment has editable packages installed).

@costrouc
Copy link
Member

@asmeurer I agree this is two places that need improvement in the tests:

  • we need to have a "mock" conda environment to use for testing. Using CONDA_PREFIX is dirty and honestly is only there because it was quick to implment
  • we need to have conda-store --standalone tests. These would also be reasonably quick CI tests since we don't have to go through the whole ordeal of running docker-compose .... Additionally we could run cross OS windows/osx/linux github-actions here

@asmeurer
Copy link
Contributor

@costrouc What do you see a "conda-store --standalone" test looking like? Should we have an integration test that checks the actual command line conda-store-server --standalone to make sure it runs and creates the necessary files?

Otherwise, to me, the existing tests already are functionally standalone tests, because they run without using docker, and use the default configuration, which is the standalone configuration.

I agree we should set up a Mac CI build. I will work on that.

asmeurer added a commit to asmeurer/conda-store that referenced this issue Aug 25, 2023
This is built on conda-incubator#549 but
I've made in a separate PR because I'm not sure if there will be other issues
here and I don't want to block that PR on this (but at the same time, tests
won't pass on Mac without the changes from that PR).

See conda-incubator#513 and conda-incubator#507
@asmeurer
Copy link
Contributor

Anyway, if we want to add a test that actually runs conda-store --standalone, we might need to fix #513 (comment) first.

@costrouc
Copy link
Member

Otherwise, to me, the existing tests already are functionally standalone tests, because they run without using docker, and use the default configuration, which is the standalone configuration.

Yeah I agree. We should be able to somehow just use the existing tests. Some tests I do expect to fail since we are using a sqlite backend instead of redis for celery which makes build canceling not work (e.g. build canceling and viewing active tasks).

@asmeurer
Copy link
Contributor

I think the biggest issue here is the way that --standalone actually starts the worker process (see #513 (comment)). As I noted in #513 (comment), I believe it should be more properly daemonized using something like supervisord.

Another question relates to what behavior we want the conda-store-server to have by default. Right now, all the default configuration is designed around a standalone operation. This decision dates back to #418 (possibly other places too), and I generally agree with it, because it means that you don't have to worry about a configuration file for standalone mode. But that also begs the question of whether the --standalone flag itself should be required, or if conda-store-server should just default to standalone mode. I guess I'm not completely clear what the expected user interaction of conda-store server in standalone mode would be.

Incidentally, I also tried manually running conda-store-server and conda-store-worker separately. This seems to work (although note that if you are using iTerm2, you need to run ITERM_PROFILE= conda-store-worker until celery/celery#8379 gets released). Indeed, aside from the fact that you have to use two separate processes, this appears, at least at first blush, to be more robust than the current --standalone approach.

But I also noticed that if you just run the server and don't start the worker, then everything appears to work. The GUI loads and it appears as if you can build an environment, but if you try to do so, it just spins. It does log the environment into the database, meaning it will continue to appear in the GUI in future iterations as a failed build. Is there an existing issue open for this? I didn't find one. If not, should I open one? I think the server should check if there are workers active and fail if it can't find any, and there also should perhaps be some sort of heartbeat to ensure the worker process remains alive.

asmeurer added a commit to asmeurer/conda-store that referenced this issue Aug 30, 2023
This is built on conda-incubator#549 but
I've made in a separate PR because I'm not sure if there will be other issues
here and I don't want to block that PR on this (but at the same time, tests
won't pass on Mac without the changes from that PR).

See conda-incubator#513 and conda-incubator#507
@asmeurer
Copy link
Contributor

I think the server should check if there are workers active and fail if it can't find any, and there also should perhaps be some sort of heartbeat to ensure the worker process remains alive.

Related to this, if you try to run the server without starting the workers with a fresh database file, you get this error when trying to create an environment in the UI.

INFO:     127.0.0.1:63982 - "POST /api/v1/specification/ HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/anyio/streams/memory.py", line 98, in receive
    return self.receive_nowait()
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/anyio/streams/memory.py", line 93, in receive_nowait
    raise WouldBlock
anyio.WouldBlock

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/base.py", line 78, in call_next
    message = await recv_stream.receive()
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/anyio/streams/memory.py", line 118, in receive
    raise EndOfStream
anyio.EndOfStream

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/base.py", line 108, in __call__
    response = await self.dispatch_func(request, call_next)
  File "/Users/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/server/app.py", line 235, in conda_store_middleware
    response = await call_next(request)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/base.py", line 84, in call_next
    raise app_exc
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/base.py", line 70, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/sessions.py", line 86, in __call__
    await self.app(scope, receive, send_wrapper)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/cors.py", line 91, in __call__
    await self.simple_response(scope, receive, send, request_headers=headers)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/cors.py", line 146, in simple_response
    await self.app(scope, receive, send)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/Users/aaronmeurer/anaconda3/envs/conda-store-server-dev/lib/python3.10/site-packages/fastapi/routing.py", line 190, in run_endpoint_function
    return await dependant.call(**values)
  File "/Users/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/server/views/api.py", line 574, in api_post_specification
    build_id = conda_store.register_environment(
  File "/Users/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/app.py", line 588, in register_environment
    self.validate_action(
  File "/Users/aaronmeurer/Documents/conda-store/conda-store-server/conda_store_server/app.py", line 68, in conda_store_validate_action
    ) and (settings.storage_threshold > system_metrics.disk_free):
AttributeError: 'NoneType' object has no attribute 'disk_free'

I believe the issue is that the server is trying to read something from the database (the amount of free space on the disk) which should have been written by a worker. I wouldn't be surprised if this error is also possible under normal working conditions if a race condition somehow occured that delayed the system metrics task.

Also, for whatever reason, when I do this same thing on Windows, I get the 500 error, but no traceback is printed to the terminal. My guess is this is an issue with one of the third-party packages being used (there are at least 4 packages present in the above traceback), but it makes 500 errors extremely annoying to diagnose.

trallard pushed a commit that referenced this issue Oct 9, 2023
* Add a macOS worker to the CI

This is built on #549 but
I've made in a separate PR because I'm not sure if there will be other issues
here and I don't want to block that PR on this (but at the same time, tests
won't pass on Mac without the changes from that PR).

See #513 and #507

* Trigger build

* Try not using mamba to fix macos CI

* Add a separate macos environment file without conda-docker

* Use the macos environment file for macos on CI

* Try using mamba on macos again

* Revert "Try using mamba on macos again"

This reverts commit 031574a.

* Install mamba in the dev environment

* Go back to using mamba but with the correct syntax this time

Revert "Revert "Try using mamba on macos again""

This reverts commit 6cf2877.

* Fix CI

Revert "Go back to using mamba but with the correct syntax this time"

This reverts commit b82c0c8.

* Only test docker on Linux
@asmeurer
Copy link
Contributor

asmeurer commented Oct 17, 2023

To summarize the remaining issues here:

  • Do we need --standalone at all? All it does is run both the server and the worker in the same process, but it's not hard to run them separately.

  • If we do want it, it's possible to forcibly control-c out of the standalone server in a way that leaves the worker processes intact. See Revisit the implementation of --standalone as well as the assumptions made #513 (comment). However, I've also noticed that this is somewhat of an issue with celery itself, even when the worker is run separately, that cancellation doesn't work very smoothly. The difference is that when workers are run separately, the process doesn't end until the workers are actually shutdown. Not clear how important this issue is. It comes up during development, but I don't know if it comes up in deployment.

  • If you run the server separately from the worker, but don't start the worker, you get a bunch of errors (e.g. Revisit the implementation of --standalone as well as the assumptions made #513 (comment)). In general, the server doesn't seem to be doing any checks if the worker is actually running.

Also some related issues:

@costrouc
Copy link
Member

@nkaretnikov needs to update the status on this issue.

@nkaretnikov
Copy link
Contributor

  • Linux, Windows, macOS should work in standalone mode.
  • I'd like to test all three at some point again, just to be sure. Had no time to do it last week. I remember seeing an issue on either Intel or ARM64 macs, where our conda env couldn't be built because of a missing dep. Might be already fixed. I suspect it's on m1 because we have no CI tests for it.
  • Windows has the MAX_PATH limit, which currently can be bypassed with admin access, which might require us changing our build_path (this is WIP). A related issue is the conda-store prefix limit (also WIP). See [META] - Path length issues #650.
  • The port issue I'm not aware of and would need to take a look.
  • I'm making some improvements to make it easier to run in standalone mode (nice UI, no config needed). Ideally, we should have a one-command wrapper that would start both the UI and server (with matching versions), to make it easier for people to run. UI is very slow to start when I run yarn run start.
  • There's some stuff about the --standalone param not being needed. I'm going to ignore this since it's low prio, we can fix this later.

@kcpevey
Copy link
Contributor

kcpevey commented Nov 7, 2023

Just run through manually testing on each system. If all looks good, this should be complete.

@asmeurer
Copy link
Contributor

asmeurer commented Nov 7, 2023

There's also the issue on Mac where if you Control-C too hard, it will exit the server and apparently be stopped, but the worker tasks will still be running in the background. They will need to wait until they get the cancellation request and stop themselves. The apparent fix is to run the worker as a daemon (#513 (comment)). This shouldn't affect ordinary operation but it can mean weird things can happen after shutting down conda-store if the workers aren't actually stopped all the way.

I'm not sure if this also happens on Linux or Windows. It might be related to how "fork" works on Mac.

@nkaretnikov
Copy link
Contributor

I'll test and open new issues for everything mentioned above (if there are no open issues already).

@nkaretnikov
Copy link
Contributor

Status update: still need to follow up here, but this is at the very top of my list of things to do.

@nkaretnikov
Copy link
Contributor

Status update:

Tested standalone again on 5e4e2e5: Linux, Windows 11 (ARM64), macOS (Intel, ARM64) work. On macOS ARM64, I had to comment out playwright from the dev yaml file, see #630. I was able to build this env:

channels:
- conda-forge
dependencies:
- ipython>=8.15.0
description: test
name: test
prefix: null
variables: null

I've read through the entire issue and identified action items. All of them are in this comment:

These 3 issues will be fixed as part of the same milestone, so I'm going to close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: api area: user experience 👩🏻‍💻 Items impacting the end-user experience project: JATIC Work item needed for the JATIC project
Projects
Archived in project
Development

No branches or pull requests

5 participants