Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add example of workflow deployment using metaflow #353

Open
raybellwaves opened this issue Mar 21, 2024 · 1 comment
Open

[FEA] Add example of workflow deployment using metaflow #353

raybellwaves opened this issue Mar 21, 2024 · 1 comment
Labels
feature request New feature or request

Comments

@raybellwaves
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Could add an example of a workflow deployment using metaflow at https://docs.rapids.ai/deployment/stable/examples/

Describe the solution you'd like
It could be a notebook but metaflow works with .py files. The notebook could still contain the code to generate the .py file and how to run it.

Describe alternatives you've considered
NA

Additional context
I've got experience with this. Happy to create a PR. Folks at outerbounds may have thoughts as well cc. @hugobowne @tuulos @emattia or the notebook simple cross-post to https://outerbounds.com/blog/nvidia-cloud-gpu-announcement/ or other documents on using GPU's with metaflow

@raybellwaves raybellwaves added the feature request New feature or request label Mar 21, 2024
@emattia
Copy link

emattia commented Mar 21, 2024

Nice idea @raybellwaves, lmk how I can help. Here is a starter template I've used recently:

from metaflow import FlowSpec, step, kubernetes
from metaflow.profilers import gpu_profile


class RapidsFlow(FlowSpec):

    @step
    def start(self):
        self.next(self.train)

    @gpu_profile(interval=1)
    @kubernetes(
        cpu=4,
        memory=22000,
        gpu=1,
        image="docker.io/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10",
    )
    @step
    def train(self):
        """https://docs.rapids.ai/api/cuml/stable/estimator_intro/"""

        import cuml
        from cuml.datasets.classification import make_classification
        from cuml.model_selection import train_test_split
        from cuml.ensemble import RandomForestClassifier as cuRF

        n_samples = 5_000_000
        n_features = 100
        n_classes = 2

        # random forest depth and size
        n_estimators = 25
        max_depth = 10

        # generate synthetic data [ binary classification task ]
        X, y = make_classification(
            n_classes=n_classes,
            n_features=n_features,
            n_samples=n_samples,
            random_state=0,
        )

        self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(
            X, y, random_state=0
        )

        model = cuRF(max_depth=max_depth, n_estimators=n_estimators, random_state=0)

        self.trained_RF = model.fit(self.X_train, self.y_train)
        self.predictions = model.predict(self.X_test)

        self.next(self.eval)

    @kubernetes(
        cpu=1,
        memory=12000,
        gpu=1,
        image="docker.io/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10",
    )
    @step
    def eval(self):
        import cuml
        from cupy import asnumpy
        from sklearn.metrics import accuracy_score

        cu_score = cuml.metrics.accuracy_score(self.y_test, self.predictions)
        sk_score = accuracy_score(asnumpy(self.y_test), asnumpy(self.predictions))

        print(" cuml accuracy: ", cu_score)
        print(" sklearn accuracy : ", sk_score)

        self.next(self.end)

    @step
    def end(self):
        pass


if __name__ == "__main__":
    RapidsFlow()

Could cross-post an example on https://outerbounds.com/docs/how-to-index/, or write a blog if we do something more of a real-world reference.

@raydouglass raydouglass transferred this issue from rapidsai/docs Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants