Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyreverse Aggregation and Composition arrows are same #6543

Closed
khashashin opened this issue May 7, 2022 · 12 comments
Closed

pyreverse Aggregation and Composition arrows are same #6543

khashashin opened this issue May 7, 2022 · 12 comments
Assignees
Labels
Enhancement ✨ Improvement to a component Hacktoberfest Needs PR This issue is accepted, sufficiently specified and now needs an implementation pyreverse Related to pyreverse component
Milestone

Comments

@khashashin
Copy link

khashashin commented May 7, 2022

Bug description

pyreverse generates same diagram for aggregation and composition. Here is the sample code:

# Aggregation
class Student:
    def __init__(self, id):
        self._id = id

    def registration_number(self, department_id):
        return str(self._id) + '-' + department_id


class Department:
    def __init__(self, id, student):
        self._id = id
        self._student = student

    def student_registration(self):
        return self._student.registration_number(self._id)


if __name__ == '__main__':
    student = Student(50)
    department = Department('ENG', student)

    print(department.student_registration())

# Composition
class Student:
    def __init__(self, id):
        self._id = id

    def registration_number(self, department_id):
        return str(self._id) + '-' + department_id


class Department:
    def __init__(self, department_id, student_id):
        self._id = department_id
        self._student = Student(student_id)

    def student_registration(self):
        return self._student.registration_number(self._id)


if __name__ == '__main__':
    department = Department('ENG', 50)

    print(department.student_registration())

Configuration

No response

Command used

pyreverse -o png -p composition.py .
pyreverse -o png -p aggregation.py .

Pylint output

Format png is not supported natively. Pyreverse will try to generate it using Graphviz...
parsing ./__init__.py...
parsing /mnt/d/GitHub/test/django_with_sass/experiments/composition.py...
parsing /mnt/d/GitHub/test/django_with_sass/experiments/__init__.py...

Format png is not supported natively. Pyreverse will try to generate it using Graphviz...
parsing ./__init__.py...
parsing /mnt/d/GitHub/test/django_with_sass/experiments/aggregation.py...
parsing /mnt/d/GitHub/test/django_with_sass/experiments/__init__.py...

Expected behavior

Different Arrow type.

image
Source

Pylint version

pylint --version
pylint 2.13.8
astroid 2.11.4
Python 3.10.0 (default, Dec  3 2021, 00:34:18) [GCC 9.3.0]

OS / Environment

Windows 11
WSL2

Additional dependencies

sudo apt install graphviz

@khashashin khashashin added the Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling label May 7, 2022
@Pierre-Sassoulas Pierre-Sassoulas added Enhancement ✨ Improvement to a component pyreverse Related to pyreverse component and removed Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels May 8, 2022
@Pierre-Sassoulas Pierre-Sassoulas added the Needs PR This issue is accepted, sufficiently specified and now needs an implementation label Jul 8, 2022
@dgutson
Copy link

dgutson commented Oct 27, 2022

Please assign this to @qequ

@Pierre-Sassoulas
Copy link
Member

Do you want me to assign this to you @qequ ? 😅

@qequ
Copy link
Contributor

qequ commented Oct 28, 2022

@Pierre-Sassoulas please assign this to me 👍

@qequ
Copy link
Contributor

qequ commented Nov 17, 2022

@Pierre-Sassoulas @DudeNr33 I've been working on this and I have some doubts about astroid/pyreverse and the way associations edges are created.

The way pyreverse currently works is checking in the dicts of instance_attrs_type and locals_type for attributes that are objects and then creating the relationship between the current node and the object that is an attribute.

For aggregations, that are loosely coupled relationships and that each object is independent from each other, I think a good way to identify them would be to check if the arguments are objects and using a type hint like this for the previous example

# Aggregation
class Student:
   def __init__(self, id):
       self._id = id

   def registration_number(self, department_id):
       return str(self._id) + '-' + department_id


class Department:
   def __init__(self, id, student: Student):
       self._id = id
       self._student = student

   def student_registration(self):
       return self._student.registration_number(self._id)


if __name__ == '__main__':
   student = Student(50)
   department = Department('ENG', student)

   print(department.student_registration())

and then you get a ClassDef to differentiate from an instance like the one in composition.

The thing is, if I use type hints over an instance like the following snippet

class Department:
    def __init__(self, department_id, student_id):
        self._id = department_id
        self._student: Student = Student(student_id)

    def student_registration(self):
        return self._student.registration_number(self._id)

the attribute _student is identified as a ClassDef instead of a Instance. I've lost lost the capability to difference between an aggregation and a composition.
Is there a way to keep astroid from setting the Instance as a ClassDef? or better, Is there a way to get the arguments of the methods and check if there are any objects to mark as aggregations?

@Pierre-Sassoulas
Copy link
Member

My first instinct would be to tell you to use inference (NodeNG.infer()).

@DudeNr33
Copy link
Collaborator

Thanks @qequ for taking this on!

I will have to take a closer look at this and hope that I find the time for that on the weekend.
To add to Pierre's comment, my first thoughts are:

  • why is the _student attribute identified as a ClassDef? This itself seems like a bug to me.
  • I would try to differentiate between the composition and aggregation case by trying to find out if the instance was created in the scope of the class or not. If yes --> Composition, else --> Aggregation. Relying on type hints has the disadvantage that this feature will only benefit those users who actually use them.

@qequ
Copy link
Contributor

qequ commented Nov 18, 2022

@DudeNr33
The _student attribute identified as a Classdef happens when using type hints.
Having an aggregation with no type hints like this

# Aggregation
class Student:
    def __init__(self, id):
        self._id = id

    def registration_number(self, department_id):
        return str(self._id) + '-' + department_id


class Department:
    def __init__(self, id, student):
        self._id = id
        self._student = student

    def student_registration(self):
        return self._student.registration_number(self._id)


if __name__ == '__main__':
    student = Student(50)
    department = Department('ENG', student)

    print(department.student_registration())

_student is identified as Uninferable and no association is created

classes

When using type hints like in the example I showed in the previous message _student is identified as a ClassDef. I didn't know this was a bug. I'm not familiar with how astroid works, I thought this was normal behaviour 😬.

About the second point, my first thought was that: finding a way to check if the instance had been created in the instance of the class, but I coudn't find a way to. Neither ClassEntity nor NodeNG seems to keep information about the instance creation to check if it was created in the scope or not, that's why I tried with type hints. Is there any way to get that information?

@DudeNr33
Copy link
Collaborator

I tried around myself a bit.
To distinguish if the value of the attribute is a class instance created inside the scope of the class itself or not, I would start around here:
https://github.com/PyCQA/pylint/blob/57f38c39ae8df066212df75d684f8608b9f41b9e/pylint/pyreverse/inspector.py#L179-L184

You will need to inspect the parent of the assignattr and look at the value that is assigned:

assignattr.parent.value

For the following code

class Dependency:
    pass


class Compositor:
    def __init__(self):
        self.composition = Dependency()


class Aggregator:
    def __init__(self, dep: Dependency):
        self.aggregation = dep

you should see that the composition attribute of Compositor will have a Call object as assignattr.parent.value. You can then try to infer the value of the Call's func attribute, which in this case would be the ClassDef of Dependency:

set(assignattr.parent.value.func.infer())  # --> {<ClassDef.Dependency...10ae4c700>}

If you do the same for the aggregation of Aggregator, you would find that assignattr.parent.value is not a Call, but a Name (at least in the example above).

This is just a hint where and how you could start, feel free to explore different approaches.

Also this would only handle the simplest case, where the creation of the object takes place directly in the assignment to the attribute itself. One could think of more complicated examples, like creating the object in a helper method and assigning the return value to the attribute:

class Compositor:
    def __init__(self):
        self.composition = self._create_dep()
    def _create_dep(self) -> Dependency:
        return Dependency()

Or storing the created object first in a intermediate local variable before assigning it:

class Compositor:
    def __init__(self):
        intermediary = Dependency()
        self.composition = intermediary

Those cases would require more logic.
However, I would start with the simple case from above. Realizing this feature will require also enhancements to other parts of pyreverse to distinguish between composition and aggregation, so the final pull request won't be too small.
So I would propose to first implement only the simple case, make a PR, and then if needed enhance the logic to cover more complex cases in later PRs.

@Pierre-Sassoulas
Copy link
Member

Closing as #7836 was merged, thank you @qequ, great feature !

@qequ
Copy link
Contributor

qequ commented Nov 27, 2022

@DudeNr33 @Pierre-Sassoulas should we open a new issue for the cases that @DudeNr33 pointed out that pyreverse can´t handle yet?

class Compositor:
    def __init__(self):
        self.composition = self._create_dep()
    def _create_dep(self) -> Dependency:
        return Dependency()
class Compositor:
    def __init__(self):
        intermediary = Dependency()
        self.composition = intermediary

@Pierre-Sassoulas
Copy link
Member

I don't know if we want to open this can of worms. If we start to handle those two "simple" cases we'll have to handle a large number of arbitrarily hard (or rather Turing complete hard) other use cases. I'll let @DudeNr33 decide.

@DudeNr33
Copy link
Collaborator

We don't have a policy that you must open an issue in order to create a pull request.

I would open an issue first if

  • the proposed change needs discussion and decision before implementation - for example a change where you're not sure whether a direct PR would be accepted or not
  • you have a specific bug/feature request you'd like to see implemented, but are not able to provide a PR yourself right away

In other cases feel free to directly open a PR. If the change requires a changelog entry, you can first create the PR itself, and then add a changelog entry referencing the PR number instead of an issue and push that in a second commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component Hacktoberfest Needs PR This issue is accepted, sufficiently specified and now needs an implementation pyreverse Related to pyreverse component
Projects
None yet
Development

No branches or pull requests

6 participants