Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Auto Techsupport] Event driven Techsupport Changes #1796

Merged
merged 72 commits into from
Nov 16, 2021

Conversation

vivekrnv
Copy link
Contributor

@vivekrnv vivekrnv commented Sep 3, 2021

What I did

sonic-utilities changes required for feature "Event Driven TechSupport Invocation & CoreDump Mgmt". HLD

Summary of the changes:

  • Added the AUTO GEN CLI for the CFG DB tables required for this feature
  • Added the coredump_gen_handler.py & techsupport_cleanup.py scripts.
  • Added the UT's required for these scripts.
  • Enhanced coredump-compress & generate-dump scripts

How I did it

How to verify it

tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_core_dump_cleanup PASSED [ 22%]
tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_core_dump_with_exit_event_unknown_cmd PASSED [ 22%]
tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_core_dump_with_no_exit_event PASSED [ 22%]
tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_feature_table_not_set PASSED [ 22%]
tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_global_rate_limit_interval PASSED [ 22%]
tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_invalid_since_argument PASSED [ 22%]
tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_invoc_ts_after_rate_limit_interval PASSED [ 22%]
tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_invoc_ts_state_db_update PASSED [ 22%]
tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_max_core_size_limit_not_crossed PASSED [ 22%]
tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_per_proc_rate_limit_interval PASSED [ 22%]
tests/coredump_gen_handler_test.py::TestCoreDumpCreationEvent::test_since_argument PASSED [ 22%]
tests/techsupport_cleanup_test.py::TestTechsupportCreationEvent::test_dump_cleanup PASSED [ 79%]
tests/techsupport_cleanup_test.py::TestTechsupportCreationEvent::test_no_cleanup_state_disabled PASSED [ 79%]
tests/techsupport_cleanup_test.py::TestTechsupportCreationEvent::test_no_cleanup_state_enabled PASSED [ 80%]
tests/techsupport_cleanup_test.py::TestTechsupportCreationEvent::test_state_db_update PASSED [ 80%]

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)


admin@sonic:~$ show auto-techsupport global
STATE      RATE LIMIT INTERVAL    MAX TECHSUPPORT SIZE    MAX CORE SIZE  SINCE
-------  ---------------------  ----------------------  ---------------  ----------
enabled                    180                      10                5  2 days ago

admin@sonic:~$ show auto-techsupport-feature 
FEATURE NAME    STATE       RATE LIMIT INTERVAL
--------------  --------  ---------------------
bgp             enabled                     600
database        enabled                     600
dhcp_relay      enabled                     600
lldp            enabled                     600
macsec          enabled                     600
mgmt-framework  enabled                     600
nat             enabled                     600
pmon            enabled                     600
radv            enabled                     600
restapi         disabled                    800
sflow           enabled                     600
snmp            enabled                     600
swss            disabled                    800
syncd           enabled                     600
teamd           enabled                     600
telemetry       enabled                     600
admin@sonic:~$ show auto-techsupport history

TECHSUPPORT DUMP                          TRIGGERED BY    CORE DUMP
----------------------------------------  --------------  -----------------------------
sonic_dump_r-lionfish-16_20210901_221402  bgpcfgd         bgpcfgd.1630534439.55.core.gz
sonic_dump_r-lionfish-16_20210901_203725  snmp-subagent   python3.1630528642.23.core.gz
sonic_dump_r-lionfish-16_20210901_222408  lldpmgrd        python3.1630535045.34.core.gz

vivekrnv and others added 30 commits August 7, 2021 23:30
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
This reverts commit 4fdf805.
This reverts commit 7be9ee4.
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
if [[ ! -z $CONTAINER_ID ]]; then
CONTAINER_NAME=$(docker inspect --format='{{.Name}}' ${CONTAINER_ID} | cut -c2-)
if [[ ! -z ${CONTAINER_NAME} ]]; then
# coredump_gen_handler invokes techsupport if all the other required conditions are met
Copy link
Contributor Author

@vivekrnv vivekrnv Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft Regarding the extra space char comment. Do you want the "#" to align with if?

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 1796 in repo Azure/sonic-utilities

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
@vivekrnv
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@vivekrnv
Copy link
Contributor Author

vivekrnv commented Nov 6, 2021

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@liat-grozovik
Copy link
Collaborator

@qiluo-msft and @ganglyu could you please review latest code changes?

if [[ ! -z ${CONTAINER_NAME} ]]; then
# coredump_gen_handler invokes techsupport if all the other required conditions are met
# explicitly passing in the env vars because coredump-compress's namespace doesn't have these set by default
for path in $(find /usr/local/lib/python[^2]*/dist-packages -maxdepth 0); do
Copy link
Contributor

@qiluo-msft qiluo-msft Nov 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^2

check python and python3* only? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only python3.x, Can't involve both python2 & python3 both. imports won't work if i do so

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then check python3* only? you are checking non-2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handled

done
setsid $(echo > /tmp/coredump_gen_handler.log;
export PYTHONPATH=$PYTHONPATH;
python3 /usr/local/bin/coredump_gen_handler.py ${PREFIX}core.gz ${CONTAINER_NAME} &>> /tmp/coredump_gen_handler.log) &
Copy link
Contributor

@qiluo-msft qiluo-msft Nov 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python3

If add a shebang to the py file, do you still need to use python3 and PYTHONPATH? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can avoid python3 but PYTHONPATH is still required.


@AUTO_TECHSUPPORT.command(name="global")
@clicommon.pass_db
def AUTO_TECHSUPPORT_GLOBAL(db):
Copy link
Contributor

@qiluo-msft qiluo-msft Nov 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AUTO_TECHSUPPORT_GLOBAL

Do you have a strong reason to use UPPER_CASE function name? We prefer lower_case. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's auto_generated by SONiC Auto CLI gen. #1644

@vivekrnv
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@vivekrnv
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dgsudharsan
Copy link
Collaborator

@qiluo-msft Can you please sign off if all comments are addressed?

@qiluo-msft qiluo-msft merged commit a3e34e3 into sonic-net:master Nov 16, 2021
qiluo-msft pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Nov 16, 2021
#### Why I did it

Changes required for feature "Event Driven TechSupport Invocation & CoreDump Mgmt". [HLD](sonic-net/SONiC#818 )

Requires: sonic-net/sonic-utilities#1796.
Merging in any order would be fine.

Summary of the changes:

- Added the YANG Models for the new tables introduces as a part of this feature.
- Enhanced init_cfg.json with the default config required
- Added a compile Time flag which enables/disables the config required for this feature inside the init_cfg.json
- Enhanced the supervisor-proc-exit-listener script to populate `<feature>:<critical_proc> = <comm>:<pid>` info in the STATE_DB when it observes an proc exit notification for the critical processes running inside the docker.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants