Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for flaky WPF test #3785

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,11 @@ class BtActionServer
*/
void populateErrorCode(typename std::shared_ptr<typename ActionT::Result> result);

/**
* @brief Setting BT error codes to success. Used to clean blackboard between different BT runs
*/
void cleanErrorCodes();

// Action name
std::string action_name_;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,7 @@ void BtActionServer<ActionT>::executeCallback()
{
if (!on_goal_received_callback_(action_server_->get_current_goal())) {
action_server_->terminate_current();
cleanErrorCodes();
return;
}

Expand Down Expand Up @@ -284,6 +285,8 @@ void BtActionServer<ActionT>::executeCallback()
action_server_->terminate_all(result);
break;
}

cleanErrorCodes();
SteveMacenski marked this conversation as resolved.
Show resolved Hide resolved
}

template<class ActionT>
Expand All @@ -310,6 +313,14 @@ void BtActionServer<ActionT>::populateErrorCode(
}
}

template<class ActionT>
void BtActionServer<ActionT>::cleanErrorCodes()
{
for (const auto & error_code : error_code_names_) {
blackboard_->set<unsigned short>(error_code, 0); //NOLINT
}
}

} // namespace nav2_behavior_tree

#endif // NAV2_BEHAVIOR_TREE__BT_ACTION_SERVER_IMPL_HPP_
10 changes: 10 additions & 0 deletions nav2_common/nav2_common/launch/rewritten_yaml.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ def perform(self, context: launch.LaunchContext) -> Text:
param_rewrites, keys_rewrites = self.resolve_rewrites(context)
data = yaml.safe_load(open(yaml_filename, 'r'))
self.substitute_params(data, param_rewrites)
self.add_params(data, param_rewrites)
self.substitute_keys(data, keys_rewrites)
if self.__root_key is not None:
root_key = launch.utilities.perform_substitutions(context, self.__root_key)
Expand Down Expand Up @@ -121,6 +122,15 @@ def substitute_params(self, yaml, param_rewrites):
yaml_keys = path.split('.')
yaml = self.updateYamlPathVals(yaml, yaml_keys, rewrite_val)

def add_params(self, yaml, param_rewrites):
# add new total path parameters
yaml_paths = self.pathify(yaml)
for path in param_rewrites:
if not path in yaml_paths:
new_val = self.convert(param_rewrites[path])
yaml_keys = path.split('.')
if 'ros__parameters' in yaml_keys:
yaml = self.updateYamlPathVals(yaml, yaml_keys, new_val)

def updateYamlPathVals(self, yaml, yaml_key_list, rewrite_val):
for key in yaml_key_list:
Expand Down
5 changes: 4 additions & 1 deletion nav2_system_tests/src/waypoint_follower/test_case_py.launch
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,13 @@ def generate_launch_description():
params_file = os.path.join(bringup_dir, 'params/nav2_params.yaml')

# Replace the `use_astar` setting on the params file
param_substitutions = {
'controller_server.ros__parameters.FollowPath.prune_distance': '1.0',
'controller_server.ros__parameters.FollowPath.forward_prune_distance': '1.0'}
configured_params = RewrittenYaml(
source_file=params_file,
root_key='',
param_rewrites='',
param_rewrites=param_substitutions,
convert_types=True)

context = LaunchContext()
Expand Down
2 changes: 0 additions & 2 deletions nav2_system_tests/src/waypoint_follower/tester.py
Original file line number Diff line number Diff line change
Expand Up @@ -236,9 +236,7 @@ def main(argv=sys.argv[1:]):
# stop on failure test with bogous waypoint
test.setStopFailureParam(True)
bwps = [[-0.52, -0.54], [100.0, 100.0], [0.58, 0.52]]
starting_pose = [-2.0, -0.5]
test.setWaypoints(bwps)
test.setInitialPose(starting_pose)
result = test.run(True, False)
assert not result
result = not result
Expand Down
10 changes: 6 additions & 4 deletions nav2_waypoint_follower/src/waypoint_follower.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,10 @@ WaypointFollower::followWaypoints()
feedback->current_waypoint = goal_index;
action_server_->publish_feedback(feedback);

if (current_goal_status_.status == ActionStatus::FAILED) {
if (
current_goal_status_.status == ActionStatus::FAILED ||
current_goal_status_.status == ActionStatus::UNKNOWN)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really a failed case if unknown from this timeout? Seems like we should instead temporarily increase the timeout back to 15 minutes (@pepisg ) while working through a potential patch to rclcpp to better keep actively processing goals around.

Copy link
Contributor

@pepisg pepisg Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposed that as a parameter in #3787 that defaults to 15 min

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexeyMerzlyakov do you think we should remove this bit and instead use #3787 as the temp work around?

Copy link
Collaborator Author

@AlexeyMerzlyakov AlexeyMerzlyakov Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is complementary with PR3787, that improves the situation and fixes some bugs, that 3787 does not. So both PR-s ought to be merged (moreover, new parameters in RewrittenYaml I plan to use to comb-up Costmap Filters system tests).

Copy link
Member

@SteveMacenski SteveMacenski Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexeyMerzlyakov I meant only the addition of || current_goal_status_.status == ActionStatus::UNKNOWN. I thought this line was to mask the unknown return type problem? If we increase the timeout, wouldn't that make it so we didn't see UNKNOWN? This logic treats UNKNOWN as a terminal condition failure, which I suppose is an interpretation. I could get behind that but:

If we keep this here should https://github.com/ros-planning/navigation2/pull/3785/files#diff-872945d6a63001626ab7379639b5b3d5f11982ea3bdf08da01e7f111a9c2dfecR279 be changed for incrementing new goals? You're treating UNKNOWN as a terminal condition in your change, so this doesn't make as much sense anymore as a potentially non-terminal condition, correct? Is there anywhere UNKNOWN is potentially set where its not terminal?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This status is being treated as terminal one for WPF server in order to avoid infinite cycles whether for some reasons the goal got into "UNKNOWN" state. This is intended to fix/mask not only the case when the goal is getting after it was destroyed, but rather for any other potential problems. Briefly looking to RCLCPP and RCL code shows that UNKNOWN state might appear in case if goal was invalidated, destroyed, does not exists at the moment of if the transition was invalid or any other error occurred. So, receiving UNKNOWN goal for action server seems to be treated as problem state, if I got the RCLCPP / RCL intention correctly.

When I've wrote this change, I've also focused on BtActionNode server, receiving the result from downstream nodes. For this case, "UNKNOWN" status is being treated as error situation, and error is being thrown, interrupting normal operation.
From this point of view, looking on the UNKNOWN goal status as on error seems to be logical.

If we keep this here should https://github.com/ros-planning/navigation2/pull/3785/files#diff-872945d6a63001626ab7379639b5b3d5f11982ea3bdf08da01e7f111a9c2dfecR279 be changed for incrementing new goals?

Yes, this should be removed for UNKNOWN stage, if we will treat UNKNOWNS as errors. Thanks for the notice!
If we will agree on main idea, I will remove it in next update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup that works for me!

Copy link
Collaborator Author

@AlexeyMerzlyakov AlexeyMerzlyakov Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 85b6b71

{
nav2_msgs::msg::MissedWaypoint missedWaypoint;
missedWaypoint.index = goal_index;
missedWaypoint.goal = goal->poses[goal_index];
Expand Down Expand Up @@ -272,9 +275,7 @@ WaypointFollower::followWaypoints()
}
}

if (current_goal_status_.status != ActionStatus::PROCESSING &&
current_goal_status_.status != ActionStatus::UNKNOWN)
{
if (current_goal_status_.status != ActionStatus::PROCESSING) {
// Update server state
goal_index++;
new_goal = true;
Expand Down Expand Up @@ -324,6 +325,7 @@ WaypointFollower::resultCallback(
current_goal_status_.status = ActionStatus::FAILED;
return;
default:
RCLCPP_ERROR(get_logger(), "Received an UNKNOWN result code from navigation action!");
current_goal_status_.status = ActionStatus::UNKNOWN;
return;
}
Expand Down