Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CDAP-16739] Added hard limit for Avro data file, proper error messag… #406

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

nitinmotgi
Copy link
Contributor

…e redirecting user to use file source connector with format avro. Fixed parse-as-json to highlight error as well as handling integers correctly now

  • test case added
  • test file added
  • error message reviewed.

…e redirecting user to use file source connector with format avro. Fixed parse-as-json to highlight error as well as handling integers correctly now
String msg = e.getCause() != null ? e.getCause().getMessage() : e.getMessage();
// In case there are more rows being handled, we attempt to surface the Json that is
// causing an issue to make it easier for users to identify the problem quickly.
if (rows.size() > 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't rows always size 1? Otherwise throwing an ErrorRowException would incorrectly mark every Row as an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not always. If we parsing AvroFile. Parse-as-avro-file will generate records > 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but the RecipeExecutor will break that into multiple lists of size 1. Either this is always size 1, or the entire ErrorRowException contract is wrong.

@@ -52,6 +52,40 @@ public void testParseAsAvroFile() throws Exception {
Assert.assertEquals(1495194308245L, results.get(1688).getValue("timestamp"));
}

@Test
public void testAvroWithJsonFieldParsingWithParseJsonIgnoreError() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these new tests are testing two directives. More specifically, these are not testing any change in behavior in the avro directive, they are just testing the changes in json directive.

Unit tests should be testing isolated components when possible. It would be better to enhance the parse json test. For example, if we remove the parse avro file directive, we could easily remove test coverage of the parse json directive accidentally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a problem customer had. We have other tests for catching parse-as-json problems. This was when a json in avro file have. Using redacted user data as test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this is a unit test, meaning it should be testing very specific parts of the code. With the ignore-error changed introduced, it should be testing that the json directive returns empty results when given bad data and ignore-error is true.

rows.add(new Row("body", data));

List<Row> results = TestingRig.execute(directives, rows);
Assert.assertEquals(6693, results.size()); // total 6670, 7 records are bad.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these types of asserts are not specific enough. For example, how can we be sure the right rows are filtered? It would be better to just send a single row in as input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using customer test data to make sure we don't break it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't make sure of anything besides the size

@nitinmotgi
Copy link
Contributor Author

@albertshau PTAL

@nitinmotgi
Copy link
Contributor Author

@albertshau PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants