First version of getEurostatRCV with tidyr #4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rewrote getEurostatRCV with tidyr
I tried a new tidyr package for getEurostatRCV and replaced the previously used melt from
reshape package. The speed improved is sizable. With “avia_goincc” data the system.time() gives
with reshape:
user system elapsed
59.52 10.15 70.41
with tidyr:
user system elapsed
0.05 0.00 0.05
Improved is probably due to dplyr that tidyr uses. Could be used also to getEurostatRaw?
The data.frame that the modified function returns is bit different from previous version. Variable columns
no longer have (row)names. It think that was duplication of information and output of str() from
that data.frame was confusing and hard to read.
Ideas:
getEurostatRCV is long and hard to remember. Better get_eurostat and make getEurostatRaw internal?
or only eurostat and remove existing eurostat-function? is stat fin version really needed?
If you accept the idea on using tidyr, mayby you should anyway merge it to some development branch, as modification will change the return value and is not tested that much.