Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of getEurostatRCV with tidyr #4

Merged
merged 1 commit into from
Aug 4, 2014

Conversation

jhuovari
Copy link

@jhuovari jhuovari commented Aug 4, 2014

Rewrote getEurostatRCV with tidyr

I tried a new tidyr package for getEurostatRCV and replaced the previously used melt from
reshape package. The speed improved is sizable. With “avia_goincc” data the system.time() gives
with reshape:
user system elapsed
59.52 10.15 70.41

with tidyr:
user system elapsed
0.05 0.00 0.05

Improved is probably due to dplyr that tidyr uses. Could be used also to getEurostatRaw?

The data.frame that the modified function returns is bit different from previous version. Variable columns
no longer have (row)names. It think that was duplication of information and output of str() from
that data.frame was confusing and hard to read.

Ideas:
getEurostatRCV is long and hard to remember. Better get_eurostat and make getEurostatRaw internal?
or only eurostat and remove existing eurostat-function? is stat fin version really needed?

If you accept the idea on using tidyr, mayby you should anyway merge it to some development branch, as modification will change the return value and is not tested that much.

antagomir added a commit that referenced this pull request Aug 4, 2014
First version of getEurostatRCV with tidyr
@antagomir antagomir merged commit 1e29fd4 into rOpenGov:master Aug 4, 2014
@antagomir
Copy link
Member

Excellent !

The whole package is in progress and no stability guarantees are given yet. So I merged straight to master branch. Let us test this during the package development process.

The data.frame that the modified function returns is bit different from previous version. Variable columns
no longer have (row)names. It think that was duplication of information and output of str() from
that data.frame was confusing and hard to read.

This seems ok.

Improved is probably due to dplyr that tidyr uses. Could be used also to getEurostatRaw?
getEurostatRCV is long and hard to remember. Better get_eurostat and make getEurostatRaw internal?

I support the idea of making the raw function internal, and optimizing its performance in the same way. Let us do when the time allows; if you have the chance to make this change you are welcome. These function names then need be changed also from Roxygen documentation and the example vignette.

or only eurostat and remove existing eurostat-function? is stat fin version really needed?

Statfin is a remnant,, perhaps not necessary any more. We have kept it for now but considering the removal. This would at least make the function naming more clear, agreed.

@jhuovari jhuovari deleted the tidyr branch August 11, 2014 12:17
antaldaniel added a commit that referenced this pull request Jun 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants