Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial datetime bindings #200

Merged
merged 44 commits into from
Aug 22, 2018
Merged

Initial datetime bindings #200

merged 44 commits into from
Aug 22, 2018

Conversation

pganssle
Copy link
Member

@pganssle pganssle commented Aug 9, 2018

This is a rough initial crack at the bindings to the CPython datetime API. For the moment it mostly only supports up to Python 3.6, I will make a subsequent PR to make it fully support Python 3.7 (it should work on Python 3.7, but there are a few changes in 3.7).

The Python 2 support is mostly an after-thought, and I have marked several tests as xfail on that platform because for whatever reason errors are not being propagated. This error behavior isn't necessarily integral to the functionality and so it is left to a later PR as well.

Additionally, for whatever reason, even on Python 3 it seems possible to create some malformed dates and times - I have added these tests as xfail as well.

@pganssle
Copy link
Member Author

pganssle commented Aug 9, 2018

It would probably be a good idea to:

  1. Flesh out test_datetime.rs - doing more testing from the Rust side of things
  2. See if it's possible to include the rustapi_module calls int the coverage of the Rust code
  3. Move the datetime module in rustapi_module into its own submodule, to make it clear that more tests of the Python side of the bindings can go in the rustapi_module.

@pganssle
Copy link
Member Author

pganssle commented Aug 9, 2018

The failures (and the xfail on Python 2.7) is due to a change in the implementation of CPython that happened in Python 3.6.

It appears that prior to this change fixing bpo-29100, bounds were not checked as part of calling the C constructor.

For now I'll modify the xfail, and I can backport the functionality into src/objects/datetime.rs later.

}

#[repr(C)]
#[derive(Debug, Copy, Clone)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a special reason why this type and the ones below are Copy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm sure I copy-pasted this. I'm not entirely sure what the consequences are of deriving Copy and Clone, but I imagine that if you memcpy PyDateTime_Date it would be fine, but the other two hold a reference to a PyObject, and so there are reference counting implications of making a memcpy without discarding the original object.

I'll remove at least Copy, but I dunno if Clone should also get the same treatment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I figured out the reason why it was #[derive(Debug, Copy, Clone)] - it's because I used rust-bindgen to generate the initial bindings.

pub static ref PyDateTimeAPI: PyDateTime_CAPI = unsafe { PyDateTime_IMPORT() };
}

#[inline(always)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we have specific profiling data, I'm for doing #[inline] and letting the compiler decide instead of #[inline(always)].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine by me. I went with #[inline(always)] because these are macros in the C API and I figured always-inline better replicates that behavior.


// Accessor functions for PyDateTime_Date
// Note: These have nonsensical names
#[macro_export]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to export this macro? Otherwise I'd remove the #[macro_export]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes for PyDateTime_GET_MONTH and PyDateTime_GET_DAY``

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These macros are part of the public C API, which is why I exposed them in the public API here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'd say they should be not exported for now and export them as part of the ffi2/ffi3 module once the use_extern_macros is stabilized (which is expected with rust 2018)

Copy link
Member Author

@pganssle pganssle Aug 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to refactor these as functions. I think it's probably OK to have the function version look like:

pub unsafe fn PyDatetime_GET_MONTH(o: *mut PyObject) -> u32 {
    (*(o as *mut PyDateTime_Date)).data[2] as u32
}

The C memory is laid out specifically to allow this sort of type punning, I just felt a little weird about doing it in Rust.

I struggled mightily to come up with a trait bound that would allow us to restrict these functions to just PyDateTime_Date and PyDateTime_DateTime, but in the end I moved all that stuff into the safe interface and left these as macros.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If functions are possible, than I'm definitely for functions.

}
}

// datetime.datetime bindings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give the save wrapper type real rust type doc comments with ///? Maybe even copying a bit from https://docs.python.org/3/library/datetime.html#available-types. Like:

/// A datetime.datetime object, which is a combination of a date and a time
pub struct PyDateTime(PyObject);

raise SystemExit(errno)


def get_py_version_cfgs():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a pretty neat solution 👍


return out_cfg

setup_requires = ['setuptools-rust>=0.6.1']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.6.1 -> 0.10.2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically we could also get rid of this entirely, since it's preferable to use pyproject.toml, though that would mean that python setup.py test probably wouldn't work (not sure if it works now).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, deleting that line is also fine.

py_type: &str,
args: &str,
) -> (&'p PyObjectRef, &'p PyObjectRef, &'p PyObjectRef) {
macro_rules! unwrap_py {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get rid of that macro by unwrapping the import directly and replacing the macro with

    let unwrap_py = |e: PyResult<_>| e.map_err(|e| e.print(*py)).unwrap();

#[cfg(Py_3_7)]
pub TimeZone_UTC: *mut PyObject,

pub Date_FromDate: Option<
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are those functions wrapped in an Option?

Copy link
Member Author

@pganssle pganssle Aug 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while since I wrote this code, but I think the reason is that this structure is a global that starts out empty and is only initialized later by a call to PyDateTime_IMPORT(), so during the pre-import phase all of these pointers would be to uninitialized memory.

I think I actually solved this a bit more elegantly by using lazy_static for the global API module, so that the global is initialized on first use.

Having each of them be nullable seems like the right thing to do (since they are nullable and null until imported), though I don't know if I've actually accomplished that properly in this case, or if it matters when the supported way to use this structure is to access the static global, and in the static global it's not likely that these would not be populated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a struct that is never constructed in rust code (just casted to from the import function), so I'd say there's no dangling pointer there. The lazy_static should be sufficient.

@konstin
Copy link
Member

konstin commented Aug 11, 2018

One thing that I found across the code is u32 vs. i32. C's int translates to c_int, which is i32.

let h = hour as c_int;
let mi = minute as c_int;
let s = second as c_int;
let u = microsecond as c_int;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clippy doesn't like this line:

warning: 5th binding whose name is just one char
   --> src/objects/datetime.rs:115:13
    |
115 |         let u = microsecond as c_int;
    |             ^
    |
    = note: #[warn(many_single_char_names)] on by default
    = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#many_single_char_names

The easiest solution is probably inlining those variables

@konstin
Copy link
Member

konstin commented Aug 11, 2018

The tests/rustapi_module should move to the examples folder, so we can take advantage of rust-lang/rfcs#2517 once it lands.

use python::{Python, ToPyPointer};

// Traits
pub trait PyDateComponentAccess {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit of a bikeshed, but imo we could remove the Component from the name. A small doc comment like Gives access to year, month and day would also nice for rustdoc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it was long enough that I was considering moving it into a namespace, like:

pub mod  Components {
pub trait PyDateAccess { ... }
}

Either way I'll remove Component from the name.

#[macro_export]
macro_rules! PyDateTime_GET_MONTH {
($o: expr) => {
(*$o).data[2] as c_int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clippy complains about this:

warning: casting u8 to i32 may become silently lossy if types change
   --> src/ffi3/datetime.rs:271:9
    |
271 |         (*$o).data[3] as c_int
    |         ^^^^^^^^^^^^^^^^^^^^^^ help: try: `i32::from((*$o).data[3])`
...
277 |     PyDateTime_GET_DAY!(o as *mut PyDateTime_Date)
    |     ---------------------------------------------- in this macro invocation
    |
    = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#cast_lossless

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I see any good way around it. We could store month and day as u8 in the Rust API I suppose, though we'll still have this problem for year.

Hopefully the test suite will alert us if this becomes a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`c_int::from((*$o).data[3]) works and makes clippy happy, so I'd take that

@konstin
Copy link
Member

konstin commented Aug 11, 2018

In general, the additions look good and especially the tests are great! Once the comments are resolved we can merge this.

@pganssle
Copy link
Member Author

One thing that I found across the code is u32 vs. i32. C's int translates to c_int, which is i32.

On the ffi side, I've tried to use the C types wherever possible. On the Rust side, I've tried to use the rule "u32 for things that can never be negative, i32 for things that can", which is why it's i32 for all the components of a timedelta, but u32 for all the components of a datetime.

If we want to use types specifically based on Python ranges, datetime should probably look like:

year: u16
month: u8
day: u8
hour: u8
minute: u8
second: u8
microsecond: u32
fold: u8
tzinfo: *PyObject

I generally used u32 because I have a general sense that u32 is faster than things like u16. If we want to use types that have closer to the actual range of possible uses, year would be a signed type, since people occasionally work with dates BCE (though using the Gregorian calendar to represent such datetimes is an anachronism).

I can switch them all over to i32, though - any change to the types used in the Python C API would probably be a big enough problem that we'd have plenty of time to make backwards-incompatible changes.

@konstin
Copy link
Member

konstin commented Aug 11, 2018

On the safe rust side I'm fine with both specific types as well as u32 or i32 for everything. What worried me was that casting back and forth between u32 and i32 (or int) is potentially dangerous in unsafe code, e.g. in get_year. (It could also be that's perfectly fine and I'm just too worried)

@pganssle pganssle force-pushed the datetime branch 4 times, most recently from 3447558 to d51a3bf Compare August 20, 2018 18:25
// one time. So long as PyDateTime_IMPORT has no side effects (it should not),
// this will be at most a slight waste of resources.
static __PY_DATETIME_API_ONCE: Once = Once::new();
static mut __PY_DATETIME_API_UNSAFE_CACHE: *const PyDateTime_CAPI = ptr::null();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the rust side, you don't need the two underscores. Unless you declare an item or struct field pub, it's invisible outside the module

@pganssle pganssle force-pushed the datetime branch 2 times, most recently from 3cc0a12 to 16d624e Compare August 21, 2018 16:49
@pganssle
Copy link
Member Author

Ok, I believe I have re-enabled the extension-module feature. I think that's the last thing for this batch, so merge when ready.

@pganssle
Copy link
Member Author

The current travis builds are throwing errors about examples/rustapi_module not being in the workspace. Not sure what's up with that, it's working for me locally and examples/* is in Cargo.toml.

@konstin
Copy link
Member

konstin commented Aug 21, 2018

That's my fault (I've been to quick removing stuff) and I'm working on it

pganssle and others added 20 commits August 21, 2018 18:33
As noted in the comments, using call_once to initialize the
PyDateTimeAPI singleton is causing deadlocks with the GIL; these
deadlocks can be avoided by falling back on CPython's own locking
behavior.
Clippy complains about the one-letter variables, but it's preferable to
do the typecasts in-line anyway.
It *should* be safe to cast u32 to i32 in the case of the Py{Date}{Time}
constructors, because any unsigned values that would overflow would
raise an error anyway.

This adds tests to ensure that this is the case - they currently fail on
Python <3.6 because of a known issue with the Python C API.
This makes it cleaner to call PyTime::new and PyDateTime::new from Rust
without needing to retrieve a Python None.
Because the PyDateTime_CAPI is always accessed via a lazy-initialized
static reference, it is not necessary to handle the case where the
functions on the C struct have not been initialized.
In the future we can make ffi::object, ffi::pycapsule, etc as
crate-public, but importing the specific symbols is a light touch way to
do this.
These were basically cargo culted from lazy_static's implementation, but
they are not idiomatic or necessary to indicate that these are private
variables.
For whatever reason I cannot build rustapi_module without this, and
rather than figure out the core problem, I figured that, for symmetry,
it makes sense to just implement Debug for PyObject.
This more closely mimics the CPython API, since the import logic
populates the global, it should also populate the cache.

This also allows users to eagerly initialize the Python C API if
preferred (for example, doing so before populating a bunch of threads,
or before making performance measurements that will be thrown off by a
lazy import).
This is really a test module, but the Rust convention is to put
something like this under examples/, and when it lands, we can take
advantage of "Project-based Examples for Cargo Projects" - RFC link
at rust-lang/rfcs#2517
The PEP 518 way to do this is with pyproject.toml. tox doesn't support
PEP 518 yet, but we get around that by using pip install -e . as part of
the tox build until PEP 518 support arrives in tox.
While the valid ranges for the constructor parameters is the same when
expressed as either u32 or i32, since the Python API uses i32 in their
public interface, we won't have to make any changes to the signatures if
the Python behavior changes (e.g. supporting negative years) without
their API changing.
Because it's unlikely that anything other than the `year` parameter will
change in the C Python API, the rest can be restricted to their logical
ranges, which improves the compile-time error checking.
I've just seen that this had been hidden from clippy through the ffi module reordering, but fixing this on master would cause merge conflicts, so I'm fixing this here directly
@konstin
Copy link
Member

konstin commented Aug 21, 2018

Fixed, besides a flaky test

@konstin
Copy link
Member

konstin commented Aug 22, 2018

We're finally ready to merge ✨

Thank you for all the work put into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants