Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update GEOS algorithms to support XY CoordinateSequence #872

Open
dbaston opened this issue Apr 14, 2023 · 6 comments
Open

Update GEOS algorithms to support XY CoordinateSequence #872

dbaston opened this issue Apr 14, 2023 · 6 comments

Comments

@dbaston
Copy link
Member

dbaston commented Apr 14, 2023

The CoordinateSequence implementation introduced in #799 allows a CoordinateSequence to be backed by any of the Coordinate types: XY, XYZ, XYM, and XYZM. However, most of the GEOS library accesses coordinates using CoordinateSequence::getAt<Coordinate>(), which requires assumes the CoordinateSequence is backed by XYZ or XYZM Coordinates. So for now, GEOS pads XY and XYM sequences with Z values, forcing them into XYZ or XYZM.

I made certain parts of the GEOS code base "2D safe" (most of geos::algorithm and geos::operation::valid) by removing XYZ access (using CoordinateSequence::getAt<CoordinateXY> instead of CoordinateSequence::getAt(). As long as we only run these algorithms, this allows removal of the GEOS_COORDSEQ_PADZ define in CoordinateSequence.cpp, so that 2D geometries geometries are backed by XY coordinates rather than XYZ coordinates. I did some performance benchmarking with GEOSisValid and not find a compelling performance benefit. However, enabling the code to function with XY-backed sequences may have benefits for memory-constrained users and would make #747 usable for the most common case of an external XYXY buffer. Running the test suite with GEOS_COORDSEQ_PADZ removed will give some idea of the extent of GEOS code that is "2D safe".

Making 2D algorithms only use 2D access as described above is a fairly easy problem. A trickier issue is the handling of algorithms such as convex hull, distance, and Delaunary triangulation that do not require Z coordinates but preserve them if they are available. When adding M support to overlay, I used a helper function to dispatch to LineIntersector methods that are templated for the correct types. This allows us to contain the use of templates to a single point in the overlay algorithm (IntersectionAdder is not templated). And as a side effect, parts of the overlay code also support XY-backed sequences (and avoid wasted Z interpolation of NaNs).

Dispatch helper:

template<typename F, class... Args>
static void binaryDispatch(const CoordinateSequence& seq1, const CoordinateSequence& seq2, F& fun, Args... args)
{
using CoordinateXYZ = Coordinate;
auto typ1 = seq1.getCoordinateType();
auto typ2 = seq2.getCoordinateType();
switch(type_pair(typ1, typ2)) {
case type_pair(CoordinateType::XY, CoordinateType::XY): fun.template operator()<CoordinateXY, CoordinateXY>(args...); break;
case type_pair(CoordinateType::XY, CoordinateType::XYZ): fun.template operator()<CoordinateXY, CoordinateXYZ>(args...); break;
case type_pair(CoordinateType::XY, CoordinateType::XYM): fun.template operator()<CoordinateXY, CoordinateXYM>(args...); break;
case type_pair(CoordinateType::XY, CoordinateType::XYZM): fun.template operator()<CoordinateXY, CoordinateXYZM>(args...); break;
case type_pair(CoordinateType::XYZ, CoordinateType::XY): fun.template operator()<CoordinateXYZ, CoordinateXY>(args...); break;
case type_pair(CoordinateType::XYZ, CoordinateType::XYZ): fun.template operator()<CoordinateXYZ, CoordinateXYZ>(args...); break;
case type_pair(CoordinateType::XYZ, CoordinateType::XYM): fun.template operator()<CoordinateXYZ, CoordinateXYM>(args...); break;
case type_pair(CoordinateType::XYZ, CoordinateType::XYZM): fun.template operator()<CoordinateXYZ, CoordinateXYZM>(args...); break;
case type_pair(CoordinateType::XYM, CoordinateType::XY): fun.template operator()<CoordinateXYM, CoordinateXY>(args...); break;
case type_pair(CoordinateType::XYM, CoordinateType::XYZ): fun.template operator()<CoordinateXYM, CoordinateXYZ>(args...); break;
case type_pair(CoordinateType::XYM, CoordinateType::XYM): fun.template operator()<CoordinateXYM, CoordinateXYM>(args...); break;
case type_pair(CoordinateType::XYM, CoordinateType::XYZM): fun.template operator()<CoordinateXYM, CoordinateXYZM>(args...); break;
case type_pair(CoordinateType::XYZM, CoordinateType::XY): fun.template operator()<CoordinateXYZM, CoordinateXY>(args...); break;
case type_pair(CoordinateType::XYZM, CoordinateType::XYZ): fun.template operator()<CoordinateXYZM, CoordinateXYZ>(args...); break;
case type_pair(CoordinateType::XYZM, CoordinateType::XYM): fun.template operator()<CoordinateXYZM, CoordinateXYM>(args...); break;
case type_pair(CoordinateType::XYZM, CoordinateType::XYZM): fun.template operator()<CoordinateXYZM, CoordinateXYZM>(args...); break;
}
}

Example usage:

class DoIntersect {
public:
DoIntersect(algorithm::LineIntersector& li,
const CoordinateSequence& seq0,
std::size_t i0,
const CoordinateSequence& seq1,
std::size_t i1) :
m_li(li),
m_seq0(seq0),
m_i0(i0),
m_seq1(seq1),
m_i1(i1) {}
template<typename T1, typename T2>
void operator()() {
const T1& p00 = m_seq0.getAt<T1>(m_i0);
const T1& p01 = m_seq0.getAt<T1>(m_i0 + 1);
const T2& p10 = m_seq1.getAt<T2>(m_i1);
const T2& p11 = m_seq1.getAt<T2>(m_i1 + 1);
m_li.computeIntersection(p00, p01, p10, p11);
}
private:
algorithm::LineIntersector& m_li;
const CoordinateSequence& m_seq0;
std::size_t m_i0;
const CoordinateSequence& m_seq1;
std::size_t m_i1;
};
/*public*/
void
LineIntersector::computeIntersection(const CoordinateSequence& p, std::size_t p0,
const CoordinateSequence& q, std::size_t q0)
{
DoIntersect dis(*this, p, p0, q, q0);
CoordinateSequences::binaryDispatch(p, q, dis);
}

Other algorithms, such as convex hull, nearest points and Delaunay triangulation, might require different techniques. One way might be to perform the algorithm using CoordinateXY references and then construct a result by casting CoordinateXY* back to whatever the actual backing type was (e.g., CoordinateXYM*).

I'm not sure the effort to make these changes is worth it, but wanted to leave some breadcrumbs for anyone who wants to pursue this farther.

@dr-jts
Copy link
Contributor

dr-jts commented Apr 14, 2023

Other algorithms, such as convex hull, nearest points and Delaunay triangulation, might require different techniques.

What about representing single coordinates using a structure wide enough to hold all potential ordinates? (Probably CoordinateXYZM, unless there is a reason for something different). That will keep Z and M ordinates if they are present, and allow copying them into newly created CoordinateSequences (which will only keep the ordinates defined for the sequence).

@pramsey
Copy link
Member

pramsey commented Apr 14, 2023

For XY coordinate sequences, imposing an XYZM struct on top would read out of bounds for the last coordinates X and M values. For internally managed buffers we could just always make our buffers slightly larger than necessary, but for external buffers we'd be in a pickle.

@dr-jts
Copy link
Contributor

dr-jts commented Apr 14, 2023

That's the issue with using references to Coordinate references, I guess. Is that really so much more efficient?

Or perhaps a hybrid approach, where some access is done just be reference, but if the coordinate will form part of a new geometry, it is copied fully.

@dr-jts
Copy link
Contributor

dr-jts commented Apr 14, 2023

How about using CoordinateXY references for predicate/metric operations, and a full coordinate copy for constructive operations? Constructive operations tend to do a lot of work anyway, so the cost difference may be low (and worth it if ZM-preserving is a functional goal).

@tfiner
Copy link

tfiner commented May 3, 2024

Why go through all this trouble to maintain dimensions beyond XY for what is essentially "user" data for each and every coordinate?

@dr-jts
Copy link
Contributor

dr-jts commented May 3, 2024

Why go through all this trouble to maintain dimensions beyond XY for what is essentially "user" data for each and every coordinate?

  • This allows GEOS to be used as a representation for geometry with Z and M (either for simple "store-and-forward", or for processing
  • Because some existing clients (in particular, PostGIS) expect to be able to represent and process geometries with Z and M ordinates.
  • This allows supporting algorithms which depend on Z or M ordinates (e.g. linear referencing)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants